THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Linchi Shea

Checking out SQL Server via empirical data points

A Little Bit Scripting Saves the Day

Recently, I got sucked into importing Excel sheets into a database. I had thought, well that's easy with a tool such as SQL Server 2000 DTS Import/Export Wizard or SQL Server 2005 Import/Export Wizard. Needless to say, the Excel import business ran into a snag. With the error messages from the wizard not being really helpful, I ditched the wizard and wrote the following little ADO.NET program to read the Excel cells and dump them to a text file for a quick bcp import:

using System;
using System.IO;
using System.Data; 
using System.Data.OleDb;

class Junk
{

    static void Main(string[] args) 
    {
        String sConnectionString = 
        "Provider=Microsoft.Jet.OLEDB.4.0;" + 
        "Data Source=" + "Junk.xls" + ";" + 
        "Extended Properties=Excel 8.0;"; 


        OleDbConnection conn = new OleDbConnection(sConnectionString); 
        conn.Open(); 
        OleDbCommand cmd = 
                new OleDbCommand("SELECT * FROM [Vendors$]", conn); 
        OleDbDataReader reader = 
                cmd.ExecuteReader(CommandBehavior.CloseConnection);
        while (reader.Read())
        {
            for(int i=0; i< reader.FieldCount-1; i++)
            {
                Console.Write(reader.GetValue(i).ToString() + "|");
            }
            Console.Write(reader.GetValue(reader.FieldCount-1).ToString() + "}");
        }
        conn.Close(); 
    }
}

But as soon as I thought I was done, I noticed that some cells were truncated in the output text file. I'm sure there is a switch somewhere that I could have used to tell the OLE DB reader not to truncate anything. But I was pressed for time and I had to cut the loss. Plus, I really started to worry about the data quality of the Excel sheet and wanted to be able to check/verify each cell. Instead of figuring out how to get the C# code to work, I ditched it and put together the following little Perl script:

use strict;
use Win32::OLE;
use Win32::OLE::Const 'Microsoft.Excel';
use Win32::OLE::Variant;
use Win32::OLE::NLS qw(:LOCALE :DATE);
my $ex;
eval {$ex = Win32::OLE->GetActiveObject('Excel.Application')};
die "Excel not installed" if $@;
unless (defined $ex) 
{
     $ex = Win32::OLE->new('Excel.Application', sub {$_[0]->Quit;})
              or die "Oops, cannot start Excel";
}
open(LOG, ">readExcel.log") or 
     die "can't open readExcel.log for write.\n";
my $book = $ex->Workbooks->Open('c:\Junk.xls') or
    die Win32::OLE->LastError();
my $sheet = $book->Worksheets(1);
my $array = $sheet->Range("A2:Z1878")->{Value}; # specify the range
my $row_cnt = 0;
for (@$array) {
    foreach my $i (0..(scalar(@$_) - 2)) 
    {
        if (ref($_->[$i])) 
        {
            print $_->[$i]->Date("MM-dd-yyyy") . "\t";
        }
        else 
        {
            my $b = $_->[$i];
            $b =~ s/\s/ /g;     # convert whitespaces to blanks
            print "$b" . "\t";
        }
    }
    my $b = $_->[scalar(@$_) - 1];
    $b =~ s/\s/ /g;
    print "$b" . "\n";
    $row_cnt++;
    print LOG "($row_cnt," . scalar(@$_) . ")\n";
}
close(LOG);

I used several variations of this script to help check the Excel sheet. For instance, I changed the script to count the number of characters in each cell and figured out the max number of characters in each column in the sheet. This helped in determining the width for each column in the destination table. The table automatically generated by the import wizard just got all the column widths wrong.

Now, I'm not suggesting that this particular approach was the best or the script was such a terrific piece of code--I couldn't care less about its style or whether it's most efficient, I just wanted to get the data out of the cells and get the work done. Sure, there are many other ways to accomplish this, and your favorite method depends on your experience and the tool you are familiar with.

The point is that when a pre-built tool doesn't quite work out, it's good to know a little bit scripting that can help fill the gap.

Published Tuesday, March 06, 2007 12:06 AM by Linchi Shea
Filed under: , ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Stephen Moore said:

The approach that I've always used for this is to create an Excel formula that builds an INSERT statement based off of the data in the spreadsheet.  Then just copy the formula results to the clipboard and paste them into SSMS or Query Analyzer and run them.  It sometimes takes a few iterations to make sure I have the formula bugs worked out, but it's the quick and dirty preference for me.

March 8, 2007 5:23 PM
 

Linchi Shea said:

Interesting! Does this work well with arbitrarily long cell contents for a text colum? I'm no Excel expert, but I guess you could use Excel formula to easily find the lengths of all the cells, right?

March 9, 2007 8:55 AM
 

Ranga said:

I have used linked server to query excel sheets.....it works!

April 16, 2007 11:24 AM

Leave a Comment

(required) 
(required) 
Submit

About Linchi Shea

Checking out SQL Server via empirical data points

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement