THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Andy Leonard

Andy Leonard is CSO of Linchpin People and SQLPeople, an SSIS Trainer, Consultant, and developer; a Business Intelligence Markup Language (Biml) developer; SQL Server database and data warehouse developer, community mentor, engineer, and farmer. He is a co-author of SQL Server 2012 Integration Services Design Patterns. His background includes web application architecture and development, VB, and ASP. Andy loves the SQL Server Community!
Note: Comments are moderated. Spam shall not pass! </GandalfVoice>

SSIS is Case-Sensitive

Introduction

SSIS is case-sensitive even if the database is case-insensitive.

Imagine...

 ... you work in an ETL shop where someone who believes in natural keys won the Battle of the Joins. Imagine one of your natural keys is a string. (I know it's a stretch... play along!).

Let's build some tables to sketch it out. If you do not have a TestDB database, why not? Build one! You'll use it often.

Use TestDB
go

Create
Table SSIS1
(StrID char(5)
,Name varchar(15)
,Value int)

Insert
Into SSIS1
Select
'abc','One',1
union all
Select
'def','Two',2
union all
Select
'ghi','Three',3
union all
Select
'jkl','Four',4

Create
Table SSIS2
(STRID char(5) constraint PK_SSIS2_STRID Primary Key
,Name varchar(15)
,Value int)

Insert
Into SSIS2
Select
'abc','One',1
union all
Select
'def','Two',2
union all
Select
'ghi','Three',3
union all
Select
'JKL','Four',4

Cool - now you have a couple tables we can use.

The SSIS Project

Now let's build an SSIS project. After you create the project, add a Data Flow Task to the Control Flow and open it for editing. Drag an OLE DB Source Adapter onto the canvas and configure a connection to the SSIS1 table in your TestDB:

 

Next, add a Lookup Transformation and connect a Data Flow Path from the OLE DB Source Adapter to the Lookup. Configure the Lookup to use a SQL Query that pulls data from the SSIS2 table in your TestDB:

Click the Configure Error Output button and change the Error column for the Lookup Output buffer from the default (Fail Component) to Ignore Failure. This action allows the Lookup to return rows (populated with NULLs) if it does not find a match between Data Flow data and data returned by the SQL Query:

Configure the Columns in the Lookup as shown:

Close the Lookup Transformation editor and drag a Conditional Split Transformation onto the Data Flow canvas. Open the editor and add an Output buffer to detect New Rows as shown:

Close the Conditional Split Transformation editor. Next, add an OLE DB Destination to the canvas and connect a Data Flow Path from the Lookup to it. Open the editor and configure as shown:

Close the OLE DB Destination editor and let's run this! It fails as shown:

Why? That's an excellent question! Let's add some Data Viewers to find out

First, SSIS2.STRID is the primary key. Since the database is case-insensitive it believes 'jkl' is the same as 'JKL' and therefore we are trying to load a duplicate primary key into the table. The technical term for this is "bad" (like crossing the streams in GhostBusters).

The Data Viewers reveal there are four rows flowing in from the Source Adapter, but only one is making it through the Conditional Split (Filter) - the 'jkl' row. Why is the Lookup matching on the other three rows and not this row?

SSIS is case-sensitive and the database is not.

The Lookup Transformation is accomplishing an operation similar to (but not exactly like) a Left Join. If we execute the following T-SQL in SSMS we'll see something similar to the results of the Lookup:

select *
from SSIS1
left join SSIS2 on SSIS1.StrID = SSIS2.STRID

The results in SSMS are:

StrID Name            Value       STRID Name            Value
----- --------------- ----------- ----- --------------- -----------
abc   One             1           abc   One             1
def   Two             2           def   Two             2
ghi   Three           3           ghi   Three           3
jkl   Four            4           JKL   Four            4

Note the join works between the keys 'jkl' and 'JKL' in T-SQL, but not in SSIS.

How Do You Fix It?

Great question. There are a couple approaches - this one works for me.

In the OLE DB Source Adapter I add a field to the query - Upper(StrID) as JoinField:

Why not simply apply the Upper() function to the StrID column in the original query? I want to preserve the case of the actual data during the load. I see no need to alter the case of the data to accomplish a join. I want that data - as is - loaded. And I want my Lookup to work. And I want a pony, but that's another post...

I perform a similar change to the SQL Query in the Lookup: 

The major change is accomplished on the Columns tab of the Lookup where I use the newly added JoinField to match rows from the SSIS1 table (already in the Data Flow's data stream) to rows returned from the SSIS2 table:

Why don't I return the value of JoinField from the Available Lookup Columns? Again, I'm not loading that data. I need that column only to make my join work.

Executing now brings the desired results:

Conclusion

This is one way to work around the case-sensitivity of SSIS when loading case-insensitive data. Do you have a different (or better) way? Please let me know!

:{> Andy

Published Friday, May 28, 2010 8:00 AM by andyleonard

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

jorg said:

Hi Andy,

I've blogged about this a while ago and another way to "fix" this is to set the CacheType property of the Lookup to partial or none. This way not SSIS but SQL Server does the lookup comparison. You find my blog post here: http://sqlblog.com/blogs/jorg_klein/archive/2009/04/14/ssis-lookup-is-case-sensitive.aspx

By the way: I think also using UPPER like you describe is the best approach.

May 28, 2010 8:02 AM
 

unclebiguns said:

Excellent post!  I like the idea of a join field.  I have not done that, but will likely make the change as I work with SSIS in the

future.

May 28, 2010 9:39 AM
 

ssis newbie said:

Hi,

I have done the lookup transformation by using upper() function. but by doing so, i lost the original case of the value.

With your idea, the original case can be retained. thanks for sharing the idea.

September 7, 2011 11:51 PM

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

My Company


Other Blog

Check out my personal blog...
http://andyleonard.me

Contact Me

Twitter: @AndyLeonard
Email: andy.leonard@gmail.com

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement