THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Eric Johnson

My Thoughts on SQL Server and Database Related Technologies.

Managing Lookup Cache in SQL Server 2008

The Lookup transformation in SSIS has changed a lot in SQL Server 2008. One of the best new features is the ability to pre-build your cache which gives you a lot of control over what is cached and how the cached data is managed. The basic lookup offers additional cache features such as Full, Partial, or even No Cache, but the real power comes in the new Cache Transformation. The Cache Transformation uses the new Cache Connection manager to allow you to build your cache before it is needed. The connection manager and the Cache Transformation are pretty simple. For the Cache Manager Connection, you just need to specify the columns that will be stored and which columns will be indexed, as shown below. The indexed columns will be the only columns you can use as a lookup column in a Lookup Transformation. 

clip_image002

Once you have the connection manager configured, you just use a Cache Transformation to pump data into the cache. The Cache Transformation is very simple; you specify a Cache Connection manager and set up the mapping of the incoming data to the columns in the cache. Last but not least, you pull data from your source and send it into the Cache Transformation. I have a very simple example data flow below which is populating a lookup cache; and the source in this example uses the T-SQL shown.

SELECT ContactID, EmailAddress
FROM Person.contact
WHERE EmailAddress IS NOT NULL
AND EmailPromotion <> 0

clip_image004

This allows me to populate my cache with all email addresses that are not blank, where the person has not opted out of email promotion. Remember, the data flow to populate your cache can be as complex as you want. This can really allow you to pare down large sets of data to a smaller, easier to use data set that is appropriate for caching.

Once you have populated your cache, all you have left to do is use it in a Lookup Transformation. On the first page of your Lookup Transformation properties, specify a Connection Type of Cache Manager and on the connection page, specify your Cache Connection Manager object. Everything else is just like using the Lookup with a regular data source. You will need to set up your mappings and the column (or columns) that you want to return. One other note: when using a Cache Connection Manager with a lookup, you have to select Full Cache as the cache mode. The other two are disabled and for good reason. You have taken control of your cache at this point and don’t want SSIS doing anything further.

So there you have it, managing your own lookup cache should allow you to make your SSIS packages a little more streamlined and allow them to run a bit faster. Not to mention, you should be able to minimize the memory hogging packages that occurred when older lookups got a little out of hand.

Published Tuesday, September 29, 2009 2:59 PM by ejohnson2010
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Michael said:

Eric,

What if I need to run multiple successive passes using the lookup cache connection and with each pass i want to update or insert new rows to the cache so the the next pass will have the most updated data in the cache?

January 6, 2010 12:22 PM
 

ejohnson2010 said:

You would need to rebuild your cache in the second and subsequent passes.

January 6, 2010 7:28 PM

Leave a Comment

(required) 
(required) 
Submit

About ejohnson2010

Eric (Microsoft SQL MVP) is the co-founder of Consortio Services, and the primary Database Technologies Consultant. His background in Information Technology is diverse, ranging from operating systems and hardware to specialized applications and development. He has even done his fair share of work on networks. Since IT is really just a way to support business processes, he has also acquired his MBA. All in all, he has 10 years of experience with IT, a great amount of which has been working with Microsoft SQL Server. Eric has managed and designed databases of all shapes and sizes. He has delivered numerous SQL Server training classes and webcasts as well as presentations at national technology conferences. Most recently, he presented at TechMentor on SQL Server 2005 Replication, Reporting Services, and Integration Services. In addition, he is active in the local SQL Server Community, serving as the President of the Colorado Springs SQL Server Users Group.
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement