THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Jamie Thomson

This is the blog of Jamie Thomson, a freelance data mangler in London

A potted SSIS history via Connect

Yesterday I received a notification that one of my submissions to http://connect.microsoft.com had received a comment. Nothing unusual about that, I receive those sorts of emails every day, on this occasion however the comment was actually worth reading.

The comment was from SSIS development guru Jeff Bernhardt (he has changed job titles recently but I don’t know the new one so “dev guru” will have to do) and was posted in reply to my request for a single type system in SSIS (which I also blogged about last month at Have SSIS' differing type systems ever caused you problems?). Jeff’s comment provided a rather unique insight into some of the machinations within the SSIS product team as they go about triaging our many Connect submissions and also gave an interesting potted history of the SSIS product itself. For these reasons I thought it would be worth highlighting the comment to the larger SSIS community so with Jeff’s permissions I have copied the comment below (you can see the original at [SSIS] Consolidate three type systems into one)


The three type systems make me crazy too. They make all of the developers working on the guts of SSIS crazy. The history is interesting so I will share it:

The first part of SSIS that was built in the proto-days of 1999 was the replacement for the DTS import export wizard. I know, hard to believe. This was the first ‘host’ for the dataflow pipeline. The pipeline was built to use the OLEDB type system to make reading and writing from OLEDB really fast. Our internal buffers are really just OLEDB bound memory layouts. The DT_Foo type system is an exact copy of the OLEDB type system with some extra types added for SSIS specific use. That type system is built in deep to the dataflow and all transforms.

Next to arrive was the Runtime, we needed a real host for pipelines and a way to coordinate activities. In those days, the way software was built and extended here at Microsoft was to use COM and OLE Automation friendly interfaces. We expected that most folks that embedded and extended SSIS (then still called DTS) was to use native code and COM interfaces. Naturally, that is how the Runtime was designed and built; the COM type system is pervasive and runs deep inside the native Runtime. This is why we see the VARIANT types with BSTRs, SAFEARRAYS, etc. (as a fun note, the VARIANT type was design to copy the internal type system of Visual Basic, a VARIANT is a Var )

When it came time to build a designer for these interfaces (and believe it or not, the designer started a year or so after the internals were working) we took a bet on managed code and c#. This was a bet at the time; managed code was new and un-proven. There was a lot of anxiety. The UI is all managed code and so its chock full of the new and fancy CLR type system. We expected that some people might use managed code to host or extend SSIS so a lot of ‘wrappers’ were put together to make that work. Of course the seams on the type system show through.

So here we are with three type systems. Getting rid of the DT_Foo system would mean re-writing the pipeline (probably in managed code) and putting in a big shunting system to maintain backward compatibility with older transforms.

Getting rid of the VARIANT type system would mean re-writing the Runtime (probably managed) and providing some shunting and upgrade system for old packages and tasks.

Re-writing the UI is crazy talk, and would still need all of the back-compat layers put in.

So as much as we hate where we are, fixing this issue is a HUGE undertaking. For instance, it may have been the only work we took on for Denali. Certainly we thought about it, but the value delivered to customers of this one change is low when compared to all of the other things we could do for the same effort.

What does the future hold? It is hard to imagine us ever investing the effort to fix the SSIS type problem, mostly because of all of the backward compatibility issues that it would create. As we look to new problem spaces and new ways of solving problem in the cloud world, I expect the CLR type system will become the common language of data movement for us.

We will remember the pain that our short sighted decision making in 2001 unleashed on our devoted developer friends and we will strive to avoid making these mistakes again.

-Jeff Bernhardt


Published Thursday, April 07, 2011 4:12 PM by jamiet

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Cliff said:

It's nice to have an explanation of why it's the way it is. I'm sure we can all relate to projects where we made design decisions that at the time seemed like the right thing to do. However, I do still think they could throw a facade on top of it so everything looks the same to the user. If we have a class with a property of Name, the consumer sees it as Name. The class sees it as a string. It's an easy solution to a common complaint.

April 7, 2011 12:30 PM
 

andyleonard said:

Thanks for sharing this Jamie. Jeff is a class act.

:{>

April 7, 2011 10:00 PM
 

Roger W said:

It would still be helpful if MS would supply (with a huge signpost to it everywhere on MSDN, Technet etc), a complete chart of all data types used in SSIS (Including varaiable data tpes) & SQL Server, mapping their equivalents.

If you know of one, please supply the signpost (link):0).

Thanks for listening.

Roger

Shaftesbury (UK)

May 17, 2011 2:55 AM
 

SSIS Junkie said:

Once upon a time I blogged at http://consultingblogs.emc.com/jamiethomson but that ended in August 2009

October 25, 2011 4:04 AM

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement