THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Alberto Ferrari

Threads and custom components: FlowSync 1.0

Several people downloaded TableDifference to handle SCD faster, some of them, especially using it on huge table (more than 10 millions rows) noticed memory problems. The problem is that of a flow running too fast and making TableDifference cache data, we know of it and now we decided to solve it creating a new component called "FlowSync". You can find all the details and source code here.

In the article there is a brief discussion about how SSIS handles the ProcessInput method of a component with more than one input, here is an extract:

As you may already know ProcessInput is called once for every buffer and, in the case of a component with two or more inputs like TableDifference or Union All, this method is called once for each buffer of each input, so the inputs are mixed together and handled by the same method. A solution to the problem of syncronizing input, before deciding to develop FlowSync, has been that of using semaphores to stop the faster input inside the ProcessInput method. It would have been a nicer solution BUT ProcessInput is called in only ONE thread, even if it has two input flows. So, if ProcessInput is stopped then all the inputs of the components are stopped and the system will be in a deadlock state.

This is very strange because each flow runs in a separate thread but it seems that the two thread synchronize on a single one when they need to pass data to the component. So the solution has been that of inserting the sync technique where we still have separate threads, hence directly on the flows with a transformation component: FlowSync.

I would really like to see in the next version of SSIS the ability to decide if – when developing a component – we want ProcessInput to be called in a multithreaded environment or not, my personal opinion is that – using threads – programs become easier to write and maintain, TableDifference may be a good candidate to demonstrate this statement.

Published Thursday, July 06, 2006 9:48 AM by AlbertoFerrari
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

No Comments

Leave a Comment

(required) 
(required) 
Submit

About AlbertoFerrari

Alberto Ferrari is a Business Intelligence consultant. He his interests lie in two main areas: BI development lifecycle methodologies and performance tuning of ETL and SQL code. His main activities are with SSIS and SSAS for the banking, manufacturing and statistical sectors. He is also a speaker in international conferences like European PASS Conference and PASS Summit.
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement