THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Jamie Thomson

This is the blog of Jamie Thomson, a freelance data mangler in London

Should we be able to deploy a single package to the SSIS Catalog?

My buddy Sutha Thiru sent me an email recently asking about my opinion on a particular nuance of the project deployment model in SQL Server Integration Services (SSIS) 2012 and I’d like to share my response as I think it warrants a wider discussion. Sutha asked:

Jamie

What is your take on this?
http://www.mattmasson.com/index.php/2012/07/can-i-deploy-a-single-ssis-package-from-my-project-to-the-ssis-catalog/

Overnight I was talking to Matt who confirmed that they got no plans to change the deployment model.
For example if we have following scenrio how do we do deploy?

Sprint 1
Pkg1, 2 & 3 has been developed and deployed to UAT. Once signed off its been deployed to Live.

Sprint 2
Pkg 4 & 5 been developed. During this time users raised a bug on Pkg2. We want to make the change to Pkg2 and deploy that to UAT and eventually to LIVE without releasing Pkg 4 &5.
How do we do it?

Matt pointed me to his blog entry which I have seen before . http://www.mattmasson.com/index.php/2012/02/thoughts-on-branching-strategies-for-ssis-projects/

Thanks
Sutha

My response:

Personally, even though I've experienced the exact problem you just outlined, I agree with the current approach. I steadfastly believe that there should not be a way for an unscrupulous developer to slide in a new version of a package under the covers. Deploying .ispac files brings a degree of rigour to your operational processes.

Yes, that means that we as SSIS developers are going to have to get better at using source control and branching properly but that is no bad thing in my opinion. Claiming to be proper "developers" is a bit of a cheap claim if we don't even do the fundamentals correctly.

I would be interested in the thoughts of others who have used the project deployment model. Do you agree with my point of view?

@Jamiet

Published Tuesday, October 16, 2012 10:18 AM by jamiet

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Chris Nelson said:

It seems like one of the main difficulties is that source control is designed for human readable text files and SSIS packages are XML and their functionality and configuration isn't separate.

October 16, 2012 7:05 AM
 

jamiet said:

Hi Chris,

Interesting perspective. I have to say that I don't really agree that source control is for human-readable files - putting XML files into source control seems, to me, like a perfectly acceptable thing to do.

Putting that aside tho, why is the fact that SSIS packages are, under the covers, XML files an issue here? I'm afraid I'm not really understanding the point that you're making. My fault I'm sure, can you clarify?

regards

Jamie

October 16, 2012 8:16 AM
 

D. Pendleton said:

I don't like the idea of deploying packages 4 and 5, but there's really no harm in doing so, I suppose. Even if they aren't working (the packages) they aren't going to execute because they aren't scheduled via the SQL Agent or included in some other means of executing them (e.g. Powershell script or a cmd file). When I encounter this, I delete the packages from the server that were not intended to be deployed.

October 16, 2012 11:19 AM
 

Phil said:

Jamie,

I think what Chris is getting at is that performing a diff or particularly resolving a merge confilct is a_lot_easier in a human readable file (eg .cs or .sql) than in an xml file - particularly when the xml file contains design time logic (ie layout) and runtime logic (ie the actual operations to be performed) as a .dtsx does. I kind of agree, I don't think I've ever successfully performed a manual merge on two SSIS packages using a file compare, I've always resorted to the package designer to do this.

As far as deployment goes I agree that BI developers should do it 'properly' - and in fact it really isn't that difficult. We're still using 2008 R2 in our shop, but we release_all_packages from all projects, every time (in fact the first thing we do is delete all of the packages on the target server).

In the scenario above we would have our 'live' branch separate - so live fixes go in there - and the new packages don't exist, so there's no issue when we re-release to live. The fixes then get merged into the 'development' branch to keep it up-to-date. Usually this can be easily done using a TFS merge. If there have been changes to Package 2 in the meantime however, then you get conflicts and thats were things get trickier, as above.

October 17, 2012 4:53 AM
 

Koen Verbeeck said:

Are there any good articles/blog posts on these branching tactics for SSIS? I'm very interested in learning more about this approach.

October 17, 2012 5:24 AM
 

jamiet said:

Hi Phil,

Thanks for the clarification. Yes, if you're branching then merging will, eventually, be a necessity. Point taken.

I still don't agree that "source control is designed for human readable text files" though :)

JT

October 17, 2012 5:28 AM
 

Chris Nelson said:

Jamie,

Phil elaborated on the point I was trying to state. And yes, source control was designed for human readable text files (C, C# and other programming sources), to make merges possible. (See git and Mercurial for modern examples.) It's just when  XML and binaries are added to the repository, it gets tricky.

October 17, 2012 10:07 AM
 

jamiet said:

Hi Chris,

Oh I certainly don't deny that its tricky to manage XML files in source control. No disagreement there.

Just wasn't sure about your assertion that source control is intended primarily for pure text files (which I think is what you were saying) - I would have thought its more accurate to say that its mainly for none-binary files. Happy to be wrong about this tho.

Another side of the debate is to say that all XML files *are* text files. But its a bit of a pointless debate so I won't go there :)

thanks again for the comments.

JT

October 17, 2012 10:15 AM
 

jamiet said:

Kris,

I don't know of any articles on branching tactics for SSIS. To be honest, I'm  not sure that a branching strategy that is suited for SSIS is what we should be aspiring to. A branching strategy should be agnostic of the technology being branched, should it not?

Just a thought.

JT

October 17, 2012 10:17 AM
 

jamiet said:

P.S. Koen, I know your name isn't Kris. God knows why I wrote that. Sorry :)

October 17, 2012 10:36 AM
 

Phil said:

Jamie,

I don't know if its xml that is really the problem here - its not really that difficult to read (probably easier than HTML!), but the decision taken early on in SSIS to mix presentation logic and functional logic in the same xml file. The same is true of SSAS files. This meams the smallest change to layout triggers a check-out, when actually the package is doing exactly the same job as it was before. I know there are XSLT files available online that attempt to simplify the xml to show only the functional logic - but using these is totally unsupported.

Koen,

I think Jamie is right in that you shouldn't have to tailor a branching strategy to a specific technology. However, SSIS and SSAS do present their own unique challenges. I also think some of the generic source control examples are a bit ott for a basic BI project. Given the interest shown here I'll try and post something in the next few days about how we manage things on my blog here http://phil-austin.blogspot.com/

October 18, 2012 8:40 AM
 

jamiet said:

Hi Phil,

"This meams the smallest change to layout triggers a check-out, when actually the package is doing exactly the same job as it was before"

Yes, I absolutely agree with this. I would be very happy if function and presentation were separate.

"Given the interest shown here I'll try and post something in the next few days about how we manage things on my blog here"

Awesome, I'd love to read that. Let us know if and when you post it.

regards

Jamie

October 18, 2012 8:46 AM
 

Andrej Kuklin said:

October 24, 2012 11:00 AM
 

jamiet said:

Andrej,

Completely agree with those. Didn't even click through - I can tell from the URLs what they're about that I agree.

People have been asking for those things for years to be honest. I'm eternally hopeful that they get sorted one day.

Going to go vote now.

Jamie

October 24, 2012 12:13 PM
 

Phil said:

Possibly stretching the meaning of 'a few days' but I've finally put some thoughts and tips down about SSIS and source control here http://phil-austin.blogspot.co.uk/2012/11/ssis-release-and-source-control.html.

November 4, 2012 4:28 PM
 

Kevin said:

In my former job we had SQL Server 2008 and 100 or so SSIS packages running on a daily basis, with 7 data architects developing those packages. We generally had one SSIS package per table load and with rare exception kept each package isolated so changing one package only affected that package and a single table. So the concept of deploying more than one package to production at once is mostly lost on me. In 5 years of a very well-functioning environment that just didn't come into play.

We used projects in SSIS to organize our packages and other objects related to a data mart. It worked great. You go into the Oncology project (this was a healthcare provider environment) and you see all the packages that populate Oncology tables, as well as the Oncology data mart data model and even the Business Objects universe if we had one. If you needed to add a field to a table, you'd check out that package, make your changes and test, then deploy that package to prod. There's no need to deploy all the other packages because they have nothing to do with the change I just made.

I'm in a new job now and we're looking at 2012. I like the idea of the easy one-click deployment, but I don't like deploying unrelated objects at the same time. So for you purists out there, say you had 100 SSIS packages that don't relate to each other. Would you create a separate project for each package? How do you organize all those projects so they are easy to find, and so you can keep packages on a similar topic organized together? Is it just through folder structures on the file system?

It also seems to me that the discussions around this topic are more on the basis of pure software development and less on the use of an ETL tool if that makes sense. My team is made up of ETL tool users, not software developers, if that helps the discussion at all.

January 31, 2013 11:28 AM
 

jamiet said:

"So for you purists out there, say you had 100 SSIS packages that don't relate to each other. Would you create a separate project for each package? How do you organize all those projects so they are easy to find, and so you can keep packages on a similar topic organized together? Is it just through folder structures on the file system?"

Hi Kevin,

I wouldn't call myself a purist but I'll attempt an answer to those.

"Would you create a separate project for each package?"

No I wouldn't, but then I probably wouldn't create just one project containing 100 packages either. I would hope there would be some sort of logical separation where it makes sense to group some packages into a single project.

"How do you organize all those projects so they are easy to find, and so you can keep packages on a similar topic organized together?"

As I just alluded I would keep packages that are "on a similar topic" in the same project. I'm a little confused though because earlier in that paragraph you said that in this hypothetical scenario the packages "don't relate to each other", now you say "packages on a similar topic". Those two statements seem rather contradictory. Apologies if I have misunderstood.

"Is it just through folder structures on the file system?""

Given how SSIS2012 projects are organised all packages from the same project are in a dedicated folder anyway so I'm not sure there is any difference between the two. In other words, grouping some packages into a project implicitly puts them into a different folder.

Hope that helps

JT

January 31, 2013 11:46 AM
 

Kevin said:

Sorry I wasn't clear on what "related" and "similar topic" meant. I'm trying to make the distinction between packages of a similar subject (like packages that populate tables in the Oncology data mart) that are convenient to put in the same project with other related objects (data models etc) so you can see them all together, versus packages that are dependent on each other in some way and therefore need to be deployed to production together in order to work properly.

I like using projects to organize, but the objects in the project aren't dependent on each other and therefore don't need to be deployed together. If I'm working on two packages in a project (again, same topic, but not dependent on each other), I don't want my in-development package to get deployed yet just because the other package is ready. And so if that non-dependence means those packages should actually be in separate projects (which is what I understand is being suggested here) then I don't love the idea of 100 separate projects since almost all my packages are non-dependent.

January 31, 2013 12:07 PM
 

Emil said:

Hi

This is a comment to Phil's blog post.... which I would do on Phil's blog but I am no longer authorized to do so :D I apologize I didn't re-write it and direct it to everyone but feel free to post your thoughts and experience on this uneasy subject.

"Hi Phil,

I don't think we discussed that at MITIE ;) so I thought I'll catch up with your thoughts on this subject.

First of all I hate branching and merges! and I don't do them if I don't need to (but I do create release files). Obviously that works only to a certain point and may require development freeze which works fine for smaller environments with one dev team.

Larger projects/teams this is a bit of an issue especially with fixes and I like the idea of having a branch and applying a fix to branch + merge with core before next release so monitoring changes is key here.

But what do you suggest for situations when new development is done on core/root and you want to perform new release but only using part of new development?

Would you create a copy of latest branch and bring in new packages you want from core/root (which sounds like a sensible idea) or would you do it another way?

But this also effectively means that your core/branch may often be out of sync with latest branch.

Maybe Matt Masson suggestion to have 3 cores (or more) is a good one? Dev, integration, release? So dev is all dev and then is pushed to integration and release and I presume you would still perform branching on release, maybe integration and sometimes on dev core? and I presume fixes would be release fix + dev core fix at the same time (potentially with new development).

Last thing is situations where you have two streams of work (two teams) and you want to modify the same package for instance adding new dimensions attributes. How would you suggest to handle this situation? Personally this situations for me sounds like an exception and I would just combine two stream of work into one (or at least related aspects!).

I look forward to hearing your point of view.

Take care

Emil"

October 8, 2013 3:13 PM
 

Paul T. said:

We use Git to version all our projects. Everything VisualStudio generally works nicely, although project files can sometimes be a bit of a source of merge conflicts...except SSIS projects. The various proj and other definition files are definitely not DVCS-friendly. In fact, we started trying to write a pre-processor for this that devs would have to run before doing a merge.

Phil said "I know there are XSLT files available online that attempt to simplify the xml to show only the functional logic"

Could someone point some out to us? It's obviously hard to google for "SSIS XSLT project file" and get relevant results.

January 15, 2014 8:35 AM

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement