As a DBA I often find myself back tracking on myself when it comes to newer technologies and how they affect SQL Server. In 2006 if you had asked me about SQLCLR, my response was adamantly, "Not in my server." As me about it today, and I'll start asking you why you think you need to use it, and then make a more informed decision about whether SQLCLR is a good fit for the problem. What changed? I found that a lot of SQLCLR questions in the forums were going unanswered and decided to break out Visual Studio and being learning everything I could about SQLCLR. Do I know it all? Not even close, but I did learn a lot about where SQLCLR is a good fit, and when it should not be used.
It is all to common for DBA's to simply say, not in my database, or not in my server, when they don't understand how things work, or where they might be a good fit to solve a problem. Case in point, the current adoption rate of SQLCLR in SQL Server 2005, and 2008. To that point, I see a lot of bad implementations of SQLCLR happening but recently I started a series on SQLServerCentral.com for making use of SQLCLR to replace xp_cmdshell in SQL Server, and the feedback mimics my original sentiment in most cases, even though I made certain that I followed established best practices in creating a secure/safe implementation of SQLCLR using signed assemblies instead of setting Trustworthy On.
Until recently, I took the same stance for LINQ to SQL that I had take for SQLCLR, that being, it wasn't going to be used in my environment. In some consulting work I have done, I have seen more than once, problems in performance that stemmed completely from a clients use of LINQ to SQL, and in most cases it involved implicit conversions from datatype mismatches between table columns as varchar and the resulting LINQ to SQL queries which are always parameterized as nvarchar for strings. Something I learned early on in software development was that a development always seemed to sit on the triangle that one of my mentors drew on the board during a meeting where we were discussing the ever increasing timelines for delivery that were happening.
At any point in time, the business can only have two sides of the triangle. It is either fast and cheap, but quality is lacking, it is fast and good, but you are going to pay for it, or it is cheap and good, but it won't happen as fast. Its an interesting paradigm, and to this day I can see the truth that exists behind this simple drawing.
Until recently, I would have placed LINQ to SQL in the bottom left corner of this triangle, that is to say that using it will be fast because it writes the code for you, and it will be cheap because a developer can crank out code rapidly without having to switch into TSQL programming mode to handle data access, but it will not be a good product in the end. I had the extreme pleasure of being able to do Tech Editing for Paul Neilsen's SQL Server 2008 Bible and one of the chapters I edited was LINQ to SQL. In the past I have been openly against LINQ to SQL for a number of reasons including that it often created performance issues where regular TSQL wouldn't have had problems. However, after reviewing the chapter, and running a bunch of tests I noted a couple of key things that made LINQ to SQL seem like it could be more useable than I have previously given it credit, so decided to put some time in learning a good bit more about LINQ to SQL and how it works.
To be perfectly clear, I have no problems with LINQ as a add-in to the .NET framework. I have used it extensively, especially LINQ to XML, and the ability to query object collections in .NET is simply awesome to say the least. In fact the Extended Events Manager Library is very reliant on LINQ to consume information from the database engine about Extended Events.
It just so happened that we had an opportunity to give LINQ to SQL a whirl at work not long after I completed the work on the Paul's chapter, and my own extended testing. Without providing to much information, we basically had to retool a new application using an existing database, and we had to do so very quickly. I sat in on a meeting with the business sponsors, a project manager, the application developer lead for the project, and the development manager, and the key question that kept being asked was how fast could the project be coded and ready for use, the sooner the better. Since time was of the essence here, and it only needed to be a quick/cheap solution that could be revisited, I threw using LINQ to SQL on the table as an option and surprised the developer since I have been adamantly against its use in our environment.
So what motivated this decision?
Well a few things:
- It had to be developed fast
- The number of tables involved was quite small
- I could change table structures to work best with LINQ to SQL.
- There were a finite set of actual queries that were possible from the application.
- The data is completely read-only, so I had free reign to create as many indexes on the tables as needed to satisfy the queries generated by LINQ.
I spent the remainder of the afternoon after the meeting changing table schema's out to use unicode datatypes, duplicating the data into a new database, and providing the table schemas to the developer. The next day, the developer was able to write the necessary functionality for the application and begin testing it. Immediately performance problems began to surface, which was a bit concerning to him at first. This was completely expected by me, because I didn't create an indexes on the multi-million row tables when I transferred them into the newer database. Rather than duplicate the existing ISV design for the database which was a single column index on every column of every table, (don't ask me why, I didn't create it) I instead chose to initially only use a clustered index on the primary keys, and then only create the indexes as needed by the application. In the end, I had to create 3 covering indexes on the primary table, 2 on a supporting table, and 2 on a second supporting table. Even using unicode, the size of the database is about 1/2 of the original because the indexing actually makes sense.
Well nobody is complaining, including me. I was initially concerned about having cache bloat issues, but after a week of steady usage, I haven't had any problems yet.
So it kind of begs the question, would I use LINQ to SQL in a major application? Probably not, but I certainly wouldn't write it off completely. One of my big problems with the most recent implementation was that it uses the LINQ to SQL generated dynamic SQL and not stored procedures. This was a accepted trade off in building it fast, but that doesn't mean it has to be built this way. We could have used stored procedures which maintains the security model that DBA's love and hold dear. In this case, LINQ to SQL is not much different than using tableadapters and typed datasets in a .NET application which is currently accepted for programming.
Before writing off a new technology as a DBA, make sure that you give it a total review. I know that after my experiences with SQLCLR, and LINQ to SQL, I will certainly be more attentive to trying to keep up with the emerging technologies and how to identify where they may or may not be a best fit to solve problems. I'll be posting further information on LINQ to SQL as I continue my investigation into it as a technology and determine how not to use it as well as where it might be a good fit for development.