|
|
|
|
Adam Machanic, SQL Server Practice Lead for The Pythian Group, shares his experiences with programming, performance tuning, and optimizing SQL Server 2000, 2005, and 2008, in conjunction with related technologies such as .NET.
-
Next Thursday, May 8, the New England SQL Server Users Group will have a special meeting, featuring Craig Freedman from the SQL Server development team. Craig is The Man when it comes to query optimizer internals, and wrote an incredibly detailed chapter on the topic for "Inside SQL Server 2005: Query Tuning and Optimization". At the meeting next week, Craig will discuss some of what he talked about in the chapter, including the basics of how the query processor works and what iterators are. He'll cover the various operators you'll commonly see in query plans, and describe how they actually work internally. This should be a great meeting, and we expect it to be very well attended. In order to help us figure out food and drink, in addition to securing enough chairs for the meeting room, we need you to RSVP if you're planning to attend. In order to RSVP, sign up for our mailing list. I will send out an e-mail next Tuesday, and you can RSVP by replying to it. Only attendees who RSVP will be eligible for our prize draw at the end of the night, so make sure to sign up for our list by Monday in order to guarantee that you don't get left out. We would like to thank Red Gate Software, who made a very generous donation to the group that allowed us to have this special meeting. Red Gate makes some of my favorite SQL Server tools and provides a huge amount of community support in the SQL Server and .NET space, and we hope that you will give their products a try.
|
-
I was just reviewing my calendar for the next several weeks and noticed that
the Toronto SQLTeach conference is now
only a few weeks away. This conference includes quite a few interesting SQL Server-related
sessions, on topics ranging from best practices, to performance, to some of
the new SQL Server 2008 features. I fully expect this to be a great show.
I am doing two breakout sessions during the main conference:
- SQL Server 2005: Authorization, Privilege, and Access Control. In this
talk I cover SQL Server 2005’s enhancements around granting permissions via
stored modules (i.e., stored procedures, views, functions)
- Designing Highly Concurrent Database Applications. In this talk I get
into the business requirements behind supporting concurrent processes, and the
areas where SQL Server (and every other database product) falls short. I then go
on to show how to solve the problems in the database programmatically.
I am also doing a full-day post-conference session on SQLCLR
programming. This will be the first time that I will be presenting all
of my SQLCLR material in a single day; should be fun. I will take attendees from
the basics all the way through some advanced applications and techniques, so if
you’re interested in becoming a SQLCLR expert I highly recommend attending.
The conference starts in just three weeks, but it is not too late to register.
|
-
How creative are you with manipulating your queries to produce more efficient plans? Try the following puzzle and e-mail your solution to me at [<my last name> @ pythian.com]. Make sure to include an explanation of why it works, as well as your mailing address. The best two solutions/explanations win a free copy of Expert SQL Server 2005 Development, a wonderful feeling of accomplishment, plus eternal fame and glory when I reveal your solutions here on the blog.
Run the following T-SQL to create two tables in TempDB:
USE TempDB GO
CREATE TABLE b1 (blat1 CHAR(5) NOT NULL) CREATE TABLE b2 (blat2 VARCHAR(200) NOT NULL) GO
INSERT b1 SELECT LEFT(AddressLine1, 5) AS blat1 FROM AdventureWorks.Person.Address
INSERT b2 SELECT AddressLine1 AS blat2 FROM AdventureWorks.Person.Address GO
Now consider the following query:
SELECT * FROM b1 JOIN b2 ON b2.blat2 LIKE b1.blat1 + '%'
This query takes around three minutes to run on my notebook, and does over 1.8 million logical reads. Can you figure out a way to re-write it so that it performs better? No modification of the base tables or addition of any other objects is allowed (sorry, no indexed views!) -- the challenge is to tune this by doing nothing more than re-writing the query.
Good luck! I'll leave the contest open for submissions until May 1.
|
-
If you've read many of my blog posts, you know
that I consider lack of procedure cache control to be a major SQL Server pain point. Badly written apps
that use non-parameterized ad hoc queries can quickly flood SQL Server's memory
pools and bring the server to its knees.
SQL Server 2005 brought some relief in the form of
the Forced Parameterization database option, and SP2 took things one step
further with better throttling of the cache... but it's still not enough. We
want a knob!
The bad news: We're not getting quite the knob I
was hoping for.
The good news: SQL Server 2008 will include an
sp_configure option called "optimize for ad hoc workloads". This option
will cause the procedure cache to only cache the parameterized stubs for ad hoc
queries, rather than the full query with parameters. This means that
applications passing a large number of non-parameterized batches should see much
lower procedure cache memory utilization and, therefore, better overall
throughput. I'm really looking forward to seeing this in action; this feature
should be added with the next pre-release drop.
Remember, there is simply no substitute for
properly designing your application's data access layer, but hopefully this will
help for those applications that simply can't be changed...
|
-
I generally shy away from writing personal blog posts about my life, but when it comes to major career changes it's kind of fun to share the news. After almost three years as an independent consultant, I have stepped out of that business and into a full-time role with a company called The Pythian Group, a provider of remote DBA services. Never heard of The Pythian Group? That's because although the company is big in the Oracle and MySQL worlds, it has just begun to make its move into the SQL Server arena. My job title is Global Practice Lead for the SQL Server practice, and I am tasked with, among other things, increasing the company's presence and business on this side of the fence. Fun stuff! The Pythian Group has a very interesting business model that is at once both similar to what I knew as a consultant and totally foreign to the way I'm used to doing things. For example, as a consultant I generally billed my time in 15-minute increments; at The Pythian Group, time is billed in 1-minute increments -- great for the customer, not so great for the DBA. Luckily, they've built some tools to help ease the pain and after a week of totally hating the system I'm already starting to get used to it. (Sort of). The company is heavily focused on automated monitoring and 24x7 support delivered via a network of offices around the globe -- things that as an independent consultant I never would have been able to deliver. I'm really excited to watch things scale up from what is currently a relatively small SQL Server group to what I expect will eventually become a fairly large one. I'm especially interested in seeing how some of the SQL Server 2008 tools -- such as Policy-Based Management -- will affect the way we work at Pythian. As a consultant I was not especially enthusiastic about that feature. As a remote DBA, it suddenly makes perfect sense.
Anyway, thanks if you've read this far, and I'll end with a bit of a plug: if this stuff sounds at all interesting to you as a DBA, drop me a line. We're hiring. And don't let the 1-minute increment thing scare you (too much).
|
-
A lot of people will be interested to know that at the launch event in LA it was announced that the T-SQL debugger is returning to Management Studio in SQL Server 2008.
Personally, this is not a feature I've been lamenting the loss of; I never used it in SQL Server 2000, and unless we can view temp tables, table variables, etc, I just don't see it as something with a lot of utility for the way I personally develop T-SQL.
But this isn't just about me, and I know that there was a huge amount of interested in seeing the debugger come back into the core SQL Server tools. So congrats to all of the step debug fans out there; get your F10 keys ready for SQL Server 2008!
|
-
Relative comparison is a simple matter of human nature. From early childhood we compare and contrast what we see in the world around us, building a means by which to rate what we experience. And as it turns out, this desire to discover top and bottom, rightmost and leftmost, or best and worst happens to extend quite naturally into business scenarios. Which product is the top seller? How about the one that's simply not moving off the shelves? Which of our customers has placed the most expensive order? What are the most recent orders placed at each of our outlets? In the world of common business questions, the edge cases are generally of most interest. What's in the middle is unimportant; it's often too difficult for the mind to compare and comprehend when there are hundreds, thousands, or even millions of items, transactions, or facts that are all within a similar range. Instead, we focus on those that stick out in some extraordinary way. Those of us who work with SQL products on a regular basis are faced with solving this same problem time and again as we work through various business requirements. Over time, I have noticed four basic query patterns that can be used to solve the problem; each are logically equivalent (within certain restrictions -- more on that later), but can have surprisingly different performance characteristics depending on the data being queried. In this first post, I will outline the available patterns/methods. In the following posts, I will show the results of testing each pattern against a variety of scenarios in an attempt to discover where and when each should be used. The four basic patterns are outlined below. Each of the methods is illustrated using a query to show all customers' names, plus their most recent order date, and the amount of that order. I've included notes that indicate where logic differences can arise among the various methods. Method 1: Join to full group and use correlated subquery with a MIN/MAX aggregate to filter In this method we use an inner join to get all required columns, then filter the resultant set using a correlated subquery in the WHERE clause.
SELECT c.FirstName, c.LastName, o.OrderDate, o.OrderAmount FROM Customers c JOIN Orders o ON o.CustomerId = c.CustomerId WHERE o.OrderDate = ( SELECT MAX(o1.OrderDate) FROM Orders o1 WHERE o1.CustomerId = o.CustomerId )
Logic notes: With this method ties are automatically included in the output, unless a tiebreaker is specified (which can be tricky given that you only have one column to work with). This method does not allow you to pull back an arbitrary number of rows, such as top 10 per customer; you are limited to the edge and any ties that might exist. Method 1a: Join to full group and use correlated subquery with TOP(n) and ORDER BY to filter
This method is almost identical to Method 1 (which is why it is classified here as 1a), but the TOP and ORDER BY allow for a bit more flexibility than the aggregates.
SELECT c.FirstName, c.LastName, o.OrderDate, o.OrderAmount FROM Customers c JOIN Orders o ON o.CustomerId = c.CustomerId WHERE o.OrderDate = ( SELECT TOP(1) o1.OrderDate FROM Orders o1 WHERE o1.CustomerId = o.CustomerId ORDER BY o1.OrderDate DESC ) Logic notes: With this method you can more easily integrate a tiebreaker than with Method 1; the comparison column can be anything, including a primary key, and you can still order on whatever column makes most sense. In addition, you can take more rows than with Method 1 by using IN instead of = in the WHERE clause, and increasing the argument value to TOP.
Method 2: CROSS APPLY to ordered TOP(n) In this method, SQL Server 2005's CROSS APPLY operator is used. This operator allows us to essentially create a table-valued correlated subquery -- something that impossible in previous versions of SQL Server. By using TOP in conjunction with ORDER BY we can get as many rows per group as needed. SELECT c.FirstName, c.LastName, x.OrderDate, x.OrderAmount FROM Customers c CROSS APPLY ( SELECT TOP(1) o.OrderDate, o.OrderAmount FROM Orders o WHERE o.CustomerId = c.CustomerId ORDER BY o.OrderDate DESC ) x
Logic notes: This method is almost identical, from a logic point of view, with Method 1a modified to use IN on a primary key column. With both methods WITH TIES can be added to the TOP in order to get ties. Method 3: Join to derived table that uses a partitioned, ordered windowing function, and filter in the outer query based on the row number In this method a derived table or CTE is used, in conjunction with a windowing function partitioned based on the required grain of the final query. So for the "most recent order per customer" query, the row number is partitioned based on the customer. This gives us a count starting at 1 for each customer, which can be filtered in the outer query.
SELECT c.FirstName, c.LastName, x.OrderDate, x.OrderAmount FROM Customers c INNER JOIN ( SELECT o.OrderDate, o.OrderAmount, o.CustomerId, ROW_NUMBER() OVER ( PARTITION BY o.CustomerId ORDER BY o.OrderDate DESC ) AS r FROM Orders o ) x ON x.CustomerId = c.CustomerId AND x.r = 1
Logic notes: If ties are important, use DENSE_RANK instead of ROW_NUMBER. ROW_NUMBER is good for arbitrary TOP(n), similar to Method 2. Unlike the previously described methods, in conjunction with DENSE_RANK this method can return an arbitrary TOP(n) rows, all of which can include ties. So if you would like to see the three most recent order dates and each happens to have multiple orders, this method will be able to return them all by simply filtering on x.r = 3. This would not be directly possible with any of the other methods described here. Method 4: "Carry-along sort" This is the only "tricky" method, and not one that I recommend using, except as a last resort. I'm including it here only for completeness and comparison because it happens to be a very high performance method in some cases. This method involves converting each of the required inner columns into a string, concatenating them, then applying an aggregate to the string as a whole. By putting the "sort" column first, the other data is "carried along" -- thus the name for the method. The concatenated data is then "unpacked" in the outer query. This can be surprisingly efficient from an I/O standpoint, but the resultant code is a maintenance nightmare and it is quite easy to introduce errors. In addition, this method can only return the top 1 per group -- no ties or multiple return items are supported.
SELECT c.FirstName, c.LastName, CONVERT(DATETIME, SUBSTRING(x.OrderInfo, 1, 8)) AS OrderDate, CONVERT(MONEY, SUBSTRING(x.OrderInfo, 9, 15)) AS OrderAmount FROM Customers c INNER JOIN ( SELECT o.CustomerId, MAX ( CONVERT(CHAR(8), OrderDate, 112) + CONVERT(CHAR(15), SubTotal) ) OrderInfo FROM Orders o GROUP BY o.CustomerId ) x ON x.CustomerId = c.CustomerId
This post is just the beginning; watch this space in the coming days for a series of performance tests and analysis of these methods, and some results that I personally found to be quite surprising.
|
-
-
Just a heads up for those in the Boston area: The New England SQL Server Users Group is doing a special event next Wednesday night (January 23), featuring Itzik Ben-Gan, talking about Grouping Sets in SQL Server 2008:
SQL Server 2008 introduces enhanced support for aggregating data addressing the needs to analyze aggregated data dynamically. The enhancements include the new GROUPING SETS clause, the standard CUBE and ROLLUP clauses (not to confuse those with the existing non-standard CUBE and ROLLUP options), the GROUPING_ID function, and other T-SQL enhancements. This session will cover those enhancements in detail, and will describe and demonstrate their practical uses.
This should be a great event, so mark your calendars and check the NESQL Web site for more information. Please note that RSVP is strongly encouraged for this event -- only people who RSVP will be eligible for giveaways that night, and we have a great selection. To get on the list to RSVP, please visit the Web site and sign up for our mailing list. I will send a mailing -- which will include RSVP instructions -- next Monday.
|
-
As with all of the blog posts I keep meaning to write -- I keep a list and given the infrequency with which I've been posting lately, it's getting quite large -- this script has been on the queue for quite some time. So here I find myself with a spare moment right on the cusp of the new year, and figured what better way to end the year than with a script that, at least for me, has been quite useful these last few months. The driving force behind my writing this script is that I found myself endlessly calling sp_who2 'active' to see who was doing what on servers I needed to take a look at. Then I would have to sort through the results, and end up calling DBCC INPUTBUFFER to take a look at the SQL being used. This was a serious pain, and I finally caved a few months back and decided to end the madness once and for all with the help of some DMVs. The following script primarily uses the sys.dm_exec_requests view, and finds all "active" requests -- i.e., those that are running, about to start running, or suspended. It also finds some other useful information, including the host name, login name, the start time of the batch, and whether or not the batch is currently blocked. In the outer query I use the sys.dm_exec_sql_text function to get the text of the SQL that all of the active requests are running, in addition to the SQL being run by the blocking sessions, if applicable. This way I don't have to do two lookups to chase down what's blocking what. You'll notice that I use FOR XML PATH in the subqueries that pull the SQL text. This gives us a nice little bonus: instead of copying the text out of the cell in SSMS and pasting it somewhere else, you can simply click on it -- and it maintains whatever formatting, including white space and carriage returns, that it originally had. This is much, much nicer than getting the batch on a single line. The only problem is that certain characters, such as greater-than and less-than, get "entitized" when the text is converted to XML. This means that some queries won't be able to be run without a bit of editing. A small price to pay for nicer output, in my opinion. If anyone out there has a solution for the entitization, please let me know! The only way I know to solve it is to convert back to VARCHAR, and that defeats the whole purpose... Anyway, thanks all for a great 2007. Here's to an even better 2008! Without further ado, the script: SELECT x.session_id, x.host_name, x.login_name, x.start_time, x.totalReads, x.totalWrites, x.totalCPU, x.writes_in_tempdb, ( SELECT text AS [text()] FROM sys.dm_exec_sql_text(x.sql_handle) FOR XML PATH(''), TYPE ) AS sql_text, COALESCE(x.blocking_session_id, 0) AS blocking_session_id, ( SELECT p.text FROM ( SELECT MIN(sql_handle) AS sql_handle FROM sys.dm_exec_requests r2 WHERE r2.session_id = x.blocking_session_id ) AS r_blocking CROSS APPLY ( SELECT text AS [text()] FROM sys.dm_exec_sql_text(r_blocking.sql_handle) FOR XML PATH(''), TYPE ) p (text) ) AS blocking_text FROM ( SELECT r.session_id, s.host_name, s.login_name, r.start_time, r.sql_handle, r.blocking_session_id, SUM(r.reads) AS totalReads, SUM(r.writes) AS totalWrites, SUM(r.cpu_time) AS totalCPU, SUM(tsu.user_objects_alloc_page_count + tsu.internal_objects_alloc_page_count) AS writes_in_tempdb FROM sys.dm_exec_requests r JOIN sys.dm_exec_sessions s ON s.session_id = r.session_id JOIN sys.dm_db_task_space_usage tsu ON s.session_id = tsu.session_id and r.request_id = tsu.request_id WHERE r.status IN ('running', 'runnable', 'suspended') GROUP BY r.session_id, s.host_name, s.login_name, r.start_time, r.sql_handle, r.blocking_session_id ) x
Enjoy!
|
-
"Lonely but free I'll be found
Drifting along with the tumbling tumbleweeds"
- Supremes, "Tumbling Tumble Weeds"
Welcome to the first installment of what I hope will be a regular feature on this blog, Anti-Patterns and Malpractices. As a consultant, I get the honor of seeing a lot of different systems, with a lot of different code. Some of it is good, and some of it -- well -- I'll be featuring that which is not so good here. No names will be named, and code will be changed to protect the not-so-innocent; my goal is not to call out or embarrass anyone, but rather to expose those misguided patterns and practices which inevitably lead to problems (and a subsequent call to a consultant; perhaps if I post enough of these I'll have fewer less-than-appealing encounters in my work!)
The topic du jour is the Tumbling Data Anti-Pattern, a name coined by my friend Scott Diehl. Much like the tumbleweed lazily blowing around in the dust, data which exhibits this pattern is slowly and painstakingly moved from place to place, gaining little value along the way.
So what exactly typifies this particular anti-pattern? Consider the following block of T-SQL, designed to count all married employees in the AdventureWorks HumanResources.Employee table and bucket them into age ranges of 20-35 and 36-50, grouped by gender. Employees older than 50 should be disregarded:
--Find all married employees SELECT * INTO #MarriedEmployees FROM HumanResources.Employee WHERE MaritalStatus = 'M'
/* select * from #marriedemployees where employeeid = 20 */
--Find employees between 20 and 35 SELECT EmployeeId INTO #MarriedEmployees_20_35 FROM #MarriedEmployees WHERE DATEDIFF(year, birthdate, getdate()) BETWEEN 20 AND 35
--Find employees between 36 and 50 SELECT EmployeeId INTO #MarriedEmployees_36_50 FROM #MarriedEmployees WHERE DATEDIFF(year, birthdate, getdate()) BETWEEN 36 AND 50
--Remove the employees older than 50 DELETE FROM #MarriedEmployees WHERE EmployeeId NOT IN ( SELECT EmployeeId FROM #MarriedEmployees_20_35 ) AND EmployeeId NOT IN ( SELECT EmployeeId FROM #MarriedEmployees_36_50 )
--Count the remaining employees SELECT e.Gender, COUNT(*) AS theCount INTO #Employee_Gender_Count_20_35 FROM #MarriedEmployees e JOIN #MarriedEmployees_20_35 m ON e.EmployeeId = m.EmployeeId GROUP BY e.Gender
--select * from #Employee_Gender_Count_20_35
SELECT e.Gender, COUNT(*) AS theCount INTO #Employee_Gender_Count_36_50 FROM #MarriedEmployees e JOIN #MarriedEmployees_36_50 m ON e.EmployeeId = m.EmployeeId GROUP BY e.Gender
--Get the final answer SELECT a.Gender, a.theCount AS [20 to 35], b.theCount AS [36 to 50] FROM #Employee_Gender_Count_20_35 a JOIN #Employee_Gender_Count_36_50 b ON b.Gender = a.Gender
This kind of code tells us several things about the person who wrote it. Rather than thinking upfront and designing a complete solution or at least a game plan before typing, this person appears to have thought through the problem at hand in a step-by-step manner, coding along the way. A bit of debugging was done along the way, but the real goal was to spit out an answer as quickly as possible (or so it seemed at the time). No attempt was made to go back and fix the extraneous code or do any cleanup; why bother, when we already have an answer? It's important to mention that this is a simple example. I generally see this anti-pattern exploited when developers are tasked with producing large, complex reports against data sources that aren't quite as well-designed as they could be. In an attempt to preserve sanity, the developer codes each tiny data transformation in a separate statement, slowly morphing the data into the final form he wishes to output. The resultant scripts are often thousands of lines long and take hours to run, during which time the server crawls (and throughout the office you can hear people muttering "the server sure is slow today"). The solution to this problem is simple, of course, and the best software engineers do it automatically: Before writing a line of code sit back for just a moment and consider your end goal. Do you need to work in steps, or will a single query suffice? Can various join conditions and predicates be merged? Perhaps a Google search is a good idea; what is the best way to produce a histogram without a temp table? The hurried atmosphere of many companies leads to a "get it done right now--even if it's far from perfect" attitude that ends up wasting a lot more time than it saves. The above example code block took me around 10 minutes to put together. A single-query version took me under two minutes to code. It is less than a third of the length, runs approximately 500 times faster, and uses 0.4% of the resources. All because I spent a couple of moments reflecting on where I was going before I took the first step. If you find yourself exploiting this anti-pattern, step back and question whether this code will have a life beyond the current query window. If it will ever be checked into source control, it's probably a good idea to go back and do some cleanup. If you find yourself tasked with maintaining code that looks like what I've posted, my suggestion is to simply re-write it from scratch. I was recently faced with a script containing over 2000 lines of this kind of thing, and I spent almost two days slowly working my way through the mess trying to make sense of it. On the evening of the second day, after talking with some of the shareholders, I realized that it was actually a simple problem to solve. One hour later and I had a new, totally functional solution -- a couple of hundred lines long, and several orders of magnitude faster. Sometimes it's best not to wade through a muddy puddle, when you can simply hop right over.
|
-
Lazy developer that I am, I just hate running installers to set up VHDs for the SQL Server 2008 CTPs. So I was overjoyed when Microsoft did the work for me and released a pre-installed VHD image for CTP4.
CTP5, alas, did not ship with a VHD, forcing me once again down the path of the dreaded installer. But today I'm happy to report that Microsoft has once again come through for those of us who simply can't be bothered to click the "Next" button; a VHD for CTP5 has appeared on the Microsoft Download Center.
So download, enjoy, and think of all of the effort you'll save by not having to install...
|
-
I found Linchi's recent post on use of cursors in the TPC-E test to be quite interesting. The question is, why are cursors used in the test when the commonly accepted notion within the SQL Server community is that cursors are a bad thing? I've posted in the past about situations where cursors were actually faster than set-based queries. But in this case I just don't see it; cursoring over an input set to do an update? There's no way that's going to be faster. Greg Linwood commented in Linchi's post that "indexed cursors run just fine for most purposes". And although I have loads of respect for Greg and his opinions, I just can't agree in this case. I did a few tests on my end just to make sure, and indexed or not, even for the simplest of of queries, cursors perform at least a few times more slowly than their set-based equivalents. Greg mentioned in this comment that the SQL Server engine executes even set-based queries "internally using cursor style processing", but a loop in the query processor's code is clearly not the same as a T-SQL cursor. The query processor is optimized internally to process data without having to pass it around to different spots in memory or switch context, whereas with a cursor the data is transferred into local variables and your code has to constantly ask the query processor to go back and get some more. This is extremely expensive, which is why even in my experiments with situations where you can see superior performance with cursors, I found that a SQLCLR cursor--which doesn't have to do nearly as much work as a T-SQL cursor--is vastly superior.
I'll close with a simple example. The following two batches each run in AdventureWorks, and indexes are irrelevant in both cases. See for yourself which is faster. --Set-based SELECT SUM(Quantity) FROM Production.TransactionHistory GO
--Cursor-based DECLARE @q INT INT @t INT SET @t = 0
DECLARE c CURSOR LOCAL FAST_FORWARD FOR SELECT Quantity FROM Production.Transactionhistory
OPEN c
FETCH NEXT FROM c INTO @q
WHILE @@FETCH_STATUS = 0 BEGIN SET @t = @t + @q
FETCH NEXT FROM c INTO @q END
CLOSE c DEALLOCATE c
SELECT @t GO
|
-
Today I gave two talks at New England Code Camp 8. A fun experience as always, and for those of you who were in my talks and are looking for decks/code, please see this post and this post from when I did slightly different versions of the same talks earlier this year as MSDN Webcsts. I am not quite ready to publish the decks I used today.
But the topic of this post is not so much the code camp as an observation about what I saw there. Recent posts by both of our resident Andys (Kelly and Leonard) share the theme of organizations treating their database staff as next-to-worthless. And developers, in general, seem much more interested in other facets of development than all of that "database stuff."
Today's code camp proved this once again; my two talks were both quite lightly attended, even though I was talking about important issues around data security and exception handling--things that any developer working with data should get. Perhaps it's just me, but the evidence says otherwise: after my talks I peeked into a few others and found a standing room only session on Silverlight and a session on LINQ to SQL that had a comparable number of attendees to what I'd had.
Why is it that data, while the foundation of any business application, is not a draw to the developer masses? How can we ignore the data and instead focus on creating spiffy new UIs (to display flawed data, no doubt)? Perhaps data seems easy--if you know how to write a query and set up an ADO.NET connection, that's all you need, right? Or perhaps data is just someone else's job--just let the DBA or database developer handle it and display anything that comes back, flawed or not. It's not your problem, you're a UI developer. But everyone can't be a UI developer, can they? Someone has to take control of the data.
Bad data can and does lead to project failure. If you're a UI developer you're going to get canned just as quickly as the DBA if you're project is no longer being funded--so if your UI displays bad data, you are just as guilty as whomever designed the database that returned it! If you're a business tier developer, you are just as responsible for data validation as the database developer!
Alas, if you're reading this post you're already one of the converted. This is SQLblog.com, so you obviously care enough about your data to read up on it a bit more. But as developers who know the value and importance of data, it is our job to spread the data gospel. Data issues around security, validation, and performance are every developer's responsibility.
|
-
... I know I do. How many times have you seen the procedure cache bloat, for no good reason, because of badly designed applications? How many times have you been frustrated by the fact that SQL Server handles this in a relatively boneheaded way, and just keeps growing it and growing it--an especially huge problem on 64-bit systems? So far I've not had great luck with Connect, but I figured I'd try again. This is something we need now. So if you are concerned about this issue, please vote...
|
|
|
|
|
|