THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

SELECT Hints, Tips, Tricks FROM Hugo Kornelis WHERE RDBMS = 'SQL Server'

  • Misconceptions on parameter sniffing

    In my previous post, I explained the basic of parameter sniffing, and then built on that to explain the lesser known mechanics of variable sniffing and cardinality sniffing. However, I also sneakily inserted a few comments on misconceptions about the performance impact of parameter sniffing, with the promise to explain in more detail later. Well … it is later now, so here is the promised explanation!

     

    Parameter sniffing’s bad rep

     

    Lots of people in the SQL Server field will claim that parameter sniffing is a bad feature. That is not the case. Just think about it for a moment – the feature has been in the product since at least SQL Server 2005, probably even earlier. If it were truly a bad feature, then surely just removing the feature would have been a very simple way to improve the product. But Microsoft has done no such thing, and whatever horror stories on Microsoft you believe, they have not become one of the world’s biggest companies by foolishly refusing to take opportunities to improve their software for almost no investment.

     

    However, there is no denying that parameter sniffing can cause terrible performance issues. I will explain by using a slightly modified version of the example that my good friend Grant Fritchey likes to use. (Note that I used the AdventureWorks2012 sample database, but other versions of AdventureWorks should expose the same behaviour).

     

    CREATE INDEX ix_Addess_City

    ON Person.[Address] (City);

    GO

     

    CREATE PROC dbo.GetAddressByCity

                @City nvarchar(30)

    AS

    SELECT      a.AddressID,

                a.AddressLine1,

                a.AddressLine2,

                a.City,

                sp.[Name] AS StateProvinceName,

                a.PostalCode

    FROM        Person.[Address] AS a

    INNER JOIN  Person.StateProvince AS sp

          ON    sp.StateProvinceID = a.StateProvinceID

    WHERE       a.City = @City;

    GO

     

    EXEC dbo.GetAddressByCity @City = 'Mentor';

    SET STATISTICS IO ON;

    EXEC dbo.GetAddressByCity @City = 'London';

    SET STATISTICS IO OFF;

    GO

     

    DROP PROC dbo.GetAddressByCity;

    GO

     

    DROP INDEX ix_Addess_City ON Person.[Address];

    GO

     

    Since the tables used in this example are fairly small, you will not actually notice any slowness. But if you look at the output generated by the SET STATISTICS IO option, you should see that there actually is a serious issue with this query. The second execution of the stored procedure takes 871 logical reads on the Address table, and 868 logical reads on the StateProvince table – insane numbers when you realize that the entire Address table is stored in just 216 pages, and StateProvince is even just 3 pages in size!

     

    If you look at the execution plan, you will start to see why this is the case:

     

    image

     

    An index seek is used to find Address rows matching the supplied parameter, which are then retrieved in full by the index seek, and then a clustered index seek in StateProvince is used to do the join. This type of plan is very good when the filter is expected to be very selective, which is the case for Mentor (with just one matching row), the value that was sniffed when the plan was compiled during the first call. For the second call, the same plan was then reused, but because the filter on London (with 435 matches) is far less selective the plan is now actually quite bad.

     

    The biggest issue with such cases of bad performance caused by parameter sniffing is that they tend to be hard to troubleshoot. The performance of the GetAddressByCity stored procedure depends on the value that happens to be passed in on the first execution after the previous plan cache entry was invalidated or removed. But most of the common reasons for plan invalidation are beyond our control, so to the user of the system it can appear that the performance of this procedure changes at “random” moments. And obviously, once the DBA is ready to respond to the ticket a new recompile might already have occurred and the performance might be find again.

     

    Another very common cause for bad parameter sniffing is the use of a single generic procedure to return rows based on a series of parameters that are all considered optional for the search. A simple example would be a procedure dbo.GetAddressByCityOrPostalCode that uses two parameters @City and @PostalCode, and that has a query with a WHERE clause such as “WHERE (City = @City OR @City IS NULL) AND (PostalCode = @ PostalCode OR @ PostalCode IS NULL)”. When this stored procedure is first called with @ PostalCode = ‘W10 6BL’ and @City = NULL, a plan will be compiled and stored in plan cache that is optimal for this combination of parameters. If a later execution uses @PostalCode = NULL and @City = ‘Mentor’, then the results returned will still be correct but the performance will likely be bad, because the query is still executed using a plan that is optimized for a search by PostalCode only. For instance, it could use a plan that uses an Index Seek in an index on PostalCode to find ows matching the second part of the WHERE (which would find all rows on the second execution), then use a Key Lookup to find the rest of the data and then filter down on City = ‘Mentor’. For this type of stored procedure you really want multiple plans stored and the one that is best for the current parameters used, but that’s just not how SQL Server works.

     

    Bad parameter sniffing in a production system can be hard to troubleshoot. The problem only materializes when a procedure that is susceptible to bad parameter sniffing is recompiled, and the parameter values sniffed during that recompile happen to cause a plan that is bad for many other values. This is especially rare when the issue is caused by  skewed data distribution, because the really bad performance tends to be caused by a recompile for a rare outlier value, and these are not often used anyway. However, when it does hit you, it often hits you hard. When I see a database application that usually performs well but that at unexpected and unreproducible moments suddenly starts to perform bad (and then sometimes reverts to good performance later, again for no apparent reason), I always have bad parameter sniffing high on my list of likely root causes. The seemingly random switch from good to bad performance can be due to a recompile with a value that causes a bad plan. These occur at semi random moments because there are lots of reasons that can trigger a recompile (metadata change, (auto) update of statistics, memory pressure), and some of them are beyond our control.

     

    The fanboys (and girls) are wrong

     

    Many of the experts in the SQL Server community try to combat the misconception that parameter sniffing is bad. But they unfortunately use the wrong arguments. They say that yes, there are cases of “bad” parameter sniffing that you will have to work around (which is indeed correct), but they then also claim that apart from these exceptions parameter sniffing is “usually” or even “always” good. I have heard and read such claims from some very smart and experienced people that I personally hold in high esteem, such as Brent Ozar, Grant Fritchey, Gail Shaw, and many others. And I myself have also held and spread this belief for a long time. Until recently when I was trying to find a few examples to actually illustrate the benefit of parameter sniffing – and failed to find any!

     

    I then realized that from a parameter sniffing point of view, there are just two types of queries: those that are affected by parameter sniffing and those that are not. A lot of queries fall in the “not affected” category. I already gave some examples above:

    1.      If a query filters on the primary key, the estimated rowcount when based on a sniffed value will be 1, but the estimated rowcount for the same query without sniffing (eg when using it as an ad hoc query with a variable) will also be 1 – so we’d get the same plan without sniffing as we do with sniffing.

    2.      If a query filters on a key with a fairly even data distribution, for instance with every value occurring somewhere between 90 and 110 times, then the estimate for a sniffed value can be any value between 90 and 110, and the estimate that we would get without sniffing, based on the average number of rows per value, would probably be around 100. It is extremely unlikely that such small changes in the estimate would cause a big difference in the execution plan. And in the rare cases where they do have an effect, see the below discussion on queries that are affected.

     

    The only way for a query to fall in the “affected” category is when the column has a skewed data distribution. (Or, rare, to have values at a frequency around the threshold value between two alternative plans). In that case, there will be at least two values (A and B) for which a different plan is optimal. And here’s the rub. When the value A is sniffed, the plan is cached, and the procedure is then executed with B, the cached plan will be sub-optimal for that value. And vice versa. So the parameter sniffing in this case will always be bad. (You might argue that it is possible that the plan for A and the plan for B might both be better than the plan for the generic average rows per value; I’d have to agree that in theory this might be possible but I have never ran into, nor been able to construct, such a query).

     

    Good parameter sniffing

     

    Now that I have shown how parameter sniffing is usually irrelevant and sometimes bad, you might wonder why the feature exists at all. The answer is that there are some query patterns where parameter sniffing does actually help – and one of them is common enough that I would never recommend removing this feature from the product.

     

    Lots of tables in lots of databases collect data over time. Orders keep being added in sales systems, banks keep processing transactions, and sensor data continues to poor into a table of readings. Those tables almost always have a column that timestamps the row (e.g. OrderDate, TransactionDate, MeasurementDate). And you will often find stored procedures that allow the user to select a recent subset of data by passing in a @ThresholdDate parameter – so that the results can be limited to e.g. the last hour, the last week, or the last 15 days. So the query in the stored procedure will filter on for instance WHERE OrderDate >= @ThresholdDate.

     

    This is the type of stored procedure that will usually benefit hugely from parameter sniffing. The predicate uses inequality. Without sniffing the value, that type of filter is one of the hardest to get right for the cardinality estimator – the value can be November 3rd, 1848, or the Year 2525, or anything in between – and so the number of matching rows can be anything from the entire table to nothing. The cardinality estimator simply uses an estimate of 30% matching rows for this case. If the stored procedure is typically called to fetch just the last few days of data out of a table containing eight years of history, then that estimate is going to be horribly wrong.

     

    This is one case where parameter sniffing really shines. It does not matter whether the sniffed value will be for a one-hour period, a one-day period, or a two-week period; in all these cases the plan based on that value will probably be the same, and definitely better than the plan you’d get for a 30% estimate. And if the cached plan is for two weeks and you next run the query for one hour, you’d still get the better performance. This single case is so common, and can have such huge impact on performance, that this use case on itself is already sufficient for me to defend the process of parameter sniffing – in spite of the occasions where it hurts, and the many cases where it’s irrelevant.

     

    Another case of good parameter sniffing, far more esoteric, is if you have just the right combination of both a skewed data distribution and a skewed query pattern. Let’s return to Grant Fritchey’s example based on cities. This data distribution is skewed because there simply are far more people living in a city such as New York or Tokyo then there are in small villages such as Hum or Adamstown, so a stored procedure that filters on City tends to be subject to bad parameter sniffing. But what if you are working for a university and the researchers are working on a project involving only the inhabitants of very small villages? In that case, when the optimizer sniffs a value it will always be a small village and only a few rows will be returned, and again the plan based on any possible sniffed value is always going to be better for every value that is typically based, then a non-sniffed plan would be.

     

    When good turns bad

     

    You have to be aware, though, that good parameter sniffing, bad parameter sniffing, and irrelevant parameter sniffing, are all the same mechanism. The parameter sniffing remains the same, it’s the circumstances that make it go good or bad. And that’s something to keep in mind especially with code that relies on good parameter sniffing.

     

    Let’s look once more at the Orders table with eight years of history, that is only ever queried for the most recent day, week or two week period. What happens if, one day, someone does actually submit a report to do some trend analysis on the last 5 years? The best and most likely result, in this case, would be that the cached plan, based on a previously sniffed parameter for a short period, is used. In that case, the report would be unbearably slow – the plan for the short period probably involves Nested Loop joins and Index Seeks, and running a few billion rows through that can take hours, or even days. The submitter of that query would suffer from bad parameter sniffing. But a worse scenario is also possible. It the previous plan has just been invalidated, a new plan will be created based on a sniffed value of five years ago, and now the plan would probably change to use scans. All other calls to the stored procedure will now start to use this plan as well, so that all those requests for short-term periods now run longer. The increased number of table scans will also start to affect overall system performance. The buffer cache fills up whenever the table is scanned so all other activity on the system has to reread from disk instead of run from cache. In short, the overall performance of the system as a whole would slow down, and it might take a long time to find the actual cause – and all that time, you would be losing sales because the web shop is now too slow for your impatient customers.

     

    Even in a “good parameter sniffing” scenario, you should still be aware of the potential of bad parameter sniffing occurring, and take appropriate action to try to prevent it.

     

    Dealing with bad parameter sniffing

     

    If parameter sniffing is sometimes good, sometimes bad, and most often irrelevant, then the obvious question is: “how can I get the good without having to accept the bad as well”? And the good news is that this is possible. There are a lot of ways that allow you to deal with parameter sniffing, and I will briefly describe the options. (Note that most of them are described in much more detail elsewhere on the internet).

     

    The most brutal way of preventing bad parameter sniffing is to disable it completely. I do not like this method. To me, it sounds like torching down your house to prevent it from being burgled. But if you insist, you can: activating trace flag 4136 will disable all parameter sniffing (good, bad, and irrelevant) on the entire instance. And yes, it is documented and supported.

     

    A popular technique that is mostly found in old articles is to use local variables – this used to be one of the best options before SQL Server 2008. Declare a variable in the stored procedure for each parameter, assign it the value of the parameter, and then use that variable instead of the parameter in the rest of the code. The result is that SQL Server has to optimize the queries based on the variable, which it cannot sniff, so you always get a plan based on the generic data distribution rather than any sniffed value. I recently saw a recommendation somewhere to always do this for every parameter in every stored procedure – which has the same effect as using trace flag 4136, but with a lot more work. Seeing that recommendation was one of the things that triggered me to write this post. If you still need to support SQL Server 2005 or older, and if you only use it for specific parameters in specific stored procedures that are known or expected to get bad parameter sniffing, then I guess that using the local variable method can be okay. Using it in newer versions, or using it as a generic guideline for all parameters and stored procedure, is not correct.

     

    Since SQL Server 2008, we can use OPTION (OPTIMIZE FOR (@parameter UNKNOWN)) to force SQL Server to disregard any sniffed value and optimize the query as if the value for @parameter is not known. This has the same effect as using a local variable, except it does away with the extra overhead and you can limit it to just a single query in a multi-statement stored procedure. For that reason, I prefer this method over local variables. This method works very well in situations where the plan based on generic statistics is good enough for all possible cases. These tend to be cases where irrelevant and bad parameter sniffing can occur. In situations with good parameter sniffing, the generic plan you get with this option tends to be not good enough.

     

    Already since SQL Server 2005, you could use OPTION (OPTIMIZE FOR (@parameter_name = 'value')) to force SQL Server to optimize as if the value provided was sniffed. The risk of this method is that people tend to forget to maintain the value in the hint. For example in the case of retrieving the latest rows from the Orders table, this hint with a value of October 20, 2016 might work very well at this time – but if I forget to regularly update the code with a newer value, then that same hint will actually start to hurt performance in a terrible way after a few years, a year, or perhaps already after a few months. (And unfortunately, I have seen code in similar situations where a hint like this, for a datetime column, had not been changed for over three years!)

     

    Perhaps the best solution, depending on scenario, is to use multiple stored procedures, with one master to determine which one to execute. This can work for scenarios such as the Orders table (with one procedure to search in only recent data and another to search for longer time periods, and a master calling the right one), or for generic search procedures with a limited number of possibilities (with two optional search arguments you have four possible combinations and you can create four procedures; with five parameters you have 32 possible combinations and you do not want to create that many almost similar procedures – but you could check how many combinations are actually really used). It will not work in situations with skewed data, unless you happen to know all the outlier values.

     

    Another possible solution that works especially well for the optional search case is to use OPTION (RECOMPILE). As shown in my previous post, this enables SQL Server to sniff the variable. And since this is a different process (because the optimizer does not have to cater for later reuse of the same plan with different values), this can give you tremendous benefits, even up to the optimizer removing entire joins from the plan if they are not needed for the current set of values. But there is a price to be paid: every time the stored procedure executes the query has to be compiled, which is a relatively expensive process. Depending on how often the procedure executes and how complex the query is, just the CPU and memory cost of the recompilations could cause serious issues to the overall performance of your server. My recommendation for this hint is to definitely use it where it makes sense, and equally definitely not use it anywhere else.

     

    And finally, specifically for the procedure with optional search arguments, you could change the stored procedure to use dynamic SQL. This can be very dangerous when not done appropriately, but if you really know what you are doing it can be a viable alternative. You would have to construct a query string that contains only the predicates (and preferably also only the joins) that are needed based on the parameters passed it. You should not hardcode the parameter values themselves in the query string, nor allow any other form of user input to find its way in the parameter string. So the conditions in the dynamically generated query string would still be of the form WHERE Column1 = @Param1 AND Column3 = @Param3 (if only @Param1 and @Param3 have a value), and you should then use sp_executesql to execute the SQL and pass in the correct parameter values. For the parameter sniffing process, using sp_executesql is exactly the same as calling a stored procedure, but now with the query string instead of the procedure name as the “owner” of the plan. The values will be sniffed, a plan will be created and stored in cache, and the plan will be reused if the same query string is executed again – which in this case is okay, because that means the same combination of set and not set search parameters was used. For a stored procedure with five optional parameters, this might eventually result in all 32 possible permutations of the query being in the plan cache with a plan that is optimal for that permutation. But without you having to write all 32 possible versions of the query. However, be aware of the danger of SQL injection, and be aware that there may be issues with permission settings if you use dynamic SQL. I recommend Erland Sommmarskog’s excellent articles on dynamic search conditions and on using dynamic SQL if you consider using this method in your production code.

     

    How about the other sniffings?

     

    In my previous post, I described three forms of sniffing. In this post on good, bad, or irrelevant, I have so far only focused on parameter sniffing (which includes, as briefly mentioned above, the use of sp_executesql). So what about variable sniffing and cardinality sniffing?

     

    To start with the latter: since cardinality sniffing is really just a special case of parameter sniffing, all I wrote above applies equally to cardinality sniffing. It can be good (in some cases), it can be bad (in other cases), and it will in most cases probably have no impact. When it is bad, then OPTION (RECOMPILE) is probably going to be your only useful instrument, and if that causes too much compilation overhead you might have to rearchitect your solution to avoid this problem. If you are in charge of a database that uses table-valued parameters, and after reading this post you realize that you are benefiting or suffering from cardinality sniffing, then I’d love to hear from you in the comments section!

     

    For variable sniffing, none of the above applies. Since variable sniffing only occurs when you use OPTION (RECOMPILE), the plan will always be optimized for the current values, and it will never be reused. There is no possibility at all of bad variable sniffing happening, since the whole concept of bad sniffing is that a plan created for one value is reused for another value, which simply never happens with variable sniffing. In other words, variable sniffing will only ever be irrelevant or good, never bad – but it will also always be associated with the cost of OPTION (RECOMPILE), since that is the only condition that triggers it.

     

    Conclusion

     

    In the previous post, I described three forms of sniffing: the relatively well-known parameter sniffing, the lesser known variable sniffing, and the hitherto undocumented cardinality sniffing. In this post I tried to prove that both the doomsayers who claim that parameter sniffing is the Absolute Evil (TM), and the fanboys who appear to be in total love with parameter sniffing, are incorrect.

     

    Parameter sniffing is actually usually totally irrelevant. But it still has a reason to exist, because in at least one of the cases where it does make a difference, it can provide a huge performance benefit. That does mean, though, that we will also have to accept a few cases where it can hurt.

     

    Because the pain caused by bad parameter sniffing can be immense, I also walked through some of the more common fixes for these problems, describing their benefit and cost and explaining when you would or would not use them. I then also explained how bad, good, and irrelevant sniffing applies to variable and cardinality sniffing.

     

    I hope that these posts help you understand what sniffing is, how it can help or hurt, and how you can fix the latter without losing the former.

  • The sniffing database

    Your SQL Server instances, like people with hay fever that forget to take their antihistamines during summer, is sniffing all the time. Sniffing is a trick employed by the optimizer in an attempt to give you better execution plans.

     

    The most common form of sniffing is parameter sniffing. Many people know about parameter sniffing, but there are a lot of misconceptions about this subject. I have heard people describe parameter sniffing as a bad thing, and I know people who claim that parameter sniffing is mostly good with some exceptions that they then call “bad parameter sniffing”. Neither of these statements is true; in reality parameter sniffing (and the other forms of sniffing) are sometimes good, sometimes bad, and very often irrelevant. I will explain this in more detail in a later post – this post focuses on explaining what parameter sniffing actually is, and on what other forms of sniffing exist.

     

    Yes, other forms of sniffing. Under the right conditions, SQL Server will also sniff variables and cardinalities. Most SQL Server developers and DBAs appear to be blissfully unaware of variable sniffing, and the few speakers and authors that do mention it tend to get it wrong. And cardinality sniffing is, as far as I know, completely undocumented. I have mentioned it a few times in some of my presentations, but never written about it – and I have never seen or heard anyone else describe this unique type of sniffing.

     

    Parameter sniffing explained

     

    To understand parameter sniffing, you have to know a bit about how SQL Server compiles queries into execution plans. When a query batch is submitted (either through an ad-hoc query tool such as Management Studio or the sqlcmd utility, or submitted from a client application through e.g. the ADO.Net library or JDBC), SQL Server will first try to avoid the (expensive) compilation process: it checks the plan cache to see if the same plan has been executed before and the plan is available. If that is not the case, then SQL Server will parse the entire batch, compile execution plans for each of the queries in the plan, store them in the plan cache. After that, all of the plans for the batch (either taken from the plan cache, or compiled, stored in the plan cache and then taken from it), are executed, in sequence or out of sequence as dictated by control-of-flow logic in the batch.

     

    While compiling the query, the optimizer uses statistics about the data in the tables to estimate how many rows will satisfy any given condition. A condition such as “WHERE Age = 42” on its own is pretty broad. When you understand the data it operates on, your perception of the condition will change: in a database on men in a mid-life crisis, odds are that a rather high percentage of the rows will match; the same condition in the student database of a community college should generate at most a handful of hits. This reflects in the statistics that the optimizer uses, so the same query condition can result in different plans depending on the data distribution.

     

    When the condition uses a variable (e.g. “WHERE Age = @Age”), then SQL Server cannot use the statistics in the same way. When the query is optimized, the optimizer knows the data type of the variable (because the parser has processed the DECLARE statement), but not the value, because it has not been assigned yet; the assignment occurs when the batch executes, after the optimization process. The optimize will still use some statistics, but not for a specific age; instead it looks at the number of distinct values used and assumes that the data is evenly distributed. So for a kindergarten database, the number of distinct values for Age would probably be three (5, 6, and 7), and the optimizer would assume 33% matching rows for any value of @Age passed in; for the US census database that same Age column would have over 100 distinct values, and the optimizer would estimate that less than 1% of the rows will match the condition for any value of @Age.

     

    A parameter looks for most purposes exactly like a regular variable. The difference is that a parameter is declared in the header of a separately executable code unit: a stored procedure, scalar user-defined function, or multi-statement user-defined function. (And since you should as a rule not use the latter two, I’ll use a stored procedure for my example). The optimizer treats the body of a stored procedure like an ad-hoc batch: when the procedure is invoked the plan cache is first checked, and when no cached plan for the procedure is found it is generated and then stored for future reuse. The key difference is how parameters are treated. To see this in action, run the below script in the AdventureWorks2012 sample database (though it probably also works in other versions of AdventureWorks), with the option to show the actual execution plan enabled. It contains four batches, to create a sample stored procedure, invoke it twice, and then drop the procedure again. The discussion below will focus on the second and third batches, with the two EXEC statements.

     

    CREATE PROC dbo.ParameterSniffingDemo

                    @ProductID int

    AS

    SELECT  SUM(OrderQty)

    FROM    Sales.SalesOrderDetail

    WHERE   ProductID = @ProductID;

    GO

     

    -- Run the procedure, then check the execution plan.

    EXEC dbo.ParameterSniffingDemo @ProductID = 898;

    GO

     

    -- Run the procedure again, for a different product.

    EXEC dbo.ParameterSniffingDemo @ProductID = 897;

    GO

     

    -- Clean up

    DROP PROC dbo.ParameterSniffingDemo;

    GO

     

    The second batch in the query above, like any other batch, is first parsed and compiled. It is important to be aware that only the batch itself is compiled at this time. Once the compilation is done, SQL Server executes the EXEC statement: it sets the parameter value to 870 and then passes control to the stored procedure. Since the procedure was just created, there is no plan in cache yet, so at this time the compiler is once more invoked to generate an execution plan for the procedure. If you read this sequence of events carefully, you will realize that, unlike “normal” variables, the value of parameter @ProductID has been set before the compiler is invoked. The optimizer can read this value to use the specific statistics for ProductID 870 instead of the generic statistics it would use for a normal variable, to create an execution plan that is optimal for this “sniffed” value.

     

    The third batch invokes the stored procedure again, with a different value passed into it. The batch itself is different from the second batch, so this batch, too, will be first compiled and then executed. And again control will pass to the stored procedure when execution starts, after setting the parameter to its value. But now the compiler will see that there already is an execution plan available for the stored procedure in the plan cache, so instead of invoking the (relatively expensive) optimization process again, it will reuse the existing plan.

     

    image

     

    The picture above shows the execution plan of the third batch (the second execution plan in the output, as the CREATE PROC and DROP PROC statements do not generate any actual execution plans). Evidence of the parameter sniffing process is present in the plan, but not directly visible in the graphical representation. To see the evidence, you will have to right-click the top left SELECT operator, and then click “Properties” from the context menu. This brings up the full list of properties, as shown below:

     

    image

     

    The “Parameter List” section is collapsed by default; I have already expanded it in the screenshot above. This is where you can see exactly what happened: there is one parameter in this query; it had a value of 898 (“Parameter Compiled Value”) when the plan was compiled and stored in cache, and during this specific execution the value of the parameter was 897 (“Parameter Runtime Value”).

     

    Variable sniffing

     

    Above I explain the general process of SQL Server compiling batches and entire stored procedures fully before execution starts; (non-parameter) variables do not have a value at compile time so the optimizer can only use more generic statistics. I left out a more specific part: statement level recompiles.

     

    SQL Server may discard old execution plans and restart the compilation process for various reasons. These reasons fall generally in one of three categories: correctness (any change that might cause the existing plan to fail or to return incorrect data, like dropping an index that might be used in the plan, modifying a table, changing a constraint that might have enabled a plan simplification, etc); performance (any change that might enable SQL Server to find a much faster plan, like adding an index, rebuilding statistics, adding constraints, etc); or because you tell it to (by explicitly clearing the plan cache, or by using hints).

     

    Some of the reasons for recompilation can occur while a multi-statement batch or stored procedure executes. SQL Server detects that before the next statement starts, and then that statement only will be recompiled. This leads to a rather unique situation: execution has started, so now the variables do have a value, which enables the optimizer to sniff that value just as it sniffs parameters. Except … well, it doesn’t. At least not always. In fact, there is only one very specific case where SQL Server will sniff the variables: when a statement-level recompile occurs due to an explicit hint.

     

    Here is a very simple example, again using AdventureWorks2012 (and again, probably working in other versions of AdventureWorks as well):

     

    DECLARE @Variable varchar(20);

    SET @Variable = 'B';

     

    SELECT  FirstName, LastName, Title

    FROM    Person.Person

    WHERE   LastName < @Variable

    OPTION (RECOMPILE);

     

    SELECT  FirstName, LastName, Title

    FROM    Person.Person

    WHERE   LastName < @Variable;

     

    If you copy this query and, without first running it!, request an estimated execution plan, you will see the plan the optimizer creates when first compiling the batch:

     

    image

     

    While the batch is being compiled, the optimizer does not know the value of @Variable so it has to use a broad assumption – in the case of this type of inequality filter that assumption is that 30% of the rows will match, so the plan for the queries is based on an estimate of almost 6000 rows. The RECOMPILE hint on the second query does not affect the initial compilation of the batch; both queries get the exact same plan.

     

    If you then actually execute the batch, with the option to include the actual execution plan enabled, you will see this:

     

    image

     

    The first plan has changed. The RECOMPILE hint forces SQL Server to recompile this query, and because this is a statement-level recompile, the SET statement has already executed. SQL Server can sniff the value ‘B’ in @Variable, use that to estimate that only about 900 rows will match, and then generate a plan that is optimized for that lower number of rows. You can play around with the value in @Variable to see that the plan used for the first query can actually change depending on this value.

     

    The plan for the second query will never change. There is no RECOMPILE hint, and there were no other reasons for SQL Server to recompile this statement, so this query will always use the plan that was generated when the batch was compiled.

     

    Fun fact: if you request an estimated execution plan after running the query, you will get the last plan that was actually used. That is because the statement-level recompile will update the plan cache entry for this batch with the new execution plan. The requested estimated execution plan will be presented to you from this plan cache entry. When you actually run the batch again, the initial compilation of the batch will also use the cached entry without redoing the full compilation process, but at run-time the RECOMPILE hint will still trigger a recompilation of the first query only – which will now result in the same plan again, unless you modified the data in the Person.Person table. This is a very important consideration. A lot of people think that adding OPTION (RECOMPILE) to a query results in the execution plan not being cached; this is not true. Those queries still result in the same memory footprint on the plan cache, with the additional cost of not only compiling the query whenever it runs, but also updating the plan in the cache on every execution.

     

    Another interesting observation can be made when you right-click the SELECT operators on the two plans above and look at the properties. The second plan, which was not recompiled and sniffed, does have a Parameter List section in the plan, but in this case there is only a Parameter Runtime Value, no Parameter Compiled Value:

     

    image

     

    (The term “parameter” is confusing; within the context of execution plan properties SQL Server uses this term for every variable that is used in the query, regardless of whether they are actual parameters, or “normal”, non-parameter variables).

    For the first plan, the properties of the SELECT operator do not include a Parameter List section at all, which is even more confusing – this was the section where we saw evidence of parameter sniffing, and now that a variable is sniffed there is no evidence at all! And how can SQL Server even produce the correct results without embedding the variable and its runtime value in the plan? To get an answer to that question, we’ll have to look at the properties of another operator, the Index Seek (NonClustered) at the far right. And in this case we do not even have to open the full property list, we can just hover our mouse over the operator and wait for the tooltip window to pop up:

     

    image

     

    This shows that the plan that is used for the statement with the OPTION (RECOMPILE) hint does not reference @Variable at all. What actually happened is that, during the statement level recompile, the parser inspected the current value of @Variable and then replaced all references to this variable in the query (or rather, in the internal representation of the query) by the value. The input that was finally received by the optimizer was exactly the same as if we had submitted the query below:

     

    SELECT  FirstName, LastName, Title

    FROM    Person.Person

    WHERE   LastName < N'B'

    OPTION (RECOMPILE);

     

    For “normal” parameter sniffing, this would not be a safe implementation. The next call to the stored procedure could pass a different parameter value, and if the sniffed value were hardcoded in the plan, then the results would obviously be incorrect. That’s why a plan with parameter sniffing must keep the parameter; it optimizes for the sniffed value, but will also produce correct results for other values. In the case of variable sniffing, there is no need to keep the variable as a variable in the execution plan. A cached plan will only be reused if there is a full textual match on the full query text, which includes the OPTION (RECOMPILE) hint – and that option guarantees that the cached plan will be overwritten with a new plan rather than be reused. So the choice to implement variable sniffing differently, by directly injecting the values into the plan, is safe.

     

    But that same safe choice also results in a loss of variable sniffing in other scenarios. When a statement-level recompile occurs for a different reason, for instance because enough rows were added to a table to trigger an automatic update of the statistics on that table, then no sniffing is done. This makes sense when you consider what would otherwise happen. Suppose I have a batch and the third statement uses a variable; just before it executes the threshold for automatic statistics update is hit so new statistics pop up and the optimizer recompiles the query. If it would sniff the variable, it would hard-code the sniffed value in the plan, then replace the old cached plan with the new plan. Next time I execute the same batch, the variable can have a new value – but the text of the batch has not changed, and it is unlikely that there will be a statement-level recompile for another reason. So the updated plan from the cache, that now includes a hard-coded value, would be used – but that value is no longer correct and wrong results would be returned. That is of course not acceptable, and for that reason SQL Server will only sniff variables when a statement-level recompile is forced by the OPTION (RECOMPILE) query hint.

     

    In theory, it would have been possible for Microsoft to implement a second form of variable sniffing, using the same method as for parameter sniffing. In reality, that is not the choice Microsoft made. In order to ensure that a statement-level recompile that is not caused by OPTION (RECOMPILE) produces a “safe” plan, the variables are simply not sniffed. The new plan will be based on the new statistics, but it will still use the generic estimates.

     

    Cardinality sniffing

     

    Table variables behave in many ways exactly the same as normal variables. This includes the optimization process. So when a batch or stored procedure is submitted, the parser interprets the declaration of the table variable and then compilation starts without any data being in the table variable. Based on actual reality at the time of compilation, one might expect an estimate of zero rows. However, the optimizer always assumes at least one row (unless the query includes a filter that cannot possibly be true). If you run the query below, you will see that the first two SELECT queries both have an estimated rowcount of 1, even though the actual rowcount is 0 for the first query and 19972 for the second.

     

    DECLARE @TabVar table

           (PersonID int NOT NULL PRIMARY KEY);

     

    SELECT  PersonID

    FROM    @TabVar;

     

    INSERT  @TabVar

    SELECT  BusinessEntityID

    FROM    Person.Person;

     

    SELECT  PersonID

    FROM    @TabVar;

     

    SELECT  PersonID

    FROM    @TabVar

    OPTION (RECOMPILE);

     

    In the case of the simple queries above, this huge mistake in cardinality estimation does not affect the execution plan. But in more complex queries this can often result in very slow running queries. This is one of many reasons why most people tend to prefer temporary tables over table variables unless they know for sure that there will never be more than a handful of rows involved. This problem can be avoided by adding OPTION (RECOMPILE), as shown in the last SELECT above. This forces a statement-level recompile and now the actual number of rows in @TabVar can be used by the optimizer. This can often help prevent execution plans with terrible performance, but at the price of a full compilation for the statement every time it executes.

     

    All of the above is pretty well known and described at many places; I only include this basic information because it is important to understand what “cardinality sniffing” is. Cardinality sniffing is related to parameter sniffing, because it only relates to parameters passed into stored procedures (and other executable code modules). It is also related to table variables. In fact, cardinality sniffing is specifically related to table-valued parameters – table variables passed as a parameter into a stored procedure. Here is an example:

     

    -- Define a type for the table variables

    CREATE TYPE TabType AS TABLE

           (PersonID int NOT NULL PRIMARY KEY);

    GO

     

    -- Create a stored procedure that uses the table variable

    CREATE PROC dbo.SniffCardinality

           @InputTable TabType READONLY

    AS

    SELECT  PersonID

    FROM    @InputTable;

    GO

     

    -- Create and populate a table variable

    DECLARE @TabVar AS TabType;

     

    INSERT  @TabVar

    SELECT  BusinessEntityID

    FROM    Person.Person;

     

    -- Invoke the stored procedure the first time

    EXEC dbo.SniffCardinality @TabVar;

     

    -- Now remove some of the rows from the table variable

    DELETE  @TabVar

    WHERE   PersonID < 20000;

     

    -- Invoke the stored procedure again

    EXEC dbo.SniffCardinality @TabVar;

    GO

     

    -- Clean up

    DROP PROC dbo.SniffCardinality;

    DROP TYPE TabType;

    GO

     

    If you run the batch above with the option to include the actual execution plan enabled, you will see a bunch of plans for all the queries in the last batch. The second and fourth are the executions of the stored procedure. Previously when we ran the same SELECT statement in a batch, the estimated number of rows for the table variable was 1. Now the estimates are different, as shown in the picture below (note that I edited the picture to remove irrelevant execution plans and include two tooltip windows):

     

    image

     

    As you see, the first estimate is exactly correct. That is because the table-valued parameter is used by the optimizer just as it uses other parameters. So when the stored procedure is first executed and no plan for it is in cache yet, it will sniff the table-valued parameter just as it sniffs other parameters. Now it is important to realize that this sniffing is limited. The optimizer will not read data from the table variable to get insight on the data in it. It only sniffs the metadata that is available at the time of compiling the plan. And because table variables do not have statistics, that metadata is limited to only the number of rows in the table variable that is passed into the stored procedure for the first execution.

     

    The second estimate is wrong. After the first call to the stored procedure we deleted most of the rows from the table variable, but the execution plans that are used for the stored procedure are all still optimized on the original row count. That is because the execution simply reused the previously compiled plan from the plan cache. A change in the cardinality on the table-valued parameter, like a change to the value of a regular parameter, is no reason to recompile the execution plan for the stored procedure.

     

    I already mentioned that cardinality sniffing has, to the best of my knowledge, never been documented before. That is not really surprising. I have yet to see my first table-valued parameter in “real” client code, and I hear almost nobody ever talk about them, so the feature appears to be pretty niche. However, when you happen to work on a system that does use table-valued parameters, then you might be faced with cardinality sniffing and its possible performance consequences. (And if not, then you can at least add “Hey, would you like to sniff my cardinality?” to your collection of useless pickup lines).

     

    Conclusion

     

    In this post, I first elaborated a bit on the relatively well-known concept of parameter sniffing, giving some background information into the internals at play in this process. I then went on to explain the much lesser known concept of variable sniffing, and proved that (unlike popular opinion by those who do talk or write about this concept) it actually uses quite different internal mechanisms from parameter sniffing, and that for this reason variable sniffing only happens when a recompile is manually enforced.

     

    In the last paragraph I then introduces a previously unknown form of sniffing: cardinality sniffing. In a process similar to parameter sniffing, the number of rows in a table variable can be sniffed when it is passed as a table-valued parameter into a stored procedure.

     

    In the opening paragraph I already hinted that parameter sniffing, though well-known, is also very misunderstood. It is neither “bad” (as suggested by many experts), nor “mostly good with an occasional exception” (as suggested by most others), but actually “sometimes good, sometimes bad, and mostly irrelevant”. I will explain this in my next blog post on this subject.

  • Upcoming presentations. And an offer.

    It’s not my habit to announce my future presentations on this blog. But I sometimes make exceptions to this rule, and today is a good day for such an exception. Especially since I have been allowed to offer some money off for one of my upcoming presentations.

    I’ll fly to Copenhagen tomorrow, for SQL Saturday Demark, where I will present two sessions: a 60-minute talk (“T-SQL User-Defined Functions, or: Bad Performance Made Easy”) about how T-SQL user-defined functions can wreck your performance, and a 10-minute lightning talk (“Managing Execution Plans”) about the similarities between execution plans and my coworkers.

    Three weeks later, on October 8, I will be in Germany for SQL Saturday Munich, where I will once more present the “T-SQL User-Defined Functions, or: Bad Performance Made Easy” session – so if you cannot make it to Denmark in time, feel free to sign up for Munich!

    Late October, you can find me in Seattle at the 2016 PASS Summit. If you’ve never heard of this conference, then you are really missing out on something. It’s three days (or five if you include pre-cons) packed full with literally hundreds of sessions, delivered by high end speakers and Microsoft staff, access to Microsoft support staff, and a chance to network with thousands of your peers. Plus, it’s also a lot of fun. And I am very happy to say that I have been selected to present two sessions this year. On October 27, I will deliver the lightning talk “Managing Execution Plans”, in a lightning talk session that also features Rob Volk, Russ Thomas, and Wayne Sheffield. And on October 28, I will present a 75-minute deep-dive session “Now Where Did THAT Estimate Come From?” on how SQL Server uses statistics and assumptions to estimate how many rows will be returned by a query, or in intermediate steps in the execution plan.

    And your last opportunity to see me this year will be in Orlando, December 5-9. I will be speaking at Live! 360, a single event that encapsulates six co-located conferences: AppDev Trends, Modern Apps Live!, Office & Sharepoint Live!, SQL Server Live!, TechMentor, and Visual Studio Live!. A single ticket gives access to all these conferences. I will of course speak in the SQL Server track. On the afternoon of December 8, I have been scheduled to present two back to back sessions (perhaps to ensure that you’ll really want to leave at the end of the day??). In “Powerful T-SQL Improvements that Reduce Query Complexity”, I will explain how the introduction and later improvements to the OVER clause has simplified code that used to be horrendous, and in “T-SQL User-Defined Functions, or: Bad Performance Made Easy”, I will once more explore the do’s and don’ts of T-SQL user-defined functions, this time with a bit extra depth because I have 75 minutes available.

    Save money on Live! 360

    I of course hope to see all of you at all those conferences. But I understand that not everybody has the budget to afford them all. The two SQL Saturday are easy, since they are free, but both the PASS Summit and Live! 360 are paid conferences. I absolutely believe that it’s money well spent because you get an absurd amount of training for your money at those conferences – but it still has to come out of your pocket, or out of a training budget, and those tend to be finite.

    In the case of Live! 360, I might be able to help out. A bit. The retail price of the full five-day conference currently sells at $1,995 (Super Early Bird), and will go up to $2,095 (Early Bird) after October 5, and then to $2,395 (Standard Pricing) after November 2. However, I have been allowed to offer readers of my blog a discounted price of $1,895, and that offer remains valid until just before the conference (or until the conference sells out, whichever comes first). The only thing you need to do if you want to make use of this offer is to enter the Priority Code LSPK43 when registering. Or you can simply use this link, which pre-populates the code for you so you cannot even forget it!

    (Unfortunately, the discount cannot be applied to existing registrations or combined with other discounts. It is valid for the 5-Day Package only)

    Shake hands?

    With four conferences coming up in the next few months, I hope that a lot of my readers will find an opportunity to attend at least one of those. If you do, then please feel free to find me in between sessions, introduce yourself, and have a friendly chat.

  • The DIY guide for local-global aggregation

    If you have read the title, you will not be surprised when I tell you that this blog is about a thing called “local-global aggregation”, and on how to do that yourself. So let’s start with the two obvious questions: what the heck is local-global aggregation anyway, and why the heck would you ever want to do it yourself?

     

    What is local-global aggregation

     

    The term local-global aggregation may sound scary, but it’s not that hard to understand. You might even already know the concept if you have ever heard the arguably most used explanation of parallelism. You have a jar of beans and need to count the beans. Instead of doing it all by yourself, you trick a few of your gullible friends into helping you, with the promise of a nice beer after an “easy task”. Each of them gets a few scoops of beans, counts them, and writes down the number they have. You collect the papers, sum up the numbers, and the result is the amount of beans that was in the jar before you started scooping. Congratulations! You have just done a local-global aggregation.

     

    A more technical explanation that wraps this example back to SQL Server (or, in fact, any data processing application) is that a set of data is divided into subgroups (your friends each getting their share of beans). Each of these subgroups is then aggregated on its own (your friends counting their beans); this is the “local” part of the aggregation, also called “partial” aggregation (a term that in my opinion actually better describes what is going on, but unfortunately “local” is the more commonly used term). The last step is to combine the separate local aggregation results into a single final result (you tallying up the numbers counted by your friends); the “global” aggregation step. Note that the overall amount of work is not less when using this pattern. However, because the work can now be spread over multiple workers, the task still finishes faster. On a single-threaded platform, it would not make sense to work like this.

     

    The basic concept sounds simple, but it can get more complex depending on what type of aggregation you need done. What if you need to know the minimum and maximum weight of the beans in the jar? You could buy a few precision scales, gather your friends again, and have each of them start weighing beans to find the minimum and maximum in their share; you would then just select the lowest and highest weight from all their results to get your final result. But how about the average weight? You could once more get out the scales and gather your friends (assuming you still have any after making them weigh thousands of beans for a single beer for the previous exercise) and have each of them compute the average weight for their share of beans … but then what? How do you get the global average from the local averages? Problem is: you can’t. To understand why, here is an extremely simple example, using just two friends and three beans. Friend 1 gets two beans weighing 1.1 and 1.3 gram; friend 2 gets the third bean which weighs 1.5 gram. You get two results from them: an average weight of 1.2 and an average weight of 1.5. Now how do you combine those two numbers to get the correct average of all beans, which is 1.3? Again, you can’t.

     

    This does not mean that you can’t use local-global aggregation to get an average, it just means that you have to be smarter. You should have asked your friends to tell you the number of beans and their total weight. From those local aggregates, you could then compute the global aggregates, total weight and total number of beans (3.9 gram and 3 beans), and then computed the average by dividing those numbers (3.9 / 3 = 1.3 gram). So to do a correct local-global aggregation for an average aggregate, you have to get different aggregates at the local level, which can then be combined in the correct formula to get the aggregate you need. For an average, this is fairly simple. But once we start looking at statistical aggregate functions such as standard deviation and variance, the trick becomes more complicated. The basic trick remains the same (compute some other aggregate at the local level, then combine with the appropriate formula), but the actual formula itself becomes more complex. More on that later.

     

    Local-global aggregation done by the optimizer

     

    You usually don’t need to worry about local-global aggregation. Minds far greater than ours have done all the hard work and implemented this logic in the query optimizer, so everything works smoothly without us having to do anything special for it.

     

    The most obvious place where you expect this to happen is in parallel plans. Here is a very simple example, based on the Microsoft Contoso BI Demo database which can be found here. (I normally prefer to base my examples on AdventureWorks, but the tables in that database are too small to get parallelism on simple example queries).

     

    SELECT AVG(TotalCost)

    FROM dbo.FactOnlineSales;

     

    The execution plan for this extremely simple query looks like this:

     

    image

     

    The query has just a single aggregation, but the plan has two Stream Aggregate operators. The rightmost Stream Aggregate, running in the parallel section, does the local aggregation. If you open the properties and look at the Defined Values aggregate, you will see that it computes two aggregates: “partialagg1004=Count(*)” and “partialagg1006=SUM(TotalCost)” (I have simplified these slightly from the actual expressions in the plan). To the left, in the serial section of the plan, a second Stream Aggregate operator computes the global aggregates: “globalagg1005=SUM(partialagg1004)” and “globalagg1007=SUM(partialagg1006)”. So globalagg1005 is the total number of rows, and globalagg1007 is the grand total of all TotalCost values. The Compute Scalar operator then uses those two global aggregates to compute the average that the query actually requested: “Expr1003=CASE WHEN globalagg=0 THEN NULL ELSE globalagg1007 / globalagg1005” – in other words, total cost divided by number of rows, or NULL if the number of rows was zero.

     

    Parallel plans are not the only place where the optimizer will use local-global aggregation. In some other queries it can use this as a trick to save work. See the query below for an example (this time I did base the query on AdventureWorks, because I want to have a serial plan).

     

    SELECT     p.Color, SUM(sod.OrderQty)

    FROM       Sales.SalesOrderDetail AS sod

    INNER JOIN Production.Product AS p

          ON   p.ProductID = sod.ProductID

    GROUP BY   p.Color;

     

    The query requests an aggregation by colour, but because colour isn’t stored in the SalesOrderDetail table we need to join to the Product table. The most straightforward execution plan for this would look something like this (which I made up by using copy and paste from various other plans – we can’t all know all undocumented trace flags by head):

     

    image

    This made-up plan is a very straightforward implementation of the query: join every row from SalesOrderHeader to every row from DimProduct, then aggregate by colour and present the results. In this plan, the join has to operate on all 121,317 rows from SalesOrderDetail, which is estimated to contribute over a quarter of the total cost of the plan.

     

    The execution plan that you actually get for the query above (when running on SQL Server 2012) looks like this:

     

    image

     

    You will see two key differences: far less input to the join operator (just 266 rows), and two instead of just one operators for a local-global aggregation pattern. The Hash Match (Aggregate) operator to the right and in the lower branch does the local aggregation, and the Stream Aggregate operator to the left does the global aggregation. I will not dive into the exact Defined Values properties for each of them, as they are pretty basic (since the query merely requests a SUM). In this case, it is far more interesting to look at the grouping level.

     

    The global aggregate obviously has to group by colour, as requested in the query, and this is confirmed if you look at the Group By property of the Stream Aggregate. But what happens in the local aggregation in the Hash Match (Aggregate) operator? The Product table is not joined yet so we do not have the colour available. To see how the groups are defined here, we’ll have to look at the Hash Keys Build property of this operator, which tells us that the local aggregation groups by ProductID.

     

    What happens here is that the global aggregation, which has a group for each colour, is subdivided into a group for each product. Some colours might have be used for just a single product, but most are used in multiple products. You can see this by looking at the number of row property of various operators – after grouping by ProductID the 121,317 rows from SalesOrderDetails are reduced to just 266 rows for 266 distinct products, and after the final aggregation we are left with just 9 rows, for nine colours.

     

    As before, the local-global aggregation pattern does not save work by itself; it actually introduces a bit of extra work. And in this case the plan is serial so we never have more than one worker. And yet this pattern makes sense here – because the extra work introduced  by the local-global aggregation is far less than the work saved by having to join 266 instead of 121,317 rows.

     

    Local-global aggregation done by you

     

    So far I have shown you what local-global aggregation is, and how the optimizer will use this when using this pattern can improve the overall query performance. It is now time to address the “DIY” part of the title.

     

    There may be situations where a query might benefit from local-global aggregation, but the optimizer can’t or doesn’t use it by itself. This happens very often when you rewrite queries on a columnstore table to work around the limitations of batch mode execution; I cover this extensively in levels 10 and 11 of the Stairway to Columnstore Indexes (not yet published at the time of writing this blog post). It can also, less frequently, happen in other situations.

     

    To see an example, let’s return to the AdventureWorks sample database. Two of its tables are Production.TransactionHistory and Production.TransactionHistoryArchive. The second one is used for older data that will not change anymore and is less frequently queried. There are lots of ways to implement such a business need; this one was chosen for AdventureWorks (and that I have also seen in a lot of real companies).

     

    The management dashboard includes a view on these tables that shows aggregated data from these tables, as follows:

     

    CREATE VIEW DashBoard_ProductsSold

    WITH SCHEMABINDING

    AS

    SELECT   ProductID, SUM(Quantity) AS TotalSold

    FROM    (SELECT ProductID, Quantity

             FROM   Production.TransactionHistory

             UNION ALL

             SELECT ProductID, Quantity

             FROM   Production.TransactionHistoryArchive) AS c

    GROUP BY ProductID;

     

    Lately, the DBA team has noticed that more and more managers get access to this dashboard, and some tend to refresh quite often. And every time the dashboard is refreshed, all 200K rows in the two tables have to be read. This is starting to affect overall performance of the system, so we want to index this view. The benefit of an indexed view is that the results are stored, so now refreshing the dashboard is just a simple index scan. The downside is that the stored results need to be updated whenever data changes so there will be some overhead on modifications. In this case, we expect this overhead to be outweighed by the saved IO and processing whenever the dashboard is loaded or refreshed.

     

    However, when we try to index the view we get an error:

     

    CREATE UNIQUE CLUSTERED INDEX cix_DashBoard_ProductsSold

    ON dbo.DashBoard_ProductsSold (ProductID);

     

    Msg 10109, Level 16, State 1, Line 1

    Cannot create index on view "AdventureWorks2012.dbo.DashBoard_ProductsSold" because it references derived table "c" (defined by SELECT statement in FROM clause). Consider removing the reference to the derived table or not indexing the view.

     

    There are lots of limitations for indexed views, and this is one of them. Because the data is in two different tables, we have to combine it and then aggregate the combined result. That isn’t possible without using a derived table or another similar (and also forbidden) construction, so we will have to find another way to optimize the dashboard. And that’s where DIY local-global aggregation comes in. In this case the data already is divided in two smaller groups (the two tables); instead of combining the data and then aggregating, we can aggregate each tables individually and then combine the results. Let’s show this in two steps.

     

    The first step is for the local aggregation. For this, we need to create two new views:

     

    CREATE VIEW DashBoard_ProductsSold_History

    WITH SCHEMABINDING

    AS

    SELECT   ProductID, SUM(Quantity) AS TotalSold

           , COUNT_BIG(*) AS NumRows            -- Added to satisfy indexed view req'ment

    FROM     Production.TransactionHistory

    GROUP BY ProductID;

    GO

    CREATE VIEW DashBoard_ProductsSold_History_Archive

    WITH SCHEMABINDING

    AS

    SELECT   ProductID, SUM(Quantity) AS TotalSold

           , COUNT_BIG(*) AS NumRows            -- Added to satisfy indexed view req'ment

    FROM     Production.TransactionHistoryArchive

    GROUP BY ProductID;

     

    These two views can each be indexed without problems:

     

    CREATE UNIQUE CLUSTERED INDEX cix_DashBoard_ProductsSold_History

    ON dbo.DashBoard_ProductsSold_History (ProductID);

    CREATE UNIQUE CLUSTERED INDEX cix_DashBoard_ProductsSold_History_Archive

    ON dbo.DashBoard_ProductsSold_History_Archive (ProductID);

     

    And then, as the final step, we change the original view to return the same results by performing global aggregation on the locally aggregated data from the new views:

     

    ALTER VIEW DashBoard_ProductsSold

    WITH SCHEMABINDING

    AS

    SELECT   ProductID, SUM(TotalSold) AS TotalSold

    FROM    (SELECT ProductID, TotalSold

             FROM   dbo.DashBoard_ProductsSold_History WITH (NOEXPAND)

             UNION ALL

             SELECT ProductID, TotalSold

             FROM   dbo.DashBoard_ProductsSold_History_Archive WITH (NOEXPAND)) AS c

    GROUP BY ProductID;

     

    (The NOEXPAND hints are added to ensure that this works on any edition of SQL Server; on the Enterprise and Developer editions the hints are not required but they do not hurt either).

     

    The last view cannot be indexed because of the derived table. But this view does not need to be indexed, because the local aggregation views already are indexed. If you query this view and check the execution plan, you’ll see less than 1000 rows read from the two views, as opposed to the 200K rows we were reading before.

     

    With these same local-aggregated indexed views as input, I can easily extend the dashboard to also show an average, or a count of rows:

     

    ALTER VIEW DashBoard_ProductsSold

    WITH SCHEMABINDING

    AS

    SELECT   ProductID,

             SUM(TotalSold) AS TotalSold,

             SUM(NumRows) AS NumRows,

             SUM(TotalSold) * 1.0 / SUM(NumRows) AS AvgSold

    FROM    (SELECT ProductID, TotalSold, NumRows

             FROM   dbo.DashBoard_ProductsSold_History WITH (NOEXPAND)

             UNION ALL

             SELECT ProductID, TotalSold, NumRows

             FROM   dbo.DashBoard_ProductsSold_History_Archive WITH (NOEXPAND)) AS c

    GROUP BY ProductID;

     

    Adding a minimum or a maximum is slightly more work because I would need to add the MIN and MAX functions to the local-aggregated views, but even this is hardly rocket science. However, things gets complicated when management asks to include more advanced statistical information.

     

    Those pesky statistical functions

     

    SQL Server offers out of the box four statistical functions that are more advanced then AVG: VAR, VARP, STDEV, and STDEVP. These, too, can be used in a local-global aggregation pattern. But that is more complex than for the aggregate functions we have seen so far.

     

    It is possible that you, like me, are not a daily user of these functions. And it is even possible that your high school days are so far back that you don’t really remember what they are about. But if you work with SQL Server, then I believe that you should at least know that they exist and (roughly) know what they are and what they are used for. A very short explanation is that both variance and standard deviation are a measure of how spread out the values are. The set of numbers {0, 50, 100} has the same average as {49, 50, 51}. But the distribution is very different, and the standard deviation and variance indicate that difference.

     

    The variance is defined in terms of the difference between each value and the average. The typical algorithm is to first calculate the average, then return to the data set and find the difference between each value and that average. The standard deviation is simply the square root of the variance. A small added complexity is that there is a subtle change in the calculation depending on whether you compute it based on all data or on a sample – that is why SQL Server has VAR and STDEV for the variance and standard deviation of a sample and VARP and STDEVP for the variance and standard deviation of the full population (the extra P stands for population). See here for a slightly deeper (but still fairly accessible) explanation.

     

    The problem with the standard algorithm for variance and standard deviation is that it requires two passes over the data: first the average must be found, and then we need a second pass to subtract each value from that average. For computers, where I/O typically costs more than computation, we don’t want to use two passes. Luckily, smart mathematicians have found many other algorithms to compute the same result, and some of them can be implemented with a single pass over the data. The formula that SQL Server uses internally when executing a query that uses a statistical aggregate function is the so-called “Naïve algorithm”, and this is also the formulas that we will use for our DIY local-global aggregation.

     

    Without going too deep into the formulas, I will now show the code that you can use for the local-global aggregation pattern on statistical functions. Note that in this case I have to drop and recreate all views because an indexed view does not support ALTER. Also note that the views needs to use an ISNULL function because SQL Server misinterprets the SQUARE function as being nullable even when the input is not nullable; this nullability would trigger yet another limitation on indexed views.

     

    DROP VIEW DashBoard_ProductsSold;

    DROP VIEW DashBoard_ProductsSold_History;

    DROP VIEW dbo.DashBoard_ProductsSold_History_Archive;

    GO

    CREATE VIEW DashBoard_ProductsSold_History

    WITH SCHEMABINDING

    AS

    SELECT   ProductID,

             SUM(Quantity) AS TotalSold,

             SUM(ISNULL(SQUARE(Quantity),0)) AS TotalSquared,

             COUNT_BIG(*) AS NumRows

    FROM     Production.TransactionHistory

    GROUP BY ProductID;

    GO

    CREATE VIEW DashBoard_ProductsSold_History_Archive

    WITH SCHEMABINDING

    AS

    SELECT   ProductID,

             SUM(Quantity) AS TotalSold,

             SUM(ISNULL(SQUARE(Quantity),0)) AS TotalSquared,

             COUNT_BIG(*) AS NumRows

    FROM     Production.TransactionHistoryArchive

    GROUP BY ProductID;

    GO

    CREATE UNIQUE CLUSTERED INDEX cix_DashBoard_ProductsSold_History

    ON dbo.DashBoard_ProductsSold_History (ProductID);

    CREATE UNIQUE CLUSTERED INDEX cix_DashBoard_ProductsSold_History_Archive

    ON dbo.DashBoard_ProductsSold_History_Archive (ProductID);

    GO

    CREATE VIEW DashBoard_ProductsSold

    WITH SCHEMABINDING

    AS

    SELECT   ProductID,

             SUM(TotalSold) AS TotalSold,

             SUM(NumRows) AS NumRows,

             SUM(TotalSold) * 1.0 / SUM(NumRows) AS AvgSold,

             CASE WHEN SUM(NumRows) > 0

                  THEN (SUM(TotalSquared) - (SQUARE(SUM(TotalSold)) / SUM(NumRows))) / SUM(NumRows)

             END                            AS [VarP],

             CASE WHEN SUM(NumRows) > 0

                  THEN SQRT((SUM(TotalSquared) - (SQUARE(SUM(TotalSold)) / SUM(NumRows))) / SUM(NumRows))

             END                            AS [StDevP],

             CASE WHEN SUM(NumRows) > 1

                  THEN (SUM(TotalSquared) - (SQUARE(SUM(TotalSold)) / SUM(NumRows))) / (SUM(NumRows) - 1)

             END                            AS [Var],

             CASE WHEN SUM(NumRows) > 1

                  THEN SQRT((SUM(TotalSquared) - (SQUARE(SUM(TotalSold)) / SUM(NumRows))) / (SUM(NumRows) - 1))

             END                            AS [StDev]

    FROM    (SELECT ProductID, TotalSold, TotalSquared, NumRows

             FROM   dbo.DashBoard_ProductsSold_History WITH (NOEXPAND)

             UNION ALL

             SELECT ProductID, TotalSold, TotalSquared, NumRows

             FROM   dbo.DashBoard_ProductsSold_History_Archive WITH (NOEXPAND)) AS c

    GROUP BY ProductID;

    GO

     

    A small footnote – the square root expression in the calculation for standard deviation can, by mathematical standards, never be negative. But rounding errors, especially with floating point data, introduce a theoretical possibility that the computer runs into a small negative number, which could cause a run-time error. The chance of this happening is incredibly small, but if you want to make sure that this never happens, use this longer but more defensive version instead:

     

    ALTER VIEW DashBoard_ProductsSold

    WITH SCHEMABINDING

    AS

    SELECT   ProductID,

             SUM(TotalSold) AS TotalSold,

             SUM(NumRows) AS NumRows,

             SUM(TotalSold) * 1.0 / SUM(NumRows) AS AvgSold,

             CASE WHEN SUM(NumRows) = 0

                  THEN NULL

                  WHEN (SUM(TotalSquared) - (SQUARE(SUM(TotalSold)) / SUM(NumRows))) / SUM(NumRows) < 0

                  THEN 0

                  ELSE (SUM(TotalSquared) - (SQUARE(SUM(TotalSold)) / SUM(NumRows))) / SUM(NumRows)

             END                            AS [VarP],

             CASE WHEN SUM(NumRows) = 0

                  THEN NULL

                  WHEN (SUM(TotalSquared) - (SQUARE(SUM(TotalSold)) / SUM(NumRows))) / SUM(NumRows) < 0

                  THEN 0

                  ELSE SQRT((SUM(TotalSquared) - (SQUARE(SUM(TotalSold)) / SUM(NumRows))) / SUM(NumRows))

             END                            AS [StDevP],

             CASE WHEN SUM(NumRows) <= 1

                  THEN NULL

                  WHEN (SUM(TotalSquared) - (SQUARE(SUM(TotalSold)) / SUM(NumRows))) / (SUM(NumRows) - 1) < 0

                  THEN 0

                  ELSE (SUM(TotalSquared) - (SQUARE(SUM(TotalSold)) / SUM(NumRows))) / (SUM(NumRows) - 1)

             END                            AS [Var],

             CASE WHEN SUM(NumRows) <= 1

                  THEN NULL

                  WHEN (SUM(TotalSquared) - (SQUARE(SUM(TotalSold)) / SUM(NumRows))) / (SUM(NumRows) - 1) < 0

                  THEN 0

                  ELSE SQRT((SUM(TotalSquared) - (SQUARE(SUM(TotalSold)) / SUM(NumRows))) / (SUM(NumRows) - 1))

             END                            AS [StDev]

    FROM    (SELECT ProductID, TotalSold, TotalSquared, NumRows

             FROM   dbo.DashBoard_ProductsSold_History WITH (NOEXPAND)

             UNION ALL

             SELECT ProductID, TotalSold, TotalSquared, NumRows

             FROM   dbo.DashBoard_ProductsSold_History_Archive WITH (NOEXPAND)) AS c

    GROUP BY ProductID;

     

    Note that in real cases, the source of the local aggregation can be diverse. It can be two or more views as in the example above, but it can also be a subquery with a different grouping level than the final query, or some other subquery, view or CTE. In all cases, the steps to build a DIY aggregate for variance and standard deviation are the same: ensure that the local aggregation computes the three ingredients (count of rows, sum of values, and sum of squares), then use the appropriate formula as shown above for the final result.

     

    Conclusion

     

    Instead of computing an aggregate function of a full data set, it is sometimes a better idea to divide the data in smaller portions, aggregate each individually, and then compute the final aggregates by combining the aggregations of the smaller sets (the partial aggregates). This is called local-global aggregation.

     

    There are cases where the optimizer will do this automatically for you. Parallelism is a prime example of this, but the optimizer can also use local-global aggregation for other optimizations.

     

    In some cases you need to explicitly force local-global aggregation in your queries. The batch execution mode, one of the reasons for the huge performance gain that columnstore indexes can achieve, has lots of limitations that require query rewrites for optimal performance; these often necessitate local-global aggregation patterns in the query. And though many of the batch mode limitations that existed in SQL Server 2012 were fixed in later versions, there are even in SQL Server 2016 still a few queries that need help to get batch mode execution.

     

    Even when not using columnstore indexes, there are still situations that require us to write a manual local-global aggregation pattern. An example of this based on a current and an archive table is used in this blog post to demonstrate the methods to do this.

     

    For the aggregate functions that are most commonly known, the local-global aggregation pattern is fairly easy to implement and understand. For more advanced statistical aggregate functions, such as standard deviation and variance, the formulas to use can get quite complex. In this blog post I showed a query that includes all these formulas.

     

    I hope this blog post can help you when you ever need to write a query with a local-global aggregation pattern – regardless of whether you know that term or not!

  • How TOP wrecks performance (part 2)

    In my previous post, I showed how the vanilla version of TOP can really help performance by nudging the optimizer into picking an execution plan that is optimized for returning just a few rows instead of the entire result set. But I also showed that adding WITH TIES causes a cardinality misestimate, which may lead to suboptimal or downright terrible plan choices. Luckily, this is fixed in SQL Server 2014 and later, provided you use the new cardinality estimator.

     

    In this post I will show how the PERCENT option causes even worse performance, and how you can avoid that performance hit.

     

    PERCENT

     

    Adding PERCENT to a TOP clause does exactly what it states on the label. Here is an example – very similar to the example used in the previous part, but I changed the ORDER BY for reasons I will explain later in the post.

     

    SELECT   TOP(2) PERCENT CustomerID, OrderDate, SubTotal

    FROM     Sales.SalesOrderHeader

    ORDER BY SalesOrderID;

     

    This query returns 630 rows. Why? Because without the TOP clause, the query would return 31,465 rows (the number of rows in Sales.SalesOrderHeader), and two percent of that is 630 (rounded up – TOP PERCENT always rounds up).  Unlike TOP without PERCENT, the value for TOP with PERCENT can be a fractional number, e.g. TOP (2.5) PERCENT in the previous query would return 787 rows.

     

    Because SQL Server does not know how many rows the query without TOP would return, it can only satisfy the TOP PERCENT requirement by first executing the query as if there is no TOP to get that row count (which, for this simple query, means reading the entire table), and then return the required number of rows. Here is the execution plan for the query above:

    image

    Even though I said that SQL Server needs to first read the entire table to get the amount of rows and then return to the table to get the 630 expected rows, the plan shows just a single clustered index scan operator accessing the table. So how does this work? The answer is the table spool operator.

     

    Let’s step through the plan, starting (as SQL Server does) at the top left operator. The SELECT operator will call the Top operator to get a row, and Top will immediately relay that GetNext row request to the Table Spool operator. Because this spool operator is running in eager spool mode, it will call the Clustered Index Scan operator not once, but over and over again until all rows are read. All rows returned are then “spooled” by the Table Spool operator – i.e., they are written to a worktable, a temporary data structure located in tempdb, from where they can later be retrieved and served (while preserving the order). The Clustered Index Scan operator itself is running as an ordered scan (as shown in the operator properties), which is the reason why this query does not need a Sort operator to satisfy the ORDER BY clause.

     

    Once all rows have been read from the clustered index and stored in the worktable, the Table Spool will return the first row to the Top operator. And, even though this is not visible in the plan, the Top operator also gets access to the total number of rows that is now stored in the worktable. This allows it to compute the number of rows it should return, and then call the Table Spool operator that amount of times. For each of those calls, the Table Spool will not call out to the Clustered Index Scan operator again, but simply read the next row from the worktable and return it.

     

    At first sight this table spool may appear to be superfluous. Why not, after getting a count of rows from the table, get back directly to the base table to get the required results? In the case of this simple query that would indeed make sense. But most queries are more complex: they join a bunch of tables together, filter out part of the results based on a complex WHERE clause, perhaps also do grouping and aggregating to get the final result – those queries can be very expensive and long-running, and in that case you really do not want SQL Server to execute the query first in its entirety to just get a row count, and then execute it again to return the requested rows. The Table Spool pattern is used to prevent that double cost, and the overhead of creating the worktable is considered small enough to not bother using a different pattern for simpler queries.

     

    But there is a catch, and it is a big one. The optimizer seems to think that storing rows in a temporary table is “relatively” cheap, but in reality this turns out to be not the case. The mere process of storing those rows is in reality very expensive. You can see that by rerunning the query, with the STATISTICS IO set option enabled:

     

    SET STATISTICS IO ON;

    SELECT   TOP(2) PERCENT CustomerID, OrderDate, SubTotal

    FROM     Sales.SalesOrderHeader

    ORDER BY SalesOrderID;

     

    Switch to the Messages tab after executing this, and you will see that the SalesOrderHeader table has 689 logical reads – a full scan of the entire table, as expected. But you will also see 63,730 logical reads for the worktable – that seems a bit excessive for returning just 630 rows!

     

    Spool, the incredible performance sink

     

    It took me some time before I realised that these logical reads are not produced when reading from the worktable, but when writing to it. I was able to verify this theory by changing the TOP(…) PERCENT parameter to either a very high value (e.g. 99.99999), so that all rows from the table are returned, or to a very low value (e.g..0.00001) to return just a single row. The logical reads change, but only a bit: the number of logical reads on the worktable varies from 63,727 to return a single row to 63,888 to return them all. This indicates that over 63,720 reads are generated while building the spool, and between just one or two up to 170 or so to read up to 31,465 rows from it.

     

    In order to understand why so many logical reads are generated when populating the spool, I tried variations on the same query. When I added a WHERE clause to limit the number of rows going into the spool,  I found that the number of logical reads decreases by two for every row filtered out. When I added more columns to the SELECT list, or used larger tables as the source (e.g. SalesOrderDetail, or a cross join between two tables), that number at one point changes to a difference of three logical reads per row removed. Upon seeing these ratios, I finally figured out what is causing these excessive reads – apparently, the worktable is created with an internal clustered index on top of it (even if the results stored in it can contain duplicates), and every single insert is performed by doing a seek in that index to find the location to add the row. For tiny worktables that seek starts at just a single read, but as the worktable grows, more levels have to be added to the clustered index, causing the number of reads per row to go up to 2, 3, or even more.

     

    The first rewrite

     

    If the cost of writing to a spool is one, two, or even three logical reads per row inserted, then the overhead it introduces in the plan for the simple query above is way more than the saving. A much better and faster solution is to use a “Do it yourself” TOP PERCENT workaround, as follows:

     

    DECLARE @cnt bigint;

    SET @cnt =

     (SELECT COUNT_BIG(*)

      FROM   Sales.SalesOrderHeader);

     

    SELECT TOP(CAST(CEILING(2.0 * @cnt / 100) AS int))

             CustomerID, OrderDate, SubTotal

    FROM     Sales.SalesOrderHeader

    ORDER BY SalesOrderID;

     

    This method avoids the overhead of the worktable. An additional benefit is that the number of matching rows is now found using COUNT(*), giving the optimizer more freedom to use a different plan. In this case, it does indeed choose to use a smaller index for the first query, reducing the number of logical reads. The total amount of work for this batch is now down to 57 logical reads for the first query and 18 logical reads – a huge reduction from the 689 logical reads from the base table plus 63,730 logical reads from the worktable we started with.

     

    But the proof of the pudding is in the eating, and the proof of the query is (for me) in the actual time used. Switching to SET STATISTICS TIME ON and running both queries several times shows that I did indeed achieve a nice performance improvement. The original query consistently takes approximately 300 ms elapsed time and burns 180 ms CPU time; the combined total of the two queries in the rewrite is just 150 ms elapsed time, and uses a mere 15 ms CPU time.

     

    There are drawbacks too, though. One is that the query now has to be repeated twice. The SELECT clause in the first is simplified to COUNT(*), but all the remaining logic has to be exactly copied – which makes this a time bomb for future maintenance. This can be slightly remedied by using a CTE, but this makes it a lot harder to understand the logic:

     

    WITH     TheResults AS

     (SELECT SalesOrderID, CustomerID, OrderDate, SubTotal

      FROM   Sales.SalesOrderHeader)

    SELECT TOP(CAST(CEILING(2.0 * (SELECT COUNT_BIG(*) FROM TheResults) / 100) AS int))

             CustomerID, OrderDate, SubTotal

    FROM     TheResults

    ORDER BY SalesOrderID;

     

    This version uses the exact same amount of logical reads as the two queries from the other version combined, and almost the same amount of time – but it will probably be a challenge to everyone who has never seen this query pattern before.

     

    Another drawback of this method is that under high concurrency, the population of the table may change between the two queries. The harder to maintain single query makes the window of risk smaller, but is not immune – if this is an important requirement, you need to elevate the transaction isolation level to serializable, which will have a negative impact on concurrency.

     

    The second rewrite

     

    The method above is not always a good idea. We avoid using the spool by forcing SQL Server to execute the query twice (the CTE variation removes the duplication from the query itself, but not from the execution plan!). That is an improvement over the relatively high cost of the spool, but only if some conditions are met.

     

    If the optimizer can simplify the first execution of the query (to get the rowcount) to a very cheap plan, and if the percentage returned (and hence the number of rows for which the full rowset has to be produced) is small, then the saving can be huge. But what if we try this with a relatively expensive query, that does not allow simplification to get the rowcount, and we request a high percentage?

     

    The query below demonstrates just that. In order to get a sufficiently expensive query on a small database such as AdventureWorks I had to go a bit fancy so this query will not be very realistic – but on much bigger databases, even more realistic queries can quickly become just as or even more expensive.

     

    SELECT TOP(10) PERCENT

                   soh.CustomerID, soh.OrderDate, soh.SubTotal, Next1KSales.Total

    FROM           Sales.SalesOrderHeader AS soh

    CROSS APPLY

       (SELECT     SUM(sod2.LineTotal) AS Total

        FROM       Sales.SalesOrderHeader AS soh2

        INNER JOIN Sales.SalesOrderDetail AS sod2

              ON   sod2.SalesOrderID = soh2.SalesOrderID

        WHERE      soh2.SalesOrderID BETWEEN soh.SalesOrderID

                                     AND soh.SalesOrderID + 1000) AS Next1KSales

    WHERE          Next1KSales.Total > 500000

    ORDER BY       soh.SalesOrderID;

     

    This query takes roughly 118 seconds to run (both elapsed time and CPU time). In the STATISTICS IO results, I see 1,347,920 logical reads on SalesOrderDetail, 801,607 on SalesOrderHeader, and 61,042 on a worktable – this is for storing the nearly 30,000 rows that qualify the WHERE clause, of which then 10 percent (nearly 3,000 rows) are returned. Just as in the previous example, the number of logical reads for filling the spool is excessive so let’s try to avoid this. In this case I choose not to duplicate the query in my code, so that means I have to resort to the more esoteric version with a CTE:

     

    WITH TheResults AS

     (SELECT         soh.SalesOrderID, soh.CustomerID, soh.OrderDate,

                     soh.SubTotal, Next1KSales.Total

      FROM           Sales.SalesOrderHeader AS soh

      CROSS APPLY

         (SELECT     SUM(sod2.LineTotal) AS Total

          FROM       Sales.SalesOrderHeader AS soh2

          INNER JOIN Sales.SalesOrderDetail AS sod2

                ON   sod2.SalesOrderID = soh2.SalesOrderID

          WHERE      soh2.SalesOrderID BETWEEN soh.SalesOrderID

                                       AND soh.SalesOrderID + 1000) AS Next1KSales

      WHERE          Next1KSales.Total > 500000)

    SELECT TOP(CAST(CEILING(10.0 * (SELECT COUNT_BIG(*) FROM TheResults) / 100) AS int))

             CustomerID, OrderDate, SubTotal, Total

    FROM     TheResults

    ORDER BY SalesOrderID;

     

    Looking at the STATISTICS IO results, I see that I did indeed avoid the worktable and its 61,042 logical reads. But I also see that I have to pay a severe penalty: the number of logical reads for SalesOrderDetail has gone up by 162,901 to 1,510,821, and on SalesOrderHeader the reads have increased by 77,795 to 879,402. So in order to avoid just over 61 thousand logical reads, we had to take a hit of over 240 thousand logical reads on other tables. Does not sound like a winning proposal to me.

     

    The run times of the second query are a bit strange. For reasons beyond the scope of this post the first query does not use parallelism; the second does go parallel, which improves the elapsed time quite a bit, at the cost of much higher CPU usage. In order to get a better comparison I added a MAXDOP hint to force both plans to be serial. This does not change the logical reads, but it does change the execution times: the second plan now uses 125 seconds (both elapsed and CPU time), approximately a 6% increase over the query that simply uses TOP.

     

    Luckily, there is yet another way to rewrite the query. A method that is actually almost the same as the execution plan chosen for the original query, except that we force SQL Server to replace the implicit temporary work area of the spool with an explicit temporary table. So I rewrite the original single query as two queries. I first execute the original query without any TOP clause and store the results in a temporary table; I then return the TOP(..) PERCENT rows from that temporary table using the original rewrite – which is now efficient because it is no longer expensive to fetch the same rows multiple times.

     

    SELECT         soh.SalesOrderID, soh.CustomerID, soh.OrderDate,

                   soh.SubTotal, Next1KSales.Total

    INTO           #TheResults

    FROM           Sales.SalesOrderHeader AS soh

    CROSS APPLY

       (SELECT     SUM(sod2.LineTotal) AS Total

        FROM       Sales.SalesOrderHeader AS soh2

        INNER JOIN Sales.SalesOrderDetail AS sod2

              ON   sod2.SalesOrderID = soh2.SalesOrderID

        WHERE      soh2.SalesOrderID BETWEEN soh.SalesOrderID

                                     AND soh.SalesOrderID + 1000) AS Next1KSales

    WHERE          Next1KSales.Total > 500000

    OPTION (MAXDOP 1);

     

    SELECT TOP(CAST(CEILING(10.0 * (SELECT COUNT_BIG(*) FROM #TheResults) / 100) AS int))

             CustomerID, OrderDate, SubTotal, Total

    FROM     #TheResults

    ORDER BY SalesOrderID;

     

    DROP TABLE #TheResults;

     

    The query above produces the same result as all previous queries, but now in the fastest possible way. Of course the first query still has to do the work of evaluating the entire query which as always necessitates 1,347,920 reads on SalesOrderDetail and 801,607 reads on SalesOrderHeader; these cannot be avoided. (Unless we put in the work to completely rewrite the query, which is probably very well possible in this case but would defy the point of this blog post). There are however no logical reads for inserting into the temporary table, because in this case the temporary table is a heap so SQL Server does not have to search the right location when inserting a row.

     

    The second query uses just 374 logical reads to read the data from this temporary table twice. There is a hidden cost, though, that you can see in the execution plan: because the data in a heap is not sorted, the data has to be read completely twice; the first time in order to count rows and the second time in order to feed them all into a Sort operator. These operators typically do not cause I/O activity because the sort is done in-memory; however this does cost a lot of CPU, and a lot of memory.

     

    The query above forces serial execution for the first query, for better comparison to the other versions. It uses approximately 118 seconds – similar to the query with TOP, which is not very surprising because the overhead of the spool operator, though large in relation to the number of rows it has to store, is insignificant in relation to the total amount of work done for the rest of the query. When running both queries multiple times and calculating the average elapsed and CPU time, I expect the query with TOP to be a few tenths of a second slower than the version that inserts into the temporary table. When I allow this query to go parallel, it runs much faster on my system (only 48 seconds elapsed), at the cost of burning way more CPU (over 350 seconds CPU time on my 8-core test system). I would allow the parallelism if this query is critical or if my server has sufficient unused CPU cycles; on a heavily used systems that is constantly under CPU pressure I would force the serial plan.

     

    The second query, despite the sort, takes almost no time: just a quarter of a second elapsed time, and less than a tenth of a second CPU time.

     

    It always depends

     

    At the start of this post I promised that I would explain why I change the ORDER BY for this post from the ORDER BY that I used in the previous post. The reason is obvious as soon as you run the query below:

     

    SELECT   TOP(2) PERCENT CustomerID, OrderDate, SubTotal

    FROM     Sales.SalesOrderHeader

    ORDER BY CustomerID;

     

    The SET STATISTICS IO results do still include a worktable, but in this case the number of logical reads in that table is zero. Is this some hidden magical Table Spool optimization at work, or is there a different explanation? To answer that question we will have to look at the execution plan for this query:

    image

    As you can see, there is no Table Spool at all. It is replaced by a Sort operator. There are some surprising similarities between the Table Spool (Eager Spool) in the first plan and the Sort operator in this plan. Both will consume the entire input before even producing the first row of output. As a result, they both know how many rows there are, and they can (and will) both transmit this information to the Top operator so that it can compute the number of row for the specified percentage. And this is why no additional Table Spool is needed in the plan above, because all rows are effectively already “spooled” in the Sort.

     

    There are also two important differences between these operators; the first being that the Table Spool preserves the order of the rows and the Sort operator (obviously) doesn’t. But in the context of this blog I am mostly interested in the second difference: the Table Spool stores the data it reads in a worktable in tempdb, but the Sort stores it in memory (using tempdb only when the allocated memory is insufficient). And that is why we don’t see any I/O on the worktable in the last example, and why this query, despite the much higher estimated cost, in reality runs a lot faster than the first example.

     

    Conclusion

     

    For a query that uses TOP with the PERCENT option, the optimizer needs a way to know the number of rows before it can return the correct subset. This requires evaluating the entire query as if there were no TOP, storing the results, and then returning rows from that stored result set. If a Sort operator is needed to satisfy the ORDER BY of the query then this Sort operator gives us the facility to store and count the rows “for free”; in all other cases the optimizer will inject a Table Spool operator in the plan for this purpose.

     

    However, the Table Spool is in practice way more expensive than what the optimizer thinks it costs. In many cases, it would actually be more efficient to first count the number of rows in the result without saving them, and then run the query again with a Top operator to limit the number of rows returned. Unfortunately, because the optimizer does not fully grasp the actual cost of a Table Spool, this optimization can only be achieved by rewriting the query. In this post I have showed two patterns for this; one that is relatively easy to understand but requires duplicating the code, and a second pattern that doesn’t duplicate code by using a much harder to understand construction.

     

    Sometimes, especially when returning a large percentage of the results from an expensive query that cannot be simplified when only the count of rows is requested, the rewrite pattern I provided backfires. In such cases an alternative form of rewrite is possible where the results of the entire query are first stored in a temporary table, and then the first form of the TOP (…) PERCENT rewrite is used with that temporary table as the source. Though this form is still somewhat faster than just using TOP (…) PERCENT, it does not save quite as much as the other rewrites do in other conditions, so this form should only be used when you have proven that you need it.

     

    And obviously, performance is not the only relevant metric for your code, and often far from the most important metric. This is clearly a case where there is a trade-off between performance and maintainability. If you need to use TOP (…) PERCENT on a resultset that will never measure more than just a few thousand rows, then the performance difference is probably not enough to warrant the rewrite. But if you find that one of your key queries is using TOP (…) PERCENT on a huge resultset and the query does not perform as well as you would like, then hopefully the content of this blog post has given you the tools to tackle that problem.

     

    (And please, for sake of the mental sanity of whoever gets to inherit your code – if you do decide to use any of the rewrite patterns I provided, then please add lots of comments to your code, to explain exactly what you do, and why you are doing it).

  • How TOP wrecks performance (part 1)

    The TOP keyword in the SELECT clause may not be ANSI standard, but I am sure we have all used it. But do you know that it comes with two optional keywords? And have you used them? They are PERCENT and WITH TIES.

     

    TOP

     

    Let’s first look at the based functionality of TOP, and how it affects performance – which in fact, contrary to the title of this post, is usually quite good for performance.

     

    Here is a simple example, using the AdventureWorks2012 sample database (which can be downloaded from Codeplex). Let’s start with a very basic example:

     

    SELECT   CustomerID, OrderDate, SubTotal

    FROM     Sales.SalesOrderHeader

    ORDER BY CustomerID;

     

    For this query, based on the available indexes, the query optimizer has two obvious choices (and probably hundreds of less obvious ones). One is to scan the table (using the clustered index on SalesOrderID) and then sort the results on CustomerID. The other is to do an ordered scan of the index on CustomerID, which eliminates the sort from the plan, but introduces a lookup to fetch the OrderDate and SubTotal columns. Based on the available statistics, the query optimizer estimates that the cost of doing a lookup for each of the 30K+ rows in the table exceeds the cost of a sort, so the plan I get for this query uses the first option:

    image

    Adding a TOP operator to this query will result in a very nice performance boost. Here are the modified query and the plan I got after adding just a vanilla TOP clause:

     

    SELECT   TOP(3) CustomerID, OrderDate, SubTotal

    FROM     Sales.SalesOrderHeader

    ORDER BY CustomerID;

    image

    The obvious and predictable change is the injection of a Top operator that limits the result set to just the first three rows. But as you see, the rest of the plan also changed shape. This is because the optimizer knows that plans are executed left to right. The Top operator, when called by the SELECT operator, calls the Nested Loops operator to get a row; returns it to the SELECT operator; and then sits waiting until it is called again – which normally will happen fairly quickly, unless the SELECT operator has to wait for the application or network to send out the row. The same thing happens two more times, but when the SELECT operator calls the Top operator for the fourth time, the Top operator will immediately return an “end of data” condition without ever bothering to call its descendant Nested Loops node. No matter how many rows the SalesOrderHeader table has, the TOP clause guarantees that there will never be more than three lookups. The Sort operator we had in the first plan was cheaper than 31,465 lookups, but is far more expensive than just three lookups.

     

    So vanilla TOP is actually good for performance. It not only reduces the amount of data returned, reducing load on the network; it also allows the optimizer to make choices that are optimal for the specified lower number of rows rather than for the full result set.

     

    WITH TIES

     

    Unfortunately, both the WITH TIES and the PERCENT option change this. Let’s first focus on WITH TIES. Adding this option to a TOP clause has the same effect you sometimes see in sports results: the top three athletes are listed, but if numbers three and four finish in the exact same time or have the exact same score, both are considered to be in the number three spot and both are included in the results. However, if numbers two and three finish in the same time, the number four is NOT included anymore – so this is not like DENSE_RANK.

     

    Because the concept of ties requires the concept of an order, the WITH TIES option requires that the query has an ORDER BY clause. (But using any TOP clause without ORDER BY is kind of an abomination anyway). Adding WITH TIES to our previous query has only a minimal impact on the plan:

     

    SELECT   TOP(3) WITH TIES CustomerID, OrderDate, SubTotal

    FROM     Sales.SalesOrderHeader

    ORDER BY CustomerID;

    image

    If you run this query you will see that there are still only three rows returned, but the actual number of rows returned from the Nested Loops operator to the Top operator is shown as four. Where does that extra row come from? Well, if you remember the results of the original query, you will know that there is no tie between rows #3 and #4. The way you saw that is to look at both rows and compare the customer number. And that is also exactly how SQL Server handles this. On the fourth call, the Top operator will request another row from its descendant and compare the value of its ORDER BY column(s) to the value those columns had in the third row. If they are the same, it will return the row, wait for the next call, and then repeat this until the value change; only then will it return an “end of data” condition and stop calling its descendants. In this example the value changed directly on that fourth call, so the number of rows processed is just one more than the number in the TOP expression. If you use TOP(4) WITH TIES, you will get six rows in the result and seven rows processed, because now the Top operator knows there are no more ties after reading the seventh row.

     

    So far, all behavior is exactly as expected – with one exception. In the execution plan above, the estimated number of rows is three. This is clearly not correct; there is no way SQL Server can ever satisfy this query without processing at least four rows (to verify that there is no tie), or more (if there is one). Now you may think this is just a minor difference, and in this specific case it is – one row off for the TOP(3) WITH TIES and three rows off for TOP(4) WITH TIES will probably not have a major impact on plan choice. But the difference can be much bigger, as demonstrated by this query and execution plan:

     

    -- Create index (for the demo)

    CREATE INDEX IX_SalesOrderHeader_TerritoryID

    ON Sales.SalesOrderHeader(TerritoryID);

    GO

     

    SELECT   TOP(3) WITH TIES CustomerID, OrderDate, SubTotal

    FROM     Sales.SalesOrderHeader

    ORDER BY TerritoryID;

    GO

     

    -- Drop index (we don't really need it)

    DROP INDEX Sales.SalesOrderHeader.IX_SalesOrderHeader_TerritoryID;

    GO

    image

    The difference between an estimate of 3 and an actual of 4595 is way too big, and does in this case actually result in a bad plan choice. The lookup is estimated to execute three times, but in reality it executes 4,595 times. The result is a whopping 14,091 logical reads on a table that is only 689 pages in size. In this, a plan with a clustered index scan and a sort would have been much faster.

     

    There is good news, though. All the above tests were done on my main machine, which is SQL Server 2012 SP1. I also ran a test on SQL Server 2014, and the new cardinality estimator gave me much better results. The estimated row counts for the input to the Top now actually take data distribution and statistics into account when a query uses TOP WITH TIES. For the query above, the number of rows coming into Top operator was estimated as 3146.5 when I enabled the new cardinality estimator, which is much closer to the actual number (4595). As a result, the plans chosen now have a much better chance of being optimal for the actual data.

     

    One error that was not fixed in SQL Server 2014 is that, even with WITH TIES, the estimated number of rows being read by the Top operator is still equal to the estimated number of rows being returned. Since the WITH TIES can only stop returning rows to its parent after seeing a row with different values being returned from its child node, the estimated input should always be one more – but now we are really talking about a minor difference, an off-by-one error, which for estimates and plan choice is really just a minor issue.

     

    Conclusion

     

    So far we have seen that a vanilla TOP operator can give you great performance benefits, but a TOP operator with the WITH TIES option poses a severe danger. In SQL Server 2012 and before, as well as in SQL Server 2014 with the old cardinality estimator, the data distribution of the ORDER BY column(s) is not taken into account, which can result in a gross misestimate, especially if the ORDER BY columns include lots of duplicates. These wrong estimates can easily result in very bas plan choices, and hence to bad performance.

     

    The new cardinality estimator in SQL Server 2014 fixes this. It still has an off-by-one error, but that is just a minor issue that will hardly ever cause serious performance issues.

     

    In the next episode, I will focus on the PERCENT option of TOP, and show how this option will wreck performance in an even worse way.
  • Give speakers feedback. It makes a difference.

    I sometimes struggle with finding a catchy title for a blog post. It has to grab the attention, and clarify what the post is about. If you decided to spend time reading my words, I do not want you to finish ending “if I had known this was the content, I would have done something better with my time”.

    For this post, I think the title just sums up the entire post. If you are short on time, you can stop reading now. Seriously.

    Feedback from SQL Server Days

    Last week I received the feedback from my presentation at the SQL Server Days conference in Belgium, three weeks ago. That was quick – which is great! My kudos to the organization for this very fast turnaround.

    As always when I receive feedback for a presentation, the first thing I do is read the comments that attendees have written on the evaluations. I am very happy to say that a lot of people took the opportunity to explain their marks or to give me some tips. Some of them were very positive. One was about the chilly room temperature (and I can only agree to the comment). Some were very sweet. But there were also several comments that I consider to be pointers to me as to where I can improve. Allow me to share a few.

    “The session itself was very good however I expected other content.”

    By itself, I would consider this a cue to revisit the abstract. But some of the other feedback changes how I interpret this. In this case, I think I spent too much time on content that was not advertised.

    “Unfortunately, the time was too short for this session to cover all the speaker wanted to share with us.”

    I get that a lot, and I know. I always either run over or have to speed up near the end, no matter how much time I am given. But in this case, again, the other comments put a different perspective on the feedback – they show me clearly where I go wrong and what causes me too run out of time.

    “The first part showing the old stuff was too long. One or two of the worst examples was probably enough.”

    And this one was followed by four other comments basically saying the same thing. Yes. That message is very clear. When five out of thirteen comments tell me the same thing and two other comments are probably a side effect of the same root cause, I would be a very lousy presenter if I did not take it serious.

    “The subject was different from the one announced. This session was a part of the precon I followed some years ago with Bob Beauchemin. I didn't learn anything new in this session.”

    Okay, this comment has nothing to do with the previous one, but I just have to single it out and comment on it. I do not understand the first part; the session title and abstract were all on the web site and on the printed overviews – but I did replace a speaker who had to cancel about a month in advance, so maybe this attendee used a very old version of the schedule?

    The rest of the comment is nice. I never attended any precon by Bob, but I do know him, I have attended some of his sessions, and I deeply admire his technical knowledge and presentation skills. If a session written entirely by myself is sufficiently similar to Bob’s material that attendees think I copied it, then that is one of the best complements I can get. I’ll take it!

    (For the record: I have never copied other people’s sessions. I de sometimes copy bits and pieces – always with permission of the original author, and I always attribute the material to the original author.)

    Message received

    To understand the comments, you need to know the session. In the session, I show how the OVER clause enables us to solve certain problems. The outline of the session is that I first present the problems with their solutions in SQL Server 2000 (before the OVER clause), then give alternative solutions for SQL Server 2005, 2008, and 2008R2 (where the basic version of the OVER clause is available) and finally present the simple and clean solutions that have become possible in SQL Server 2012, when the windowing extensions to the OVER clause were introduced. In between those demo blocks, I introduce the syntax of the OVER clause.

    After reading the comments and with the benefit of 20/20 hindsight, I have to agree. I think I spend about half the available time on SQL Server 2000, 30% on SQL Server 2005-2008R2, and 20% on the SQL Server 2012-2014 stuff. That is not good. There are only very few people left who support SQL Server 2000, and I waste a lot of people’s time by going over syntax and solutions for that version for half an hour.

    Root cause analysis

    When I need to fix a bug in my code, I do not just dive in head down and start fixing. I have learnt to first step back, analyze the situation, and find the root cause of the problem. Otherwise I end up wasting time fixing a symptom, rather than the problem.

    I do the same here.

    Obviously, I never set out to make this a session on writing big and ugly queries for SQL Server 2000. I wanted to demonstrate the strength of the OVER clause. And I must also admit that I have been struggling with this session from the first time I did it. The first delivery of this session had a different order – I presented the problems one by one, and kept jumping between SQL Server versions each time I moved to the next problem. After that first delivery I left the stage with an unsatisfied feeling; I never got into the flow while presenting, felt awkward all the time, and knew I had to do some serious rewriting or remove the session from my portfolio. I did actually get some fairly decent feedback to that session, but my own mind told me it was no good, and there’s no point in arguing with my harshest critic.

    The version I delivered in Belgium is already the 2.0 version, and a version I have delivered a few times before. When presenting I did feel the flow I normally feel, and I was convinced that, while not my best session, this session is good enough. Until last week. :)

    So how did a session that I designed to highlight functionality introduced in SQL Server 2005 and enhanced in SQL Server 2012 end up spending (wasting) half the available time on SQL Server 2000? Now that the attendees of SQL Server Days have forced me to think about it, I think I can tell. I wanted to convince the audience of the strength of the OVER clause by demonstrating just how much easier some common problems are solved now, as compared to before. That choice was what drove the session design, both the 1.0 and the 2.0 versions. Take a common problem, show the ugly code that was needed to solve this problem back when I was still a young Adonis (“cough”), then show how OVER makes the world a better place. But the problem here, is that the queries for SQL Server 2000 are very complex, and very convoluted. To me, it felt wrong to show complex code without giving any explanation. So when presenting the demos, I take the time to highlight how the queries are built, why they are built that way, and why they could not be simplified.

    With very complex queries for four different problems, all those explanations add up, and end up taking way too much time. Time spent on explanations that benefit nobody. What a waste!

    Fix it (Felix)

    Now that I know where I went wrong, I can set out to fixing it. One option is of course to retire the session, but that is not an option I ever seriously considered. The OVER clause is a great feature, and I have a good story to tell about it. I just need to find the right way to tell it.

    My first plan was to do another total overhaul, create a version 3.0 of this session. I already had some rough ideas forming in my head. Not a total new session yet, just first thoughts – more time is needed before this will evolve into a new session.

    And then I realized that I have no time for this. In just a week from now, I will be in Portland, at SQL Saturday Oregon, and I will present there. The chosen session? You got it: “Powerful T-SQL Improvements that Reduce Query Complexity” – the same session I did at the SQL Server Days. One week is not enough time for a full overhaul. Plus, I have to deliver a session that matches the abstract (and hence the expectations of the attendees) next week, and a full overhaul cannot guarantee that.

    So instead of a 3.0 version, I will update my existing version to a 2.5 version. The rough outline will remain the same, but I will severely cut back on the “old” part. I will not show all the demos for that, and definitely not explain as much as I did before. I will only briefly outline why the queries needed to be so complex, without going into details. I will then spend more time on the SQL Server 2005 stuff, because this is still relevant to the large number of people still working on 2005, 2008, and 2008R2. But I plan to spend way more time than before on the enhancements that have become available in SQL Server 2012. In the next week, I will go over my slides and demo code and see what I need to change there in order to support the 2.5 version – probably not much (it’s just a point release, after all), but I need to make sure that slides and demos will help me keep the pace just right, not get in my way.

    A complete rewrite may still be needed. That depends on the feedback I receive next week, both from my harshest critic (that little voice in the back of my head), and from the attendees. I don’t know if there will be official feedback forms in Portland, but I accept feedback from all sources: conversations, emails, comments on my blog, twitter, and whatever other means you can think of. If the gist of the feedback is that the session is great, I will it as is, but if attendees think it is still not as good as it should be, I will definitely start working on that 3.0 version.

    Bottom line

    I started the blog by saying that you don’t need to read it – just the title is enough. If you ignored me and did continue to read until here, you will probably see that I am right.

    Giving speakers feedback does make a difference. Your feedback has opened my eyes to the flaws in my session on the OVER clause, and has convince me that I need to change it. And it will again be your feedback on the updated version that will tell me whether my changes were sufficient, or more work is needed.

    Speakers give you a lot. They sacrifice their time (a lot of it – only those who ever created a presentation from scratch will really know just how much time that takes!) to share their knowledge with you, often for no more reward than a free entrance pass to the conference. The least you can do as a reward is to give them your feedback. Be positive when they deserve it. And be constructive when they can improve. If you don’t tell us, we will never know!

  • Execution plans, laundry, and a giveaway

    In just a week from now, SQL Saturday #337 will kick off in Portland, Oregon. And I will be there – the third time already for me to be speaking in Portland.

    For a European, Portland is not the most logical location. But the organization over there is a pretty smart bunch of people. They figure that being close to Seattle gives them a great opportunity – so whenever they get the chance, they will reserve a Saturday just before the PASS Summit in Seattle for their own event. And then they make sure to organize this event so extremely well that people, speakers as well as attendees, will want to travel a few days early and make Portland their first stop.

    SQL Mastery Sessions

    But this year, the crew of SQL Saturday Oregon tried something new. They have added a day of pre-cons (or, as they call it, SQL Mastery Sessions) to their program. On Friday, October 31, three speakers will deliver a full-day session, allowing them to dive much deeper in their favorite subjects.

    For the BI fans, Mark Tabladillo will present on “Self-Service BI with Power BI”. I do not know Power BI (or any BI at all), but I do know Mark and I am sure that the session will rock. Based on the abstract, I think it will be a “from zero to hero” type of day. If you want to be the one that brings self-service BI into your organization, go and sign up for this session!

    If you are more DBA oriented, then you should definitely consider going to another rock star in the SQL world: Argenis Fernandez. His session “SQL Server Internals and Data Recovery” promises to give you a real deep-dive insight in how SQL Server stores data, how fate and bad karma can cause that data to become corrupted, and (in my opinion the most important skill a DBA should have, and one you hope never to have to apply to your production database) how to salvage your company’s valuable data if the unthinkable happens. Argenis will correct me if I’m wrong (see the comments below), but based on the abstract I suspect that this session is not for the faint of heart, nor for the beginning DBA. But if you are at least at medior level and want to learn how to recover from almost every imaginable disaster, then this is the session where you should be!

    Both sessions sound great to me, but I am spared the challenge of choosing – because I will be presenting the third SQL Mastery Session, aimed at a target audience of both database developers and DBAs at intermediate level. In “Understanding Execution Plans”, I will explain how to read an execution plan, starting at the very beginning (like: what is an execution plan and where do I find it), and then exposing intimate details of what SQL Server is doing when your query is running. You will learn how a merge join operator processes a right anti semi join, how an exchange operator shoves data between threads, how aggregation relates to my laundry, and hundreds of other things – but most of all, you will learn knowledge that you can apply to slow queries at work, to get a better understanding of what makes them slow, and how you could fix it.

    First the bad news

    Earlier this week I was told that my SQL Mastery Session is already sold out. That is great for me – nothing strokes the ego as much as seeing that there are many people who do not run away screaming at the idea of spending a day locked up with me in a classroom, listening to me talking geek, but are actually prepared to spend real dollars for the privilege. But for you, the reader, this may be bad news – especially if the paragraphs above have whetted your appetite and you now also want to be there. But don’t despair, you can sign up for the waiting list and hope a slot fills up. But there is another option as well.

    Then the good news

    The good news is that the kind people of the SQL Saturday Oregon organization have given me permission to give away one seat to my session. If you win, you not only get a $129 value admission ticket, you even get bypass the waiting list to find yourself in the last available seat, reserved especially for you! Sounds good? Yeah, thought so. So, how do you apply?

    Easy. You just have to help the SQL Saturday Oregon team and me to get all three sessions sold out. Spread the word, do the marketing! Here are the simple steps:

    1. Write on Twitter, Facebook, your blog, a web forum, or any other public place why you think that everyone should sign up for the SQL Mastery sessions in Oregon. Be original, be creative, and most of all: be convincing!

    2. Send me an email, with a link to your post. Send it from an address where you will be able to receive replies – otherwise I will not be able to contact you.

    3. Wait, pray, and hope.

    On Monday, October 27, I will write all email addresses on paper slips, throw them in a bowl, and draw the winner. To reward effort, I do reserve the right to put in two, three or even four slips for entries that are especially creative, convincing, and/or original. This is determined on a purely subjective basis by an independent jury consisting of only me. But every entry will result in at least one slip in the bowl, so every participant has a chance to be the lucky winner!

    I will pass the email address of the winner to the SQL Saturday Oregon team, who will them contact him or her to work out the details.

    Everyone’s a winner

    Only one person can win the main prize. But everyone who enters in the competition will receive a consolation prize. I will not disclose what that prize is at this time – just watch your email next week, to see the surprise.

  • Database Mail … and then the SMTP Server changed

    The database I inherited when I started my current job sends out lots of mails. It does so using Database Mail, and it has lots of mail accounts and mail profiles defined. I know that several of those profiles are used, I suspect that some are not, and I have no idea about the rest – one day I will find the time to clean up, but so far there have always been more pressing matters to attend to.

    But today the mail administrator told me that due to a change in architecture, SQL Server had to start using a different SMTP server for sending mails. Quite an easy task if you have just a single profile – just a few clicks in the Database Mail Configuration Wizard, and done. But repeating those same mouse-clicks for every profile in the list was not my idea of a morning well spent, so I decided that I’d have to script this. (This should be easy – we have just a single SMTP server, so I could hit every single mail account and did not have to bother with exceptions).

    Usually, scripts for such a task are very easy – just type a well-chosen search string in your favorite search engine, check the first two or three hits, and you’ll have a script. Usually even more than one. Carefully inspect the script (just because it’s on the internet does not mean it’s safe!), copy the script, paste into SSMS, make adjustments for your own situation, do one more inspection just to be sure – and then hit the F5 button and say “time to grab a coff … oh wait, it’s already done”.

    In this case I had no luck. Maybe I used the wrong search phrase, or maybe there is a way to accomplish this that is so easy that nobody ever bother blogging about it and I am the only one who managed to overlook the option. Or maybe nobody has ever tried to automate the task of changing SMTP servers.

    Bottom line, I did not find a pre-made script for this task, so I put in the effort to write one, and then decided to share it with you.

    Note that the script below was tested on SQL Server 2008R2 only. Also note that it will update all mail accounts to use the new SMTP server. If you have a more complex setup with multiple servers and only some need to change, you will have to add the correct filtering criteria to the WHERE clause.

    DECLARE @NewServer sysname = 'NotTelling.mail', -- New SMTP server
            @OldServer sysname = 'MySecret.mail',   -- Old SMTP server
            @account_id int;

    DECLARE Cursor_MailAccounts CURSOR LOCAL FORWARD_ONLY STATIC READ_ONLY
    FOR SELECT account_id
        FROM   msdb.dbo.sysmail_server
        WHERE  servername = @OldServer;             -- Add extra logic here

    OPEN Cursor_MailAccounts;

    FETCH NEXT
    FROM  Cursor_MailAccounts
    INTO  @account_id;

    WHILE @@FETCH_STATUS = 0
    BEGIN;
        EXECUTE msdb.dbo.sysmail_update_account_sp
                    @account_id = @account_id,
                    @mailserver_name = @NewServer;
       
        FETCH NEXT
        FROM  Cursor_MailAccounts
        INTO  @account_id;
    END;

    CLOSE Cursor_MailAccounts;
    DEALLOCATE Cursor_MailAccounts;
    (And remember, just because you found it on the internet doesn’t mean it’s safe!)

    With that done, my next task was to fix the mails being sent from SSIS packages. They use an SMTP connection that is defined in the package, not SQL Server’s own Database Mail, so I had to open and edit them by hand. Luckily, I have only four packages active, and only three of them have a Send Mail task, so I did not see any need to automate this.

  • PASS Summit Abstract Feedback

    Last week, Adam posted the feedback he received from PASS on the abstracts he had submitted for the PASS Summit 2014. I was very happy he did – not only because the post itself was a good read, but also because I had not seen the PASS announcement on making the feedback available (I did see it in a PASS community mail a few days later, though).

    I can imagine that not everyone will want to see this feedback, but I do. I love harsh criticism, as long as it’s fair – that’s how I can learn and improve. (The written comments are always my favorite part of the session feedback from conferences where I have spoken). So I have followed Adam’s lead and requested a copy of the feedback on my abstracts. And I will continue to follow Adam, by posting the feedback here – that way, all of the community can learn from my mistakes, instead of just me.

    Understanding Execution Plans [full day, level 400] (not accepted)

    For troubleshooting long running queries, looking at the execution plan is often a good starting point. Once you know how the query is executed, you know why it's slow and what you can do to speed it up.
    But what if the execution plan is just beyond your understanding? What if it uses operators you have seen before, but do not really understand? What if you look at the execution plan, but just don't see the problem?
    In this full-day workshop, you will learn everything you need to be able to read and understand any execution plan. We'll start with an overview of execution plans as a whole, and then dive in and look at all the components, and how they fit together. This will increase your understanding on why the optimizer picks a plan, and what you can do to make it pick a better plan.

    Whether you have read your share of execution plans or whether you wouldn't know where to find them, this workshop will teach you everything you need to know about execution plans. Attend this workshop if you want to hone your tuning skills!
    Throughout the day, we will have exercises to help you get an even better understanding of the theory. In order to get the most out of the day, attendees are encouraged to bring their own laptop, with SQL Server (any version), Management Studio, and the AdventureWorks sample database installed.

    • This sounds a great topic and a good advanced level. I notice it was presented at 300 level in Copenhagen and this time it's aimed at 400 level. Hopefully it will not start too advanced for the audience. - great abstract. maybe more demo time??
      One of the learning outcomes is "Understand how to read execution plans." - hopefully at level 400 people who know how to do this already
    • Good abstract, very good idea for a session.
    • The topic is very important for any developer and looks like the presenter will be very well trained on this session as it is being presented in three other major conferences.
    • well written abstratc with good level of details. interetsing topic - but i dont see ay references to updates in sql server 2014. goals are clear. demo % could be higher for this topic.
    • Yay for audience participation.
      In the future, when noting this session has been presented at SQL Saturdays, please note if it was presented as a full-day pre-conference or as a normal session as normal SQL Saturday sessions do not match the length of a full session.

    Setting the level in an abstract is a constant struggle for me. Is a session only 500 level if only MVPs have a chance of understanding it, and if people walk out with brain damage? If a presenter is able to take the most complex subject matter and explain it in a way that can be understood by everyone, is this a 200 or 300 level session because the intended audience can be relatively new, or is still a 500 level session because the subject matter is deep and advanced? Because of this struggle, I often adjust the level when submitting the same session. In Copenhagen, I got feedback that it was more advanced than the audience expected, so I adjusted. The goal of the session is to ensure that even people who have little experience with execution plans can learn to read them (hence the learning outcome), and then continue to dig deeper where most sessions stop (hence the advanced level). This can be seen as sitting uncomfortably between two stools, or as killing two birds with one stone. I think that the session achieves the latter – but I apparently fail to explain that well enough in the abstract.
    Even for a full day session, time is a constraint. I had to choose between more demos and more room for new content, or more coverage of the basic content. I chose the latter (even the SQL 2012 content is just the last 30 minutes or so), and I stand by that choice, but I realize that everyone has their own preferences.
    When I put in where I presented the session before, I assumed that is would be obvious that this were pre-cons for SQL Saturdays – otherwise it would not be the same session. But it was apparently not clear at all. I should have been more specific.

    Lessons learned:
    Make sure the abstract reflects that people can come in with little previous knowledge, but will still leave with advanced understanding.
    Be careful with assumptions.

     

    A New Method for Effective Database Design [full day, level 200] (not accepted)

    Your databases may be normalized and may perform well … but is the data actually correct?
    Hidden pitfalls abound even in seemingly-mature database designs. Complexity, overabstraction, and miscommunication, can create situations where data simply can’t meet expectations.
    In this full-day seminar, you will learn how to use a requirements-based methodology to translate you users’ real business problems into a rock-solid database design. You will learn exactly what questions to ask, how to ask them, and how to transform the answers into a fully normalized database that accurately captures all the business requirements.

    If you have ever experienced problems due to bad database design, attend this seminar to make sure that never happens again.

    • Good outcomes - sounds appealing
    • One typo "to translate you" rather than "your".  I really like this session idea, though I can't really foresee what the "new" method might be.  Definitely sounds like the only method I know of. Expect it would be quite good.
    • This is a very important topic.
    • Seems the presenter can pull this off quite nicely.
    • Great topic
    • interesting topic. decent abstract. clear goals. low % of demos for a full day session
    • Abstract is for a full-day session and the submitter notes they have presented the session at SQL Saturdays and SQLBits.  One is not able to compare these due to the massive difference in the session length.

    Hmmm, this is a challenge. The abstract apparently fails to make it clear that the method I present here is indeed very different from traditional database design methods. I remember a previous version of the abstract that was rather boring and technical and got rejected all the time. I tried to spiffy it up and make it sound more appealing, but when I now reread it, it does sound a lot like the marketing speak I see and hear from advocates of other methods. I’ll have to redo this one for the next time, find a middle ground between accurate but bring, and spiffy marketing speak.
    I agree with the demo remark, and I would love to do more demos, and maybe some classroom exercises, but the subject is simply too big for that. Maybe if I can present both Monday and Tuesday? ;-)
    And I again caused confusion by including references to previous deliveries without explicitly mentioning that it was a precon at those events as well. Too bad: with the mostly positive other comments, I feel that this is the only reason that this session was not selected.

    Lessons learned:
    Abstract should contain some “near-marketing” speak to make it sound appealing, but I have gone overboard. I failed to really explain what is new and special about the presented method.
    And, again, make very clear that the other deliveries were precons as well.

     

    T-SQL User-Defined Functions, or: Bad Performance Made Easy [general session, level 300] (accepted)

    User-defined functions in SQL Server are very much like custom methods and properties in .Net languages. At first sight, they seem to be the perfect tool to introduce code encapsulation and reuse in T-SQL. So why is this feature mostly avoided by all T-SQL gurus?
    The reason is performance. In this session, you will learn how user-defined functions feed the optimizer with misleading and insufficient information, how the optimizer fails to use even what little information it has, and how this can lead to shocking query performance.
    However, you will also see that there is a way to avoid the problems. With just a little extra effort, you can reap the benefits of code encapsulation and reuse, and still get good performance.

    • Should be a good session - I would attend.
    • Very good abstract, and a decent idea for a session. I like that it includes positive benefits along with the bad effects as functions aren't all bad in all cases. I think you have a typo in the notes, as  you say it got a 4.73 on a 15 scale :)
    • This session looks very good. We always hear how UDFs are bad for performance but never what we can do about it. Having a session that shows that can be very good.
    • Well written with clear and concise goals.
    • good topic. great abstract. clear goals. exciting to see session with 100% demos

    Thanks for the kind words. And yeah, that 4.73 was on a 1-5 scale, missed the dash.
    I hope the last reviewer will not be disappointed when the actual demo percentage is about 95% (the PASS submission form has a dropdown for the demo percentage with 0%, 25%, 50%, 75%, or 100%, so I had to round).

    Lessons learned:
    Looks like this abstract, apart from the typo, can remain unchanged.

     

    Now Where Did THAT Estimate Come From? [general session, level 500] (not accepted)

    The SQL Server Query Optimizer makes its plan choices based on estimated rowcounts. If those estimates are wrong, the optimizer will very likely produce a poor plan. And there's nothing you can do about it. Or is there?
    In this session, you will learn exactly where these estimates come from. You will gain intimate knowledge of how statistics are built and maintained and how they are used to estimate row counts. But you will also learn how filters and joins influence those estimates.
    Though the focus of this session is on understanding the cause of bad estimates, you will also learn ways to fix the problems and get better estimates - and hence, better performing queries.

    • 500 level session and the pre-requisits seem to make it clear that prior knowledge is essential. I hope the presenter can do justice to this - that would be my main concern.
    • I wish the title make it more clear that we were talking about the optimizer, but for a 500 level session, perhaps that isn't needed. "Though the focus of this session is on understanding the cause of bad estimates" I guess I see that, but really the session is about where estimates come from based on the rest of the abstract, and where a bad one comes from is just an offshoot of that.
    • The abstract looks solid. The goals are aligned with the abstract and also look solid.
    • Abstract gives a clear picture of what to expect.  Goals could be better defined.  Is this truly a 500 level?
    • Type-o in Session Prerequisite, "sold" should be "solid".
    • Doubtful if this 500
    • interesting title. good topic - revevant for senior dbas or developers interesting in an indepth understanding of execution plans. abstract has good level of depth goals are terse but convey the necessary details. high % of demo is good.

    Again the level problem, and I must admit that I was, and still am, in doubt on whether to present this as level 400 or 500.
    The “can do justice” remark makes me wonder if this is from a reviewer who got only the abstract, or the abstract and my name. In other words, are these doubts based on the abstract, or based on a perception of my understanding of and ability to explain the subject? I would love to get into contact with this reviewer (mail me, please!). No hard feelings, I am just curious to understand where this opinion comes from and how I can improve.
    The remark on “talking about the optimizer” is interesting. That is NOT what I talk about in this session, but I see what part of the abstract results in this misconception. I need to rewrite that. And yes, you are completely right that the focus is on where estimates come from, not just bad estimates, so I will definitely steal those words for my v2 abstract.

    Lesson learned:
    Focus the abstract on the actual content, not on related content.

     

    Normalization Beyond Third Normal Form [general session, level 400] (not accepted)

    Do you believe the myths that “Third Normal Form is good enough”, or that “Higher Normal Forms are hard to understand”?
    Do you believe the people who claim that these statements are myths?
    Or do you prefer to form your own opinion?

    If you take database design seriously, you cannot afford to miss this session. You will get a clear and easy to understand overview of all the higher Normal Forms: what they are, how to check if they are met, and what consequences their violations can have. This will arm you with the knowledge to reject the myths about higher Normal Forms. But, more important: it will make you a better designer!

    • This is a good academic session - but not sure about it's real world application - a totlal theoretical discussion (based on the fact it has 0% demos) may not hold peoples interest for too long.
    • Database design is my favorite topic, and one I think db devs need to know more about. My only problem with this abstract is that it feels way overreaching to cover all of these normal forms at a 100 level where 1, 2 and 3 aren't even going to be understood all that well.
    • Looks very interesting. I suppose this is something we all should learn about. Looks the abstract covers a lot of material and while is 100% theory it is theory that must be learned anyway.
    • Love the topic. Abstractand Goals tend to "preach" one side of the story.Real-world examples would be helpful.
    • Not sure this is 100 level,
    • routine topic. abstract and topic dont show anything new or exciting for sql 2012 or 2014. The sessions seems all theory and has no demos. This may not be of much appeal and interest to audience
    • Zero demonstration time.
    • While the abstract is presented as a 100-level; the abstract description and goals are 200-level.

    I always prepare abstracts in a Word document, and then copy the content on the submission form. And in this case I must have made a mistake in the level selection. I intended to submit this as a level 400 session, not as level 100. Oops.
    What is a demo? Is it only a demo if the presenter sits behind the laptop, opens Management Studio, and executes T-SQL code? Or can you consider applying normalization rules to a concrete example of a database design, even if it’s on pre-made Powerpoint slides, as a demo too? The reviewers seem to say “yes” to the last option; I thought “no” when I submitted.
    The “abstract and topic dont show anything new or exciting for sql 2012 or 2014” comment is the only remark in all the feedback that I can’t use in any way to improve myself. The subject of database design is not related to a version of SQL Server, not even to SQL Server itself. I could present this session on a conference on any relational database and it would be equally useful.

    Lessons learned:
    When the session submission form is completed, “check, check and check again” is not sufficient; I probably should wait a day and then check three times more (or just learn not to make such sloppy mistakes).
    All real-world examples (or made up but realistic examples) are demos. Even if I am still in Powerpoint.

     

    Inside the Nonclustered Columnstore Index [general session, level 400] (not accepted)

    You have heard the marketing speak. You may have seen the demos. And maybe you have even played with the feature: Columnstore indexes, the amazing "go faster" switch in SQL Server 2012!
    But do you know how they work? Come to this session to find out. We will lift the hood to give you a fascinating in-depth view of how Columnstore indexes are built and read, how batch-mode execution operates, and how these features work together to achieve an amazing performance boost.

    • Topic should be well received
    • Thanks for the abstract, it's good to have someone speaking on columnstore Index, also it would be great if we can include some real examples and demo.since there is no any demo.
      Thanks

    I must have made another mistake – not visible in the feedback, but in the mail I got with the results this was listed as a half-day session. I entered this as a normal (75 minute) session in my Word document.
    No demo (or rather 10%, but I had to round to either 0 or 25) is a result of my choice to cover the internals in depth. Most internals can’t be shown in demos. I love seeing lots of demos in the sessions I attend, so I fully understand the reviewer. But I am also convinced that adding more demos would not improve the quality of this specific session. If that reduces the chance to be accepted, then so be it. I rather present the best possible session at only a few events than a watered down version at more places.
    That being said, I probably should not have submitted this session, and I will not submit it anymore to other conferences. The nonclustered columnstore was SQL Server 2012, and is now sort of obsolete. SQL Server 2014 was not released yet when I submitted my abstracts, but I knew it was coming and I knew that it would bring the improved, clustered version of the columnstore index – this session is old news. I love this session and I regret having to decommission it. But this session is now about as relevant as a session on the internals of English Query.

    Lessons learned:
    Once more: check, check, double check – and then check again.
    No matter how much I love a session, when it’s time has come I must no longer submit it.


    Powerful T-SQL Improvements that Reduce Query Complexity [general session, level 300] (not accepted)

    We’ve all dealt with nightmare queries: huge, twisted monsters that somehow work, despite being ugly and unmanageable. The time has come to tame these beasts, and the solution is available now, in SQL Server 2012.
    New T-SQL functions offer out-of-the-box solutions for many problems that previously required complex workarounds. Paging, Running totals, Moving aggregates, YTD, and much more comes at the power of your fingertips in SQL Server 2012. The only thing you need to do is learn the syntax. And that is exactly what this session is all about: a thorough description and explanation of the syntax, and loads of demos to demonstrate how you can use all these new features.
    Attend this session to boldly take SQL Server where it has never gone before!

    • I like the abstract it informs attendees of some of the things that they can expect to learn.
    • The title of the session is kind of vague, since whereas the windowing functions are useful TSQL improvements, they are not the only ones that are available throughout time. Would have gotten a better review if the title was more specific to windowing functions and 2012.
    • Looks good. If selected I would encourage the presenter to include/replace some aspects of it with SQL 2014 features that also help to improve how queries are written.
    • interesting topic and catchy session name. abstract is very well written and points out sql server 2012 features - relevant and current. good level of details in the goals and good balance of demo %
    • The session name is somewhat misleading for what appears to be a "Windowing Functions" presentation.

    Thanks for calling the title catchy. I like it too. But the comments of three other reviewers make me painfully aware that this title fails to deliver on the most important function of a title: it does not give the conference visitor an “at a glance” idea of the broad subject matter. Yes, all abstracts are available. But how many conference attendees read them all? I don’t, that’s for sure! I use the titles to filter down to just two or three promising sessions, then read the abstracts (if I have time – otherwise I pick based on title alone!) There were indeed more improvements in SQL 2012, and then there are even more in SQL 2014. Based on the title, I myself would expect to see them covered.

    Lesson learned:
    I already knew this but failed to apply the lesson: session titles should be catchy, but should also set a correct expectation for the audience.

     

    And that concludes another long post. Thanks to all the volunteers who spent countless hours on the unthankful task of reading through hundreds of abstracts, commenting on them, selecting between sessions that are almost impossible to compare with nothing more to go on than a few hundred words in the abstract. Your hard work is much appreciated; the PASS Summit would not be possible without people like you. So thanks!

    And I like to conclude this post with the same words that Adam has at the end of his: Have any additional feedback for me? Post it below!

  • SQLPass NomCom election: Why I voted twice

    Did you already cast your votes for the SQLPass NomCom election? If not, you really should! Your vote can make a difference, so don’t let it go to waste.

    The NomCom is the group of people that prepares the elections for the SQLPass Board of Directors. With the current election procedures, their opinion carries a lot of weight. They can reject applications, and the order in which they present candidates can be considered a voting advice. So use care when casting your votes – you are giving a lot of influence to the people you choose.

    Because there are three seats open for elections, every PASS member with a completed profile on the PASS site gets a maximum of three votes. I used only two, and I have a good reason for that. I believe that there is something fundamentally broken in the NomCom selection and election process.

    Since a few years, PASS is actively trying to morph from an “international” organization, ran only from the USA, to a truly international organization. As a result of changes to the bylaws and to the procedure for Board of Directors elections, the current Board of Directors now for the first time in history (as far as I know) has two European representatives – but sadly no one from any other non-North-American area. A lot of work still has to be done, and a lot of time still has to pass, before we can expect a truly international Board of Directors.

    The process for the NomCom election appears to be a step back in this regard. Three seats are to be elected. Two are truly open, the third is reserved … for a US/Canada representative. I do not understand this. Does PASS really need a guaranteed US/Canada seat on the NomCom to get a truly international representation? Wouldn’t it have made much more sense to reserve at least one seat for “non-US/Canada”?

    It is also relevant to consider that the actual NomCom consists of five persons. Three are elected by the members, one is the Immediate Past President of PASS, Bill Graziano, and I could not find anything about the fifth seat but my assumption is that this seat is reserved for founding partner and primary sponsor Microsoft. So that means that two seats are already filled with representatives from the US, and the reserved US/Canada seat in the elections means that North America is guaranteed a 3 out of 5 majority in the NomCom. And depending on the votes, US/Canada may also get the last two seats.

    With only two candidates for the NomCom who are not from the US or Canada, you might consider all of this a moot point. From a practical point of view, the reserved seat for US/Canada will never make a difference (though a reserved seat for non-US/Canada might). But purely out of principle, I want to voice my opinion that this process is broken. (And then we can also speculate on whether there would have been more international candidates if there would have been no reserved seats – if I know that the rules of the election favor my opponents and reduce my chances, I would think twice before even applying).

    I am convinced that all NomCom candidates will try to put the best candidates forward, and that none of them will favor candidates from their own region. But if PASS wants to become a truly international organization, it needs to have international representatives in all bodies, including the NomCom. Ideally, that is achieved by having completely open elections for all bodies – no seats should be reserved for regions. But I understand that the current membership and the past record of election participation from members from different regions would cause such a setup to end up with all North American representatives, so I do see the need for reserving seats for under-represented areas – as a temporary measure.

    What I do not understand is the reason for reserving a seat for a region that is already way over-represented. And that’s why I decided to go on a strike against candidates from the US or from Canada, voting only for candidates from the rest of the world. Nothing personal, not for any reasons related to the actual candidates – just making a point, and maximizing the chance of both open seats going to non-North-American candidates. There were only two such candidates – and that is why I used only two of my three votes.

    There are now less than two days left before voting closes. If you have not already, I urge you to go out and vote now. Use your votes. Use them wisely.

  • SQL TuneIn Zagreb 2014 – Session material

    I spent the last few days in Zagreb, Croatie, at the third edition of the SQL TuneIn conference, and I had a very good time here. Nice company, good sessions, and awesome audiences.

    I presented my “Understanding Execution Plans” precon to a small but interested audience on Monday. Participants have received a download link for the slide deck.

    On Tuesday I had a larger crowd for my session on cardinality estimation. The slide deck and demo code used for that presentation will be available through the conference website, but for those who cannot wait, I have also attached them to this blog post.

    The organization of the event have already announced their plans to host a fourth edition. And if I have any say in the matter, I will visit again.

  • TechDays 2014 – Session material

    Last week in the Hague, I had the honor to present two very different sessions at the Dutch TechDays conference. A deep-dive session on internals of the SQL Server 2012 nonclustered columnstore index, and a very developer-oriented session on the bare basics of performance tuning. To my delight, both times the room was filled with very interested people, asking great questions and, I guess, enjoying my presentations.

    All sessions were recorded, and I have been told that in due time, all will be available on Channel 9. But what if you can’t stand the wait? What if you just want to quickly browse through my slide deck without having to endure my crappy jokes? Or what if you want to play around with my demo code but are of the rightful opinion that manually copying my code while constantly pausing and unpausing the video would be an utter waste of time?

    The answer to each of those problems is: you simply download the slides and demo code from the attachment to this blog post. Simple, huh?

  • Fake statistics, and how to get rid of them

    There are two ways to test how your queries behave on huge amounts of data. The simple option is to actually use them on huge amounts of data – but where do you get that if you have no access to the production database, and how do you store it if you happen not to have a multi-terabyte storage array sitting in your basement? So here’s the second best option: you cheat.

    Luckily, SQL Server has a feature that allows you to do just that. I must warn you that this feature is undocumented and unsupported. You should never under any circumstances use this feature on a production server, unless explicitly directed to by qualified Microsoft staff. Using it, sparingly, on a test box is okay. But as an undocumented feature, there is no guarantee that it will always work, or that it will continue to work in future versions.

    With that disclaimer out of the way, let’s take a look at the “cheat” option. As you probably know, the Query Optimizer relies on statistics for its decisions. If those statistics are wrong, the decisions will probably be wrong. Here is where we find the leverage to cheat: if we can force SQL Server to use statistics that are representative of the production database rather than the test database, we will get the execution plan it would generate on production. The actual performance will probably still be lots faster, but at least we can look at the plan and use our understanding of execution plans to check that we get scans where we want scans, seeks where we want seeks, and all the right types of joins.

    Fake statistics

    The undocumented feature that we can use here actually has a very simple and easy-to-remember syntax. You just run an UPDATE STATISTICS command, adding the options “WITH ROWCOUNT=xxxx, PAGECOUNT=xxxx” to force SQL Server to store the numbers you mention as the rowcount and pagecount. Clear the plan cache, then test your queries and check how they behave with the simulated number of rows and pages. I could give a much longer description and examples of the usage, but others have already done so – see for instance this post from the horse’s mouth (aka the Query Optimization Team), this post from Benjamin Nevarez, or if you are able to read German this excellent post from Uwe Ricken.

    … and how to get rid of them

    But what all those blog posts fail to mention is what happens later. Of course, if you just create a small test table, slap in a few sample rows, fake big statistics, check the execution plan and then drop the table, that is not an issue. But what if you have a test database that contains a lot of carefully crafted test data, and that is used for other tests as well? You may not want to run all the tests with those artificially bumped rowcounts!

    In my case, I ran into this in the AdventureWorks sample database, my loyal companion for presentations and demos. I will present a pre-conference seminar on Execution Plans in Exeter (March 21), in Copenhagen (March 28), and in Zagreb (June 2). For my preparations, I wanted to force a hash spill warning, so I artificially lowered the rowcount for the SalesOrderDetail table to just 200. That worked just fine – this nifty feature can be used to mimic every rowcount, not just the big numbers.

    Because I use this sample database for other purposes as well, I wanted to immediately reset the statistics to their normal value. I figured rebuilding all the statistics on the table with the FULLSCAN option would do the trick. Right? Wrong! The UPDATE STATISTICS did run, it took a while to scan all rows in the table (as requested) – but then it still retained the rowcount and pagecount values that I had forced earlier! And there is no visible indication at all – executing the UDPATE STATISTICS … WITH FULLSCAN statement simply runs for a while, then reports success.

    It was actually mere coincidence that I later did some other tests on the same table, and just happened to notice that the estimated rowcount for this table was still 200. It then took me at least an hour of searching the internet and trying many different options (including using zero or a blank value for the rowcount and pagecount parameters, combining different other options of the UPDATE STATISTICS command, clearing or not clearing the plan cache, and a few other things I don’t even remember. None of them worked. I could of course count the actual number of rows and pages and use that, and that would have worked because my AdventureWorks database never changes – but for a test database where the data can actually change over time, this options would not work.

    In the end, I finally found one a method that works. But it is really a sledgehammer approach, and I prefer not heaving to do this on large tables on a regular basis: ALTER TABLE Sales.SalesOrderDetail REBUILD. After running that statement, I found that the statistics on the SalesOrderDetail table had finally reverted to their normal behavior. Until the next time I need to fake a different number.

    Conclusion

    On a development server, using the WITH ROWCOUNT and WITH PAGECOUNT options of the UPDATE STATISTICS is a fine method to simulate large numbers of rows, or to simulate the effect of wrong statistics. But unfortunately, the numbers given stick for far longer that I like; this is not documented anywhere, and not easy to undo. Rebuilding the table and indexes appears to be the only solution.

    If you have found other ways to return from fake row- and pagecounts to the normal statistics behavior, post a comment and share your knowledge!

  • Database Design training – for free?

    When I started this blog, my plan was to focus on two main subject areas. One of them is SQL Server, T-SQL, and performance. The other is database design.

    Looking back over my post history, I honestly cannot say I delivered on the second area. Not because I have nothing to say about database design, but because I found it doesn’t lend itself for a blog. In the SQL Server / T-SQL area, there are subjects that can be isolated and described in a single post, or in a short series. In database design, I do not see such objects. Everything is related, and the only good way to cover this is end to end – starting at requirements gathering, through modeling, normalization, more normalization, and ending at the final database design.

    That cannot be covered on a blog post. So I found a different place: Pluralsight. Pluralsight is a company that provides online training on IT-related subjects. Excellent training, delivered by the best specialists available, for a fraction of the price of classroom training. And with the additional benefit of being able to pause and rewind as often as you want.

    Relational Database Design

    I was delighted when Pluralsight offered me the opportunity to record a course on relational database design for them. I must admit that creating the course took more time than I had ever anticipated – but it is now finished and available in the Pluralsight course library. It is a very comprehensive course – to see the entire course, end to end, you will need to set aside almost a full work day: 7 hours en 33 minutes. But luckily, you can consume the course in small portions, watching a few clips when you have time and stopping to return later when you’ve had enough.

    The first module is the introduction, in which I give a historic overview of how storage hardware and methods have evolved to where they are now. If you are pressed for time, you might consider skipping this module. Knowing the past can help understand the present better, but this module is not required to understand the rest of the course.

    In module 2, I explain the IDEF1X notation for ER diagrams, that I use throughout the course. Module 3 presents you with techniques you can use to gather the information you need from the domain experts, and module 4 shows how to use that information to create a first draft of the data model. Module 5 then explains how to convert between the ER diagram and the relational database representation of the data model – I personally prefer to postpone this conversion until after the normalization, but not everyone likes to work that way; hence my decision to cover the conversion first.

    Normalization is a subject that is far more complex than many people realize – yet much easier to do than many people fear. I use two (long!) modules to cover basic normalization, up to third normal form; and then a third long module to go over all the so-called “higher” normal forms – often considered to be unnecessary, too hard to do, and too complex to understand. I hope you’ll disagree with those qualifications as much as I do after watching this module.

    IDEF1X may be popular, it is far from the only ER diagramming method. In the last module, I cover several other methods, showing you how much they are alike and where you’ll find the important differences. After seeing it, you should be able to work in any ER diagramming method you may encounter.

    I personally think that this is a very good course on relational database design. But I might be just a bit biased. So I suggest that you go ahead, check it out, and be your own judge!

    Cheap …

    My relational database design course is not only (in my opinion) very good – it is also extremely affordable. Have you ever looked at the cost of a classroom course on database design, then decided that you simply cannot afford it? The good news is – Pluralsight is cheaper! Prices for individual subscriptions start at just $29 per month, or $299 per year for a basic subscription – which already gives unlimited access to the full course library. Or you can pay $49 per month / $499 per year for a “plus” subscription, that adds access to exercise files, assessments, certificates, and offline viewing.

    And for that low price, you do not get access to just my course on relational database design – the full course library of Pluralsight already contains over a thousand titles, with new titles being added almost every day. Readers of this bog should be excited to see over fifty courses in the SQL Server subject area, by renown experts such as Paul Randal, Jonathan Kehayias, Leonard Lobel, Glenn Berry, and many others. And if you want to look beyond the borders of SQL Server, Pluralsight has courses on almost every other subject (as long as it is related to dev and IT).

    … or free?

    But the title of my blog promises “free”, not “cheap”. Thirty bucks per month may be a steal, but it is not free. Was I lying? Of course not! You can check my relational database design course (and more) for free. How? Simple!

    Method 1: available to all, but a bit limited. Go to the Pluralsight website, and sign up for a free trial. This costs you absolutely nothing, and it gives you access to all the Pluralsight courses for a whole week. There is a catch, though – your viewing time is limited to 200 minutes. So you can use the trial to watch one of the shorter courses completely, to get a very extensive taster of a longer course, or to browse around and watch snippets of many courses – that is up to you!

    Using this method is an easy way to watch almost half of the relational database design course for free. But what about the rest? Enter method 2!

    Method 2: sufficient to watch the entire course, but not available to everyone. Pluralsight has allowed me to give out a few one-week training passes. Like the free trial, these give access to all courses in the course library, for a full week. Unlike the free trial, they are not time limited. If you have the stamina, you can get a whole 168 hours of training out of them. (I do not advise this. I am not responsible for the effects on mental and physical health if anyone thinks they should even try this. Please don’t). But with a bit of planning, a week should be enough to consume the entire relational database design course, and still have time left to check out some of the other courses!

    So how do you get such a free pass? The easiest method is to post a comment to this blog, explaining why I should pick you. On February 10, I will choose the five best, most convincing, or maybe just the funniest comments, and reward them with a training pass!

    So go the the Pluralsight site, sign up for that free trial, check out my course, then return here and post a comment. Convince me!

More Posts Next page »

This Blog

Syndication

Privacy Statement