THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

SELECT Hints, Tips, Tricks FROM Hugo Kornelis WHERE RDBMS = 'SQL Server'

Parameterization and filtered indexes (part 2)

In my previous post, I demonstrated how the presence of a filtered index can get in the way of successful parameterization, possibly resulting in too much query compilations and procedure cache bloat. I suggest reading that first, because I will go straight where I left off.

 

Use the Force, Luke

 

If you read the previous post very carefully, you will have noticed that I wrote that if a plan is unstable, it is “unsafe for simple parameterization” – and that suggests that SQL Server will not have such inhibitions when using forced parameterization. Let’s have a look. Assuming you still have the indexes I created in the previous blog post, just execute the query below change the parameterization setting for AdventureWorks2012 to forced, clear out the plan cache (and remember not to do this on a production system!), execute the query that would not successfully parameterize when the database was set to simple parameterization, and inspect the plan cache. You will see that this time, the query was indeed parameterized. If you also add the code to watch the “Unsafe Auto-Params/sec” counter, you will see no changes to its value.

 

USE AdventureWorks2012;

go

ALTER DATABASE AdventureWorks2012

SET PARAMETERIZATION FORCED;

go

DBCC FREEPROCCACHE;

go

SELECT SUM(OrderQty)

FROM   Sales.SalesOrderDetail

WHERE  ProductID = 706;

go

WITH XMLNAMESPACES (DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan')

SELECT      cp.objtype AS ObjectType,

            cp.usecounts AS UseCount,

            st.[text] AS QueryText,

            cp.plan_handle AS PlanHandle,

            stmt.value('(@ParameterizedPlanHandle)',

                       'varchar(64)') AS ParameterizedPlanHandle,

            qp.query_plan AS [Execution Plan]

FROM        sys.dm_exec_cached_plans AS cp

CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) AS qp

CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) AS st

CROSS APPLY query_plan.nodes('/ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple') AS batch(stmt)

WHERE       qp.dbid = DB_ID()

AND         st.[text] NOT LIKE '%ExcludeMe%';

go

 

At first sight, this may appear to be a very simple and elegant solution for the problem. Just set all databases to forced parameterization and then sit back and relax, waiting for the inevitable flood of happy emails from end users who wish to thank you for the tremendous performance boost you just gave them.

 

But wait. When has solving a problem in SQL Server ever been this simple? There must be a catch, right? But where?

 

A new use case

 

In order to see the problem, I will use a different database – so let’s first clean up the mess we made in AdventureWorks2012 and restore it to its original state, so that other bloggers can use it for their demos without my stuff getting in the way:

 

ALTER DATABASE AdventureWorks2012

SET PARAMETERIZATION SIMPLE;

 

DROP INDEX Sales.SalesOrderDetail.ix_ProductID_Over_1000;

DROP INDEX Sales.SalesOrderDetail.ix_ProductID_Incl_OrderQty;

 

A very common use case of filtered indexes is in combination with “soft deletes” – data that is no longer valid is not physically removed from the table, but kept (often to maintain a history of events). A special column, usually called “is_deleted”, is used to track that this information is historic only. Since the majority of queries is only interested in current data, you will find the predicate “is_deleted = 0” in almost every query – so that makes this an ideal candidate for a filtered index. And since we are now under the impression that filtered indexes really require forced parameterization, we will change that setting right away. Here is the script to create this scenario in a sample database (it may take some time to run this!):

 

USE tempdb;

go

IF EXISTS (SELECT * FROM sys.databases WHERE name = 'FilterDemo')

BEGIN;

    ALTER DATABASE FilterDemo SET SINGLE_USER WITH ROLLBACK IMMEDIATE;

    DROP DATABASE FilterDemo;

END;

CREATE DATABASE FilterDemo;

ALTER DATABASE FilterDemo SET PARAMETERIZATION FORCED;

go

USE FilterDemo;

go

CREATE TABLE dbo.AllThings

   (ThingID int NOT NULL,

    ThingName varchar(20) NOT NULL,

    is_deleted bit NOT NULL DEFAULT (0),

    LotsOfStuff char(2000) DEFAULT ('Placeholder'),

    CONSTRAINT PK_AllThings

        PRIMARY KEY (ThingID)

   );

DECLARE @i int = 0;

WHILE @i < 5000

BEGIN;

    SET @i += 1;

    INSERT dbo.AllThings

           (ThingID,

            ThingName,

            is_deleted)

    VALUES (@i,

            CHAR(65 + RAND() * 26) + CHAR(65 + RAND() * 26) + CHAR(65 + RAND() * 26), -- Random name

            CASE WHEN RAND() < 0.99 THEN 1 ELSE 0 END)  -- Most products are deleted

END;

CREATE INDEX ix_Name_NotDeleted

ON dbo.AllThings (ThingName)

INCLUDE (is_deleted)    -- Always include the columns you filter on!

WHERE is_deleted = 0;

go

 

Once the script is done, let’s get a quick count of the number of things that have not been deleted. I used a formula that results in 99% of the things being deleted, but there is a random factor involved. The actual number of non-deleted things should be close to 50. It was 58 on my system, which I found by running this query:

 

SET STATISTICS IO ON;

go

SELECT COUNT(*)

FROM   dbo.AllThings

WHERE  is_deleted = 0;

 

But if you switch to the messages tab, you will see a very disturbing number. I expect SQL Server to execute this query by simply scanning the filtered index, as this index exactly contains all the rows it needs to count. But the output from STATISTICS IO shows that a total of 1256 logical reads have been made. Over twelve hundred reads to count just 58 rows? How is that possible? Let’s take a look at the execution plan to find out:

image

As you see, the filtered index is not used at all; the optimizer chose to scan the clustered index instead, wading through all 5,000 “things” in my table to find just those 58 that were not deleted.

 

The reason for this is simple. I have enabled forced parameterization. So I told SQL Server that, no matter the consequences, it should always replace constant values with parameters. So the plan that the optimizer was forced to compile was not for the query I typed, but for this query instead:

 

DECLARE @1 int = 0;

SELECT COUNT(*)

FROM   dbo.AllThings

WHERE  is_deleted = @1;

 

And the optimizer has to produce a plan that will always return the correct results, for any possible value of the parameter. Of course, when executing this query with parameter value 1, scanning the clustered index is the only possible way to return correct results, so the plan choice that was made was indeed the only possible choice. By enabling forced parameterization, we have effectively crippled the optimizer in using any filtered index at all (except through views).

 

More force?

 

In a case such as this, where we know that we will always use is_deleted = 0 in the query predicate, it can become very tempting to find ways to convince the optimizer to choose the query plan we want without giving up on the forced parameterization plan. Let’s first see what we can achieve by parameterizing this query ourselves and applying the OPTIMIZE FOR hint:

 

DECLARE @1 int = 0;

SELECT COUNT(*)

FROM   dbo.AllThings

WHERE  is_deleted = @1

OPTION (OPTIMIZE FOR (@1 = 0));

 

Okay, I admit, it was a rather desperate attempt and I didn’t really expect much of it. The OPTIMIZE FOR hint tells the optimizer that I want a plan that gives the best performance for that particular value, but the results still have to be correct for other values. So the optimizer will still consider what would happen if I supply other values, and will reject the filtered index because of that.

 

So, back to the original query, and now use more force. I know that this query will always perform better when using the filtered index – so if the optimizer fails to see that, I will just force it. Applying an index hint does just that. It may be called a hint, but it is a directive; if you hint a query, it WILL be used no matter what. So this should help, right?

 

SELECT COUNT(*)

FROM   dbo.AllThings WITH (INDEX = ix_Name_NotDeleted)

WHERE  is_deleted = 0;

 

No, it does not help. The result is an error message. We have now given the optimizer a totally impossible task. We told it to parameterize the query, no matter what, so it did. It now has to produce an execution plan for the parameterized query, and that query will be reused with different values. You and I know that the value will really never be different, but SQL Server does not, and it still has to guarantee correct results. But then we also told SQL Server that it really has to use an index that does not include some of the rows that, for some values of the parameter, may have to be returned. Like I said – an impossible task, and SQL Server responds just like you and I would do, by throwing up its hands in despair and giving up.

 

The simple solution

 

Luckily, there is a solution to this problem. It’s simple – simple parameterization, to be precise. Just set the parameterization option back to its default setting of simple, and you will get a much better behavior.

 

ALTER DATABASE FilterDemo

SET PARAMETERIZATION SIMPLE;

SET STATISTICS IO ON;

go

SELECT COUNT(*)

FROM   dbo.AllThings

WHERE  is_deleted = 0;

 

Now the query takes just two logical reads. And the execution plan looks as expected: a scan of the filtered index, that’s all.

 

Conclusion

 

When you look only at the plan cache, forced parameterization may look like manna from heaven. But when you look further, you will see that setting the parameterization option to forced is probably not a good idea at all. It may appear to solve some issues, but you get bigger issues in return – filtered indexes that might boost performance tremendously are ignored, and if they are hinted it can even cause errors. Do you know all the code that is running on your system? Are you sure that none of your application developers has ever added an index hint? Do you want to find out the hard way?

 

Simple parameterization in combination with filtered indexes may not always play well with the plan cache. But I would think very long and hard, and do a close inspection of all objects in the database and all queries used before even considering switching to forced parameterization.

Published Friday, January 24, 2014 3:43 AM by Hugo Kornelis

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Paul White said:

Hi Hugo,

As an aside, adding OPTION (RECOMPILE) allows the filtered index to be used even where parameterization is forced (there is no possibility of plan reuse).

Paul

January 25, 2014 5:03 PM

Leave a Comment

(required) 
(required) 
Submit

About Hugo Kornelis

Hugo is co-founder and R&D lead of perFact BV, a Dutch company that strives to improve analysis methods and to develop computer-aided tools that will generate completely functional applications from the analysis deliverable. The chosen platform for this development is SQL Server. In his spare time, Hugo likes to visit the SQL Server newsgroups, in order to share and enhance his knowledge of SQL Server.

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement