THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Alexander Kuznetsov

More specific queries may run faster, but we should not overdo it.

It is well known that replacing a generic, one-size-fits-all query with several more specific ones may boost performance. However, in some cases these more specific queries do not perform any faster than the generic one. We shall consider several examples which demonstrate both scenarios.

Prerequisites

The following script creates a table and populates it with 4M rows, not much but enough for our test cases.

CREATE TABLE dbo.Packages
    
(
      
ID INT NOT NULL
            
IDENTITY ,
      
[length] INT NOT NULL ,
      
width INT NOT NULL ,
      
[weight] INT NOT NULL ,
      
SpaceFiller CHAR(30) NOT NULL,
      
CONSTRAINT PK_Packages PRIMARY KEY ( ID )
    ) ;
GO

INSERT  INTO dbo.Packages
        
( [length] ,
          
width ,
          
[weight],
          
SpaceFiller
        
)
        
SELECT  n1.n ,
                
n2.n ,
                
1 ,
                
'*'
        
FROM    ( SELECT    n
                  
FROM      dbo.Numbers
                  
WHERE     n <= 2000
                
) AS n1
                
CROSS JOIN ( SELECT n
                            
FROM   dbo.Numbers
                            
WHERE  n <= 2000
                          
) AS n2 ;

GO

CREATE INDEX Packages_length_width
  
ON dbo.Packages([length], width) INCLUDE ( [weight], SpaceFiller ) ;
GO

CREATE INDEX Packages_width_length
  
ON dbo.Packages(width, [length]) INCLUDE ( [weight], SpaceFiller ) ;
GO

 

When replacing a generic query with two specific ones boosts performance

The following generic query performs very poorly - it does not utilize any indexes. Instead, it scans the whole table and sorts it:

CREATE PROCEDURE dbo.GetTop1Weight
    
@OrderBy VARCHAR(30) ,
    
@topWeight INT OUT
AS
    BEGIN
;
        
SET NOCOUNT ON ;
        
SELECT TOP ( 1 )
                
@topWeight = [weight]
        
FROM    dbo.Packages
        
ORDER BY CASE WHEN @OrderBy = 'Length,Width' THEN [length]
                      
ELSE [width]
                
END ,
                
CASE WHEN @OrderBy = 'Length,Width' THEN [width]
                    
ELSE [length]
                
END ;
    
END ;
GO
  

You can run it or at least see the execution plan, and see for yourself - the query needs about 25K logical reads to complete.

Let us replace this generic query with two specific ones:

ALTER PROCEDURE dbo.GetTop1Weight
    
@OrderBy VARCHAR(30) ,
    
@topWeight INT OUT
AS
    BEGIN
;
        
SET NOCOUNT ON ;
        
IF @OrderBy = 'Length,Width'
            
BEGIN ;
                
SELECT TOP ( 1 )
                        
@topWeight = [weight]
                
FROM    dbo.Packages
                
ORDER BY [length] ,
                        
[width] ;
            
END ;
        
ELSE
            BEGIN
;
                
SELECT TOP ( 1 )
                        
@topWeight = [weight]
                
FROM    dbo.Packages
                
ORDER BY [width] ,
                        
[length] ;
            
END ;
    
END ;
GO
  

As a result the query speeds up dramatically - for both sort orders it executes as one index seek:

DECLARE @topWeight INT ;
EXEC dbo.GetTop1Weight @OrderBy = 'Length,Width',
    
@topWeight = @topWeight OUT ;
EXEC dbo.GetTop1Weight @OrderBy = 'Width,Length',
    
@topWeight = @topWeight OUT ;

Both queries need just 3 logical reads to complete, which is a huge improvement.

As we have seen, sometimes replacing a generic, one-size-fits-all query with several more specific ones dramatically improves performance. However, in many other cases more specific queries do not perform any better.

When replacing a generic query with two specific ones does not improve performance at all

The following one-size-fits-all query performs very poorly:

CREATE PROCEDURE dbo.GetTotalWeight
    
@length INT = NULL ,
    
@width INT = NULL ,
    
@totalWeight INT OUT
AS
    BEGIN
;
        
SET NOCOUNT ON ;
        
IF ( ( @length IS NULL
               AND
@width IS NULL
             )
             OR (
@length IS NOT NULL
                  AND
@width IS NOT NULL
                )
           )
            
BEGIN ;
                
RAISERROR('Must provide exactly one of the following parameters: @width and @length', 16, 1) ;
                
RETURN 1 ;
            
END ;
        
SELECT  @totalWeight = SUM([weight])
        
FROM    dbo.Packages
        
WHERE   [length] = COALESCE(@length, [length])
                AND
[width] = COALESCE(@width, [width]) ;
    
END ;
GO
 

Note: for simplicity, let us assume that we must always provide exactly one parameter, either length or width, but never both. In all the subsequent versions we shall skip the verification of parameters, the first IF statement, just to keep our examples shorter.

We can run the following script and see that every time our stored procedure scans the whole table and performs 25K logical reads:

DECLARE @totalWeight INT ;
EXEC dbo.GetTotalWeight @length = 1,
    
@totalWeight = @totalWeight OUT ;
EXEC dbo.GetTotalWeight @width = 1,
    
@totalWeight = @totalWeight OUT ;

Speeding up this query is very easy. The following query is index friendly. For brevity, we have omitted the first step, the verification that either length or width, but never both, is provided:

ALTER PROCEDURE dbo.GetTotalWeight
    
@length INT = NULL ,
    
@width INT = NULL ,
    
@totalWeight INT OUT
AS
    BEGIN
;
        
SET NOCOUNT ON ;
        
SELECT  @totalWeight = SUM([weight])
        
FROM    dbo.Packages
        
WHERE   [length] = @length
                
OR [width] = @width ;
    
END ;
GO

Every time it uses the right index and needs just 16 logical reads:

DECLARE @totalWeight INT ;
EXEC dbo.GetTotalWeight @length = 1,
    
@totalWeight = @totalWeight OUT ;
EXEC dbo.GetTotalWeight @width = 1,
    
@totalWeight = @totalWeight OUT ;
 

In this case more specific queries do not run any faster. The following queries show several approaches which we can try out. We can keep all the code in one procedure, as shown in the following scripts:

ALTER PROCEDURE dbo.GetTotalWeight
    
@length INT = NULL ,
    
@width INT = NULL ,
    
@totalWeight INT OUT
AS
    BEGIN
;
        
SET NOCOUNT ON ;
        
SELECT  @totalWeight = SUM([weight])
        
FROM    ( SELECT    [weight]
                  
FROM      dbo.Packages
                  
WHERE     [length] = @length
                  
UNION ALL
                  
SELECT    [weight]
                  
FROM      dbo.Packages
                  
WHERE     [width] = @width
                
) AS T ;
    
END ;
GO

ALTER PROCEDURE dbo.GetTotalWeight
    
@length INT = NULL ,
    
@width INT = NULL ,
    
@totalWeight INT OUT
AS
    BEGIN
;
        
SET NOCOUNT ON ;
        
IF @length IS NOT NULL
            
BEGIN ;
                
SELECT  @totalWeight = SUM([weight])
                
FROM    dbo.Packages
                
WHERE   [length] = @length ;
            
END
        ELSE
            BEGIN
                SELECT  
@totalWeight = SUM([weight])
                
FROM    dbo.Packages
                
WHERE   [width] = @width ;
            
END
    END
;
GO

Alternatively, we can try out two separate procedures:

CREATE PROCEDURE dbo.GetTotalWeight_ByWidth
    
@width INT ,
    
@totalWeight INT OUT
AS
    BEGIN
;
        
SET NOCOUNT ON ;
        
SELECT  @totalWeight = SUM([weight])
        
FROM    dbo.Packages
        
WHERE   [width] = @width ;
    
END ;
GO

CREATE PROCEDURE dbo.GetTotalWeight_ByLength
    
@length INT ,
    
@totalWeight INT OUT
AS
    BEGIN
;
        
SET NOCOUNT ON ;
        
SELECT  @totalWeight = SUM([weight])
        
FROM    dbo.Packages
        
WHERE   [length] = @length ;
    
END ;
GO

DECLARE @totalWeight INT ;
EXEC dbo.GetTotalWeight_ByLength @length = 1,
    
@totalWeight = @totalWeight OUT ;
EXEC dbo.GetTotalWeight_ByWidth @width = 1,
    
@totalWeight = @totalWeight OUT ;

Anyway, none of the approaches we have tried out performs any faster than the optimized generic one - they all complete in 16 logical reads and use approximately the same CPU.

So, in this case replacing a generic procedure with several specific ones does not make any sense at all - it increases complexity without improving performance.

We are better off with the original generic query:

ALTER PROCEDURE dbo.GetTotalWeight
    
@length INT = NULL ,
    
@width INT = NULL ,
    
@totalWeight INT OUT
AS
    BEGIN
;
        
SET NOCOUNT ON ;
        
SELECT  @totalWeight = SUM([weight])
        
FROM    dbo.Packages
        
WHERE   [length] = @length
                
OR [width] = @width ;
    
END ;
GO

Conclusion

Although in many cases replacing a generic query with several specific ones does boost performance, we should not use this approach in all cases as a rule of thumb. Instead, we need to prove that splitting a generic query does improve performance before actually doing it.

 

My other posts discussing T-SQL code reuse and performance are as follows:

 
Will wrapping up code in a reusable stored procedure hurt performance?

 
Reuse Your Code with Table-Valued UDFs

Comparing the cost of INSERT ... EXEC and its alternatives
Published Wednesday, July 20, 2011 11:01 PM by Alexander Kuznetsov

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

No Comments

Leave a Comment

(required) 
(required) 
Submit

About Alexander Kuznetsov

Alex Kuznetsov has been working with object oriented languages, mostly C# and C++, as well as with databases for more than a decade. He has worked with Sybase, SQL Server, Oracle and DB2. He regularly blogs on sqlblog.com, mostly about database unit testing, defensive programming, and query optimization. Alex has written a book entitled "Defensive Database Programming with Transact-SQL" and several articles on simple-talk.com and devx.com. Currently he works as an agile developer.

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement