THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Paul Nielsen

www.SQLServerBible.com

Partitioned Group Query

In the SSWUG Virtual Conference this morning, in the chat room following my Kill the Cursor session, Boris asked, "How can I find the last five events for every device, without a cursor?"

Here's the script, Boris. Essentially it uses the Over clause with a Partition by in a subquery to segment by device and number each row. The outer query then has a where clause to restrict the results to the last five.

Boris, I'm interested in the performance gains you see with the set-based query over the cursor. Once you test this will you please post your findings? thanks,

USE TempDB

CREATE TABLE Events (

        
EventID INT IDENTITY NOT NULL PRIMARY KEY,

        
Device CHAR(2),

        
EventTime DATETIME,

        
Data CHAR(10)

  )

GO

INSERT EventsDeviceEventTimeData)

    
VALUES ('a'GETDATE(), 'asd')

INSERT EventsDeviceEventTimeData)

    
VALUES ('b'GETDATE(), 'qwe')

INSERT EventsDeviceEventTimeData)

    
VALUES ('c'GETDATE(), 'sdf')

  
-- waitfor delay '00:00:00.100' 

GO 1000

INSERT EventsDeviceEventTimeData)

    
VALUES ('a'GETDATE(), 'last five')

INSERT EventsDeviceEventTimeData)

    
VALUES ('b'GETDATE(), 'last five')

INSERT EventsDeviceEventTimeData)

    
VALUES ('c'GETDATE(), 'last five')

      
WAITFOR delay '00:00:00.100'

GO 5
 


SELECT 

  
FROM (

  
SELECT EventIDDeviceEventTime,

        
Row_Number() OVER(partition BY device 

      
ORDER BY eventtime DESCAS RowNum

    
FROM Eventssq

  
WHERE RowNum <= 5

Published Thursday, June 26, 2008 1:03 PM by Paul Nielsen

Comments

 

jerryhung said:

The everlasting Top N per Group question

Yeah, ROW_NUMBER or RANK is the only way I do it now, or see how it can be done quickly and efficiently

June 27, 2008 9:21 AM
 

Boris Toplak said:

I made a test and a comparision between row-based and 2 cursor based solution on SQL Server 2005. I created a sample database with 1000 devices and each device has 200-3000 events. First I measured without using any index and then added an index.

R1 - Row-Based solution propesed by Paul in the article

Q1 - Cursor-Based solution using temporary table to join all results

Q2 - Cursor-Based solution returning multiple sets at once (1 set for each device)

No Index:

200: R1=0.8 sec; Q1=50 sec; Q2=43 sec

400: R1=1.5 sec; Q1=94 sec; Q2=85 sec

600: R1=2.2 sec; Q1=138 sec; Q2=127 sec

800: R1=3.0 sec; Q1=183 sec; Q2=172 sec

1000: R1=3.7 sec; Q1=226 sec; Q2=213 sec

1200: R1=4.6 sec; Q1=88 sec; Q2=82 sec

1400: R1=5.4 sec; Q1=95 sec; Q2=90 sec

1600: R1=6.3 sec; Q1=99 sec; Q2=93 sec

1800: R1=8.5 sec; Q1=116 sec; Q2=110 sec

2000: R1=9.4 sec; Q1=125 sec; Q2=122 sec

2500: R1=9.8 sec; Q1=143 sec; Q2=141 sec

3000: R1=13.4 sec; Q1=175 sec; Q2=171 sec

Using Index On Device, EventTime:

200: R1=0.8 sec; Q1=3.2 sec; Q2=1.0 sec

400: R1=1.4  sec; Q1=3.2 sec; Q2=0.9 sec

600: R1=2.2 sec; Q1=3.3 sec; Q2=1.0 sec

800: R1=2.9 sec; Q1=3.4 sec; Q2=1.1 sec

1000: R1=3.8 sec; Q1=3.5 sec; Q2=1.2 sec

1200: R1=4.6 sec; Q1=3.6 sec; Q2=1.4 sec

1400: R1=5.5 sec; Q1=3.2 sec; Q2=1.0 sec

1600: R1=6.2 sec; Q1=3.2 sec; Q2=1.0 sec

1800: R1=8.3 sec; Q1=3.2 sec; Q2=1.1 sec

2000: R1=9.3 sec; Q1=3.3 sec; Q2=1.1 sec

2500: R1=9.9 sec; Q1=3.5 sec; Q2=1.1 sec

3000: R1=13.3 sec; Q1=3.5 sec; Q2=1.2 sec

I noticed that for row-based solution the added index does not help It goes through entire table to find the solution and it does not take advantage of the added index.

July 8, 2008 5:51 AM
New Comments to this post are disabled

About Paul Nielsen

Paul Nielsen believes SQL is the romance language of data. As such he’s a hands-on database developer, Microsoft SQL Server MVP, trainer, and author of SQL Server Bible series (Wiley). As a data architect, he developed the concepts of Smart Database Design and Nordic – an open source O/R dbms for SQL Server. He lives in Colorado Springs.

This Blog

Syndication

News

news item test
Privacy Statement