THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Paul Nielsen

www.SQLServerBible.com

Partitioned Group Query

In the SSWUG Virtual Conference this morning, in the chat room following my Kill the Cursor session, Boris asked, "How can I find the last five events for every device, without a cursor?"

Here's the script, Boris. Essentially it uses the Over clause with a Partition by in a subquery to segment by device and number each row. The outer query then has a where clause to restrict the results to the last five.

Boris, I'm interested in the performance gains you see with the set-based query over the cursor. Once you test this will you please post your findings? thanks,

USE TempDB

CREATE TABLE Events (

        
EventID INT IDENTITY NOT NULL PRIMARY KEY,

        
Device CHAR(2),

        
EventTime DATETIME,

        
Data CHAR(10)

  )

GO

INSERT EventsDeviceEventTimeData)

    
VALUES ('a'GETDATE(), 'asd')

INSERT EventsDeviceEventTimeData)

    
VALUES ('b'GETDATE(), 'qwe')

INSERT EventsDeviceEventTimeData)

    
VALUES ('c'GETDATE(), 'sdf')

  
-- waitfor delay '00:00:00.100' 

GO 1000

INSERT EventsDeviceEventTimeData)

    
VALUES ('a'GETDATE(), 'last five')

INSERT EventsDeviceEventTimeData)

    
VALUES ('b'GETDATE(), 'last five')

INSERT EventsDeviceEventTimeData)

    
VALUES ('c'GETDATE(), 'last five')

      
WAITFOR delay '00:00:00.100'

GO 5
 


SELECT 

  
FROM (

  
SELECT EventIDDeviceEventTime,

        
Row_Number() OVER(partition BY device 

      
ORDER BY eventtime DESCAS RowNum

    
FROM Eventssq

  
WHERE RowNum <= 5

Published Thursday, June 26, 2008 1:03 PM by Paul Nielsen

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

jerryhung said:

The everlasting Top N per Group question

Yeah, ROW_NUMBER or RANK is the only way I do it now, or see how it can be done quickly and efficiently

June 27, 2008 9:21 AM
 

Boris Toplak said:

I made a test and a comparision between row-based and 2 cursor based solution on SQL Server 2005. I created a sample database with 1000 devices and each device has 200-3000 events. First I measured without using any index and then added an index.

R1 - Row-Based solution propesed by Paul in the article

Q1 - Cursor-Based solution using temporary table to join all results

Q2 - Cursor-Based solution returning multiple sets at once (1 set for each device)

No Index:

200: R1=0.8 sec; Q1=50 sec; Q2=43 sec

400: R1=1.5 sec; Q1=94 sec; Q2=85 sec

600: R1=2.2 sec; Q1=138 sec; Q2=127 sec

800: R1=3.0 sec; Q1=183 sec; Q2=172 sec

1000: R1=3.7 sec; Q1=226 sec; Q2=213 sec

1200: R1=4.6 sec; Q1=88 sec; Q2=82 sec

1400: R1=5.4 sec; Q1=95 sec; Q2=90 sec

1600: R1=6.3 sec; Q1=99 sec; Q2=93 sec

1800: R1=8.5 sec; Q1=116 sec; Q2=110 sec

2000: R1=9.4 sec; Q1=125 sec; Q2=122 sec

2500: R1=9.8 sec; Q1=143 sec; Q2=141 sec

3000: R1=13.4 sec; Q1=175 sec; Q2=171 sec

Using Index On Device, EventTime:

200: R1=0.8 sec; Q1=3.2 sec; Q2=1.0 sec

400: R1=1.4  sec; Q1=3.2 sec; Q2=0.9 sec

600: R1=2.2 sec; Q1=3.3 sec; Q2=1.0 sec

800: R1=2.9 sec; Q1=3.4 sec; Q2=1.1 sec

1000: R1=3.8 sec; Q1=3.5 sec; Q2=1.2 sec

1200: R1=4.6 sec; Q1=3.6 sec; Q2=1.4 sec

1400: R1=5.5 sec; Q1=3.2 sec; Q2=1.0 sec

1600: R1=6.2 sec; Q1=3.2 sec; Q2=1.0 sec

1800: R1=8.3 sec; Q1=3.2 sec; Q2=1.1 sec

2000: R1=9.3 sec; Q1=3.3 sec; Q2=1.1 sec

2500: R1=9.9 sec; Q1=3.5 sec; Q2=1.1 sec

3000: R1=13.3 sec; Q1=3.5 sec; Q2=1.2 sec

I noticed that for row-based solution the added index does not help It goes through entire table to find the solution and it does not take advantage of the added index.

July 8, 2008 5:51 AM

Leave a Comment

(required) 
(required) 
Submit

About Paul Nielsen

Paul Nielsen believes SQL is the romance language of data. As such he’s a hands-on database developer, Microsoft SQL Server MVP, trainer, and author of SQL Server Bible series (Wiley). As a data architect, he developed the concepts of Smart Database Design and Nordic – an open source O/R dbms for SQL Server. He lives in Colorado Springs.

This Blog

Syndication

News

news item test
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement