THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Adam Machanic

Adam Machanic, Boston-based SQL Server developer, shares his experiences with programming, monitoring, and performance tuning SQL Server. And the occasional battle with the query optimizer.

SQL Server 2005 T-SQL: Aggregates and the OVER clause

A new feature added to SQL Server 2005 for the sake of the windowing functions is the OVER clause. Using this clause, you can specify ordering or partitioning for the windowing functions. For instance, to enumerate the names of all of the products in the AdventureWorks database that have a list price, along with their list prices and the rank of those prices compared to all of the other prices, the following query can now be used:

 

SELECT
P.Name,
P.ListPrice,
DENSE_RANK() OVER (ORDER BY P.ListPrice DESC) AS PriceRank
FROM Production.Product P
WHERE
ListPrice > 0
ORDER BY
P.Name ASC



Name List Price PriceRank
-------------------------------------------------
All-Purpose Bike Stand 159.0000 44
AWC Logo Cap 8.9900 98
Bike Wash - Dissolver 7.9500 99
Cable Lock 25.0000 88
Chain 20.2400 93
Classic Vest, L 63.5000 66
Classic Vest, M 63.5000 66
Classic Vest, S 63.5000 66
Fender Set - Mountain 21.9800 91
Front Brakes 106.5000 55
Front Derailleur 91.4900 59
...

So what does this tell us? All-Purpose Bike Stand is the 44th most expensive item sold by AdventureWorks. AWC Logo Cap is the 98th most expensive item. And the Vests are tied for 66th most expensive. Which is why DENSE_RANK was used for this example! But really, this example is only here to demonstrate one use of the OVER clause. And this post isn't about windowing functions or rankings at all. That's another post for another day.

What this post is about is normal, non-windowing aggregate functions. Like SUM(). It turns out that the OVER clause can be used for them, too!

Pretend that you're an employee of AdventureWorks and your manager comes to you with a request: Write a query to return all of the products, their prices, their subcategories, and the average price for all products in the subcategory that any given product belongs to... Why? Perhaps the manager wants to re-categorize products based on whether they fall, percentage-wise, close to the same average price. Or maybe it just makes a good contrived example for showing this feature! Regardless...

Here's how you can solve this in SQL Server 2000:

 

SELECT
P.Name AS ProductName,
P.ListPrice,
PS.Name AS ProductSubCategoryName,
x.AveragePrice
FROM Production.Product P
JOIN Production.ProductSubCategory PS ON P.ProductSubCategoryID = PS.ProductSubCategoryID
JOIN
(
SELECT
P2.ProductSubCategoryID,
AVG(P2.ListPrice) AS AveragePrice
FROM Production.Product P2
WHERE
P2.ProductSubCategoryID IS NOT NULL
GROUP BY
P2.ProductSubCategoryID
) x ON x.ProductSubCategoryID = P.ProductSubCategoryId
ORDER BY
P.Name

I don't know about you (since I have no clue who you are), but I personally have a difficult time reading this. If I came back to this query in six months, it would take me a few minutes to figue out what was going on. And doesn't it feel like there should be a more efficient way of expressing it?

...Well, now there is...

 

SELECT
P.Name AS ProductName,
P.ListPrice,
PS.Name AS ProductSubCategoryName,
AVG(P.ListPrice) OVER (PARTITION BY P.ProductSubCategoryID)
FROM Production.Product P
JOIN Production.ProductSubCategory PS ON P.ProductSubCategoryID = PS.ProductSubCategoryID
ORDER BY
P.Name

So what's going on here? Under the covers, SQL Server builds a subquery for the average, based on the partitioning column of the OVER clause -- which in this case is ProductSubCategoryID. It's a little bit less efficient in this case than the derived table approach, but a lot cleaner from a readability standpoint. Personally, I think it's a really cool feature, although I don't honestly see myself using it too often.

More ways to express yourself using SQL Server 2005. Madonna would be proud.


Published Wednesday, July 12, 2006 10:44 PM by Adam Machanic
Filed under:

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

SANJEEV SHARMA said:

I found it very useful. Let me give you a quick overview of the OVER clause. Not very detailed but a will prove a nice kick start for you guys.

http://sanjevsharma.blogspot.com/2008/04/sql-server-2005-over-clause.html

April 20, 2008 2:49 AM
 

Tinku said:

Thanks a alot very nice information

April 25, 2009 12:39 PM
 

John Q. Butliquor said:

Nice!

March 15, 2011 5:27 PM

Leave a Comment

(required) 
(required) 
Submit

About Adam Machanic

Adam Machanic is a Boston-based SQL Server developer, writer, and speaker. He focuses on large-scale data warehouse performance and development, and is author of the award-winning SQL Server monitoring stored procedure, sp_WhoIsActive. Adam has written for numerous web sites and magazines, including SQLblog, Simple Talk, Search SQL Server, SQL Server Professional, CoDe, and VSJ. He has also contributed to several books on SQL Server, including "SQL Server 2008 Internals" (Microsoft Press, 2009) and "Expert SQL Server 2005 Development" (Apress, 2007). Adam regularly speaks at conferences and training events on a variety of SQL Server topics. He is a Microsoft Most Valuable Professional (MVP) for SQL Server, a Microsoft Certified IT Professional (MCITP), and an alumnus of the INETA North American Speakers Bureau.

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement