<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://sqlblog.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Rob Farley : sql, t-sql tuesday</title><link>http://sqlblog.com/blogs/rob_farley/archive/tags/sql/t-sql+tuesday/default.aspx</link><description>Tags: sql, t-sql tuesday</description><dc:language>en</dc:language><generator>CommunityServer 2.1 SP2 (Build: 61129.1)</generator><item><title>Filegroups and Non-Clustered Indexes</title><link>http://sqlblog.com/blogs/rob_farley/archive/2013/03/12/filegroups-and-non-clustered-indexes.aspx</link><pubDate>Tue, 12 Mar 2013 00:20:11 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:48187</guid><dc:creator>Rob Farley</dc:creator><slash:comments>7</slash:comments><comments>http://sqlblog.com/blogs/rob_farley/comments/48187.aspx</comments><wfw:commentRss>http://sqlblog.com/blogs/rob_farley/commentrss.aspx?PostID=48187</wfw:commentRss><wfw:comment>http://sqlblog.com/blogs/rob_farley/rsscomments.aspx?PostID=48187</wfw:comment><description>&lt;p&gt;Let’s start with some basics and then jump in a bit deeper, for this post to go with the 40th &lt;a href="http://www.midnightdba.com/Jen/2013/03/invitation-to-t-sql-tuesday-040-file-and-filegroup-wisdom/" target="_blank"&gt;T-SQL Tuesday&lt;/a&gt;, hosted this month by &lt;a href="http://www.midnightdba.com/Jen/" target="_blank"&gt;Jen McCown&lt;/a&gt;. &lt;a href="http://www.midnightdba.com/Jen/2013/03/invitation-to-t-sql-tuesday-040-file-and-filegroup-wisdom/" target="_blank"&gt;&lt;img style="background-image:none;border-right-width:0px;margin:5px;padding-left:0px;padding-right:0px;display:inline;float:right;border-top-width:0px;border-bottom-width:0px;border-left-width:0px;padding-top:0px;" title="TSQL2sDay150x150" border="0" alt="TSQL2sDay150x150" align="right" src="http://sqlblog.com/blogs/rob_farley/TSQL2sDay150x150_3F5FB646.jpg" width="170" height="170" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;SQL Server holds data, and that data is stored physically in files.&lt;/p&gt;  &lt;p&gt;Of course, in the database world we think of the data as living in tables&lt;a href="http://sqlblog.com/blogs/rob_farley/archive/2012/12/11/joins-in-single-table-queries.aspx" target="_blank"&gt;*&lt;/a&gt;, so naturally there must be some sort of mapping between the data in a table and the data in a file. This is where filegroups come in.&lt;/p&gt;  &lt;p&gt;When SQL objects are created within a database, there is an option to be able to tell the system where to actually store it. But this isn’t a file, it’s a filegroup. (If it were a file and the disk volume that the file was on filled up, we’d have big problems. Instead, we put it in a filegroup, and can add extra files (on different volumes if so desired) to that filegroup.) Objects are stored within filegroups. Filegroups are groups of files, although many database systems do end up with only a single file per filegroup.&lt;/p&gt;  &lt;p&gt;Filegroups end up providing a large amount of flexibility for the storage of data. Rarely accessed data can be put in filegroups that have files on cheaper (but probably slower) disk, while data that is heavily written can be put on disk that’s more appropriate for that, and so on. I’m sure you get the picture, and this is nothing overly thought-worthy.&lt;/p&gt;  &lt;p&gt;You may even have already considered the idea around partitioning data across filegroups, moving data of a certain age (but potentially even from the same table) onto a different filegroup, so that queries that use different parts of tables can benefit from having some data on faster disk. Lots of potential. :)&lt;/p&gt;  &lt;p&gt;Where it becomes interesting though, is when you start to consider the concept of non-clustered indexes.&lt;/p&gt;  &lt;p&gt;You see, while a clustered index (or heap) provides the structure to store the data across all the columns of a table (although I’m simplifying it a little in regard to LOB data), we use copies of some of that data (in non-clustered indexes) to optimise access to the data.&lt;/p&gt;  &lt;p&gt;So where are these copies stored? Many database people don’t even think about this kind of thing, in the same way that they often don’t think about including indexes in their database design. I’m going to figure you’re not in that boat though, because I probably know you, and you’re probably not one of those people.&lt;/p&gt;  &lt;p&gt;Most people like to have a default filegroup that is NOT the primary filegroup. It means that when they create new objects, those new objects get put in the default filegroup, not the primary one. But it’s not actually that simple.&lt;/p&gt;  &lt;p&gt;Let’s start by creating a database.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;CREATE DATABASE fg_testing;        &lt;br /&gt;GO         &lt;br /&gt;USE fg_testing;         &lt;br /&gt;GO         &lt;br /&gt;SELECT *         &lt;br /&gt;FROM sys.filegroups;         &lt;br /&gt;--Only one filegroup at the moment         &lt;br /&gt;&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;&lt;img style="background-image:none;border-right-width:0px;padding-left:0px;padding-right:0px;display:inline;border-top-width:0px;border-bottom-width:0px;border-left-width:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_39E11FD5.png" width="586" height="265" /&gt;&lt;/p&gt;  &lt;p&gt;Notice the column &lt;font face="Consolas"&gt;data_space_id&lt;/font&gt;. This is the column which identifies each filegroup. We’ll use it later.&lt;/p&gt;  &lt;p&gt;Let’s create a new filegroup and set it to be the default.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;ALTER DATABASE fg_testing ADD FILEGROUP FG2;        &lt;br /&gt;GO         &lt;br /&gt;SELECT *         &lt;br /&gt;FROM sys.filegroups;         &lt;br /&gt;GO&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_2FCC7EAA.png" width="567" height="206" /&gt;&lt;/p&gt;  &lt;p&gt;Cool – data_space_id 2 is created.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;ALTER DATABASE fg_testing MODIFY FILEGROUP FG2 DEFAULT;&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_7ADF796C.png" width="564" height="105" /&gt;     &lt;br /&gt;&lt;/p&gt;  &lt;p&gt;I think this is a shame – but I guess it’s fair enough. We have to have a file in the filegroup before we can make it the default. That’s easy enough though, and probably what we’d be wanting to do before too long anyway.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;ALTER DATABASE fg_testing ADD FILE ( NAME = N'file2', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL11.SQL2012\MSSQL\DATA\fg_testing_file2.ndf') TO FILEGROUP FG2;        &lt;br /&gt;GO         &lt;br /&gt;ALTER DATABASE fg_testing MODIFY FILEGROUP FG2 DEFAULT;&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_6A8401B3.png" width="770" height="162" /&gt;&lt;/p&gt;  &lt;p&gt;(I’m doing this on my laptop, which only has one physical drive – on a server it’d go somewhere more sensible of course)&lt;/p&gt;  &lt;p&gt;Now let’s add another filegroup. This is going to be for some data, but I don’t want it to be my default.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;ALTER DATABASE fg_testing ADD FILEGROUP FG3;        &lt;br /&gt;GO         &lt;br /&gt;ALTER DATABASE fg_testing ADD FILE ( NAME = N'file3', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL11.SQL2012\MSSQL\DATA\fg_testing_file3.ndf') TO FILEGROUP FG3;&lt;/font&gt;       &lt;br /&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_07A9A9BE.png" width="731" height="186" /&gt;&lt;/p&gt;  &lt;p&gt;Now, I want to create an object on FG3, which I’m going to pretend is my slower disk.&lt;/p&gt;  &lt;p&gt;Once created, I’m looking in sys.objects to confirm which filegroup the table is in, but there’s no information there. Remember that a table is only metadata, and the things that matter are the indexes/heaps that are on it. Checking in sys.indexes shows me that indeed, it’s in data_space_id 3.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;CREATE TABLE dbo.OrderDates (OrderDate date PRIMARY KEY, NumOrders int NOT NULL DEFAULT 0) ON FG3;        &lt;br /&gt;GO         &lt;br /&gt;SELECT *         &lt;br /&gt;FROM sys.objects         &lt;br /&gt;WHERE object_id = OBJECT_ID('dbo.OrderDates');         &lt;br /&gt;GO         &lt;br /&gt;SELECT *         &lt;br /&gt;FROM sys.indexes         &lt;br /&gt;WHERE object_id = OBJECT_ID('dbo.OrderDates');         &lt;br /&gt;GO&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;&lt;img style="background-image:none;border-right-width:0px;padding-left:0px;padding-right:0px;display:inline;border-top-width:0px;border-bottom-width:0px;border-left-width:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_798ABAC0.png" width="1063" height="449" /&gt;&lt;/p&gt;  &lt;p&gt;For completeness’ sake, I’m going to put some data in there, using a query &lt;a href="http://sqlblog.com/blogs/rob_farley/archive/2013/03/11/a-top-query.aspx" target="_blank"&gt;that I blogged about yesterday&lt;/a&gt;.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;INSERT dbo.OrderDates (OrderDate, NumOrders)        &lt;br /&gt;SELECT TOP (DATEDIFF(DAY,'20010101','20130312'))         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; DATEADD(day,ROW_NUMBER() OVER (ORDER BY (SELECT 1))-1,'20010101') AS OrderDate,         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ABS(CHECKSUM(NEWID())) % 100 as NumOrders         &lt;br /&gt;FROM master.dbo.spt_values t1 CROSS JOIN master.dbo.spt_values t2         &lt;br /&gt;ORDER BY ROW_NUMBER() OVER (ORDER BY (SELECT 1))         &lt;br /&gt;GO&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;&lt;img style="background-image:none;border-right-width:0px;padding-left:0px;padding-right:0px;display:inline;border-top-width:0px;border-bottom-width:0px;border-left-width:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_461ED197.png" width="796" height="235" /&gt;&lt;/p&gt;  &lt;p&gt;But the whole point of this was to see what happens with the non-clustered index, which I want to be on recent data only (ie, filtered), and I want it to be in the DEFAULT filegroup.&lt;/p&gt;  &lt;p&gt;As I want it in the default group, I won’t specify a filegroup for the index.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;CREATE INDEX ixRecentData ON dbo.OrderDates (OrderDate)        &lt;br /&gt;INCLUDE (NumOrders)         &lt;br /&gt;WHERE OrderDate &amp;gt;= '20130101';         &lt;br /&gt;GO         &lt;br /&gt;SELECT *         &lt;br /&gt;FROM sys.indexes         &lt;br /&gt;WHERE object_id = OBJECT_ID('dbo.OrderDates');&lt;/font&gt;       &lt;br /&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;&lt;img style="background-image:none;border-right-width:0px;padding-left:0px;padding-right:0px;display:inline;border-top-width:0px;border-bottom-width:0px;border-left-width:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_698B502F.png" width="968" height="333" /&gt;&lt;/p&gt;  &lt;p&gt;But look! It’s put the non-clustered index in the same filegroup as the clustered index. This isn’t what I wanted. In fact, it’s almost never what I’d want, because even if the disk performance is identical, it can be nice to have transactions which update both the clustered and non-clustered indexes using different disks.&lt;/p&gt;  &lt;p&gt;Instead, we have to specify it explicitly to tell it to use the filegroup we want.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;DROP INDEX ixRecentData ON dbo.OrderDates        &lt;br /&gt;GO         &lt;br /&gt;CREATE INDEX ixRecentData ON dbo.OrderDates (OrderDate)         &lt;br /&gt;INCLUDE (NumOrders)         &lt;br /&gt;WHERE OrderDate &amp;gt;= '20130101'         &lt;br /&gt;ON FG2;         &lt;br /&gt;GO         &lt;br /&gt;SELECT *         &lt;br /&gt;FROM sys.indexes         &lt;br /&gt;WHERE object_id = OBJECT_ID('dbo.OrderDates');&lt;/font&gt;       &lt;br /&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;&lt;img style="background-image:none;border-right-width:0px;padding-left:0px;padding-right:0px;display:inline;border-top-width:0px;border-bottom-width:0px;border-left-width:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_5E321625.png" width="902" height="386" /&gt;&lt;/p&gt;  &lt;p&gt;It’s a shame to have to be so explicit with these things, but when you create non-clustered indexes, you really should think about which filegroup they’re going onto, because they may well not go onto the default one as you might expect.&lt;/p&gt;  &lt;p&gt;&lt;a href="http://twitter.com/rob_farley" target="_blank"&gt;@rob_farley&lt;/a&gt;&lt;/p&gt;&lt;img src="http://sqlblog.com/aggbug.aspx?PostID=48187" width="1" height="1"&gt;</description><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/sql/default.aspx">sql</category><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/t-sql+tuesday/default.aspx">t-sql tuesday</category></item><item><title>Behind the scenes of PowerShell and SQL</title><link>http://sqlblog.com/blogs/rob_farley/archive/2013/02/12/behind-the-scenes-of-powershell-and-sql.aspx</link><pubDate>Tue, 12 Feb 2013 00:11:46 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:47649</guid><dc:creator>Rob Farley</dc:creator><slash:comments>3</slash:comments><comments>http://sqlblog.com/blogs/rob_farley/comments/47649.aspx</comments><wfw:commentRss>http://sqlblog.com/blogs/rob_farley/commentrss.aspx?PostID=47649</wfw:commentRss><wfw:comment>http://sqlblog.com/blogs/rob_farley/rsscomments.aspx?PostID=47649</wfw:comment><description>&lt;p&gt;Every year, PowerShell increases its stranglehold on the Windows Server system and the applications that run upon it – with good reason too. Its consistent mechanisms for interaction between its scripting interface and the underlying systems make it easy for people to feel comfortable, and there is a discoverability that has been lacking in many other scripting environments.&lt;/p&gt;  &lt;p&gt;Of course, SQL Server hasn’t been overlooked at all, and it’s coming up to five years since the &lt;a href="https://msmvps.com/blogs/robfarley/archive/2008/03/04/sql-server-2008-powershell-snapin.aspx" target="_blank"&gt;SnapIns were made available&lt;/a&gt; (even longer since people started to &lt;a href="http://msmvps.com/blogs/robfarley/archive/2007/01/05/powershell-and-sql.aspx" target="_blank"&gt;dabble with SQL using PowerShell&lt;/a&gt;).&lt;/p&gt;  &lt;p&gt;But what’s going on behind the scenes? Does PowerShell present a threat to those amongst us who will always prefer T-SQL? Does PowerShell give us new options that are not available any other way? Well, let’s have a bit of a look, especially since this month’s T-SQL Tuesday (hosted by &lt;a href="http://blog.waynesheffield.com/wayne/archive/2013/02/invitation-for-t-sql-tuesday-39-can-you-shell-what-the-posh-is-cooking/" target="_blank"&gt;Wayne Sheffield&lt;/a&gt; who tweets as &lt;a href="http://twitter.com/DBAWayne" target="_blank"&gt;@DBAWayne&lt;/a&gt;) is on the topic of PowerShell.&lt;/p&gt;  &lt;p&gt;&lt;a href="http://blog.waynesheffield.com/wayne/archive/2013/02/invitation-for-t-sql-tuesday-39-can-you-shell-what-the-posh-is-cooking/" target="_blank"&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;margin:5px;padding-left:0px;padding-right:0px;display:inline;float:right;border-top:0px;border-right:0px;padding-top:0px;" title="TSQL2sDay150x150" border="0" alt="TSQL2sDay150x150" align="right" src="http://sqlblog.com/blogs/rob_farley/TSQL2sDay150x150_41BF631A.jpg" width="170" height="170" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;So we know PowerShell is useful. However we spin it up, we can quickly jump into writing commands, whether it be interacting with WMI, hooking into some .Net assembly we’ve loaded up, or simply browsing the file system. I’ve developed a tendency to use it to start whichever SQL instances I’m needing for the day – by default I have all of them turned off, since I don’t know which one I’ll be wanting most.&lt;/p&gt;  &lt;p&gt;If we’re going to be interacting with SQL, then it’s easiest to either load up the SQLPS environment directly (there’s a PowerShell shortcut within Management Studio), or else (as I do), start a PowerShell window with the Snapin loaded. I prefer this later option, as the SQLPS environment is a slightly cut-back version of PowerShell. But either way – the stuff I’ll continue on with is essentially the same whichever environment you use.&lt;/p&gt;  &lt;p&gt;If you’ve talked about SQL with me long enough, you’ll have come across the fact that I often use SQL Profiler when I’m curious about where to find information. My former colleague &lt;a href="http://www.jimmcleod.net" target="_blank"&gt;Jim McLeod&lt;/a&gt; (&lt;a href="http://twitter.com/jim_mcleod" target="_blank"&gt;@Jim_McLeod&lt;/a&gt;) blogged &lt;a href="http://www.jimmcleod.net/blog/index.php/2012/08/14/t-sql-tuesday-33-trick-shots/" target="_blank"&gt;about this a few months ago&lt;/a&gt;, with an example that I remember looking through with him four or five years ago. It’s a great technique that works on all kinds of things, even across different versions of SQL Server. It also adds as a terrific reminder that Management Studio is not a special application, it simply knows how to ask for the pieces of information that it shows.&lt;/p&gt;  &lt;p&gt;But PowerShell (or SMO, for that matter), that’s in the .Net world. Surely that would be able to bypass the clunky T-SQL stuff that Management Studio does... I mean, Management Studio has to be flexible enough to work across remote servers, talking through firewalls that only allow T-SQL interfaces. Surely PowerShell has access to a deeper magic.&lt;/p&gt;  &lt;p&gt;Well, no. PowerShell still lets you talk to remote servers, and ends up using the same methods.&lt;/p&gt;  &lt;p&gt;Let’s prove it.&lt;/p&gt;  &lt;p&gt;Spin up Profiler, and start a trace against your favourite instance. I like to watch for SQL:BatchCompleted, SP:StmtCompleted and RPC:Completed events when doing this kind of thing. I’m using an instance that isn’t doing anything else, but you could apply a ColumnFilter to filter the events to things with an ApplicationName starting with SQLPS if you prefer.&lt;/p&gt;  &lt;p&gt;With that running, I jump into PowerShell and do something like:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;PS SQLSERVER:\sql\localhost\sql2008r2&amp;gt; dir Databases | ft name&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;This lists the names of the databases on my SQL2008R2 instances. You don’t need to see the results, you can imagine them for yourself.&lt;/p&gt;  &lt;p&gt;If PowerShell were using some secret interface, it’s unlikely we’d see something in Profiler. But it’s not, and we see a bunch of stuff.&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_56ED68C2.png" width="672" height="468" /&gt;&lt;/p&gt;  &lt;p&gt;We see a bunch of Remote Procedure Calls, each with a Stored Procedure Statement Completed event showing the same information. And look – we see queries against master.sys.databases, asking for the name of each of the databases, passing in the name as a parameter. Brilliant! Notice just a bit earlier though, there’s a SQL:BatchCompleted call. This means that a query has been passed in directly. It’s this:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;SELECT       &lt;br /&gt;CAST(        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; serverproperty(N'Servername')        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; AS sysname) AS [Server_Name],        &lt;br /&gt;dtb.name AS [Name]        &lt;br /&gt;FROM        &lt;br /&gt;master.sys.databases AS dtb        &lt;br /&gt;WHERE        &lt;br /&gt;(CAST(case when dtb.name in ('master','model','msdb','tempdb') then 1 else dtb.is_distributor end AS bit)=0)        &lt;br /&gt;ORDER BY        &lt;br /&gt;[Name] ASC&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_3EB9D89A.png" width="646" height="432" /&gt;&lt;/p&gt;    &lt;p&gt;So it grabs the list of database names first, and then makes extra calls to be able to fetch the list of names again, one by one.&lt;/p&gt;  &lt;p&gt;The reason why it’s grabbing the list of names one by one isn’t because it’s stupid and is asking to be ridiculed. It’s because we've asked to see that property, and I guess the PowerShell people figured that no matter what property you ask for, it’ll go and fetch it to show you.&lt;/p&gt;  &lt;p&gt;When I asked for the CompatibilityLevel property instead, I got some different rows thrown in. Interestingly though, it still asked for the name each time.&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_3862DC3F.png" width="635" height="437" /&gt;&lt;/p&gt;  &lt;p&gt;Also interestingly, when I asked for the CompatibilityLevel a subsequent time, the calls for “SELECT dtb.compatibility_level…” weren’t in there. They’d been cached by the PowerShell environment – important to note if you ever come across PowerShell giving you old values.&lt;/p&gt;  &lt;p&gt;So what about asking something more interesting? Let’s try asking about the IndexSpaceUsage in AdventureWorks.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;PS SQLSERVER:\sql\localhost\sql2008r2&amp;gt; gi Databases\AdventureWorks | ft IndexSpaceUsage&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;The result tells me it’s 62576. Yeah, but today I’m not interested in that, just what happened in the trace.&lt;/p&gt;  &lt;p&gt;Four entries. An SP:StmtCompleted with matching RPC:Completed, and two SQL:BatchCompleted.&lt;/p&gt;  &lt;p&gt;The SP:StmtCompleted and RPC:Completed were this statement, passing in the parameter value ‘AdventureWorks’. Clearly incredibly informative.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;SELECT       &lt;br /&gt;CAST(0 AS float) AS [IndexSpaceUsage],        &lt;br /&gt;dtb.name AS [DatabaseName]        &lt;br /&gt;FROM        &lt;br /&gt;master.sys.databases AS dtb        &lt;br /&gt;WHERE        &lt;br /&gt;(dtb.name=@_msparam_0)&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;This is getting the value zero. Wow. Brilliant stuff.&lt;/p&gt;  &lt;p&gt;The last entry – the second of the two SQL:BatchCompleted events is:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;select convert(float,low/1024.) from master.dbo.spt_values where number = 1 and type = 'E'&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;If you run this in Management Studio, you’ll discover it gives the value 8. Ok.&lt;/p&gt;  &lt;p&gt;The other entry is more interesting.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;use [AdventureWorks]       &lt;br /&gt;SELECT        &lt;br /&gt;SUM(CASE WHEN a.type &amp;lt;&amp;gt; 1 THEN a.used_pages WHEN p.index_id &amp;lt; 2 THEN a.data_pages ELSE 0 END) AS [DataSpaceUsage],        &lt;br /&gt;SUM(a.used_pages) AS [IndexSpaceTotal]        &lt;br /&gt;FROM        &lt;br /&gt;sys.allocation_units AS a INNER JOIN sys.partitions AS p ON (a.type = 2 AND p.partition_id = a.container_id) OR (a.type IN (1,3) AND p.hobt_id = a.container_id)&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;This is more like it! We run this in Management Studio, and we see two values. DataSpaceUsage is 13682, IndexSpaceTotal is 21504. Neither are our value 62576. But we do have clues in the column names, and in that value 8 that came back too. We can easily deduce that it’s actually (IndexSpaceTotal-DataSpaceUsage)*8, and we have ourselves a nice little method for working out the IndexSpaceUsage ourselves now if we need it.&lt;/p&gt;  &lt;p&gt;Or we can just ask PowerShell next time as well.&lt;/p&gt;  &lt;p&gt;Incidentally – if you’re considering doing the MCM Lab exam some time, then you might find that a familiarity with PowerShell comes in really handy. I’m not saying there are PowerShell questions on the exam at all – I’m just suggesting that you may find that PowerShell becomes a really useful way of getting at some of the information that you’re looking for. If you’re stumbling around the list of DMVs trying to remember which one it is that stores some particular thing, remember that you might be able to get the data out more easily if you use PowerShell instead.&lt;/p&gt;  &lt;p&gt;So can we discover secret things about SQL from PowerShell? Are there things we can do in PowerShell that are impossible through other mechanisms? Hooks that let us break the rules even?&lt;/p&gt;  &lt;p&gt;Recently, Kendal van Dyke asked a question about this kind of thing on Twitter. He was wondering if you could have a default constraint on a column in a view. The reason for his wondering was that he saw a property on a view column in PowerShell that made him wonder. The answer is no though, and there’s a simple reason.&lt;/p&gt;  &lt;p&gt;PowerShell is a programmatic interface. It involves classes and property and methods. It does things row by row, which is why much of what you see in that trace feels amazingly pedantic – asking about things which shouldn’t have to be that complicated. The implication of this though, is that PowerShell reuses the concept of a column, regardless of whether this is a column in a table, a view, or anywhere else it decides to need a column. The fact that columns in tables have some extra properties isn’t enough to make this class re-use pointless. If we try to set a Default constraint for a column in a view though, we get an error, just like if we tried to do it any other way.&lt;/p&gt;  &lt;p&gt;The PowerShell I used was:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;$db = Get-Item SQLSERVER:\sql\localhost\sql2008r2\Databases\AdventureWorks       &lt;br /&gt;&lt;/font&gt;&lt;font face="Consolas"&gt;$def = New-Object -TypeName Microsoft.SqlServer.Management.SMO.Default –ArgumentList $db, &amp;quot;ViewDefault&amp;quot;       &lt;br /&gt;&lt;/font&gt;&lt;font face="Consolas"&gt;$def.TextHeader = &amp;quot;CREATE DEFAULT ViewDefault AS&amp;quot;       &lt;br /&gt;$def.TextBody = &amp;quot;'ABC'&amp;quot;        &lt;br /&gt;&lt;/font&gt;&lt;font face="Consolas"&gt;$def.Create()       &lt;br /&gt;&lt;/font&gt;&lt;font face="Consolas"&gt;$def.BindToColumn(&amp;quot;vStateProvinceCountryRegion&amp;quot;,&amp;quot;StateProvinceCode&amp;quot;,&amp;quot;Person&amp;quot;)&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;The code that ended up getting called was to the stored procedure &lt;em&gt;sp_bindefault&lt;/em&gt; (despite it being deprecated). Naturally, trying to execute this against a view column gives an error regardless of what wrappers you have put around it – PowerShell or not.&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_125D061C.png" width="813" height="132" /&gt;&lt;/p&gt;  &lt;p&gt;So PowerShell is very useful, and it provides a handy way of getting to a lot of things that could be otherwise hard. But looking below the surface, it isn’t able to circumvent the basic restrictions of SQL Server, because it still ends up doing its work using T-SQL.&lt;/p&gt;  &lt;p&gt;&lt;a href="http://twitter.com/rob_farley" target="_blank"&gt;@rob_farley&lt;/a&gt;&lt;/p&gt;&lt;img src="http://sqlblog.com/aggbug.aspx?PostID=47649" width="1" height="1"&gt;</description><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/powershell/default.aspx">powershell</category><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/sql/default.aspx">sql</category><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/t-sql+tuesday/default.aspx">t-sql tuesday</category></item><item><title>Joins in single-table queries</title><link>http://sqlblog.com/blogs/rob_farley/archive/2012/12/11/joins-in-single-table-queries.aspx</link><pubDate>Tue, 11 Dec 2012 00:05:17 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:46601</guid><dc:creator>Rob Farley</dc:creator><slash:comments>4</slash:comments><comments>http://sqlblog.com/blogs/rob_farley/comments/46601.aspx</comments><wfw:commentRss>http://sqlblog.com/blogs/rob_farley/commentrss.aspx?PostID=46601</wfw:commentRss><wfw:comment>http://sqlblog.com/blogs/rob_farley/rsscomments.aspx?PostID=46601</wfw:comment><description>&lt;p&gt;Tables are only metadata. They don’t store data.&lt;/p&gt;  &lt;p&gt;I’ve written &lt;a href="http://sqlblog.com/blogs/rob_farley/archive/2010/09/14/table-no-such-thing.aspx" target="_blank"&gt;something about this&lt;/a&gt; before, but I want to take a viewpoint of this idea around the topic of joins, especially since it’s the topic for T-SQL Tuesday this month. Hosted this time by &lt;a href="http://sqlity.net/en/1175/t-sql-tuesday-37-invite-to-join-me-in-a-month-of-joins/" target="_blank"&gt;Sebastian Meine&lt;/a&gt; (&lt;a href="http://twitter.com/sqlity" target="_blank"&gt;@sqlity&lt;/a&gt;), who has a whole series on joins this month. Good for him – it’s a great topic. &lt;a href="http://sqlity.net/en/1175/t-sql-tuesday-37-invite-to-join-me-in-a-month-of-joins/" target="_blank"&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;margin:5px;padding-left:0px;padding-right:0px;display:inline;float:right;border-top:0px;border-right:0px;padding-top:0px;" title="TSQL2sDay150x150" border="0" alt="TSQL2sDay150x150" align="right" src="http://sqlblog.com/blogs/rob_farley/TSQL2sDay150x150_1CEA56EB.jpg" width="170" height="170" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;In that last post I discussed the fact that we write queries against tables, but that the engine turns it into a plan against indexes. My point wasn’t simply that a table is actually just a Clustered Index (or heap, which I consider just a special type of index), but that data access always happens against indexes – never tables – and we should be thinking about the indexes (specifically the non-clustered ones) when we write our queries.&lt;/p&gt;  &lt;p&gt;I described the scenario of looking up phone numbers, and how it never really occurs to us that there is a master list of phone numbers, because we think in terms of the useful non-clustered indexes that the phone companies provide us, but anyway – that’s not the point of this post.&lt;/p&gt;  &lt;p&gt;So a table is metadata. It stores information about the names of columns and their data types. Nullability, default values, constraints, triggers – these are all things that define the table, but the data isn’t stored in the table. The data that a table describes is stored in a heap or clustered index, but it goes further than this.&lt;/p&gt;  &lt;p&gt;All the useful data is going to live in non-clustered indexes. Remember this. It’s important. Stop thinking about tables, and start thinking about indexes.&lt;/p&gt;  &lt;p&gt;So let’s think about tables as indexes. This applies even in a world created by &lt;a href="http://www.lmgtfy.com/?q=someone+who's+bad+at+database+design" target="_blank"&gt;someone else&lt;/a&gt;, who doesn’t have the best indexes in mind for you.&lt;/p&gt;  &lt;p&gt;I’m sure you don’t need me to explain Covering Index bit – the fact that if you don’t have sufficient columns “included” in your index, your query plan will either have to do a Lookup, or else it’ll give up using your index and use one that does have everything it needs (even if that means scanning it). If you haven’t seen that before, drop me a line and I’ll run through it with you. Or go and read &lt;a href="http://msmvps.com/blogs/robfarley/archive/2008/10/12/doing-the-maths-to-understand-sql-optimiser-choices.aspx" target="_blank"&gt;a post I did a long while ago&lt;/a&gt; about the maths involved in that decision.&lt;/p&gt;  &lt;p&gt;So – what I’m going to tell you is that a Lookup is a join.&lt;/p&gt;  &lt;p&gt;When I run &lt;font face="Consolas"&gt;SELECT CustomerID FROM Sales.SalesOrderHeader WHERE SalesPersonID = 285; &lt;/font&gt;against the AdventureWorks2012 get the following plan:&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_31AC299E.png" width="470" height="226" /&gt;&lt;/p&gt;  &lt;p&gt;I’m sure you can see the join. Don’t look in the query, it’s not there. But you should be able to see the join in the plan. It’s an Inner Join, implemented by a Nested Loop. It’s pulling data in from the Index Seek, and joining that to the results of a Key Lookup.&lt;/p&gt;  &lt;p&gt;It clearly is – the QO wouldn’t call it that if it wasn’t really one. It behaves exactly like any other Nested Loop (Inner Join) operator, pulling rows from one side and putting a request in from the other. You wouldn’t have a problem accepting it as a join if the query were slightly different, such as &lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;SELECT sod.OrderQty       &lt;br /&gt;FROM Sales.SalesOrderHeader AS soh        &lt;br /&gt;JOIN Sales.SalesOrderDetail as sod        &lt;br /&gt;on sod.SalesOrderID = soh.SalesOrderID        &lt;br /&gt;WHERE soh.SalesPersonID = 285;&lt;/font&gt;      &lt;br /&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_5CA417A3.png" width="478" height="223" /&gt;&lt;/p&gt;  &lt;p&gt;Amazingly similar, of course. This one is an explicit join, the first example was just as much a join, even thought you didn’t actually ask for one.&lt;/p&gt;  &lt;p&gt;You need to consider this when you’re thinking about your queries.&lt;/p&gt;  &lt;p&gt;But it gets more interesting.&lt;/p&gt;  &lt;p&gt;Consider this query: &lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;SELECT SalesOrderID       &lt;br /&gt;FROM Sales.SalesOrderHeader        &lt;br /&gt;WHERE SalesPersonID = 276        &lt;br /&gt;AND CustomerID = 29522;&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;It doesn’t look like there’s a join here either, but look at the plan.&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_3993A633.png" width="513" height="240" /&gt;    &lt;br /&gt;&lt;/p&gt;  &lt;p&gt;That’s not some Lookup in action – that’s a proper Merge Join. The Query Optimizer has worked out that it can get the data it needs by looking in two separate indexes and then doing a Merge Join on the data that it gets. Both indexes used are ordered by the column that’s indexed (one on SalesPersonID, one on CustomerID), and then by the CIX key SalesOrderID. Just like when you seek in the phone book to Farley, the Farleys you have are ordered by FirstName, these seek operations return the data ordered by the next field. This order is SalesOrderID, even though you didn’t explicitly put that column in the index definition. The result is two datasets that are ordered by SalesOrderID, making them very mergeable. &lt;/p&gt;  &lt;p&gt;Another example is the simple query&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;SELECT CustomerID       &lt;br /&gt;FROM Sales.SalesOrderHeader        &lt;br /&gt;WHERE SalesPersonID = 276;&lt;/font&gt;      &lt;br /&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_0191B250.png" width="514" height="245" /&gt;&lt;/p&gt;  &lt;p&gt;This one prefers a Hash Match to a standard lookup even! This isn’t just ordinary index intersection, this is something else again! Just like before, we could imagine it better with two whole tables, but we shouldn’t try to distinguish between joining two tables and joining two indexes.&lt;/p&gt;  &lt;p&gt;The Query Optimizer can see (using basic maths) that it’s worth doing these particular operations using these two less-than-ideal indexes (because of course, the best indexese would be on both columns – a composite such as (SalesPersonID, CustomerID – and it would have the SalesOrderID column as part of it as the CIX key still).&lt;/p&gt;  &lt;p&gt;You need to think like this too.&lt;/p&gt;  &lt;p&gt;Not in terms of excusing single-column indexes like the ones in AdventureWorks2012, but in terms of having a picture about how you’d like your queries to run. If you start to think about what data you need, where it’s coming from, and how it’s going to be used, then you will almost certainly write better queries. &lt;/p&gt;  &lt;p&gt;…and yes, this would include when you’re dealing with regular joins across multiples, not just against joins within single table queries.&lt;/p&gt;&lt;img src="http://sqlblog.com/aggbug.aspx?PostID=46601" width="1" height="1"&gt;</description><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/indexing/default.aspx">indexing</category><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/sql/default.aspx">sql</category><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/t-sql+tuesday/default.aspx">t-sql tuesday</category></item><item><title>When someone deletes a shared data source in SSRS</title><link>http://sqlblog.com/blogs/rob_farley/archive/2012/10/09/when-someone-deletes-a-shared-data-source-in-ssrs.aspx</link><pubDate>Tue, 09 Oct 2012 00:55:50 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:45484</guid><dc:creator>Rob Farley</dc:creator><slash:comments>8</slash:comments><comments>http://sqlblog.com/blogs/rob_farley/comments/45484.aspx</comments><wfw:commentRss>http://sqlblog.com/blogs/rob_farley/commentrss.aspx?PostID=45484</wfw:commentRss><wfw:comment>http://sqlblog.com/blogs/rob_farley/rsscomments.aspx?PostID=45484</wfw:comment><description>&lt;p&gt;SQL Server Reporting Services plays nicely. You can have things in the catalogue that get shared. You can have Reports that have Links, Datasets that can be used across different reports, and Data Sources that can be used in a variety of ways too.&lt;/p&gt;  &lt;p&gt;So if you find that someone has deleted a shared data source, you potentially have a bit of a horror story going on. And this works for this month’s T-SQL Tuesday theme, hosted by &lt;a href="http://blog.nhaslam.com/2012/10/04/t-sql-tuesday-35-soylent-green-tsql2sday/" target="_blank"&gt;Nick Haslam&lt;/a&gt;, who wants to hear about horror stories.&lt;a href="http://blog.nhaslam.com/2012/10/04/t-sql-tuesday-35-soylent-green-tsql2sday/" target="_blank"&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;margin:5px;padding-left:0px;padding-right:0px;display:inline;float:right;border-top:0px;border-right:0px;padding-top:0px;" title="TSQL2sDay150x150" border="0" alt="TSQL2sDay150x150" align="right" src="http://sqlblog.com/blogs/rob_farley/TSQL2sDay150x150_53144E68.jpg" width="170" height="170" /&gt;&lt;/a&gt; I don’t write about &lt;a href="http://lobsterpot.com.au/" target="_blank"&gt;LobsterPot&lt;/a&gt; client horror stories, so I’m writing about a situation that a fellow MVP friend asked me about recently instead.&lt;/p&gt;  &lt;p&gt;The best thing to do is to grab a recent backup of the ReportServer database, restore it somewhere, and figure out what’s changed. But of course, this isn’t always possible.&lt;/p&gt;  &lt;p&gt;And it’s much nicer to help someone with this kind of thing, rather than to be trying to fix it yourself when you’ve just deleted the wrong data source. Unfortunately, it lets you delete data sources, without trying to scream that the data source is shared across over 400 reports in over 100 folders, as was the case for my friend’s colleague.&lt;/p&gt;  &lt;p&gt;So, suddenly there’s a big problem – lots of reports are failing, and the time to turn it around is small. You probably know which data source has been deleted, but getting the shared data source back isn’t the hard part (that’s just a connection string really). The nasty bit is all the re-mapping, to get those 400 reports working again.&lt;/p&gt;  &lt;p&gt;I know from exploring this kind of stuff in the past that the ReportServer database (using its default name) has a table called &lt;strong&gt;dbo.Catalog&lt;/strong&gt; to represent the catalogue, and that Reports are stored here. However, the information about what data sources these deployed reports are configured to use is stored in a different table, &lt;strong&gt;dbo.DataSource&lt;/strong&gt;. You could be forgiven for thinking that shared data sources would live in this table, but they don’t – they’re catalogue items just like the reports. Let’s have a look at the structure of these two tables (although if you’re reading this because you have a disaster, feel free to skim past).&lt;/p&gt;  &lt;p&gt;Frustratingly, there doesn’t seem to be a Books Online page for this information, sorry about that. I’m also not going to look at all the columns, just ones that I find interesting enough to mention, and that are related to the problem at hand. These fields are consistent all the way through to SQL Server 2012 – there doesn’t seem to have been any changes here for quite a while.&lt;/p&gt;  &lt;p&gt;&lt;strong&gt;dbo.Catalog&lt;/strong&gt;&lt;/p&gt;  &lt;p&gt;The Primary Key is &lt;strong&gt;ItemID&lt;/strong&gt;. It’s a uniqueidentifier. I’m not going to comment any more on that. A minor nice point about using GUIDs in unfamiliar databases is that you can more easily figure out what’s what. But foreign keys are for that too…&lt;/p&gt;  &lt;p&gt;&lt;strong&gt;Path&lt;/strong&gt;, &lt;strong&gt;Name&lt;/strong&gt; and &lt;strong&gt;ParentID &lt;/strong&gt;tell you where in the folder structure the item lives. Path isn’t actually required – you could’ve done recursive queries to get there. But as that would be quite painful, I’m more than happy for the Path column to be there. Path contains the Name as well, incidentally.&lt;/p&gt;  &lt;p&gt;&lt;strong&gt;Type&lt;/strong&gt; tells you what kind of item it is. Some examples are 1 for a folder and 2 a report. 4 is linked reports, 5 is a data source, 6 is a report model. I forget the others for now (but feel free to put a comment giving the full list if you know it).&lt;/p&gt;  &lt;p&gt;&lt;strong&gt;Content &lt;/strong&gt;is an image field, remembering that image doesn’t necessarily store images – these days we’d rather use varbinary(max), but even in SQL Server 2012, this field is still image. It stores the actual item definition in binary form, whether it’s actually an image, a report, whatever.&lt;/p&gt;  &lt;p&gt;&lt;strong&gt;LinkSourceID&lt;/strong&gt; is used for Linked Reports, and has a self-referencing foreign key (allowing NULL, of course) back to ItemID.&lt;/p&gt;  &lt;p&gt;&lt;strong&gt;Parameter&lt;/strong&gt; is an ntext field containing XML for the parameters of the report. Not sure why this couldn’t be a separate table, but I guess that’s just the way it goes. This field gets changed when the default parameters get changed in Report Manager.&lt;/p&gt;  &lt;p&gt;There is nothing in dbo.Catalog that describes the actual data sources that the report uses. The default data sources would be part of the Content field, as they are defined in the RDL, but when you deploy reports, you typically choose to NOT replace the data sources. Anyway, they’re not in this table. Maybe it was already considered a bit wide to throw in another ntext field, I’m not sure. They’re in dbo.DataSource instead.&lt;/p&gt;  &lt;p&gt;&lt;strong&gt;dbo.DataSource&lt;/strong&gt;&lt;/p&gt;  &lt;p&gt;The Primary key is &lt;strong&gt;DSID&lt;/strong&gt;. Yes it’s a uniqueidentifier...&lt;/p&gt;  &lt;p&gt;&lt;strong&gt;ItemID &lt;/strong&gt;is a foreign key reference back to dbo.Catalog&lt;/p&gt;  &lt;p&gt;Fields such as &lt;strong&gt;ConnectionString&lt;/strong&gt;, &lt;strong&gt;Prompt&lt;/strong&gt;, &lt;strong&gt;UserName &lt;/strong&gt;and &lt;strong&gt;Password &lt;/strong&gt;do what they say on the tin, storing information about how to connect to the particular source in question.&lt;/p&gt;  &lt;p&gt;&lt;strong&gt;Link &lt;/strong&gt;is a uniqueidentifier, which refers back to dbo.Catalog. This is used when a data source within a report refers back to a shared data source, rather than embedding the connection information itself. You’d think this should be enforced by foreign key, but it’s not. It does allow NULLs though.&lt;/p&gt;  &lt;p&gt;&lt;strong&gt;Flags&lt;/strong&gt; this is an int, and I’ll come back to this.&lt;/p&gt;  &lt;p&gt;When a Data Source gets deleted out of dbo.Catalog, you might assume that it would be disallowed if there are references to it from dbo.DataSource. Well, you’d be wrong. And not because of the lack of a foreign key either.&lt;/p&gt;  &lt;p&gt;Deleting anything from the catalogue is done by calling a stored procedure called &lt;strong&gt;dbo.DeleteObject&lt;/strong&gt;. You can look at the definition in there – it feels very much like the kind of Delete stored procedures that many people write, the kind of thing that means they don’t need to worry about allowing cascading deletes with foreign keys – because the stored procedure does the lot.&lt;/p&gt;  &lt;p&gt;Except that it doesn’t quite do that.&lt;/p&gt;  &lt;p&gt;If it deleted everything on a cascading delete, we’d’ve lost all the data sources as configured in dbo.DataSource, and that would be bad. This is fine if the ItemID from dbo.DataSource hooks in – if the report is being deleted. But if a shared data source is being deleted, you don’t want to lose the existence of the data source from the report.&lt;/p&gt;  &lt;p&gt;So it sets it to NULL, and it marks it as invalid.&lt;/p&gt;  &lt;p&gt;We see this code in that stored procedure.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;UPDATE [DataSource]       &lt;br /&gt;&amp;#160;&amp;#160; SET        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; [Flags] = [Flags] &amp;amp; 0x7FFFFFFD, -- broken link        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; [Link] = NULL        &lt;br /&gt;FROM        &lt;br /&gt;&amp;#160;&amp;#160; [Catalog] AS C        &lt;br /&gt;&amp;#160;&amp;#160; INNER JOIN [DataSource] AS DS ON C.[ItemID] = DS.[Link]        &lt;br /&gt;WHERE        &lt;br /&gt;&amp;#160;&amp;#160; (C.Path = @Path OR C.Path LIKE @Prefix ESCAPE '*')&lt;/font&gt;      &lt;br /&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;Unfortunately there’s no semi-colon on the end (but I’d rather they fix the ntext and image types first), and don’t get me started about using the table name in the UPDATE clause (it should use the alias DS). But there is a nice comment about what’s going on with the Flags field.&lt;/p&gt;  &lt;p&gt;What I’d LIKE it to do would be to set the connection information to a report-embedded copy of the connection information that’s in the shared data source, the one that’s about to be deleted. I understand that this would cause someone to lose the benefit of having the data sources configured in a central point, but I’d say that’s probably still slightly better than LOSING THE INFORMATION COMPLETELY. Sorry, rant over. I should log a Connect item – I’ll put that on my todo list.&lt;/p&gt;  &lt;p&gt;So it sets the Link field to NULL, and marks the Flags to tell you they’re broken. So this is your clue to fixing it.&lt;/p&gt;  &lt;p&gt;A bitwise AND with 0x7FFFFFFD is basically stripping out the ‘2’ bit from a number. So numbers like 2, 3, 6, 7, 10, 11, etc, whose binary representation ends in either 11 or 10 get turned into 0, 1, 4, 5, 8, 9, etc. We can test for it using a WHERE clause that matches the SET clause we’ve just used. I’d also recommend checking for Link being NULL and also having no ConnectionString. And join back to dbo.Catalog to get the path (including the name) of broken reports are – in case you get a surprise from a different data source being broken in the past.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;SELECT c.Path, ds.Name       &lt;br /&gt;FROM dbo.[DataSource] AS ds        &lt;br /&gt;JOIN dbo.[Catalog] AS c ON c.ItemID = ds.ItemID        &lt;br /&gt;WHERE ds.[Flags] = ds.[Flags] &amp;amp; 0x7FFFFFFD        &lt;br /&gt;AND ds.[Link] IS NULL        &lt;br /&gt;AND ds.[ConnectionString] IS NULL;&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;When I just ran this on my own machine, having deleted a data source to check my code, I noticed a Report Model in the list as well – so if you had thought it was just going to be reports that were broken, you’d be forgetting something.&lt;/p&gt;  &lt;p&gt;So to fix those reports, get your new data source created in the catalogue, and then find its ItemID by querying Catalog, using Path and Name to find it.&lt;/p&gt;  &lt;p&gt;And then use this value to fix them up. To fix the Flags field, just add 2. I prefer to use bitwise OR which should do the same. Use the OUTPUT clause to get a copy of the DSIDs of the ones you’re changing, just in case you need to revert something later after testing (doing it all in a transaction won’t help, because you’ll just lock out the table, stopping you from testing anything).&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;UPDATE ds SET [Flags] = [Flags] | 2, [Link] = '3AE31CBA-BDB4-4FD1-94F4-580B7FAB939D' /*Insert your own GUID*/       &lt;br /&gt;OUTPUT deleted.Name, deleted.DSID, deleted.ItemID, deleted.Flags        &lt;br /&gt;FROM dbo.[DataSource] AS ds        &lt;br /&gt;JOIN dbo.[Catalog] AS c ON c.ItemID = ds.ItemID        &lt;br /&gt;WHERE ds.[Flags] = ds.[Flags] &amp;amp; 0x7FFFFFFD        &lt;br /&gt;AND ds.[Link] IS NULL        &lt;br /&gt;AND ds.[ConnectionString] IS NULL;&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;But please be careful. Your mileage may vary. And there’s no reason why 400-odd broken reports needs to be quite the nightmare that it could be. Really, it should be less than five minutes.&lt;/p&gt;  &lt;p&gt;&lt;a href="http://twitter.com/rob_farley" target="_blank"&gt;@rob_farley&lt;/a&gt;&lt;/p&gt;&lt;img src="http://sqlblog.com/aggbug.aspx?PostID=45484" width="1" height="1"&gt;</description><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/reporting+services/default.aspx">reporting services</category><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/sql/default.aspx">sql</category><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/t-sql+tuesday/default.aspx">t-sql tuesday</category></item><item><title>Tricks? In T-SQL?</title><link>http://sqlblog.com/blogs/rob_farley/archive/2012/08/14/tricks-in-t-sql.aspx</link><pubDate>Tue, 14 Aug 2012 02:16:19 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:44695</guid><dc:creator>Rob Farley</dc:creator><slash:comments>0</slash:comments><comments>http://sqlblog.com/blogs/rob_farley/comments/44695.aspx</comments><wfw:commentRss>http://sqlblog.com/blogs/rob_farley/commentrss.aspx?PostID=44695</wfw:commentRss><wfw:comment>http://sqlblog.com/blogs/rob_farley/rsscomments.aspx?PostID=44695</wfw:comment><description>&lt;p&gt;Four years ago, I was preparing to speak at TechEd Australia. I’d been asked to give a session on “T-SQL Tips and Tricks”, but I’d pushed back and we’d gone with “T-SQL Tips and Techniques” instead. I hadn’t wanted to show Tricks, because despite being a fan of ‘magicians’ (like &lt;a href="http://www.youtube.com/watch?v=D5n6lUXVtEo" target="_blank"&gt;Tommy Cooper&lt;/a&gt;) I feel like the trickery should disappear with the understanding of the technique used. This month, &lt;a href="http://www.mikefal.net/2012/08/07/invitation-to-t-sql-tuesday-33-trick-shots-tsql2sday/" target="_blank"&gt;Mike Fal asks about Trick Shots&lt;/a&gt;, and I’m reminded of some of the things I do with T-SQL, and that session I gave nearly four years ago. &lt;a href="http://www.mikefal.net/2012/08/07/invitation-to-t-sql-tuesday-33-trick-shots-tsql2sday/" target="_blank"&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;margin:5px;padding-left:0px;padding-right:0px;display:inline;float:right;border-top:0px;border-right:0px;padding-top:0px;" title="" border="0" alt="" align="right" src="http://sqlblog.com/blogs/rob_farley/TSQL2sDay150x150_193ADC55.jpg" width="170" height="170" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;So I gave a talk, in which I covered 15 T-SQL Tips (probably more – I definitely threw a lot of stuff in there). They included things like the &lt;a href="http://bit.ly/Simplification" target="_blank"&gt;Incredible Shrinking Execution Plan&lt;/a&gt;, using the OUTPUT clause to return identity values on multiple rows, short-circuiting GROUP BY statements with unique indexes, and plenty more. There are a lot more things that I cover these days – you can’t exactly stay still and remain current – but still I like to maintain that there shouldn’t be trickery with T-SQL.&lt;/p&gt;  &lt;p&gt;The common thread going through many of the tips, along with every class I teach about T-SQL, is the importance of the execution plan. That’s where you can see what’s actually going on, and hopefully it can explain some of the magic that you see. Of course, there’s more to it than that, but getting your head around the relationship between queries and plans can definitely help demystify situations.&lt;/p&gt;  &lt;p&gt;Take recursive CTEs, for example. In the piece of code below (against old AdventureWorks, to which I added a covering index to avoid lookups), we see a sub-query used within a CTE (which is all about giving a name to the sub-query so it can be referenced later), and then in the second half of the UNION ALL statement, still within the sub-query, we see the CTE name used (where I’ve made it &lt;strong&gt;bold&lt;/strong&gt;). Despite the fact that we haven’t even finished using it yet. This functionality has been around for a long time, but yet many people are not used to it, and see it as a trick.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;em&gt;WITH OrgChart AS       &lt;br /&gt;(        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; SELECT 1 AS EmployeeLevel, *        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; FROM HumanResources.Employee        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; WHERE ManagerID IS NULL&lt;/em&gt;&lt;/p&gt;    &lt;p&gt;&lt;em&gt;&amp;#160;&amp;#160;&amp;#160; UNION ALL&lt;/em&gt;&lt;/p&gt;    &lt;p&gt;&lt;em&gt;&amp;#160;&amp;#160;&amp;#160; SELECT o.EmployeeLevel + 1, e.*       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; FROM &lt;strong&gt;OrgChart&lt;/strong&gt; AS o        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; JOIN HumanResources.Employee AS e        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ON e.ManagerID = o.EmployeeID        &lt;br /&gt;)        &lt;br /&gt;SELECT *        &lt;br /&gt;FROM OrgChart;&lt;/em&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;There isn’t a trick here, and it comes down the principle of pulling sub-query definitions (including CTEs and non-indexed views) into the outer query. You see, the OrgChart here isn’t a database object, it’s simply a nested sub-query.&lt;/p&gt;  &lt;p&gt;You might imagine that it looks a bit like this, where I’ve replaced the OrgChart reference with a copy of the query itself.&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="CTE" border="0" alt="CTE" src="http://sqlblog.com/blogs/rob_farley/CTE_28220B6F.png" width="322" height="370" /&gt;&lt;/p&gt;  &lt;p&gt;...but I’m not a big fan of this kind of representation, because it’s a bit strange to see that “WHERE ManagerID IS NULL” bit in there repeatedly. Are we really going to be getting that row out over and over again?&lt;/p&gt;  &lt;p&gt;I’ve seen people try to demonstrate this something like:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;em&gt;SELECT o2.EmployeeLevel + 1 AS EmployeeLevel, e.*        &lt;br /&gt;FROM         &lt;br /&gt;(        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; SELECT o1.EmployeeLevel + 1 AS EmployeeLevel, e.*         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; FROM         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; (        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; SELECT 1 AS EmployeeLevel, *        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; FROM HumanResources.Employee        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; WHERE ManagerID IS NULL        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ) AS o1        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; JOIN HumanResources.Employee AS e        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ON e.ManagerID = o1.EmployeeID        &lt;br /&gt;) AS o2        &lt;br /&gt;JOIN HumanResources.Employee AS e        &lt;br /&gt;ON e.ManagerID = o2.EmployeeID        &lt;br /&gt;;&lt;/em&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;, but this isn’t right either, because this query is putting the data onto the end of rows, whereas we really do need a UNION ALL.&lt;/p&gt;  &lt;p&gt;The easiest way of showing what’s going on is to look at the execution plan.&lt;/p&gt;  &lt;p&gt;&lt;a href="http://sqlblog.com/blogs/rob_farley/CTE_plan_59CD02F7.png"&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="CTE_plan" border="0" alt="CTE_plan" src="http://sqlblog.com/blogs/rob_farley/CTE_plan_thumb_5A75B921.png" width="742" height="283" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;Look at the first operator called – it’s an Index Spool (over on the left). It gets its data from a Concatenation between data that comes from an Index Seek, and a join between a Table Spool and another Index Seek. This sounds all well and good, but that Table Spool is empty. There’s nothing on the spool at all.&lt;/p&gt;  &lt;p&gt;At least, until the Concatenation operator returns that first row to the Index Spool. When this happens, the Table Spool can serve another row. The Nested Loop happily takes the row, and requests any matching rows from the Index Seek at the bottom-right of the plan, and the Concatenation operator happily passes these rows back to the Index Spool, at which point the Table Spool has more rows it can serve up again.&lt;/p&gt;  &lt;p&gt;The Index Spool controls all this. At some point, the system has to realise that there’s no more data that’s going to be served up. The Table Spool doesn’t just sit waiting for rows to appear, nor does the spooling behaviour cause the Table Spool to suddenly get kicked off again. This is all handled because the Concatenation operator keeps getting prodded (by the Index Spool) that there’s more data that’s been pushed onto it. The Table Spool doesn’t know (or even care) if the rows it’s handed over are going to end up back on the spool – after all, it doesn’t know if those employees are also managers, it just serves up the data that appears on it, when requested.&lt;/p&gt;  &lt;p&gt;The recursive CTE is not magic. It doesn’t do any kind of trickery. It’s just a loop that feeds its data back into itself. And of course, to understand this properly, you should make sure you know to read plans from the left, revealing the Index Spool which really runs this query.&lt;/p&gt;  &lt;p&gt;Learn to read execution plans. There are a bunch of resources out there (such as other posts of mine, stuff by &lt;a href="http://www.scarydba.com/" target="_blank"&gt;Grant Fritchey&lt;/a&gt;, and more), but above all, just start opening them up and seeing how your queries run. You’ll find that a lot of the ‘tricks’ you think are in T-SQL aren’t really tricks at all, it’s just about understanding how your queries are being executed.&lt;/p&gt;  &lt;p&gt;And before you ask, I won’t be at TechEd Australia this year.&lt;/p&gt;&lt;img src="http://sqlblog.com/aggbug.aspx?PostID=44695" width="1" height="1"&gt;</description><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/sql/default.aspx">sql</category><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/t-sql+tuesday/default.aspx">t-sql tuesday</category></item><item><title>MERGE gives better OUTPUT options</title><link>http://sqlblog.com/blogs/rob_farley/archive/2012/06/12/merge-gives-better-output-options.aspx</link><pubDate>Tue, 12 Jun 2012 00:40:06 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:43836</guid><dc:creator>Rob Farley</dc:creator><slash:comments>11</slash:comments><comments>http://sqlblog.com/blogs/rob_farley/comments/43836.aspx</comments><wfw:commentRss>http://sqlblog.com/blogs/rob_farley/commentrss.aspx?PostID=43836</wfw:commentRss><wfw:comment>http://sqlblog.com/blogs/rob_farley/rsscomments.aspx?PostID=43836</wfw:comment><description>&lt;p&gt;MERGE is very cool. There are a ton of useful things about it – mostly around the fact that you can implement a ton of change against a table all at once. This is great for data warehousing, handling changes made to relational databases by applications, all kinds of things.&lt;/p&gt;  &lt;p&gt;One of the more subtle things about MERGE is the power of the OUTPUT clause. Useful for &lt;a href="http://sqlvariant.com/2012/06/t-sql-tuesday-31-logging/" target="_blank"&gt;logging&lt;/a&gt;.&lt;/p&gt;  &lt;p&gt;&amp;#160;&lt;a href="http://sqlvariant.com/2012/06/t-sql-tuesday-31-logging/" target="_blank"&gt;&lt;img style="background-image:none;border-right-width:0px;margin:5px;padding-left:0px;padding-right:0px;display:inline;border-top-width:0px;border-bottom-width:0px;border-left-width:0px;padding-top:0px;" title="TSQL2sDay150x150" border="0" alt="TSQL2sDay150x150" src="http://sqlblog.com/blogs/rob_farley/TSQL2sDay150x150_1AA80E99.jpg" width="170" height="170" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;If you’re not familiar with the OUTPUT clause, you really should be – it basically makes your DML (INSERT/DELETE/UPDATE/MERGE) statement return data back to you. This is a great way of returning identity values from INSERT commands (so much better than SCOPE_IDENTITY() or the older (and worse) @@IDENTITY, because you can get lots of rows back). You can even use it to grab default values that are set using non-deterministic functions like NEWID() – things you couldn’t normally get back without running another query (or with a trigger, I guess, but that’s not pretty).&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="" border="0" alt="" src="http://sqlblog.com/blogs/rob_farley/image_78700312.png" width="390" height="372" /&gt;&lt;/p&gt;  &lt;p&gt;That &lt;em&gt;inserted&lt;/em&gt; table I referenced – that’s part of the ‘behind-the-scenes’ work that goes on with all DML changes. When you insert data, this internal table called &lt;em&gt;inserted&lt;/em&gt; gets populated with rows, and then used to inflict the appropriate inserts on the various structures that store data (HoBTs – the Heaps or B-Trees used to store data as tables and indexes). When deleting, the &lt;em&gt;deleted &lt;/em&gt;table gets populated. Updates get a matching row in both tables (although this doesn’t mean that an update is a delete followed by an inserted, it’s just the way it’s handled with these tables). These tables can be referenced by the OUTPUT clause, which can show you the before and after for any DML statement. Useful stuff.&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="" border="0" alt="" src="http://sqlblog.com/blogs/rob_farley/image_7C2DA7E2.png" width="381" height="173" /&gt;&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="" border="0" alt="" src="http://sqlblog.com/blogs/rob_farley/image_0E95F898.png" width="437" height="161" /&gt;&lt;/p&gt;  &lt;p&gt;MERGE is slightly different though.&lt;/p&gt;  &lt;p&gt;With MERGE, you get a mix of entries. Your MERGE statement might be doing some INSERTs, some UPDATEs and some DELETEs. One of the most common examples of MERGE is to perform an &lt;em&gt;UPSERT&lt;/em&gt; command, where data is updated if it already exists, or inserted if it’s new. And in a single operation too. Here, you can see the usefulness of the &lt;em&gt;deleted&lt;/em&gt; and &lt;em&gt;inserted&lt;/em&gt; tables, which clearly reflect the type of operation (but then again, MERGE lets you use an extra column called &lt;em&gt;$action&lt;/em&gt; to show this).&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_54AA37E3.png" width="698" height="325" /&gt;&lt;/p&gt;      &lt;p&gt;(Don’t worry about the fact that I turned on IDENTITY_INSERT, that’s just so that I could insert the values)&lt;/p&gt;  &lt;p&gt;One of the things I love about MERGE is that it feels almost cursor-like – the UPDATE bit feels like “WHERE CURRENT OF …”, and the INSERT bit feels like a single-row insert. And it is – but into the &lt;em&gt;inserted&lt;/em&gt; and &lt;em&gt;deleted&lt;/em&gt; tables. The operations to maintain the HoBTs are still done using the whole set of changes, which is very cool.&lt;/p&gt;  &lt;p&gt;And $action – very convenient.&lt;/p&gt;  &lt;p&gt;But as cool as $action is, that’s not the point of my post. If it were, I hope you’d all be disappointed, as you can’t really go near the MERGE statement without learning about it.&lt;/p&gt;  &lt;p&gt;The subtle thing that I love about MERGE with OUTPUT is that &lt;strong&gt;you can hook into more than just inserted and deleted&lt;/strong&gt;.&lt;/p&gt;  &lt;p&gt;Did you notice in my earlier query that my source table had a ‘src’ field, that wasn’t used in the insert? Normally, this would be somewhat pointless to include in my source query. But with MERGE, I can put that in the OUTPUT clause.&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_1680F765.png" width="727" height="319" /&gt;&lt;/p&gt;  &lt;p&gt;This is useful stuff, particularly when you’re needing to audit the changes. Suppose your query involved consolidating data from a number of sources, but you didn’t need to insert that into the actual table, just into a table for audit. This is now very doable, either using the INTO clause of OUTPUT, or surrounding the whole MERGE statement in brackets (parentheses if you’re American) and using a regular INSERT statement.&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="" border="0" alt="" src="http://sqlblog.com/blogs/rob_farley/image_7FCB0118.png" width="581" height="310" /&gt;&lt;/p&gt;  &lt;p&gt;This is also doable if you’re using MERGE to just do INSERTs.&lt;/p&gt;  &lt;p&gt;In case you hadn’t realised, you can use MERGE in place of an INSERT statement. It’s just like the UPSERT-style statement we’ve just seen, except that we want nothing to match. That’s easy to do, we just use &lt;em&gt;ON 1=2.&lt;/em&gt;&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="" border="0" alt="" src="http://sqlblog.com/blogs/rob_farley/image_47C90D35.png" width="511" height="371" /&gt;&lt;/p&gt;    &lt;p&gt;This is obviously more convoluted than a straight INSERT. And it’s slightly more effort for the database engine too. But, if you want the extra audit capabilities, the ability to hook into the other source columns is definitely useful.&lt;/p&gt;  &lt;p&gt;Oh, and before people ask if you can also hook into the target table’s columns... Yes, of course. That’s what &lt;em&gt;deleted&lt;/em&gt; and &lt;em&gt;inserted&lt;/em&gt; give you.&lt;/p&gt;&lt;img src="http://sqlblog.com/aggbug.aspx?PostID=43836" width="1" height="1"&gt;</description><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/sql/default.aspx">sql</category><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/t-sql+tuesday/default.aspx">t-sql tuesday</category></item><item><title>Analytic functions – they’re not aggregates</title><link>http://sqlblog.com/blogs/rob_farley/archive/2012/04/10/analytic-functions-they-re-not-aggregates.aspx</link><pubDate>Tue, 10 Apr 2012 01:53:49 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:42753</guid><dc:creator>Rob Farley</dc:creator><slash:comments>0</slash:comments><comments>http://sqlblog.com/blogs/rob_farley/comments/42753.aspx</comments><wfw:commentRss>http://sqlblog.com/blogs/rob_farley/commentrss.aspx?PostID=42753</wfw:commentRss><wfw:comment>http://sqlblog.com/blogs/rob_farley/rsscomments.aspx?PostID=42753</wfw:comment><description>&lt;p&gt;SQL 2012 brings us a bunch of new &lt;a href="http://technet.microsoft.com/en-us/library/hh213234(v=sql.110).aspx" target="_blank"&gt;analytic functions&lt;/a&gt;, together with enhancements to the &lt;a href="http://technet.microsoft.com/en-us/library/ms189461%28v=sql.110%29.aspx" target="_blank"&gt;OVER clause&lt;/a&gt;. People who have known me over the years will remember that I’m a big fan of the OVER clause and the types of things that it brings us when applied to aggregate functions, as well as the ranking functions that it enables.&lt;/p&gt;  &lt;p&gt;The OVER clause was introduced in SQL Server 2005, and remained frustratingly unchanged until SQL Server 2012.&lt;a href="http://www.nigelpsammy.com/2012/04/t-sql-tuesday-029-lets-have-sql-server.html" target="_blank"&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;margin:5px;padding-left:0px;padding-right:0px;display:inline;float:right;border-top:0px;border-right:0px;padding-top:0px;" title="TSQL2sDay150x150" border="0" alt="TSQL2sDay150x150" align="right" src="http://sqlblog.com/blogs/rob_farley/TSQL2sDay150x150_04C9E35C.jpg" width="170" height="170" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;This post is going to look at a particular aspect of the analytic functions though (not the enhancements to the OVER clause). When I give presentations about the analytic functions around Australia as part of the tour of SQL Saturdays (starting in Brisbane this Thursday), and in Chicago next month, I’ll make sure it’s sufficiently well described. But for this post – I’m going to skip that and assume you get it.&lt;/p&gt;  &lt;p&gt;The analytic functions introduced in SQL 2012 seem to come in pairs – FIRST_VALUE and LAST_VALUE, LAG and LEAD, CUME_DIST and PERCENT_RANK, PERCENTILE_CONT and PERCENTILE_DISC. Perhaps frustratingly, they take slightly different forms as well. The ones I want to look at now are FIRST_VALUE and LAST_VALUE, and PERCENTILE_CONT and PERCENTILE_DISC.&lt;/p&gt;  &lt;p&gt;The reason I’m pulling this ones out is that they always produce the same result within their partitions (if you’re applying them to the whole partition).&lt;/p&gt;  &lt;p&gt;Consider the following query:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;SELECT        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; YEAR(OrderDate),         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; FIRST_VALUE(TotalDue)         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; OVER (PARTITION BY YEAR(OrderDate)        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; ORDER BY OrderDate, SalesOrderID        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; RANGE BETWEEN UNBOUNDED PRECEDING        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; AND UNBOUNDED FOLLOWING),        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; LAST_VALUE(TotalDue)         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; OVER (PARTITION BY YEAR(OrderDate)        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; ORDER BY OrderDate, SalesOrderID        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; RANGE BETWEEN UNBOUNDED PRECEDING        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; AND UNBOUNDED FOLLOWING),        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; PERCENTILE_CONT(0.95)         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; WITHIN GROUP (ORDER BY TotalDue)        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; OVER (PARTITION BY YEAR(OrderDate)),        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; PERCENTILE_DISC(0.95)         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; WITHIN GROUP (ORDER BY TotalDue)        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; OVER (PARTITION BY YEAR(OrderDate))        &lt;br /&gt;FROM Sales.SalesOrderHeader        &lt;br /&gt;&lt;/font&gt;&lt;font face="Consolas"&gt;;&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;This is designed to get the TotalDue for the first order of the year, the last order of the year, and also the 95% percentile, using both the continuous and discrete methods (‘discrete’ means it picks the closest one from the values available – ‘continuous’ means it will happily use something between, similar to what you would do for a traditional median of four values). I’m sure you can imagine the results – a different value for each field, but within each year, all the rows the same.&lt;/p&gt;  &lt;p&gt;Notice that I’m not grouping by the year. Nor am I filtering. This query gives us a result for every row in the SalesOrderHeader table – 31465 in this case (using the original AdventureWorks that dates back to the SQL 2005 days). &lt;/p&gt;  &lt;p&gt;The RANGE BETWEEN bit in FIRST_VALUE and LAST_VALUE is needed to make sure that we’re considering all the rows available. If we don’t specify that, it assumes we only mean “RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW”, which means that LAST_VALUE ends up being the row we’re looking at.&lt;/p&gt;  &lt;p&gt;At this point you might think about other environments such as Access or Reporting Services, and remember aggregate functions like FIRST. We really should be able to do something like:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;SELECT        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; YEAR(OrderDate),         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; FIRST_VALUE(TotalDue)         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; OVER (PARTITION BY YEAR(OrderDate)        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; ORDER BY OrderDate, SalesOrderID        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; RANGE BETWEEN UNBOUNDED PRECEDING        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; AND UNBOUNDED FOLLOWING)        &lt;br /&gt;&lt;/font&gt;&lt;font face="Consolas"&gt;FROM Sales.SalesOrderHeader       &lt;br /&gt;&lt;/font&gt;&lt;font face="Consolas"&gt;GROUP BY YEAR(OrderDate)       &lt;br /&gt;&lt;/font&gt;&lt;font face="Consolas"&gt;;&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;But you can’t. You get that age-old error:&lt;/p&gt;  &lt;p&gt;&lt;font color="#ff0000" face="Consolas"&gt;Msg 8120, Level 16, State 1, Line 5     &lt;br /&gt;Column 'Sales.SalesOrderHeader.OrderDate' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.      &lt;br /&gt;Msg 8120, Level 16, State 1, Line 5      &lt;br /&gt;Column 'Sales.SalesOrderHeader.SalesOrderID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.      &lt;br /&gt;&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;Hmm.&lt;/p&gt;  &lt;p&gt;You see, FIRST_VALUE isn’t an aggregate function. None of these analytic functions are. There are too many things involved for SQL to realise that the values produced might be identical within the group. &lt;/p&gt;  &lt;p&gt;Furthermore, you can’t even surround it in a MAX. Then you get a different error, telling you that you can’t use windowed functions in the context of an aggregate. &lt;/p&gt;  &lt;p&gt;And so we end up grouping by doing a DISTINCT.&lt;/p&gt; &lt;font face="Consolas"&gt;SELECT DISTINCT   &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; YEAR(OrderDate),     &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; FIRST_VALUE(TotalDue)&amp;#160; &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; OVER (PARTITION BY YEAR(OrderDate)    &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; ORDER BY OrderDate, SalesOrderID    &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; RANGE BETWEEN UNBOUNDED PRECEDING    &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; AND UNBOUNDED FOLLOWING),    &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; LAST_VALUE(TotalDue)     &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; OVER (PARTITION BY YEAR(OrderDate)    &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; ORDER BY OrderDate, SalesOrderID    &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; RANGE BETWEEN UNBOUNDED PRECEDING    &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; AND UNBOUNDED FOLLOWING),    &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; PERCENTILE_CONT(0.95)&amp;#160; &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; WITHIN GROUP (ORDER BY TotalDue)    &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; OVER (PARTITION BY YEAR(OrderDate)),    &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; PERCENTILE_DISC(0.95)     &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; WITHIN GROUP (ORDER BY TotalDue)    &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; OVER (PARTITION BY YEAR(OrderDate))    &lt;br /&gt;FROM Sales.SalesOrderHeader    &lt;br /&gt;&lt;/font&gt;&lt;font face="Consolas"&gt;;&lt;/font&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_18A32A58.png" width="510" height="149" /&gt;&lt;/p&gt;  &lt;p&gt;I’m sorry. It’s just the way it goes. Hopefully it’ll change the future, but for now, it’s what you’ll have to do.&lt;/p&gt;  &lt;p&gt;If we look in the execution plan, we see that it’s incredibly ugly, and actually works out the results of these analytic functions for all 31465 rows, finally performing the distinct operation to convert it into the four rows we get in the results.&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_550B7935.png" width="423" height="129" /&gt;&lt;/p&gt;  &lt;p&gt;You might be able to achieve a better plan using things like TOP, or the kind of calculation that I used in &lt;a title="http://sqlblog.com/blogs/rob_farley/archive/2011/08/23/t-sql-thoughts-about-the-95th-percentile.aspx" href="http://sqlblog.com/blogs/rob_farley/archive/2011/08/23/t-sql-thoughts-about-the-95th-percentile.aspx"&gt;http://sqlblog.com/blogs/rob_farley/archive/2011/08/23/t-sql-thoughts-about-the-95th-percentile.aspx&lt;/a&gt; (which is how PERCENTILE_CONT works), but it’s definitely convenient to use these functions, and in time, I’m sure we’ll see good improvements in the way that they are implemented.&lt;/p&gt;  &lt;p&gt;Oh, and this post should be good for fellow &lt;a href="http://www.nigelpsammy.com/" target="_blank"&gt;SQL Server MVP Nigel Sammy&lt;/a&gt;’s &lt;a href="http://www.nigelpsammy.com/2012/04/t-sql-tuesday-029-lets-have-sql-server.html" target="_blank"&gt;T-SQL Tuesday this month&lt;/a&gt;.&lt;/p&gt;&lt;img src="http://sqlblog.com/aggbug.aspx?PostID=42753" width="1" height="1"&gt;</description><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/sql/default.aspx">sql</category><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/sql+improvements/default.aspx">sql improvements</category><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/t-sql+tuesday/default.aspx">t-sql tuesday</category></item><item><title>Be the surgeon</title><link>http://sqlblog.com/blogs/rob_farley/archive/2012/03/13/be-the-surgeon.aspx</link><pubDate>Tue, 13 Mar 2012 00:13:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:42260</guid><dc:creator>Rob Farley</dc:creator><slash:comments>6</slash:comments><comments>http://sqlblog.com/blogs/rob_farley/comments/42260.aspx</comments><wfw:commentRss>http://sqlblog.com/blogs/rob_farley/commentrss.aspx?PostID=42260</wfw:commentRss><wfw:comment>http://sqlblog.com/blogs/rob_farley/rsscomments.aspx?PostID=42260</wfw:comment><description>&lt;p&gt;It’s a phrase I use often, especially when teaching, and I wish I had realised the concept years earlier. (And of course, fits with this month’s T-SQL Tuesday topic, hosted by &lt;a href="http://sqlblog.com/blogs/argenis_fernandez/archive/2012/03/05/t-sql-tuesday-028-jack-of-all-trades-master-of-none.aspx" target="_blank"&gt;Argenis Fernandez&lt;/a&gt;)&lt;a href="http://sqlblog.com/blogs/argenis_fernandez/archive/2012/03/05/t-sql-tuesday-028-jack-of-all-trades-master-of-none.aspx" target="_blank"&gt;&lt;img width="170" height="170" title="TSQL2sDay150x150" align="right" style="margin:5px;border:0px currentColor;padding-top:0px;padding-right:0px;padding-left:0px;float:right;display:inline;background-image:none;" alt="TSQL2sDay150x150" src="http://sqlblog.com/blogs/rob_farley/TSQL2sDay150x150_30B9E11A.jpg" border="0"&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;When I’m sick enough to go to the doctor, I see a GP. I used to typically see the same guy, but he’s moved on now. However, when he has been able to roughly identify the area of the problem, I get referred to a specialist, sometimes a surgeon.&lt;/p&gt;  &lt;p&gt;Being a surgeon requires a refined set of skills. It’s why they often don’t like to be called “Doctor”, and prefer the traditional “Mister” (the history is that the doctor used to make the diagnosis, and then hand the patient over to the person who didn’t have a doctorate, but rather was an expert cutter, typically from a background in butchering). But if you ask the surgeon about the pain you have in your leg sometimes, you’ll get told to ask your GP. It’s not that your surgeon isn’t interested – they just don’t know the answer.&lt;/p&gt;  &lt;p&gt;IT is the same now.&lt;/p&gt;  &lt;p&gt;That wasn’t something that I really understood when I got out of university. I knew there was a lot to know about IT – I’d just done an honours degree in it. But I also knew that I’d done well in just about all my subjects, and felt like I had a handle on everything. I got into developing, and still felt that having a good level of understanding about every aspect of IT was a good thing.&lt;/p&gt;  &lt;p&gt;&lt;strong&gt;This got me through for the first six or seven years of my career.&lt;/strong&gt;&lt;/p&gt;  &lt;p&gt;But then I started to realise that I couldn’t compete.&lt;/p&gt;  &lt;p&gt;I’d moved into management, and was spending my days running projects, rather than writing code. The kids were getting older. I’d had a bad back injury (ask anyone with chronic pain how it affects&amp;nbsp; your ability to concentrate, retain information, etc). But most of all, IT was getting larger.&lt;/p&gt;  &lt;p&gt;I knew kids without lives who knew more than I did. And I felt like I could easily identify people who were better than me in whatever area I could think of. Except writing queries (this was before I discovered technical communities, and people like &lt;a href="http://sqlblog.com/blogs/paul_white" target="_blank"&gt;Paul White&lt;/a&gt; and &lt;a href="http://sqlblogcasts.com/blogs/sqlandthelike/" target="_blank"&gt;Dave Ballantyne&lt;/a&gt;). And so I figured I’d specialise.&lt;/p&gt;  &lt;p&gt;&lt;strong&gt;I wish I’d done it years earlier.&lt;/strong&gt;&lt;/p&gt;  &lt;p&gt;Now, I can tell you plenty of people who are better than me at any area you can pick. But there are also more people who might consider listing me in some of their lists too. If I’d stayed the GP, I’d be stuck in management, and finding that there were better managers than me too.&lt;/p&gt;  &lt;p&gt;If you’re reading this, SQL could well be your thing. But it might not be either. Your thing might not even be in IT. Find out, and then see if you can be a world-beater at it.&lt;/p&gt;  &lt;p&gt;But it gets even better, because you can find other people to complement the things that you’re not so good at.&lt;/p&gt;  &lt;p&gt;My company, &lt;a href="http://lobsterpot.com.au/" target="_blank"&gt;LobsterPot Solutions&lt;/a&gt;, has six people in it at the moment. I’ve hand-picked those six people, along with the one who quit. The great thing about it is that I’ve been able to pick people who don’t necessarily specialise in the same way as me. I don’t write their T-SQL for them – generally they’re good enough at that themselves. But I’m on-hand if needed. Consider &lt;a href="http://www.rogernoble.com" target="_blank"&gt;Roger Noble&lt;/a&gt;, for example. He’s doing stuff in HTML5 and jQuery that I could never dream of doing to create an amazing &lt;a href="http://www.rogernoble.com/2012/02/02/addressing-the-elephant-in-the-room-the-html5-pivotviewer/" target="_blank"&gt;HTML5 version of PivotViewer&lt;/a&gt;. Or &lt;a href="https://twitter.com/ashley_adelaide" target="_blank"&gt;Ashley Sewell&lt;/a&gt;, a guy who does project management far better than I do. I could go on. My team is brilliant, and I love them to bits. We’re all surgeons, and when we work together, I like to think we’re pretty good!&lt;/p&gt;  &lt;p&gt;&lt;a href="https://twitter.com/rob_farley" target="_blank"&gt;@rob_farley&lt;/a&gt;&lt;/p&gt;&lt;img src="http://sqlblog.com/aggbug.aspx?PostID=42260" width="1" height="1"&gt;</description><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/lobsterpot/default.aspx">lobsterpot</category><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/sql/default.aspx">sql</category><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/t-sql+tuesday/default.aspx">t-sql tuesday</category></item><item><title>APPLY – not exactly set-based</title><link>http://sqlblog.com/blogs/rob_farley/archive/2012/01/10/apply-not-exactly-set-based.aspx</link><pubDate>Tue, 10 Jan 2012 00:06:04 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:40921</guid><dc:creator>Rob Farley</dc:creator><slash:comments>4</slash:comments><comments>http://sqlblog.com/blogs/rob_farley/comments/40921.aspx</comments><wfw:commentRss>http://sqlblog.com/blogs/rob_farley/commentrss.aspx?PostID=40921</wfw:commentRss><wfw:comment>http://sqlblog.com/blogs/rob_farley/rsscomments.aspx?PostID=40921</wfw:comment><description>&lt;p&gt;In my last post, I showed a technique for &lt;a href="http://sqlblog.com/blogs/rob_farley/archive/2011/12/13/a-t-sql-tip-working-calculations.aspx" target="_blank"&gt;dealing with working columns&lt;/a&gt; when writing T-SQL. The idea was around using APPLY to be able to push values from the existing set through calculations (but preferably not scalar functions, of course), producing new columns which can be used further down the query, even in the WHERE and GROUP BY clauses. Useful stuff indeed.&lt;/p&gt;  &lt;p&gt;But there is a limitation which I didn’t cover, and I feel that this is worth writing about for &lt;a href="http://davidbrycehoward.com/archive/2012/01/tsql-tuesday-026-second-chances/" target="_blank"&gt;this month’s T-SQL Tuesday&lt;/a&gt;. &lt;a href="http://davidbrycehoward.com/archive/2012/01/tsql-tuesday-026-second-chances/" target="_blank"&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;margin:5px;padding-left:0px;padding-right:0px;display:inline;float:right;border-top:0px;border-right:0px;padding-top:0px;" title="" border="0" alt="" align="right" src="http://sqlblog.com/blogs/rob_farley/TSQL2sDay150x150_7D8F2D23.jpg" width="170" height="170" /&gt;&lt;/a&gt;The theme allows people to write about previous Tuesday topics – so I’m going to revisit topics &lt;a href="http://mattvelic.com/tsql-tuesday-17-invite/" target="_blank"&gt;17&lt;/a&gt; and &lt;a href="http://sqlblog.com/blogs/allen_white/archive/2011/12/05/t-sql-tuesday-025-invitation-to-share-your-tricks.aspx" target="_blank"&gt;25&lt;/a&gt; as I go a little further into APPLY.&lt;/p&gt;  &lt;p&gt;These working columns can only be applied if they are calculations on a single row of data. The resulting set might be bigger or smaller based on how many rows are affected, but the input (at least logically) should be considered on a row-by-row basis.&lt;/p&gt;  &lt;p&gt;Consider the following query:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;SELECT p.Name, r.RevName       &lt;br /&gt;FROM         &lt;br /&gt;Production.ProductSubcategory AS s        &lt;br /&gt;CROSS APPLY         &lt;br /&gt;(SELECT REVERSE(s.Name) AS RevName) AS r        &lt;br /&gt;CROSS APPLY        &lt;br /&gt;(        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; SELECT TOP (1) *        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; FROM Production.Product AS p        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; WHERE p.ProductSubcategoryID = s.ProductSubcategoryID        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ORDER BY p.ListPrice DESC        &lt;br /&gt;) AS p        &lt;br /&gt;ORDER BY s.ProductSubcategoryID;&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;You see two CROSS APPLYs here – one creating a simple working column in REVERSE(s.Name), the other even involving another table to do a lookup. This second CROSS APPLY doesn’t necessarily produce any rows – if there is no matching Product, the resultset won’t contain any rows for that ProductSubcategory. I’ve shown some of its results below:&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="" border="0" alt="" src="http://sqlblog.com/blogs/rob_farley/image_5A7EBBB3.png" width="554" height="500" /&gt;&lt;/p&gt;  &lt;p&gt;But what about the following query, which gives the same 37 rows on the AdventureWorks database:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;SELECT        &lt;br /&gt;(        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; SELECT TOP (1) p.Name        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; FROM Production.Product AS p        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; WHERE p.ProductSubcategoryID = s.ProductSubcategoryID        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ORDER BY p.ListPrice DESC        &lt;br /&gt;),         &lt;br /&gt;REVERSE(s.Name) AS RevName        &lt;br /&gt;FROM Production.ProductSubcategory AS s        &lt;br /&gt;ORDER BY s.ProductSubcategoryID;&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;This is very similar, but you’ll notice that instead of developing the result set in the FROM clause, I’ve used the calculations directly in the SELECT clause to produce the same result.&lt;/p&gt;  &lt;p&gt;Except that it’s not the same.&lt;/p&gt;  &lt;p&gt;Here, my TOP sub-query can only produce a single value. You’ll notice I change my query to fetch only a single column now – any more would give an error. And I’m lucky I’ve selected TOP(1), not TOP(2), or TOP (1) WITH TIES – which would both be legal in my APPLY system. You’ll see that APPLY provides additional flexibility here.&lt;/p&gt;  &lt;p&gt;So now consider the following query:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;SELECT        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; p.Name,         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; p.ProductSubcategoryID,         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; COUNT(*) OVER (PARTITION BY p.ProductSubcategoryID) AS SubcatCnt        &lt;br /&gt;FROM Production.Product AS p        &lt;br /&gt;ORDER BY p.ProductSubcategoryID, p.Name;&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;You will see that both look at the list of Products, and produce a count of the number of products which are in the same Subcategory. I’m sure you’re all very familiar with the OVER clause and the use of the windowing technique provided by the PARTITION BY clause.&lt;/p&gt;  &lt;p&gt;If we run this first query and look at some of the results, you’ll see the marvellous windowing technique, which I’ve highlighted using ZoomIt. You’ll see that the SubcatCnt column has the value 3 for the Subcategory with 3 items, and 6 for the one with 6, and so on.&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="" border="0" alt="" src="http://sqlblog.com/blogs/rob_farley/image_7225CD4C.png" width="408" height="288" /&gt;&lt;/p&gt;  &lt;p&gt;Excellent stuff. But you’ll probably also be aware that you can’t use windowing functions like this in the WHERE clause – they’re applied only in the SELECT clause.&lt;/p&gt;  &lt;p&gt;Having seen my working columns trick though, you might think otherwise. &lt;/p&gt;  &lt;p&gt;&lt;strong&gt;Check this out – it’s perfectly legal!&lt;/strong&gt;&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;SELECT       &lt;br /&gt;p.Name,         &lt;br /&gt;p.ProductSubcategoryID,         &lt;br /&gt;sc.SubcatCnt        &lt;br /&gt;FROM Production.Product AS p        &lt;br /&gt;CROSS APPLY         &lt;br /&gt;(        &lt;br /&gt;SELECT COUNT(*) OVER (PARTITION BY p.ProductSubcategoryID) AS SubcatCnt        &lt;br /&gt;) AS sc        &lt;br /&gt;ORDER BY p.ProductSubcategoryID, p.Name;&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;Amazing stuff. I can now use SubcatCnt in the WHERE clause. Job done!&lt;/p&gt;  &lt;p&gt;&lt;strong&gt;But I’m having you on. It’s a trick. It’s rubbish.&lt;/strong&gt;&lt;/p&gt;  &lt;p&gt;You see, APPLY logically works on a single row at a time, as I said earlier. Look at the same block of results for this second query.&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="" border="0" alt="" src="http://sqlblog.com/blogs/rob_farley/image_04CAA137.png" width="416" height="319" /&gt;&lt;/p&gt;  &lt;p&gt;Loads of 1s. Not exactly helpful. It’s accurate though – there is a count of one row in each partition of the set that is logically passed in each time.&lt;/p&gt;  &lt;p&gt;APPLY works out the result for each row individually. It doesn’t do it for the whole set. So if you’re wanting working columns that need to be applied to multiple rows at once, then you need to think of another solution.&lt;/p&gt;  &lt;p&gt;APPLY is good, &lt;a href="http://www.youtube.com/watch?v=d2lpq5WQ7wQ" target="_blank"&gt;but not that good&lt;/a&gt;. Know its limitations and you’ll be better off.&lt;/p&gt;&lt;img src="http://sqlblog.com/aggbug.aspx?PostID=40921" width="1" height="1"&gt;</description><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/sql/default.aspx">sql</category><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/t-sql+tuesday/default.aspx">t-sql tuesday</category></item><item><title>A T-SQL Tip: Working calculations</title><link>http://sqlblog.com/blogs/rob_farley/archive/2011/12/13/a-t-sql-tip-working-calculations.aspx</link><pubDate>Tue, 13 Dec 2011 00:46:09 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:40290</guid><dc:creator>Rob Farley</dc:creator><slash:comments>5</slash:comments><comments>http://sqlblog.com/blogs/rob_farley/comments/40290.aspx</comments><wfw:commentRss>http://sqlblog.com/blogs/rob_farley/commentrss.aspx?PostID=40290</wfw:commentRss><wfw:comment>http://sqlblog.com/blogs/rob_farley/rsscomments.aspx?PostID=40290</wfw:comment><description>&lt;p&gt;T-SQL Tuesday again and this month is on T-SQL Tips (thanks &lt;a href="http://sqlblog.com/blogs/allen_white" target="_blank"&gt;Allen&lt;/a&gt;!). In some ways it’s a tough topic, because there are things I don’t really consider tips that other people do, and vice-versa. This one’s legitimate though: &lt;strong&gt;Using CROSS APPLY for working columns in calculations&lt;/strong&gt;.&lt;/p&gt;  &lt;p&gt;Let me give you an example. &lt;a href="http://sqlblog.com/blogs/allen_white/archive/2011/12/05/t-sql-tuesday-025-invitation-to-share-your-tricks.aspx" target="_blank"&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;margin:5px;padding-left:0px;padding-right:0px;display:inline;float:right;border-top:0px;border-right:0px;padding-top:0px;" title="TSQL2sDay150x150" border="0" alt="TSQL2sDay150x150" align="right" src="http://sqlblog.com/blogs/rob_farley/TSQL2sDay150x150_51311EF9.jpg" width="170" height="170" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;Back in 2009, I wrote a blog post on Julian (YYDDD) dates. &lt;a title="http://msmvps.com/blogs/robfarley/archive/2009/03/25/converting-to-and-from-julian-format-in-t-sql.aspx" href="http://msmvps.com/blogs/robfarley/archive/2009/03/25/converting-to-and-from-julian-format-in-t-sql.aspx" target="_blank"&gt;http://msmvps.com/blogs/robfarley/archive/2009/03/25/converting-to-and-from-julian-format-in-t-sql.aspx&lt;/a&gt;. &lt;/p&gt;  &lt;p&gt;Someone asked me recently about using this for time periods, for example, someone’s age when they play a football match. For example, &lt;a href="http://en.wikipedia.org/wiki/List_of_Arsenal_F.C._records_and_statistics" target="_blank"&gt;Cesc Fàbregas holds the record as the youngest Arsenal player, aged 16 years and 177 days&lt;/a&gt;. Given someone’s birthdate and date of interest, this seems like it should be quite easy to work out, but can get frustratingly tricky.&lt;/p&gt;  &lt;p&gt;The idea is quite straight forward – you just count the number of days from their most recent birthday. So let’s have a think.&lt;/p&gt;  &lt;p&gt;Counting the number of years between the startdate and the enddate is a nice place to start. I could count the number of days, but that might give me rounding error based on leap years. But counting the years simply compares the year component, such as 1987 and 2003, and the player might not have had his birthday this year yet. In that case, we’d want to go with the year before. Anyway – once we’ve figured out how old the player is, we can figure out when their last birthday was and count the days since then. Easy. &lt;/p&gt;  &lt;p&gt;But so easy to make a mistake somewhere.&lt;/p&gt;  &lt;p&gt;And this is where CROSS APPLY can come into its own, by allowing us to use working columns.&lt;/p&gt;  &lt;p&gt;Let’s start with a few players. Some young ones, and Thierry Henry, just because. I’m using the VALUES method, but you could have a Players table just as easily.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;select *       &lt;br /&gt;from         &lt;br /&gt;(        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; values        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160; ('Cesc Fàbregas', cast('4 May 1987' as date), cast('28 October 2003' as date))        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ,('Theo Walcott', cast('16 March 1989' as date), cast('19 August 2006' as date))        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ,('Jack Wilshere', cast('1 January 1992' as date), cast('13 September 2008' as date))        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ,('Thierry Henry', cast('17 August 1977' as date), cast('7 August 1999' as date))        &lt;br /&gt;) p (Name, DoB, Debut)&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_1639C663.png" width="307" height="128" /&gt;&lt;/p&gt;  &lt;p&gt;&lt;font face="Consolas"&gt;&lt;font face="Verdana"&gt;First, I want to make it clear which column is my startdate and which is my enddate. It’s too easy to be thinking “startdate” and pick up “Debut” here, because this sounds very much like the same thing. There’s the startdate of their playing career, and the startdate of the calculation. To avoid confusion, I’m going to do some simple column-renaming. This gives me more reusable code, and APPLY even means I never have to worry about whether I already have these column names referring to something else, because they’re going to get their own table alias too.&lt;/font&gt;&lt;/font&gt;&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;select *       &lt;br /&gt;from         &lt;br /&gt;(        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; values        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160; ('Cesc Fàbregas', cast('4 May 1987' as date), cast('28 October 2003' as date))        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ,('Theo Walcott', cast('16 March 1989' as date), cast('19 August 2006' as date))        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ,('Jack Wilshere', cast('1 January 1992' as date), cast('13 September 2008' as date))        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ,('Thierry Henry', cast('17 August 1977' as date), cast('7 August 1999' as date))        &lt;br /&gt;) p (Name, DoB, Debut)        &lt;br /&gt;CROSS APPLY         &lt;br /&gt;&amp;#160;&amp;#160; (select p.DoB as startdate, p.Debut as enddate) as working&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_419DE75D.png" width="438" height="117" /&gt;&lt;/p&gt;  &lt;p&gt;Might seem like a bit of a waste to you, but it means so much to me. Really.&lt;/p&gt;  &lt;p&gt;Now I want to count how many years there are between my startdate and enddate, and work out when the startdate is this year. I could do this in one step, sure, but I want to be able to check my working really thoroughly.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;select *       &lt;br /&gt;from         &lt;br /&gt;(        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; values        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160; ('Cesc Fàbregas', cast('4 May 1987' as date), cast('28 October 2003' as date))        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ,('Theo Walcott', cast('16 March 1989' as date), cast('19 August 2006' as date))        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ,('Jack Wilshere', cast('1 January 1992' as date), cast('13 September 2008' as date))        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ,('Thierry Henry', cast('17 August 1977' as date), cast('7 August 1999' as date))        &lt;br /&gt;) p (Name, DoB, Debut)        &lt;br /&gt;CROSS APPLY         &lt;br /&gt;(select p.DoB as startdate, p.Debut as enddate) as working        &lt;br /&gt;CROSS APPLY        &lt;br /&gt;(select &lt;strong&gt;datediff(year,working.startdate,working.enddate)&lt;/strong&gt; as YearsDifferent) yd        &lt;br /&gt;CROSS APPLY        &lt;br /&gt;(select &lt;strong&gt;dateadd(year,yd.YearsDifferent,working.startdate)&lt;/strong&gt; as StartDateThisYear) sdty&lt;/font&gt;      &lt;br /&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_50F1496C.png" width="681" height="131" /&gt;&lt;/p&gt;  &lt;p&gt;Now I can easily test to see if I need to subtract a year or not.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;select *       &lt;br /&gt;from         &lt;br /&gt;(        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; values        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160; ('Cesc Fàbregas', cast('4 May 1987' as date), cast('28 October 2003' as date))        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ,('Theo Walcott', cast('16 March 1989' as date), cast('19 August 2006' as date))        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ,('Jack Wilshere', cast('1 January 1992' as date), cast('13 September 2008' as date))        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ,('Thierry Henry', cast('17 August 1977' as date), cast('7 August 1999' as date))        &lt;br /&gt;) p (Name, DoB, Debut)        &lt;br /&gt;CROSS APPLY         &lt;br /&gt;(select p.DoB as startdate, p.Debut as enddate) as working        &lt;br /&gt;CROSS APPLY        &lt;br /&gt;(select datediff(year,working.startdate,working.enddate) as YearsDifferent) yd        &lt;br /&gt;CROSS APPLY        &lt;br /&gt;(select dateadd(year,yd.YearsDifferent,working.startdate) as StartDateThisYear) sdty        &lt;br /&gt;CROSS APPLY        &lt;br /&gt;(select &lt;strong&gt;case when sdty.StartDateThisYear &amp;gt; working.enddate then -1 else 0 end&lt;/strong&gt; as YearOffset) yo        &lt;br /&gt;CROSS APPLY        &lt;br /&gt;(select &lt;strong&gt;dateadd(year,yd.YearsDifferent + yo.YearOffset,working.startdate)&lt;/strong&gt; as AdjustedYear) ay&lt;/font&gt;      &lt;br /&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_5BCE2AB4.png" width="838" height="123" /&gt;&lt;/p&gt;        &lt;p&gt;I’m sure you can see where this is going. I now have their latest birthday before the date I’m looking for, and I can easily turn this into a Julian Date format.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;select p.*, jd.JulianDiff       &lt;br /&gt;from         &lt;br /&gt;(        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; values        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160; ('Cesc Fàbregas', cast('4 May 1987' as date), cast('28 October 2003' as date))        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ,('Theo Walcott', cast('16 March 1989' as date), cast('19 August 2006' as date))        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ,('Jack Wilshere', cast('1 January 1992' as date), cast('13 September 2008' as date))        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ,('Thierry Henry', cast('17 August 1977' as date), cast('7 August 1999' as date))        &lt;br /&gt;) p (Name, DoB, Debut)        &lt;br /&gt;CROSS APPLY         &lt;br /&gt;&amp;#160;&amp;#160; (select p.DoB as startdate, p.Debut as enddate) as working        &lt;br /&gt;CROSS APPLY        &lt;br /&gt;&amp;#160;&amp;#160; (select datediff(year,working.startdate,working.enddate) as YearsDifferent) yd        &lt;br /&gt;CROSS APPLY        &lt;br /&gt;&amp;#160;&amp;#160; (select dateadd(year,yd.YearsDifferent,working.startdate) as StartDateThisYear) sdty        &lt;br /&gt;CROSS APPLY        &lt;br /&gt;&amp;#160;&amp;#160; (select case when sdty.StartDateThisYear &amp;gt; working.enddate then -1 else 0 end as YearOffset) yo        &lt;br /&gt;CROSS APPLY        &lt;br /&gt;&amp;#160;&amp;#160; (select dateadd(year,yd.YearsDifferent + yo.YearOffset,working.startdate) as AdjustedYear) ay        &lt;br /&gt;CROSS APPLY        &lt;br /&gt;&amp;#160;&amp;#160; (select &lt;strong&gt;(yd.YearsDifferent + yo.YearOffset) * 1000&lt;/strong&gt; as YearComponent) yc        &lt;br /&gt;CROSS APPLY        &lt;br /&gt;&amp;#160;&amp;#160; (select &lt;strong&gt;datediff(day,ay.AdjustedYear,working.enddate)&lt;/strong&gt; as DayComponent) dc        &lt;br /&gt;CROSS APPLY        &lt;br /&gt;&amp;#160;&amp;#160; (select &lt;strong&gt;yc.YearComponent + dc.DayComponent&lt;/strong&gt; as JulianDiff) as jd        &lt;br /&gt;;&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_5F8BCF84.png" width="350" height="124" /&gt;&lt;/p&gt;  &lt;p&gt;The great thing about this is that I can be very confident of my working, being able to check each step of the calculation along the way. I can even turn it into a simple TVF, whilst keeping the calculations just as verifiable:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;create function dbo.YYDDD(@startdate date, @enddate date) returns table       &lt;br /&gt;as return        &lt;br /&gt;(        &lt;br /&gt;select jd.JulianDiff         &lt;br /&gt;FROM         &lt;br /&gt;(values (@startdate, @enddate)) working (startdate, enddate)        &lt;br /&gt;CROSS APPLY        &lt;br /&gt;&amp;#160;&amp;#160; (select datediff(year,working.startdate,working.enddate) as YearsDifferent) yd        &lt;br /&gt;CROSS APPLY        &lt;br /&gt;&amp;#160;&amp;#160; (select dateadd(year,yd.YearsDifferent,working.startdate) as StartDateThisYear) sdty        &lt;br /&gt;CROSS APPLY        &lt;br /&gt;&amp;#160;&amp;#160; (select case when sdty.StartDateThisYear &amp;gt; working.enddate then -1 else 0 end as YearOffset) yo        &lt;br /&gt;CROSS APPLY        &lt;br /&gt;&amp;#160;&amp;#160; (select dateadd(year,yd.YearsDifferent + yo.YearOffset,working.startdate) as AdjustedYear) ay        &lt;br /&gt;CROSS APPLY        &lt;br /&gt;&amp;#160;&amp;#160; (select (yd.YearsDifferent + yo.YearOffset) * 1000 as YearComponent) yc        &lt;br /&gt;CROSS APPLY        &lt;br /&gt;&amp;#160;&amp;#160; (select datediff(day,ay.AdjustedYear,working.enddate) as DayComponent) dc        &lt;br /&gt;CROSS APPLY        &lt;br /&gt;&amp;#160;&amp;#160; (select yc.YearComponent + dc.DayComponent as JulianDiff) as jd        &lt;br /&gt;)        &lt;br /&gt;;&lt;/font&gt;&lt;/p&gt;    &lt;p&gt;&lt;font face="Consolas"&gt;select p.*, jd.JulianDiff       &lt;br /&gt;from         &lt;br /&gt;(        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; values        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160; ('Cesc Fàbregas', cast('4 May 1987' as date), cast('28 October 2003' as date))        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ,('Theo Walcott', cast('16 March 1989' as date), cast('19 August 2006' as date))        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ,('Jack Wilshere', cast('1 January 1992' as date), cast('13 September 2008' as date))        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ,('Thierry Henry', cast('17 August 1977' as date), cast('7 August 1999' as date))        &lt;br /&gt;) p (Name, DoB, Debut)        &lt;br /&gt;CROSS APPLY dbo.YYDDD(DoB, Debut) jd;&lt;/font&gt;      &lt;br /&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;I know it’s really simple, but it’s an amazingly useful tip for making sure that you do things right.&lt;/p&gt;  &lt;p&gt;Oh, and as for the plan – SQL just rolls it all up into a single Compute Scalar as if we’d written it out the long way, happily showing us that doing it step-by-step ourselves is no worse at all.&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_3CE79109.png" width="606" height="314" /&gt;&lt;/p&gt;  &lt;p&gt;&lt;a href="http://twitter.com/rob_farley" target="_blank"&gt;@rob_farley&lt;/a&gt;&lt;/p&gt;&lt;img src="http://sqlblog.com/aggbug.aspx?PostID=40290" width="1" height="1"&gt;</description><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/sql/default.aspx">sql</category><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/t-sql+tuesday/default.aspx">t-sql tuesday</category></item><item><title>When is a SQL function not a function?</title><link>http://sqlblog.com/blogs/rob_farley/archive/2011/11/08/when-is-a-sql-function-not-a-function.aspx</link><pubDate>Tue, 08 Nov 2011 00:01:17 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:39687</guid><dc:creator>Rob Farley</dc:creator><slash:comments>8</slash:comments><comments>http://sqlblog.com/blogs/rob_farley/comments/39687.aspx</comments><wfw:commentRss>http://sqlblog.com/blogs/rob_farley/commentrss.aspx?PostID=39687</wfw:commentRss><wfw:comment>http://sqlblog.com/blogs/rob_farley/rsscomments.aspx?PostID=39687</wfw:comment><description>&lt;p&gt;Should SQL Server even have functions? (Oh yeah – this is a &lt;a href="http://bradsruminations.blogspot.com/2011/10/invitation-for-t-sql-tuesday-024-prox-n.html" target="_blank"&gt;T-SQL Tuesday post&lt;/a&gt;, hosted this month by &lt;a href="http://bradsruminations.blogspot.com/" target="_blank"&gt;Brad Schulz&lt;/a&gt;) &lt;a href="http://bradsruminations.blogspot.com/2011/10/invitation-for-t-sql-tuesday-024-prox-n.html" target="_blank"&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;margin:5px;padding-left:0px;padding-right:0px;display:inline;float:right;border-top:0px;border-right:0px;padding-top:0px;" title="TSQL2sDay150x150" border="0" alt="TSQL2sDay150x150" align="right" src="http://sqlblog.com/blogs/rob_farley/TSQL2sDay150x150_4A51E977.jpg" width="170" height="170" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;Functions serve an important part of programming, in almost any language. A function is a piece of code that is designed to return something, as opposed to a piece of code which isn’t designed to return anything (which is known as a procedure). SQL Server is no different. You can call stored procedures, even from within other stored procedures, and you can call functions and use these in other queries. Stored procedures might query something, and therefore ‘return data’, but a function in SQL is considered to have the type of the thing returned, and can be used accordingly in queries. Consider the internal GETDATE() function.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;SELECT GETDATE(), SomeDatetimeColumn       &lt;br /&gt;FROM dbo.SomeTable;&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;There’s no logical difference between the field that is being returned by the function and the field that’s being returned by the table column. Both are the datetime field – if you didn’t have inside knowledge, you wouldn’t necessarily be able to tell which was which. And so as developers, we find ourselves wanting to create functions that return all kinds of things – functions which look up values based on codes, functions which do string manipulation, and so on.&lt;/p&gt;  &lt;p&gt;But it’s rubbish.&lt;/p&gt;  &lt;p&gt;Ok, it’s not all rubbish, but it mostly is. And this isn’t even considering the SARGability impact. It’s far more significant than that. (When I say the SARGability aspect, I mean “because you’re unlikely to have an index on the result of some function that’s applied to a column, so try to invert the function and query the column in an unchanged manner”)&lt;/p&gt;  &lt;p&gt;I’m going to consider the three main types of user-defined functions in SQL Server:&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;Scalar&lt;/li&gt;    &lt;li&gt;Inline Table-Valued&lt;/li&gt;    &lt;li&gt;Multi-statement Table-Valued&lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;I could also look at user-defined CLR functions, including aggregate functions, but not today. I figure that most people don’t tend to get around to doing CLR functions, and I’m going to focus on the T-SQL-based user-defined functions.&lt;/p&gt;  &lt;p&gt;Most people split these types of function up into two types. So do I. Except that most people pick them based on ‘scalar or table-valued’. I’d rather go with ‘inline or not’.&lt;/p&gt;  &lt;p&gt;&lt;strong&gt;If it’s not inline, it’s rubbish.&lt;/strong&gt; It really is.&lt;/p&gt;  &lt;p&gt;Let’s start by considering the two kinds of table-valued function, and compare them.&lt;/p&gt;  &lt;p&gt;These functions are going to return the sales for a particular salesperson in a particular year, from the AdventureWorks database.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;CREATE FUNCTION dbo.FetchSales_inline(@salespersonid int, @orderyear int)        &lt;br /&gt;RETURNS TABLE AS&amp;#160; &lt;br /&gt;RETURN (        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; SELECT e.LoginID as EmployeeLogin, o.OrderDate, o.SalesOrderID        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; FROM Sales.SalesOrderHeader AS o        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; LEFT JOIN HumanResources.Employee AS e        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ON e.EmployeeID = o.SalesPersonID        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; WHERE o.SalesPersonID = @salespersonid        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; AND o.OrderDate &amp;gt;= DATEADD(year,@orderyear-2000,'20000101')        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; AND o.OrderDate &amp;lt; DATEADD(year,@orderyear-2000+1,'20000101')        &lt;br /&gt;)        &lt;br /&gt;;        &lt;br /&gt;GO&lt;/font&gt;&lt;/p&gt;    &lt;p&gt;&lt;font face="Consolas"&gt;CREATE FUNCTION dbo.FetchSales_multi(@salespersonid int, @orderyear int)        &lt;br /&gt;RETURNS @results TABLE (        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; EmployeeLogin nvarchar(512),        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; OrderDate datetime,        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; SalesOrderID int        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; )        &lt;br /&gt;AS         &lt;br /&gt;BEGIN        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; INSERT @results (EmployeeLogin, OrderDate, SalesOrderID)        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; SELECT e.LoginID, o.OrderDate, o.SalesOrderID        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; FROM Sales.SalesOrderHeader AS o        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; LEFT JOIN HumanResources.Employee AS e        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ON e.EmployeeID = o.SalesPersonID        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; WHERE o.SalesPersonID = @salespersonid        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; AND o.OrderDate &amp;gt;= DATEADD(year,@orderyear-2000,'20000101')        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; AND o.OrderDate &amp;lt; DATEADD(year,@orderyear-2000+1,'20000101')        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ;        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; RETURN        &lt;br /&gt;END        &lt;br /&gt;;        &lt;br /&gt;GO&lt;/font&gt;      &lt;br /&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;You’ll notice that I’m being nice and responsible with the use of the DATEADD function, so that I have SARGability on the OrderDate filter.&lt;/p&gt;  &lt;p&gt;Regular readers will be hoping I’ll show what’s going on in the execution plans here. Here I’ve run two SELECT * queries with the “Show Actual Execution Plan” option turned on.&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_59391891.png" width="660" height="457" /&gt;&lt;/p&gt;  &lt;p&gt;Notice that the ‘Query cost’ of the multi-statement version is just 2% of the ‘Batch cost’. But also notice there’s trickery going on. And it’s nothing to do with that extra index that I have on the OrderDate column.&lt;/p&gt;  &lt;p&gt;&lt;strong&gt;Trickery&lt;/strong&gt;.&lt;/p&gt;  &lt;p&gt;Look at it – clearly, the first plan is showing us what’s going on inside the function, but the second one isn’t. The second one is blindly running the function, and then scanning the results. There’s a Sequence operator which is calling the TVF operator, and then calling a Table Scan to get the results of that function for the SELECT operator. But surely it still has to do all the work that the first one is doing...&lt;/p&gt;  &lt;p&gt;To see what’s actually going on, let’s look at the Estimated plan. &lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_2E9D37B4.png" width="1044" height="724" /&gt;&lt;/p&gt;  &lt;p&gt;Now, we see the same plans (almost) that we saw in the Actuals, but we have an extra one – the one that was used for the TVF. Here’s where we see the inner workings of it. You’ll probably recognise the right-hand side of the TVF’s plan as looking very similar to the first plan – but it’s now being called by a stack of other operators, including an INSERT statement to be able to populate the table variable that the multi-statement TVF requires.&lt;/p&gt;  &lt;p&gt;And the cost of the TVF is 57% of the batch!&lt;/p&gt;  &lt;p&gt;But it gets &lt;strong&gt;worse&lt;/strong&gt;.&lt;/p&gt;  &lt;p&gt;Let’s consider what happens if we don’t need all the columns. We’ll leave out the EmployeeLogin column.&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_77FF66A2.png" width="1044" height="658" /&gt;&lt;/p&gt;  &lt;p&gt;Here, we see that the inline function call has been simplified down. It doesn’t need the Employee table. The join is redundant and has been eliminated from the plan, making it even cheaper. But the multi-statement plan runs the whole thing as before, only removing the extra column when the Table Scan is performed.&lt;/p&gt;  &lt;p&gt;A multi-statement function is a lot more powerful than an inline one. An inline function can only be the result of a single sub-query. It’s essentially the same as a parameterised view, because views demonstrate this same behaviour of extracting the definition of the view and using it in the outer query. A multi-statement function is clearly more powerful because it can contain far more complex logic.&lt;/p&gt;  &lt;p&gt;But a multi-statement function isn’t really a function at all. It’s a stored procedure. It’s wrapped up like a function, but behaves like a stored procedure. It would be completely unreasonable to expect that a stored procedure could be simplified down to recognise that not all the columns might be needed, but yet this is part of the pain associated with this procedural function situation.&lt;/p&gt;  &lt;p&gt;The biggest clue that a multi-statement function is more like a stored procedure than a function is the “BEGIN” and “END” statements that surround the code. If you try to create a multi-statement function without these statements, you’ll get an error – they are very much required. When I used to present on this kind of thing, I even used to call it “The Dangers of BEGIN and END”, and yes, I’ve written about this type of thing before in a &lt;a href="http://msmvps.com/blogs/robfarley/archive/2009/12/05/dangers-of-begin-and-end.aspx" target="_blank"&gt;similarly-named post&lt;/a&gt; over at my old blog.&lt;/p&gt;  &lt;p&gt;&lt;strong&gt;Now how about scalar functions...&lt;/strong&gt;&lt;/p&gt;  &lt;p&gt;Suppose we wanted a scalar function to return the count of these.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;CREATE FUNCTION dbo.FetchSales_scalar(@salespersonid int, @orderyear int)        &lt;br /&gt;RETURNS int        &lt;br /&gt;AS         &lt;br /&gt;BEGIN        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; RETURN (        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;/font&gt;&lt;font face="Consolas"&gt;SELECT COUNT(*)       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; FROM Sales.SalesOrderHeader AS o        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; LEFT JOIN HumanResources.Employee AS e        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; ON e.EmployeeID = o.SalesPersonID        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; WHERE o.SalesPersonID = @salespersonid        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; AND o.OrderDate &amp;gt;= DATEADD(year,@orderyear-2000,'20000101')        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; AND o.OrderDate &amp;lt; DATEADD(year,@orderyear-2000+1,'20000101')        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; );        &lt;br /&gt;END        &lt;br /&gt;;        &lt;br /&gt;GO        &lt;br /&gt;&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;Notice the evil words? They’re required. Try to remove them, you just get an error.&lt;/p&gt;  &lt;p&gt;That’s right – any scalar function is procedural, despite the fact that you wrap up a sub-query inside that RETURN statement. It’s as ugly as anything. Hopefully this will change in future versions.&lt;/p&gt;  &lt;p&gt;Let’s have a look at how this is reflected in an execution plan. Here’s a query, its Actual plan, and its Estimated plan:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;SELECT e.LoginID, y.year, dbo.FetchSales_scalar(p.SalesPersonID, y.year) AS NumSales       &lt;br /&gt;FROM (VALUES (2001),(2002),(2003),(2004)) AS y (year)        &lt;br /&gt;CROSS JOIN Sales.SalesPerson AS p        &lt;br /&gt;LEFT JOIN HumanResources.Employee AS e        &lt;br /&gt;ON e.EmployeeID = p.SalesPersonID;        &lt;br /&gt;&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_338F4F96.png" width="660" height="240" /&gt;&lt;/p&gt;      &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_09CBD4A3.png" width="1044" height="584" /&gt;&lt;/p&gt;  &lt;p&gt;We see here that the cost of the scalar function is about twice that of the outer query. Nicely, the query optimizer has worked out that it doesn’t need the Employee table, but that’s a bit of a red herring here. There’s actually something way more significant going on.&lt;/p&gt;  &lt;p&gt;If I look at the properties of that UDF operator, it tells me that the Estimated Subtree Cost is 0.337999.&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;margin:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_7349104E.png" width="522" height="270" /&gt;&lt;/p&gt;  &lt;p&gt;If I just run the query &lt;font face="Consolas"&gt;SELECT dbo.FetchSales_scalar(281,2003);&lt;/font&gt; we see that the UDF cost is still unchanged.&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;margin:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_433EBECD.png" width="551" height="408" /&gt;&lt;/p&gt;    &lt;p&gt;You see, this 0.0337999 is the cost of running the scalar function ONCE. &lt;/p&gt;  &lt;p&gt;But when we ran that query with the CROSS JOIN in it, we returned quite a few rows. 68 in fact. Could’ve been a lot more, if we’d had more salespeople or more years.&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;margin:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_0EBDEC85.png" width="560" height="214" /&gt;&lt;/p&gt;  &lt;p&gt;And so we come to the biggest problem. This procedure (I don’t want to call it a function) is getting called 68 times – each one between twice as expensive as the outer query. And because it’s calling it in a separate context, there is even more overhead that I haven’t considered here.&lt;/p&gt;  &lt;p&gt;The cheek of it, to say that the Compute Scalar operator here costs 0%! I know a number of IT projects that could’ve used that kind of costing method, but that’s another story that I’m not going to go into here.&lt;/p&gt;  &lt;p&gt;Let’s look at a better way.&lt;/p&gt;  &lt;p&gt;Suppose our scalar function had been implemented as an inline one. Then it could have been expanded out like a sub-query. It could’ve run something like this:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;SELECT e.LoginID, y.year,        &lt;br /&gt;(SELECT COUNT(*)        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; FROM Sales.SalesOrderHeader AS o        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; LEFT JOIN HumanResources.Employee AS e        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ON e.EmployeeID = o.SalesPersonID        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; WHERE o.SalesPersonID = p.SalesPersonID        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; AND o.OrderDate &amp;gt;= DATEADD(year,y.year-2000,'20000101')        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; AND o.OrderDate &amp;lt; DATEADD(year,y.year-2000+1,'20000101')        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ) AS NumSales        &lt;br /&gt;FROM (VALUES (2001),(2002),(2003),(2004)) AS y (year)        &lt;br /&gt;CROSS JOIN Sales.SalesPerson AS p        &lt;br /&gt;LEFT JOIN HumanResources.Employee AS e        &lt;br /&gt;ON e.EmployeeID = p.SalesPersonID;&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_45B7CABE.png" width="1044" height="371" /&gt;&lt;/p&gt;  &lt;p&gt;&lt;em&gt;Don’t worry too much about the Scan of the SalesOrderHeader underneath a Nested Loop. If you remember from plenty of other posts on the matter, execution plans don’t push the data through. That Scan only runs once. The Index Spool sucks the data out of it and populates a structure that is used to feed the Stream Aggregate. The Index Spool operator gets called 68 times, but the Scan only once (the Number of Executions property demonstrates this).&lt;/em&gt;&lt;/p&gt;  &lt;p&gt;Here, the Query Optimizer has a full picture of what’s being asked, and can make the appropriate decision about how it accesses the data. It can simplify it down properly.&lt;/p&gt;  &lt;p&gt;To get this kind of behaviour from a function, we need it to be inline. But without inline scalar functions, we need to make our function be table-valued. Luckily, that’s ok.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;CREATE FUNCTION dbo.FetchSales_inline2(@salespersonid int, @orderyear int)        &lt;br /&gt;RETURNS table        &lt;br /&gt;AS         &lt;br /&gt;RETURN (SELECT COUNT(*) as NumSales        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; FROM Sales.SalesOrderHeader AS o        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; LEFT JOIN HumanResources.Employee AS e        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; ON e.EmployeeID = o.SalesPersonID        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; WHERE o.SalesPersonID = @salespersonid        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; AND o.OrderDate &amp;gt;= DATEADD(year,@orderyear-2000,'20000101')        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; AND o.OrderDate &amp;lt; DATEADD(year,@orderyear-2000+1,'20000101')        &lt;br /&gt;);        &lt;br /&gt;GO&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;But we can’t use this as a scalar. Instead, we need to use it with the APPLY operator.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;SELECT e.LoginID, y.year, n.NumSales       &lt;br /&gt;FROM (VALUES (2001),(2002),(2003),(2004)) AS y (year)        &lt;br /&gt;CROSS JOIN Sales.SalesPerson AS p        &lt;br /&gt;LEFT JOIN HumanResources.Employee AS e        &lt;br /&gt;ON e.EmployeeID = p.SalesPersonID        &lt;br /&gt;OUTER APPLY dbo.FetchSales_inline2(p.SalesPersonID, y.year) AS n;&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;And now, we get the plan that we want for this query.&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_56DB8294.png" width="1044" height="417" /&gt;&lt;/p&gt;  &lt;p&gt;All we’ve done is tell the function that it’s returning a table instead of a single value, and removed the BEGIN and END statements. We’ve had to name the column being returned, but what we’ve gained is an actual inline simplifiable function. And if we wanted it to return multiple columns, it could do that too. I really consider this function to be superior to the scalar function in every way.&lt;/p&gt;  &lt;p&gt;It does need to be handled differently in the outer query, but in many ways it’s a more elegant method there too. The function calls can be put amongst the FROM clause, where they can then be used in the WHERE or GROUP BY clauses without fear of calling the function multiple times (another horrible side effect of functions).&lt;/p&gt;  &lt;p&gt;So please. If you see BEGIN and END in a function, remember it’s not really a function, it’s a procedure. And then fix it.&lt;/p&gt;  &lt;p&gt;&lt;a href="http://twitter.com/rob_farley" target="_blank"&gt;@rob_farley&lt;/a&gt;&lt;/p&gt;&lt;img src="http://sqlblog.com/aggbug.aspx?PostID=39687" width="1" height="1"&gt;</description><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/performance/default.aspx">performance</category><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/sql/default.aspx">sql</category><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/t-sql+tuesday/default.aspx">t-sql tuesday</category></item><item><title>Joins without JOIN</title><link>http://sqlblog.com/blogs/rob_farley/archive/2011/10/04/joins-without-join.aspx</link><pubDate>Tue, 04 Oct 2011 02:43:38 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:38837</guid><dc:creator>Rob Farley</dc:creator><slash:comments>7</slash:comments><comments>http://sqlblog.com/blogs/rob_farley/comments/38837.aspx</comments><wfw:commentRss>http://sqlblog.com/blogs/rob_farley/commentrss.aspx?PostID=38837</wfw:commentRss><wfw:comment>http://sqlblog.com/blogs/rob_farley/rsscomments.aspx?PostID=38837</wfw:comment><description>&lt;p&gt;I’m now doing two sessions at the &lt;a href="http://www.sqlsaturday.com/92/schedule.aspx?sat=92" target="_blank"&gt;SQL Saturday event in Portland&lt;/a&gt;. I had been scheduled to do a &lt;a href="http://www.sqlsaturday.com/viewsession.aspx?sat=92&amp;amp;sessionid=5833" target="_blank"&gt;single session (on indexes)&lt;/a&gt;, but got an email yesterday asking if I could do another one as well. So now I’m going to do a &lt;a href="http://www.sqlsaturday.com/viewsession.aspx?sat=92&amp;amp;sessionid=5275" target="_blank"&gt;session earlier in the day about Joins&lt;/a&gt;.&lt;/p&gt;  &lt;p&gt;&lt;img style="margin:5px;display:inline;float:left;" align="left" src="http://www.sqlsaturday.com/images/sqlsat92_web.png" /&gt;Yes, JOINs. Nice co-incidence to find that this month’s T-SQL Tuesday, hosted by &lt;a href="http://codegumbo.com/index.php/2011/09/27/tsql2sday-t-sql-tuesday-23early-edition/" target="_blank"&gt;Stuart Ainsworth&lt;/a&gt; and a week early because of the PASS Summit, is on that same topic. Plus the fact that I gave a presentation on it to the &lt;a href="http://adelaide.sqlpass.org/" target="_blank"&gt;Adelaide SQL User Group&lt;/a&gt; last week.&lt;a href="http://codegumbo.com/index.php/2011/09/27/tsql2sday-t-sql-tuesday-23early-edition/" target="_blank"&gt;&lt;img style="background-image:none;border-right-width:0px;margin:5px;padding-left:0px;padding-right:0px;display:inline;float:right;border-top-width:0px;border-bottom-width:0px;border-left-width:0px;padding-top:0px;" title="" border="0" alt="" align="right" src="http://sqlblog.com/blogs/rob_farley/TSQL2sDay150x150_04CB4BA0.jpg" width="170" height="170" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;So let’s jump in…&lt;/p&gt;  &lt;p&gt;There are three types of joins that we write with the JOIN keyword – INNER, OUTER and CROSS. I’m imagine if you’ve read this far through this post, you’re very much aware of that. But there are other types of joins as well, that don’t use the JOIN keyword (and I’m not counting the comma short-hand for CROSS JOIN that we all used back in the 90s).&lt;/p&gt;  &lt;p&gt;A join describes the way that two sets are related to each other within a query (and for the purposes of this post, I’m not going to count joins that concatenate two sets, such as using the UNION keyword). If you’re using the JOIN keyword, the relationship between the tables is shown in the ON clause – although if you’re using CROSS JOIN, the relationship is that every row in one set is related to every row in the other set.&lt;/p&gt;  &lt;p&gt;If we don’t use the JOIN keyword, then where do we see joins? Well, in &lt;strong&gt;correlated sub-queries&lt;/strong&gt;. I showed this when demonstrating that the APPLY keyword applies a join between one set an another in another &lt;a href="http://sqlblog.com/blogs/rob_farley/archive/2011/04/12/the-power-of-t-sql-s-apply-operator.aspx" target="_blank"&gt;recent blog post&lt;/a&gt;.&lt;/p&gt;  &lt;p&gt;But every correlated sub-query, whether using APPLY or not must perform a join.&lt;/p&gt;  &lt;p&gt;Consider the following:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;SELECT *     &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; , (SELECT COUNT(*) FROM Production.Product AS p       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; WHERE p.ProductSubcategoryID = s.ProductSubcategoryID) AS ProdCount      &lt;br /&gt;FROM Production.ProductSubcategory AS s;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;There’s a join here, between Production.ProductSubcategory and Production.Product. The ON clause equivalent is the WHERE clause of the correlated sub-query. But what kind of join is it?&lt;/p&gt;  &lt;p&gt;Let’s think... There’s clearly a relationship between the two tables, so it’s not a CROSS JOIN. Also, there’s nothing that would stop a row from the ProductSubcategory table to be returned, so it’s not an INNER JOIN (which only returns rows that match). It’s an OUTER JOIN. And we can prove this by looking at the execution plan.&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="" border="0" alt="" src="http://sqlblog.com/blogs/rob_farley/image_5B7403A1.png" width="660" height="201" /&gt;&lt;/p&gt;  &lt;p&gt;The plan says “Right Outer Join”. Don’t be upset by the fact that it’s a Right Join rather than a Left Join – it’s just that Product table is coming from the top row and ProductSubcategory is from the bottom row. The ‘side’ of an Outer Join is just a matter of perspective. A Right Join is no different to a Left Join, it’s simply a question of which side of the desk you’re on.&lt;/p&gt;  &lt;p&gt;Another type of join with JOIN uses EXISTS or IN. These two act very similarly, and I’m going to use EXISTS in my examples.&lt;/p&gt;  &lt;p&gt;Consider the following query:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;SELECT *     &lt;br /&gt;FROM Production.ProductSubcategory AS s      &lt;br /&gt;WHERE EXISTS       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; (SELECT * FROM Production.Product AS p       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160; WHERE p.ProductSubcategoryID = s.ProductSubcategoryID);&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;and it’s converse:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;SELECT *     &lt;br /&gt;FROM Production.ProductSubcategory AS s      &lt;br /&gt;WHERE NOT EXISTS       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; (SELECT * FROM Production.Product AS p       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160; WHERE p.ProductSubcategoryID = s.ProductSubcategoryID);&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;Just as before, this performs a join between the two tables. But it’s not actually an Outer Join we see going on here. After all, we see that rows can be filtered out of the ProductSubcategory table. Furthermore, we don’t have access to any information in the Product table, not even the count of rows. But it does Filter. This is neither an Inner Join or an Outer Join. The EXISTS form is a &lt;strong&gt;Semi Join&lt;/strong&gt;, and the NOT EXISTS form is an &lt;strong&gt;Anti Semi Join&lt;/strong&gt;. These operations simply filter one set based on whether a match is present or not. We see this operation in the two plans involved.&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="" border="0" alt="" src="http://sqlblog.com/blogs/rob_farley/image_4EA59522.png" width="517" height="218" /&gt;&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="" border="0" alt="" src="http://sqlblog.com/blogs/rob_farley/image_62AE8BDE.png" width="660" height="238" /&gt;&lt;/p&gt;  &lt;p&gt;The fact that the Query Optimizer chose to implement one with a Hash Match and one with a Nested Loop is irrelevant. We’re looking at the Logical aspects of the plan, not the Physical. &lt;/p&gt;  &lt;p&gt;And so we see there are plenty of times that a Join can appear without the JOIN keyword.&lt;/p&gt;  &lt;p&gt;But let me show you something interesting about the Anti Semi Join for a moment...&lt;/p&gt;  &lt;p&gt;Many people don’t like NOT EXISTS, and would rather write my earlier query like this:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;SELECT *     &lt;br /&gt;FROM Production.ProductSubcategory AS s      &lt;br /&gt;LEFT OUTER JOIN Production.Product AS p       &lt;br /&gt;ON p.ProductSubcategoryID = s.ProductSubcategoryID      &lt;br /&gt;WHERE p.ProductID IS NULL;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;It’s relatively easy to show that these two queries should be identical. Both of them find rows where a match doesn’t exist. Both perform a logical Anti Semi Join. And yet many people will tell you that doing an Outer Join plus a WHERE clause is better – from a performance perspective. They will generally tell you that NOT EXISTS is logically equivalent.&lt;/p&gt;  &lt;p&gt;We’ve seen plenty of times when the Query Optimizer will realise that two queries are identical and produce the same plan, but this doesn’t seem to be the case here.&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="" border="0" alt="" src="http://sqlblog.com/blogs/rob_farley/image_01C182F2.png" width="620" height="220" /&gt;&lt;/p&gt;  &lt;p&gt;This plan shows very much what we asked for. An Outer Join plus a Filter, as opposed to the Anti Semi Join. &lt;/p&gt;  &lt;p&gt;And this has very little reason to be any faster than an Anti Semi Join operation. In fact, the interesting thing here is that the Outer Join completes. It tracks every successful match and passes that up to the Filter. Our Anti Semi Join operation doesn’t do that.&lt;/p&gt;  &lt;p&gt;I see no reason to use an Outer Join plus Filter over NOT EXISTS. It may have been quicker at some point if the Anti Semi Join operation was done poorly in earlier versions, but I’d really like to hear if anyone can show me the Outer Join plus Filter method being quicker.&lt;/p&gt;  &lt;p&gt;&lt;em&gt;PS: This Saturday (Oct 8th) I’ll be presenting this kind of stuff in my “Deeper JOINs” session at &lt;/em&gt;&lt;a href="http://www.sqlsaturday.com/92/schedule.aspx?sat=92" target="_blank"&gt;&lt;em&gt;SQL Saturday #92 in Portland&lt;/em&gt;&lt;/a&gt;&lt;em&gt;, and in my &lt;/em&gt;&lt;a href="http://www.sqlpass.org/summit/2011/Speakers/CallForSpeakers/SessionDetail.aspx?sid=1204" target="_blank"&gt;&lt;em&gt;pre-conference seminar a couple of days later (Monday 10th) in Seattle at the PASS Summit&lt;/em&gt;&lt;/a&gt;&lt;em&gt;. You can still register for these events, so if you find this kind of thing interesting, get yourself over to the Pacific North West and I’ll see you there!&lt;/em&gt;&lt;/p&gt;  &lt;p&gt;&lt;a href="http://twitter.com/rob_farley" target="_blank"&gt;@rob_farley&lt;/a&gt;&lt;/p&gt;&lt;img src="http://sqlblog.com/aggbug.aspx?PostID=38837" width="1" height="1"&gt;</description><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/performance/default.aspx">performance</category><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/sql/default.aspx">sql</category><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/t-sql+tuesday/default.aspx">t-sql tuesday</category></item><item><title>My biggest recommendation for people learning T-SQL</title><link>http://sqlblog.com/blogs/rob_farley/archive/2011/07/12/my-biggest-recommendation-for-people-learning-t-sql.aspx</link><pubDate>Mon, 11 Jul 2011 23:58:28 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:36789</guid><dc:creator>Rob Farley</dc:creator><slash:comments>1</slash:comments><comments>http://sqlblog.com/blogs/rob_farley/comments/36789.aspx</comments><wfw:commentRss>http://sqlblog.com/blogs/rob_farley/commentrss.aspx?PostID=36789</wfw:commentRss><wfw:comment>http://sqlblog.com/blogs/rob_farley/rsscomments.aspx?PostID=36789</wfw:comment><description>&lt;p&gt;It’s not quite a &lt;a href="http://troubleshootingsql.com/2011/07/05/invitation-for-t-sql-tuesday-19-t-sql-best-practices/" target="_blank"&gt;Best Practice&lt;/a&gt;, but it’s something that I see as very important. It makes the difference between someone who might be quite good at T-SQL, and someone who can go past the rest and become one of those people who get asked to solve other people’s T-SQL problems.&lt;/p&gt; &lt;a href="http://troubleshootingsql.com/2011/07/05/invitation-for-t-sql-tuesday-19-t-sql-best-practices/" target="_blank"&gt;&lt;img style="background-image:none;border-right-width:0px;margin:5px;padding-left:0px;padding-right:0px;display:inline;float:right;border-top-width:0px;border-bottom-width:0px;border-left-width:0px;padding-top:0px;" title="TSQL2sDay150x150" border="0" alt="TSQL2sDay150x150" align="right" src="http://sqlblog.com/blogs/rob_farley/TSQL2sDay150x150_0A2C37EC.jpg" width="170" height="170" /&gt;&lt;/a&gt;   &lt;p&gt;It’s easy – &lt;strong&gt;you read the plans&lt;/strong&gt;.&lt;/p&gt;  &lt;p&gt;You see, the plans explain to you what you’re actually doing, instead of just following the standard formula to get it done.&lt;/p&gt;  &lt;p&gt;It’s the same in many parts of life. If you’re going to be a mechanic, it helps to understand how an engine works. If you’re a pilot, you should understand the principles of flight.&lt;/p&gt;  &lt;p&gt;And so it is with T-SQL. We write a query, and the Query Optimizer pulls it apart and translates it into an execution plan. It’s this plan that runs, not our query. Unfortunately, the nuances of this translation demonstrate all of us ignorant. There may be people who understand it better than most, such as some of them people who work on the Query Optimizer at Microsoft.&lt;/p&gt;  &lt;p&gt;Grasping a handle on the main ideas is definitely worthwhile though. If you can understand the ways that Physical Joins work, and the things that influence index choices, you will naturally write better queries. You may still have trouble working out the logic that produces the correct result, but trouble-shooting will be much easier and performance gains will surely follow.&lt;/p&gt;&lt;img src="http://sqlblog.com/aggbug.aspx?PostID=36789" width="1" height="1"&gt;</description><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/performance/default.aspx">performance</category><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/sql/default.aspx">sql</category><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/t-sql+tuesday/default.aspx">t-sql tuesday</category></item><item><title>CTEs – query-scoped views</title><link>http://sqlblog.com/blogs/rob_farley/archive/2011/05/10/ctes-query-scoped-views.aspx</link><pubDate>Tue, 10 May 2011 00:04:37 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:35543</guid><dc:creator>Rob Farley</dc:creator><slash:comments>2</slash:comments><comments>http://sqlblog.com/blogs/rob_farley/comments/35543.aspx</comments><wfw:commentRss>http://sqlblog.com/blogs/rob_farley/commentrss.aspx?PostID=35543</wfw:commentRss><wfw:comment>http://sqlblog.com/blogs/rob_farley/rsscomments.aspx?PostID=35543</wfw:comment><description>&lt;p&gt;This T-SQL Tuesday is on the topic of CTEs, and is hosted by &lt;a href="http://www.bobpusateri.com/" target="_blank"&gt;Bob Pusateri&lt;/a&gt; (&lt;a href="http://twitter.com/sqlbob" target="_blank"&gt;@SQLBob&lt;/a&gt;). &lt;a href="http://www.bobpusateri.com/archive/2011/04/invitation-to-t-sql-tuesday-18-ctes/" target="_blank"&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;margin:5px;padding-left:0px;padding-right:0px;display:inline;float:right;border-top:0px;border-right:0px;padding-top:0px;" title="TSQL2sDay150x150" border="0" alt="TSQL2sDay150x150" align="right" src="http://sqlblog.com/blogs/rob_farley/TSQL2sDay150x150_331D42CD.jpg" width="154" height="154" /&gt;&lt;/a&gt;For a bunch more posts on the topic, follow the link to his &lt;a href="http://www.bobpusateri.com/archive/2011/04/invitation-to-t-sql-tuesday-18-ctes/" target="_blank"&gt;Invitation Post&lt;/a&gt;, and you’ll see a pile of them, all about CTEs. If you’re reading this and it’s still May 10th 2011 (GMT), then why not put a post together yourself!&lt;/p&gt;  &lt;p&gt;I’m sure there will be plenty of posts about how CTEs can be used to wrap ranking functions, and about recursion – useful features both – but I want to explain how I see CTEs.&lt;/p&gt;  &lt;p&gt;CTEs are Common Table Expressions. The clue is in the name – they’re table expressions, and able to be referred to in the FROM clause, fulfilling the same role that a table might. Not a table in the sense of storing data, but in the sense that a table refers to a set of data, ready to be transformed into a result set.&lt;/p&gt;  &lt;p&gt;There are a few things that can be used as a table expression in a FROM clause. Obviously a table, but others include views, table-valued functions (whether inline or procedural), and table-valued sub-queries. Those of you who regularly read &lt;a href="http://sqlblog.com/blogs/rob_farley" target="_blank"&gt;my blog&lt;/a&gt; (&lt;em&gt;and if you’re not reading this at sqlblog.com, where are you reading it?&lt;/em&gt;) will know that I like to call views ‘stored sub-queries’, and TVFs ‘stored correlated sub-queries’ or ‘parameterised views’. So perhaps there’s no surprises here that these are all table expressions too.&lt;/p&gt;  &lt;p&gt;But this post is about CTEs. They’re half way between view and sub-queries. They exist only within a single query, and contain a table-expression. A stored sub-query if you like, but only scoped to a particular query. They’re like views, but don’t persist.&lt;/p&gt;  &lt;p&gt;So a CTE is a view for when you don’t really want a view. It’s a query-scoped view. Very useful really – and you can even use them for UPDATEs, INSERTs and DELETEs, so long as you understand that this is subject to the same rules as views (but obviously you can’t put triggers on them).&lt;/p&gt;  &lt;p&gt;But this query-scoping can feel like a problem. What if I want my view to persist for a bit longer? That could be quite handy, couldn’t it? What if we wanted a view to be created at the start of a stored procedure, and be scoped there? We can do this with tables, we just put a hash in front of the name: &lt;font face="Consolas"&gt;CREATE TABLE #blah (id int)&lt;/font&gt;, and the scope is set.&lt;/p&gt;  &lt;p&gt;Unfortunately when we try to do this with views, we get a nice error.&lt;/p&gt;  &lt;p&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:inline;border-top:0px;border-right:0px;padding-top:0px;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_6E9D05F3.png" width="480" height="110" /&gt;&lt;/p&gt;  &lt;p&gt;That’s right: “Temporary views are not allowed.” It knows what we’re trying to do, it just doesn’t let us do it. It’ll give similar errors for temporary functions too. &lt;/p&gt;  &lt;p&gt;Luckily, you can vote for this on the Connect site. There are a couple of different suggestions on how it could be done, so I’ll let you choose the one that you prefer&lt;/p&gt;  &lt;p&gt;&lt;a href="http://connect.microsoft.com/SQLServer/feedback/details/640863"&gt;http://connect.microsoft.com/SQLServer/feedback/details/640863&lt;/a&gt; (Please allow creation of temporary views)&lt;/p&gt;  &lt;p&gt;&lt;a href="http://connect.microsoft.com/SQLServer/feedback/details/343067"&gt;http://connect.microsoft.com/SQLServer/feedback/details/343067&lt;/a&gt; (Module-level table expressions)&lt;/p&gt;  &lt;p&gt;Maybe at some point we’ll be able to have a wider range of options over the scope of our ‘common’ table-expressions.&lt;/p&gt;  &lt;p&gt;&lt;a href="http://twitter.com/rob_farley" target="_blank"&gt;@rob_farley&lt;/a&gt;&lt;/p&gt;&lt;img src="http://sqlblog.com/aggbug.aspx?PostID=35543" width="1" height="1"&gt;</description><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/sql/default.aspx">sql</category><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/t-sql+tuesday/default.aspx">t-sql tuesday</category></item><item><title>The power of T-SQL's APPLY operator</title><link>http://sqlblog.com/blogs/rob_farley/archive/2011/04/13/the-power-of-t-sql-s-apply-operator.aspx</link><pubDate>Tue, 12 Apr 2011 23:30:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:34845</guid><dc:creator>Rob Farley</dc:creator><slash:comments>13</slash:comments><comments>http://sqlblog.com/blogs/rob_farley/comments/34845.aspx</comments><wfw:commentRss>http://sqlblog.com/blogs/rob_farley/commentrss.aspx?PostID=34845</wfw:commentRss><wfw:comment>http://sqlblog.com/blogs/rob_farley/rsscomments.aspx?PostID=34845</wfw:comment><description>&lt;p&gt;I could go on all day about APPLY – it really is an incredible part of T-SQL. It helps solves problems were frustratingly painful to solve otherwise. It empowers the query writer in a way that is matched by few other features. This makes it a worthy topic for &lt;a href="http://mattvelic.com/tsql-tuesday-17-invite/"&gt;this month’s T-SQL Tuesday&lt;/a&gt;, hosted by &lt;a href="http://mattvelic.com/"&gt;Matt Velic&lt;/a&gt;. &lt;/p&gt;  &lt;p&gt;&lt;a href="http://mattvelic.com/tsql-tuesday-17-invite/"&gt;&lt;img style="border:0px currentColor;padding-top:0px;padding-right:0px;padding-left:0px;float:right;display:inline;background-image:none;" title="TSQL2sDay150x150" border="0" alt="TSQL2sDay150x150" align="right" src="http://sqlblog.com/blogs/rob_farley/TSQL2sDay150x150_5127808C.jpg" width="150" height="150"&gt;&lt;/a&gt;Yes, I’ve already written &lt;a href="http://sqlblog.com/blogs/rob_farley/archive/2011/04/12/select-and-where-but-no-from-clause.aspx" target="_blank"&gt;a post for T-SQL Tuesday this month&lt;/a&gt;, but I’m on a flight for a couple of hours, and I do like the APPLY operator. This post will probably be more like what other people are writing this month, looking at some of the inner workings of the APPLY operator. So if you’re reading this, go and read some of the other posts about APPLY, to have things explained in a slightly different way.&lt;/p&gt;  &lt;p&gt;APPLY, in both its inner (CROSS APPLY) and outer (OUTER APPLY) forms, allows a correlated sub-query or table-valued function to be part of the FROM clause. This is different to a regular JOIN, which is between two sets that are independent of each other, but able to be joined according to some predicate that lives in the ON clause. Using APPLY, that restriction disappears, and the predicate can be built into the sub-query (or TVF) itself.&lt;/p&gt;  &lt;p&gt;These two queries are exactly the same. They have identical execution plans, and produce the same result.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;SELECT p.Name, s.Name, s.ProductCategoryID       &lt;br&gt;FROM Production.Product AS p        &lt;br&gt;JOIN Production.ProductSubcategory AS s        &lt;br&gt;ON s.ProductSubcategoryID = p.ProductSubcategoryID;&lt;/font&gt;&lt;/p&gt;    &lt;p&gt;     &lt;br&gt;&lt;font face="Consolas"&gt;SELECT p.Name, s.Name, s.ProductCategoryID       &lt;br&gt;FROM Production.Product AS p        &lt;br&gt;CROSS APPLY (        &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT subcat.Name, subcat.ProductCategoryID         &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM Production.ProductSubcategory AS subcat        &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; WHERE subcat.ProductSubcategoryID = p.ProductSubcategoryID) AS s;&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;&lt;img style="border:0px currentColor;padding-top:0px;padding-right:0px;padding-left:0px;display:inline;background-image:none;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_0A3BA66A.png" width="520" height="368"&gt;&lt;/p&gt;  &lt;p&gt;You may look at this and figure that the first option is entirely adequate, and see no reason to switch. There isn’t a reason to switch for most situations, and I don’t actually want you to change the way you write regular joins. But I do want you to consider the power of having a predicate as part of a correlated table-sub-query.&lt;/p&gt;  &lt;p&gt;One common scenario is being able to hook easily into the first row in a particular join. It’s easy – you just add &lt;font face="Consolas"&gt;TOP (1)&lt;/font&gt; to the sub-query, and put an ORDER BY clause in there. I do this happily in BI solutions, where I want to grab a record from a dimension table that is current at the time that an event occurred.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;OUTER APPLY (       &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT TOP (1) d.DimKey        &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM dbo.DimensionTable AS d        &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; WHERE d.BusinessKey = e.DimensionCode        &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; AND d.StartDate &amp;lt; e.EventDate        &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; ORDER BY d.StartDate DESC) AS d&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;I know, you can do this very easily using a correlated sub-query in the SELECT clause, since the sub-query returns a single value (yes, it’s &lt;em&gt;scalar&lt;/em&gt;). But what if we needed to return multiple columns from a sub-query, or needed to return the best three records for a particular category. Here I’m finding the three best-selling products for each subcategory.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;SELECT s.Name, TopProds.*       &lt;br&gt;FROM Production.ProductSubcategory AS s        &lt;br&gt;OUTER APPLY (        &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; SELECT TOP (3) p.Name, SUM(o.OrderQty) AS NumOrdered        &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM Production.Product AS p        &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; JOIN Sales.SalesOrderDetail AS o        &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; ON o.ProductID = p.ProductID        &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; WHERE p.ProductSubcategoryID = s.ProductSubcategoryID        &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; GROUP BY p.ProductID, p.Name        &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; ORDER BY SUM(o.OrderQty) DESC) as TopProds        &lt;br&gt;;&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;This could be done with a Common Table Expression (CTE) and the ROW_NUMBER() functdion, but that solution isn’t nearly as elegant as this. This is the kind of solution that was designed for APPLY. One of those times when a problem and solution are nicely matched, made for each other.&lt;/p&gt;  &lt;p&gt;Here’s an example that achieves the same using a CTE:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;WITH OrderedProds AS       &lt;br&gt;(        &lt;br&gt;&amp;nbsp;&amp;nbsp; SELECT p.ProductSubcategoryID,&amp;nbsp; &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; p.Name,&amp;nbsp; &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; SUM(o.OrderQty) AS NumOrdered,&amp;nbsp; &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ROW_NUMBER() OVER (PARTITION BY p.ProductSubcategoryID         &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/font&gt;&lt;font face="Consolas"&gt;ORDER BY SUM(o.OrderQty) DESC) AS ProdOrder       &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM Production.Product AS p        &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; JOIN Sales.SalesOrderDetail AS o        &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; ON o.ProductID = p.ProductID        &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; GROUP BY p.ProductID, p.Name, p.ProductSubcategoryID        &lt;br&gt;)        &lt;br&gt;SELECT s.Name, o.Name, o.NumOrdered        &lt;br&gt;FROM Production.ProductSubcategory s        &lt;br&gt;LEFT JOIN OrderedProds o        &lt;br&gt;ON o.ProductSubcategoryID = s.ProductSubcategoryID        &lt;br&gt;WHERE o.ProdOrder &amp;lt;= 3        &lt;br&gt;;&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;I’m sure you’ll appreciate that this feels a lot more contrived than using APPLY. The Query Optimizer should really break these two queries down to exactly the same, recognising that joining on the PARTITION BY column(s) is a correlation mechanism. It doesn’t do that yet, but one day, I’m sure it might.&lt;/p&gt;  &lt;p&gt;&lt;em&gt;(The CTE query runs quicker than the APPLY equivalent in this particular case on the AdventureWorks database, but it shouldn’t need to. The behaviour largely depends on a number of other factors, in the same way that multiple Seeks can sometimes be faster than a Scan, but not always.)&lt;/em&gt;&lt;/p&gt;  &lt;p&gt;Let’s think about views for a moment. They’re sometimes referred to as “Virtual Tables” (yes, &lt;a href="http://msdn.microsoft.com/en-us/library/ms187956.aspx" target="_blank"&gt;SQL Books Online&lt;/a&gt;, I’m looking at you!)&lt;/p&gt;  &lt;p&gt;People who have heard my &lt;strike&gt;preaching&lt;/strike&gt; presenting about writing good T-SQL will have heard me refer to views as “stored sub-queries”. I reject the notion that they’re “virtual tables”. That’s rubbish – they’re treated in by the Query Optimizer as a stored sub-query, and should be referred to as such. The natural extension of this is that an inline table-valued function is a parameterised view, or stored correlated table sub-query. Basically the same as what we see here.&lt;/p&gt;  &lt;p&gt;So let’s create it.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;CREATE FUNCTION dbo.Top3ProductsBySubcategory(@subcat int) RETURNS TABLE       &lt;br&gt;AS RETURN(        &lt;br&gt;SELECT TOP (3) p.Name, SUM(o.OrderQty) AS NumOrdered        &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; FROM Production.Product AS p        &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; JOIN Sales.SalesOrderDetail AS o        &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; ON o.ProductID = p.ProductID        &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; WHERE p.ProductSubcategoryID = @subcat        &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; GROUP BY p.ProductID, p.Name        &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; ORDER BY SUM(o.OrderQty) DESC        &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; );&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;Notice I’ve simply changed the correlation hook (&lt;font face="Consolas"&gt;s.ProductSubcategoryID&lt;/font&gt;) into the parameter. Also note that I’m NOT using BEGIN and END, which would render the function procedural and non-simplifiable. To hear me go on about that in more detail, have a look at the video at &lt;a href="http://bit.ly/SimpleRob" target="_blank"&gt;http://bit.ly/SimpleRob&lt;/a&gt; – a URL I have to thank &lt;a href="http://sqlblog.com/blogs/paul_white/" target="_blank"&gt;Farmer Paul&lt;/a&gt; for, which will direct you to a presentation I did at &lt;a href="http://www.sqlbits.com/" target="_blank"&gt;SQLBits V&lt;/a&gt;, in late 2009. &lt;em&gt;(I figure what Paul meant when he created “Simple Rob” is that I present simple truths that have a profound importance, and not suggesting that I’m ‘simple’ at all. He’s not suggesting that my IQ is low or anything like that. He wouldn’t do that, he’s far too nice a bloke, and my friend. I mean, I know he’s a Kiwi, and Kiwis have a friendly rivalry with Aussies – I think they’re jealous of our weather – but he wouldn’t pick on me like that. Besides, Paul and I are both English originally, he just moved to NZ instead of Australia. Maybe he just enjoys colder weather.)&lt;/em&gt;&lt;/p&gt;  &lt;p&gt;So now my query can become:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas"&gt;SELECT s.Name, TopProds.*       &lt;br&gt;FROM Production.ProductSubcategory AS s        &lt;br&gt;OUTER APPLY dbo.Top3ProductsBySubcategory(s.ProductSubcategoryID) AS TopProds        &lt;br&gt;;&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;This executes in exactly the same way, but we’ve contained our logic nicely by storing the sub-query in an inline TVF. Notice the join condition – it’s buried inside the function, abstracted away, and yet completely simplifiable by the Query Optimizer. It’s still very much a join – if you were to look at the execution plan, you’d see a Nested Loop (Left Outer Join), with a Seek Predicate being used further down. (I’ve created indexes to help, of course.) You should be able to click on the image to see it in more detail.&lt;/p&gt;  &lt;p&gt;&lt;a href="http://sqlblog.com/blogs/rob_farley/image_122322FF.png"&gt;&lt;img style="border:0px currentColor;padding-top:0px;padding-right:0px;padding-left:0px;display:inline;background-image:none;" title="image" border="0" alt="image" src="http://sqlblog.com/blogs/rob_farley/image_thumb_54A298AA.png" width="660" height="220"&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;I really could keep going about APPLY, and talk about a bunch of other uses for it, but I’ll leave that for another time.&lt;/p&gt;  &lt;p&gt;What I want to leave you with is the reminder that APPLY is just a more powerful way of doing a JOIN, because the set that you’re joining to doesn’t need to exist yet. You may have heard people talk about APPLY being bad, in the same breath as people will tell you that a correlated sub-query is bad. Unfortunately, there are times when this is true. But there are times when any join is bad, it largely depends on what indexes you have in place, and what kind of simplification the Query Optimizer can do in working out your plan. If you have considered issues such as &lt;a href="http://sqlblog.com/blogs/rob_farley/archive/2010/02/02/a-case-study-in-sargability.aspx" target="_blank"&gt;SARGability&lt;/a&gt; and &lt;a href="http://sqlblog.com/blogs/rob_farley/archive/2011/03/22/probe-residual-when-you-have-a-hash-match-a-hidden-cost-in-execution-plans.aspx" target="_blank"&gt;Residual Predicates&lt;/a&gt;, then you really shouldn’t have much of a problem.&lt;/p&gt;  &lt;p&gt;Of course you understand joining to Tables, and to Views. If you aren’t used to joining to sub-queries, then try expanding a view yourself. Then think about what happens if you use APPLY and sneak the join condition into a predicate within the sub-query. All of a sudden you’ll find yourself able to write richer queries, and have a much stronger T-SQL arsenal.&lt;/p&gt;&lt;img src="http://sqlblog.com/aggbug.aspx?PostID=34845" width="1" height="1"&gt;</description><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/sql/default.aspx">sql</category><category domain="http://sqlblog.com/blogs/rob_farley/archive/tags/t-sql+tuesday/default.aspx">t-sql tuesday</category></item></channel></rss>