<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://sqlblog.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Linchi Shea : Best Practices, SQL Server 2005</title><link>http://sqlblog.com/blogs/linchi_shea/archive/tags/Best+Practices/SQL+Server+2005/default.aspx</link><description>Tags: Best Practices, SQL Server 2005</description><dc:language>en</dc:language><generator>CommunityServer 2.1 SP2 (Build: 61129.1)</generator><item><title>A Rose By Any Other Name</title><link>http://sqlblog.com/blogs/linchi_shea/archive/2007/11/20/a-rose-by-any-other-name.aspx</link><pubDate>Tue, 20 Nov 2007 20:04:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:3442</guid><dc:creator>Linchi Shea</dc:creator><slash:comments>8</slash:comments><comments>http://sqlblog.com/blogs/linchi_shea/comments/3442.aspx</comments><wfw:commentRss>http://sqlblog.com/blogs/linchi_shea/commentrss.aspx?PostID=3442</wfw:commentRss><description>&lt;P&gt;Regardless of the DBMS make or model, the transaction throughput curve of a database system is often shaped like a trapezoid. As the load level goes up initially, so does the transaction throughput. As the load level continues to go up, the transaction throughput will reach a plateau. After the load level is cranked up still further, the transaction throughput will inevitably start to drop and may eventually come to a&amp;nbsp; halt.&lt;/P&gt;
&lt;P&gt;With respect to the throughput drop under severe load, two questions are interesting: (1) what is the cause of the drop? and (2) what can we do to delay the drop until an even higher load level?&lt;/P&gt;
&lt;P&gt;The answers are of course dependent on the nature of the workload, the system configurations, and a host of other factors. &lt;/P&gt;
&lt;P&gt;So rather than trying to address the questions in general, let's look at one specific type of tests I often perform, which is to stress the processors, memory, and their interconnect of a server with a SQL Server read-only workload that effectively caches all the data pages in memory. In other words, this workload takes the disk I/Os out of the equation.&lt;/P&gt;
&lt;P&gt;As expected and with no exception, I would see the transaction throughput start to drop--sometimes rather dramatically--as I increase the load level beyond saturation. When this may happen often correlates nicely with a dramatic increase in waits on &lt;I&gt;Thread-safe memory objects waits&lt;/I&gt; under &lt;I&gt;SQLServer:Wait Statistics&lt;/I&gt;. Unfortunately, this wait counter is documented with a rather useless tautology in that the explanation more or less repeats what the name already suggests without adding useful information. (BTW, the explanation is "Statistics for processes waiting on thread-safe memory allocators"). So,&amp;nbsp; from looking at the wait stats, I can't really tell what wait or waits are killing the transaction throughput.&lt;/P&gt;
&lt;P&gt;The point I want to illustrate, however, is the dramatic impact of a change to the workload--a change that merely gives the stored procedures different names.&lt;/P&gt;
&lt;P&gt;&lt;B&gt;Original workload:&lt;/B&gt; The database calls of the workload calls are even distributed between two stored procedures, spOrderStatus and spStockLevel. &lt;/P&gt;
&lt;P&gt;&lt;B&gt;Modified workload&lt;/B&gt;: In the same database, I created 100 stored procedures: spOrderStatus_1 through spOrderStatus_100, that are identical to spOrderStatus except the names. Similarly, I created another 100 stored procedures: spStockLevel_1 through spStockLevel_100, that are identical to spStockLevel except the names. I then modified the original workload to distribute the database calls evenly among spOrderStatus_1 through spOrderStatus_100 and spStockLevel_1 through spStockLevel_100.&lt;/P&gt;
&lt;P&gt;With the modified workload, the same database was able to sustain the peak throughput for significantly higher load level (i.e. more users). The following chart summarizes the throughput behavior between the original workload and the modified workload on the test server:&lt;/P&gt;
&lt;P&gt;&lt;IMG src="http://sqlblog.com/blogs/linchi_shea/attachment/3442.ashx"&gt; &lt;/P&gt;
&lt;P&gt;I didn't specifically test how many identical but differently named stored procedures I had to create to see the impact on the transaction throughput. 100 for each stored procedure was picked randomly. &lt;/P&gt;
&lt;P&gt;Note that this is a known technique and is used in some of the published SQL Server TPC-C tests. For instance, in the best &lt;A href="http://www.tpc.org/tpcc/results/tpcc_result_detail.asp?id=105112801"&gt;SQL Server TPC-C results obtained on HP Superdome&lt;/A&gt;, 32 New_Order stored procedures are created with 32 different names but identical code inside. Stored procedures with different names but identical code are also created for several other transactions.&lt;/P&gt;
&lt;P&gt;So, a rose by a different name may just smell better.&lt;/P&gt;&lt;img src="http://sqlblog.com/aggbug.aspx?PostID=3442" width="1" height="1"&gt;</description><enclosure url="http://sqlblog.com/blogs/linchi_shea/attachment/3442.ashx" length="5378" type="image/gif" /><category domain="http://sqlblog.com/blogs/linchi_shea/archive/tags/Best+Practices/default.aspx">Best Practices</category><category domain="http://sqlblog.com/blogs/linchi_shea/archive/tags/Performance/default.aspx">Performance</category><category domain="http://sqlblog.com/blogs/linchi_shea/archive/tags/SQL+Server+2005/default.aspx">SQL Server 2005</category><category domain="http://sqlblog.com/blogs/linchi_shea/archive/tags/Waits/default.aspx">Waits</category></item><item><title>Nondeterministic UPDATE</title><link>http://sqlblog.com/blogs/linchi_shea/archive/2007/11/13/nondeterministic-update.aspx</link><pubDate>Tue, 13 Nov 2007 19:15:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:3312</guid><dc:creator>Linchi Shea</dc:creator><slash:comments>2</slash:comments><comments>http://sqlblog.com/blogs/linchi_shea/comments/3312.aspx</comments><wfw:commentRss>http://sqlblog.com/blogs/linchi_shea/commentrss.aspx?PostID=3312</wfw:commentRss><description>&lt;P&gt;Under the description for UPDATE in SQL Server &lt;A href="http://msdn2.microsoft.com/en-us/library/aa260662(SQL.80).aspx"&gt;2000&lt;/A&gt; and &lt;A href="http://msdn2.microsoft.com/en-us/library/ms177523.aspx"&gt;2005&lt;/A&gt; Books Online, you can find the following statement (thanks to SQL Server MVP Steve Kass for pointing me to this passage):&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;The results of an UPDATE statement are undefined if the statement includes a FROM clause that is not specified in such a way that only one value is available for each column occurrence that is updated, that is if the UPDATE statement is not deterministic.&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;BOL goes on to give a simple example of such an UPDATE. However, the example doesn't give you a real feel for the nondeterministic behavior. Although the result is supposed to be undefined and indeed you don't know&amp;nbsp;what value UPDATE will end up assigning, the result does come out consistently the same no matter how many times you execute it. At least during my limited number of test runs, UPDATE seems to always pick the first value. But of course you can't count on that because a different execution plan may pick a different value.&lt;/P&gt;
&lt;P&gt;Still, it would be nice to see a nondeterministic UPDATE statement actually picks a different value to drive it home that you should not use it in your app, unless of course your app logic wants to take advantage of this particular non-determinism (which is difficult to imagine).&lt;/P&gt;
&lt;P&gt;Recently, however, I happened to run into an interesting case where the nondeterministic behavior of UPDATE can be easily reproduced on a multi-core server when query parallelism is used. Here's the script to see the behavior.&lt;/P&gt;
&lt;P&gt;First, run the following script to create two tables and a stored procedure in any user database on a SQL Server 2005 instance:&lt;/P&gt;
&lt;DIV style="BACKGROUND-COLOR:#dddddd;"&gt;&lt;PRE&gt;CREATE TABLE t1 (
	c1 int         NULL,
	c2 varchar(50) NULL,
	rate int       NULL
)
go
CREATE INDEX c1_ind ON t1 (c1)
go

CREATE TABLE t2 (
	c1 int           NULL,
	c2 varchar(50)   NULL,
	rate int         NOT NULL
)
go

drop proc testProc
go
create proc testProc
as
set nocount on
begin tran
  UPDATE t1
     SET rate = t2.rate
    FROM t1 With (Index (c1_ind)), t2
   WHERE t1.c1 = t2.c1
     and t1.c2 = t2.c2

  SELECT 'count_0' = count(*) FROM t1 WHERE rate = 0;
  SELECT 'count_1' = count(*) FROM t1 WHERE rate = 1;
  SELECT 'rate_sum' = sum(rate) FROM t1
rollback tran
go&lt;/PRE&gt;&lt;/DIV&gt;
&lt;P&gt;Second, populate the two tables with data using the following script:&lt;/P&gt;
&lt;DIV style="BACKGROUND-COLOR:#e8e8e8;"&gt;&lt;PRE&gt;WITH tmp (c1, c2,c3) as (
    SELECT 1,1,1
    UNION ALL
    SELECT cast(rand(c3)*1000000 as int)%200, 
           cast(rand(c3)*1000000 as int)%200, 
           c3 + 1
      FROM tmp
     WHERE c3 &amp;lt; 100000
)
INSERT t1
SELECT c1, CAST(c2 as varchar(50)), NULL
  FROM tmp
OPTION (maxrecursion 0);

WITH tmp (c1, c2, c3) as (
    SELECT 0,0,0
    UNION ALL
    SELECT c1 + 1, c2 + 1, c3 + 1
      FROM tmp
     WHERE c1 &amp;lt; 200
)
INSERT t2
SELECT cast(c1 as char(6)), 
       cast(c2 as varchar(50)), 
       0                         -- c3 value = 0
  FROM tmp
OPTION (maxrecursion 0);

-- give each c1 value a second c3 value = 1
INSERT t2
SELECT c1, c2, 1
  FROM t2;
go
&lt;/PRE&gt;&lt;/DIV&gt;
&lt;P&gt;Finally, run the stored procedure multiple times to see that different results are produced: &lt;/P&gt;
&lt;DIV style="BACKGROUND-COLOR:#e5e5e5;"&gt;EXEC testProc;&lt;/DIV&gt;
&lt;P&gt;Apparently, nondeterminism in query processing gives rise to the different results. A few observations are noteworthy:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;The non-determinstic behavior is not a result of different query execution plan. At least, the reported execution plan&amp;nbsp;remains the same for all my test runs.&lt;/LI&gt;
&lt;LI&gt;If you disable query parallelism by setting &lt;U&gt;max degree of parallelism&lt;/U&gt; to 1, the stored procedure will produce the same results.&lt;/LI&gt;
&lt;LI&gt;On a SQL Server 2000 instance, you probably won't get a parallel plan, and thus won't see different results.&lt;/LI&gt;
&lt;LI&gt;The size of the table t1 is significant. On my test server, populating table t1 with 70,000 still leads to a parallel query plan, thus different results in multiple executions of the stored procedure. But if I only populate the table with 50,000 rows, I don't get an query parallelism, and I get consistent results. So SQL Server query optimizer is sensitive to the table size in deciding whether to use a parallel plan.&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;img src="http://sqlblog.com/aggbug.aspx?PostID=3312" width="1" height="1"&gt;</description><category domain="http://sqlblog.com/blogs/linchi_shea/archive/tags/Best+Practices/default.aspx">Best Practices</category><category domain="http://sqlblog.com/blogs/linchi_shea/archive/tags/SQL+Server+2005/default.aspx">SQL Server 2005</category><category domain="http://sqlblog.com/blogs/linchi_shea/archive/tags/T-SQL/default.aspx">T-SQL</category></item><item><title>Performance Impact: The Potential Cost of Read_Committed_Snapshot</title><link>http://sqlblog.com/blogs/linchi_shea/archive/2007/10/05/performance-impact-the-potential-cost-of-read-committed-snapshot.aspx</link><pubDate>Fri, 05 Oct 2007 03:49:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:2883</guid><dc:creator>Linchi Shea</dc:creator><slash:comments>4</slash:comments><comments>http://sqlblog.com/blogs/linchi_shea/comments/2883.aspx</comments><wfw:commentRss>http://sqlblog.com/blogs/linchi_shea/commentrss.aspx?PostID=2883</wfw:commentRss><description>&lt;P&gt;In response to my previous blog post--&lt;A href="http://sqlblog.com/blogs/linchi_shea/archive/2007/10/01/performance-impact-setting-a-database-to-read-only.aspx"&gt;Performance Impact: Setting a Database to Read Only&lt;/A&gt;, Shailesh Khanal mentioned that he observed significant performance degradation from&amp;nbsp;READ COMMITTED SNAPSHOT OFF to ON for a read-only workload. This&amp;nbsp;is counter intuitive since there is nothing in the version store if only SELECT statements are running, and the overhead of turning READ COMMITTED SNAPSHOT ON without maintaining any versions shouldn't be that high.&lt;/P&gt;
&lt;P&gt;So I decided to conduct some tests myself with a different read-only workload on SQL Server 2005 SP2 (9.00.3042) Enterprise Edition. The read-only workload was the same as that described in&amp;nbsp;the previous blog post--&lt;A href="http://sqlblog.com/blogs/linchi_shea/archive/2007/10/01/performance-impact-setting-a-database-to-read-only.aspx"&gt;Performance Impact: Setting a Database to Read Only&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;My results were mixed. On one hand, I confirmed Shailesh Khanal's observation that turning READ COMMITTED SNAPSHOT ON caused significant performance degradation. It's interesting to note that performance degradation was all observed on 64-bit 16-way machines. On the other hand, I didn't always observe heavy performance penalty when turning READ COMMITTED SNAPSHOT ON. On a 32-bit 4-way machine, I didn't see any degradation at all. The following two charts summarize&amp;nbsp;my results:&lt;/P&gt;
&lt;P&gt;&lt;IMG src="http://sqlblog.com/blogs/linchi_shea/attachment/2883.ashx"&gt; &lt;/P&gt;
&lt;P&gt;In Fig 1, the performance penalty of READ_COMMITTED_SNAPSHOT&amp;nbsp;for a read-only workload was astounding. Just by turning READ_COMMITTED_SNAPSHOT OFF, the transaction throughput gained a whopping ~60%. This was observed in repeated test runs, and on two different platforms. The common characteristics of these two platforms included: (1) the SQL Server 2005 instances were both x64 running on Windows 2003 x64, and (2) the servers&amp;nbsp;both had 16 cores.&lt;/P&gt;
&lt;P&gt;That's bad news. But there&amp;nbsp;was good news. On a different platform, repeating the same test led to very different results as summarized in Fig 2 above. If there is any difference between the two lines in Fig 2, turning READ_COMMITTED_SNAPSHOT ON appears to have produced slightly better throughput, though the difference should probably be considered insignificant and within the margin of error. This test platform was 32-bit SQL Server 2005 running on 32-bit Windows 2003, and the server had 4 cores.&lt;/P&gt;
&lt;P&gt;Now, I don't know the test environment that Shailesh Khanal used&amp;nbsp;to obtain his results. But from my own tests,&amp;nbsp;I'd venture a conjecture that the performance difference could be attributed to the difference between 64-bit and 32-bit, not the hardware difference between 4 cores and 16 cores. And I'd further venture to suggest that the performance degradation from READ_COMMITTED_SNAPSHOT OFF to ON in the case of 64-bit and read-only workload was most likely due to a bug because the degradation was simply too large to be a feature or by design.&lt;/P&gt;
&lt;P&gt;So is there any practical significance to all this?&lt;/P&gt;
&lt;P&gt;Clearly, you need to use READ_COMMITTED_SNAPSHOT with caution in performance tuning. Best practice recommends that you consider using it if contention is costing you throughput. For read-only workloads, there is no reader/writer contention for READ_COMMITTED_SNAPSHOT to help avoid, and there is no statement-level data consistency problem for it to prevent.&lt;/P&gt;
&lt;P&gt;The other lesson is that you really can't assume anything, even if it intuitively makes sense. You should always confirm it through testing, if possible.&lt;/P&gt;
&lt;P&gt;Finally, there could be a scenario where you have a read/write workload, and READ_COMMITTED_SNAPSHOT improves its performance by resolving your reader/write contention problem, but at the same time degrades the performance of its read-only queries. Overall, you may or may not see a net gain. Again, you'll have to run tests to be sure one way or another.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;img src="http://sqlblog.com/aggbug.aspx?PostID=2883" width="1" height="1"&gt;</description><enclosure url="http://sqlblog.com/blogs/linchi_shea/attachment/2883.ashx" length="11210" type="image/gif" /><category domain="http://sqlblog.com/blogs/linchi_shea/archive/tags/Best+Practices/default.aspx">Best Practices</category><category domain="http://sqlblog.com/blogs/linchi_shea/archive/tags/Performance/default.aspx">Performance</category><category domain="http://sqlblog.com/blogs/linchi_shea/archive/tags/SQL+Server+2005/default.aspx">SQL Server 2005</category></item></channel></rss>