THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Linchi Shea

Checking out SQL Server via empirical data points

T-SQL Cursors -- the Case of the Published TPC-E Tests

Within the SQL Server community, there is so much publicity on T-SQL set-oriented processing as good practice and the use of T-SQL cursors as bad practice that  T-SQL cursors are effectively being seen as a plague to be avoided when in fact it's not the use of T-SQ cursors, but their misuse, that should be avoided. To be fair, I must say that I have seen set-minded folks coming out to exhibit a more balanced view.

I'm not here to engage in a new debate on the pros and cons of T-SQL cursors or when they should be used. Such a debate is necessarily a contentious one at the best, and the voice for the set-oriented approach is so loud that any other arguments or suggestions risk getting drowned out.

What I do want to highlight in this post is that T-SQL cursors are used heavily in all the three TPC-E test results published to date (see www.tpc.org). As of 2007/10/09, all the published TPC-E official tests were obtained with SQL Server 2005 X64 edition and with essentially the same database schema. If you look at the full disclosure reports for these TPC-E results (available at www.tpc.org), you'll find the stored procedures that were written to implement the TPC-E transactions, and of the 26 stored procedures, nine used T-SQL cursors.

Of course, the fact that these TPC-E implementations used T-SQL cursors doesn't mean that T-SQL cursors should be generally recommended. But these published TPC-E results do serve as a few data points to highlight the traps of such unqualified statement as "avoid using cursors."

Yeah, the fact that these TPC-E implementations used T-SQL cursors doesn't even mean cursors are the best possible solution for their business logic. However, note that TPC benchmarking is a highly competitive business. Significant resources are being invested by the test sponsors and the DBMS vendors to get the best performance numbers for their respective systems. You can safely assume that whatever design and configurations that are used in a published TPC benchmark are not there by accident, but are deliberately chosen after many alternative designs and configurations have been explored, considered, and tested.

Published Tuesday, October 09, 2007 12:39 AM by Linchi Shea

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Adam Machanic said:

What are the cursors being used for in the test?  I can't imagine that nine cusors would really be required, out of 26 stored procedures.  Most apps I see where they are the better choice might have one or maybe two, out of hundreds of stored procedures...

October 9, 2007 11:08 AM
 

Linchi Shea said:

Adam;

T-SQL cursors are used to implement the TradeUpdate transactions, TradeOrder transactions, TradeLookup transactions, MarketWatch transactions, and MarkeFeed transactions. Here's one of the TransUpdate stored procedures:

-- Trade Update Frame 3 returns up to max_trades trades for a given

-- security symbol on or after a specified point in time.

-- It may also modify up to max_updates rows of CASH_TRANSACTION.

--

CREATE  PROCEDURE   TradeUpdateFrame3

                       @max_trades         int,

                       @max_updates        int,

                       @symbol             varchar(15),

                       @start_trade_dts    datetime,

                       @end_trade_dts      datetime,

                       @max_acct_id        IDENT_T

AS

BEGIN

   SET TRANSACTION ISOLATION LEVEL REPEATABLE READ

BEGIN TRANSACTION

   DECLARE @trade_id           TRADE_T,

           @t_ca_id            IDENT_T,

           @t_exec_name        varchar(64),

           @t_is_cash          TPCE_BOOLEAN,

           @t_trade_price      S_PRICE_T,

           @t_qty              S_QTY_T,

           @t_dts              datetime,

           @t_tt_id            char(3),

           @se_cash_type       varchar(40),

           @se_cash_due_date   datetime,

           @se_amount          VALUE_T,

           @ct_amount          VALUE_T,

           @ct_name            varchar(100),

           @ct_dts             datetime,

           @num_found          int,

           @num_updated        int,

           @ct_string          varchar(20),

           @s_name             varchar(70),

           @type_name          char(12)

   DECLARE Trade_List CURSOR LOCAL FORWARD_ONLY SCROLL_LOCKS

               FOR SELECT  TOP 20

                           T_ID,

                           T_CA_ID,

                           T_EXEC_NAME,

                           T_IS_CASH,

                           T_TRADE_PRICE,

                           T_QTY,

                           T_DTS,

                           T_TT_ID

                   FROM    TRADE

                   WHERE   T_S_SYMB    =  @symbol AND

                           T_DTS       BETWEEN @start_trade_dts AND @end_trade_dts

                   ORDER   BY T_DTS

   SELECT  @num_found      = 0,

           @num_updated    = 0

   SELECT  @s_name = S_NAME

   FROM    SECURITY

   WHERE   S_SYMB  = @symbol

   OPEN    Trade_List

   FETCH FROM  Trade_List

         INTO  @trade_id,

               @t_ca_id,

               @t_exec_name,

               @t_is_cash,

               @t_trade_price,

               @t_qty,

               @t_dts,

               @t_tt_id

   WHILE (@@FETCH_STATUS = 0 AND @max_trades > 0)

   BEGIN

       -- set cash transaction local variables to NULL

       SELECT  @ct_amount  = NULL,

               @ct_name    = NULL,

               @ct_dts     = NULL,

               @num_found  = @num_found + 1,

               @max_trades = @max_trades - 1,

               @type_name  = TT_NAME

       FROM    TRADE_TYPE

       WHERE   TT_ID       = @t_tt_id

       SELECT  @se_cash_type       = SE_CASH_TYPE,

               @se_cash_due_date   = SE_CASH_DUE_DATE,

               @se_amount          = SE_AMT

       FROM    SETTLEMENT

       WHERE   SE_T_ID             = @trade_id

       -- if this is a cash transaction

       -- get information from CASH_TRANSACTION

       IF (@t_is_cash <> 0)

       BEGIN

           IF (@num_updated < @max_updates)

           BEGIN

               SELECT  @ct_string  = RTRIM(RTRIM(@type_name) + ' ' + LTRIM(CAST(@t_qty AS char(6))))

               UPDATE  CASH_TRANSACTION

               SET     CT_NAME = CASE  WHEN CT_NAME LIKE '% Shares of %' THEN @ct_string + ' shares of ' + RTRIM(@s_name)

                                       WHEN CT_NAME LIKE '% shares of %' THEN @ct_string + ' Shares of ' + RTRIM(@s_name)

                                       ELSE CT_NAME

                                 END

               WHERE   CT_T_ID = @trade_id

               SET     @num_updated    = @num_updated + @@ROWCOUNT

           END

           SELECT  @ct_amount  = CT_AMT,

                   @ct_name    = CT_NAME,

                   @ct_dts     = CT_DTS

           FROM    CASH_TRANSACTION

           WHERE   CT_T_ID     = @trade_id

       END

       -- return as a rowset

       SELECT  @t_ca_id,

               @ct_amount,

               @ct_dts,

               @ct_name,

               @t_exec_name,

               @t_is_cash,

               @t_trade_price,

               @t_qty,

               @s_name,

               @se_amount,

               @se_cash_due_date,

               @se_cash_type,

               @t_dts,

               @trade_id,

               @t_tt_id,

               @type_name

       -- get the trade history of 2-3 rows for this trade

       SELECT  TOP 3

               TH_DTS,

               TH_ST_ID

       FROM    TRADE_HISTORY

       WHERE   TH_T_ID = @trade_id

       ORDER   BY TH_DTS

       IF (@max_trades > 0)

           FETCH FROM  Trade_List

                 INTO  @trade_id,

                       @t_ca_id,

                       @t_exec_name,

                       @t_is_cash,

                       @t_trade_price,

                       @t_qty,

                       @t_dts,

                       @t_tt_id

   END

   SELECT  @num_found,

           @num_updated

   CLOSE       Trade_List

   DEALLOCATE  Trade_List

   COMMIT TRANSACTION

END

GO

You can look up others on www.tpc.org.

October 9, 2007 8:12 PM
 

Adam Machanic said:

Interesting.  I can't believe that this would run faster, but perhaps they get a slight concurrency benefit by doing the updates one-by-one?

Are they not allowed to use snapshot isolation?  It would certainly give them more concurrency than repeatable read... Or did they not use it because it won't scale to TPC volumes? :)

October 10, 2007 12:53 PM
 

Greg Linwood said:

Even though SQL expressions are set based, queries are still executed internally using cursor style processing (row by row) so this shouldn't be terribly surprising. Every query is ultimately processed a cursor..

The Oracle community still use language cursors prolifically but for some reason they've developed a bad reputation in the SQL Server community. I think a major factor has been that Oracle devs have long used rule-based SQL optimsation techniques & have therefore paid much closer attention to the real performance determinent - indexes (b/c you have to think about physical processing when designing SQL for rule based processing), whilst SQL Server devs have always been virtually 100% reliant on cost based optimisation & have tended to think much less about indexes & relied too heavily on the "mystical optimiser".

Well indexed cursors run just fine for most purposes, assuming the dev isn't doing something totally stupid with business rules in-between iterations.

You hit the nail on the head in your first para - it's not the use of cursors that's a problem, its the misuse.

October 11, 2007 7:52 PM
 

Adam Machanic said:

I found Linchi's recent post on use of cursors in the TPC-E test to be quite interesting. The question

October 13, 2007 10:06 AM
 

Alex Kuznetsov said:

My 2 cents.

Linchi, how many rows in TRADE_HISTORY are there on average for @trade_id?

If, for instance, there are 100K row in TRADE_HISTORY for @trade_id, then it is easy to imagine that 20 tiny queries like the folowing run faster than a one set based query for 20 of top 3 trade histories using ROW_NUMBER()

     -- get the trade history of 2-3 rows for this trade

      SELECT  TOP 3

              TH_DTS,

              TH_ST_ID

      FROM    TRADE_HISTORY

      WHERE   TH_T_ID = @trade_id

      ORDER   BY TH_DTS

Interesting that TRADE_HISTORY is not qualified by schema name.

Greg, I think that first of all the performance penalty of using cursors in Oracle is way less pronounced than in SQL Server. So it is natural to prefer a clear and simple cursor with predictable performance over a complex set based query which may become slow and require optimization the next time you upgrade.

October 13, 2007 7:18 PM
 

Adam Machanic said:

Alex,

I would personally use CROSS APPLY for that... probably better than ROW_NUMBER?

On second read I see that they also need to do an update in there, which complicates things.  Perhaps the cursor does make sense given that req...

October 13, 2007 8:23 PM

Leave a Comment

(required) 
(required) 
Submit

About Linchi Shea

Checking out SQL Server via empirical data points

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement