THE SQL Server Blog Spot on the Web
Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | Join | Help
in Search

Kalen Delaney

  • Did You Know? Free SQL Server Troubleshooting Tools

    I'm very excited about the new Management Data Warehouse coming in SQL Server 2008 (which was called Performance Studio at one point), but keep in mind there are lots of available tools in the current versions.

    One of the students in my class last week put together this list of his favorites. Note that some of the tools can be used together. For example, the PSSDIAG collection utility output can be pumped into SQL Nexus for nice reporting and analysis.

     

    RML Utilities for SQL Server

    http://support.microsoft.com/kb/944837

    http://www.microsoft.com/downloads/details.aspx?familyid=7EDFA95A-A32F-440F-A3A8-5160C8DBE926&displaylang=en

    SQL Nexus

    http://www.sqlnexus.net/

    PSSDIAG Data Collection Utility

    http://www.microsoft.com/downloads/details.aspx?FamilyId=5564386A-28C2-4483-8293-76FFF67B9EB3&displaylang=en

    Description of the SQL Server Performance Analysis Utilities Read80Trace and OSTRESS

    http://support.microsoft.com/kb/887057

    SQL Server Health and History Tool (SQLH2)

    http://www.microsoft.com/downloads/details.aspx?FamilyID=EEDD10D6-75F7-4763-86DE-D2347B8B5F89&displaylang=en

    Performance Dashboard

    http://www.microsoft.com/downloads/details.aspx?familyid=1d3a4a0d-7e0c-4730-8204-e419218c1efc&displaylang=en

     

    Have fun!

    ~Kalen


  • Did You Know? I have a question for you on DBA Blunders!

    A good friend of mine is putting together a presentation on the top DBA Blunders from the view point of a SQL Server Database Admin, not developer. He would love to hear what other people consider to be in the big mistake list.

    Of course, you don't have to admit that these were your own blunders; they can be ones you heard about, or that 'a friend' is guilty of. :-)

    Thanks for your input!

    ~Kalen


  • Did You Know? I'm going to Portland

    I almost titled this one with OT, but then I realized that this MY BLOG, and nothing is really off-topic, right? It's my blog and I can say whatever I want. :-)

    I get a real big kick out of seeing places in movies where I have actually been. I took my husband to NYC for his birthday in early March, and the first night, in our hotel room, the movie playing on HBO was "Night at the Museum" and we were planning to go to the Museum of Natural History while in the Big Apple! Even though we'd seen the movie before, it was fun to see it again, knowing we were going to be there the next day. Our last night in town, we saw a show (Wicked) and then walked along Broadway. The next day flying home, the movie on the flight was "Enchanted", in which a Princess pops up right in the middle of Broadway. She was right on a corner where we had been the day before. Again, it was so fun just to see a place in a film that we completely recognized.

    Last night we rented a movie that took place in one of my favorite cities, Portland, Oregon.

    What The Bleep!? home page

    Again, it was really fun to see sights that we knew really well. Portland is only about a 3 hour drive from where I live, so when I have a class there, I can drive. I love going to a job in my own car, rather than flying, because then I don't have to worry about baggage restrictions or losing my pocketknife to TSA. I just throw whatever I want in the trunk of my car.  Since I live out in the boondocks, there are not too many places I can go to work that I don't have to fly away.

    I'll be in Portland twice in the next two months.  May 12 - 16, I'll be teaching my 5-day SQL Server Internals and Tuning class publicly, for my training partner SQLSoft+.  June 9-13 I'll be teaching a private class, less than 5 minutes away from the SQLSoft+ location, and I'll be staying at the same hotel. I guess if I just forget something, like my pocketknife, I could just retrieve it when I go back.

    So now you know. Or do you?

    ~Kalen


  • Geek City: Clustered or Nonclustered? Why not both?

    I had been thinking of this post all day, and then noticed that Denis wrote a post with almost the same name. I was worried he might have written about something similar, but it turns out not to be the case.

    A group of colleagues have been having a discussion about this topic recently, which was spurred by the fact that the  Microsoft supplied Northwind database has duplicates of many of its indexes. If Microsoft does this, many new database users might assume it's a good idea. It's not! Having two identical indexes gives you no additional benefit, but does give you lots of additional overhead when those indexes need to be maintained. In SQL Server 2008, you will be able to create a policy to detect and prohibit this behavior if you choose. But that of course, is still in the future.

    But what if the indexes are not quite identical? What if one is a clustered index and one is nonclustered? My colleagues report seeing this behavior frequently when someone declares a Primary Key on a column, and then doesn't realize that automatically builds an index (clustered by default), so she then builds a nonclustered on the same key.  Is this necessarily all bad? The clustered index is useful when most of the columns of many of the rows need to be returned, based on the value in the PK column, or when the data needs to be returned sorted by the PK. But what if you only need a few rows, and you need only key columns? For example, what about a count(*) query?

    In general, when satisfying a count(*) query, the SQL Server optimizer will choose the index with the smallest number of pages at the leaf level. A nonclustered index typically will have far fewer leaf level rows that a clustered, but still contains an entry for every single row, so the count(*) value will be accurate. So consider this example:

    USE AdventureWorks;
    -- create a big table by copying another one
    IF EXISTS (SELECT * FROM sys.tables WHERE name = 'newdetails')
         DROP TABLE newdetails;
    SELECT * INTO newdetails FROM Sales.SalesOrderDetail;

    -- Build a PK and a NC index on the same column
    ALTER TABLE newdetails
             ADD CONSTRAINT PK_Detail PRIMARY KEY (SalesOrderDetailID);
    CREATE UNIQUE INDEX UNQ_Detail ON newdetails (SalesOrderDetailID);

    -- Look at the plan; the optimizer will choose the NC index
    SET SHOWPLAN_TEXT ON;
    SELECT count(*) FROM newdetails
    WHERE SalesOrderDetailID BETWEEN 1000 and 2000;
    SET SHOWPLAN_TEXT OFF;

    -- Look at the performance; the nonclustered is performing better
    -- Even though the NC index will be chosen without the hint,
    --   i included it here to make it more obvious which index is chosen
    SET STATISTICS IO ON;
    SELECT count(*) FROM newdetails WITH (INDEX = UNQ_Detail)
    WHERE SalesOrderDetailID BETWEEN 1000 and 2000
    SELECT count(*) FROM newdetails WITH (INDEX = PK_Detail)
    WHERE SalesOrderDetailID BETWEEN 1000 and 2000
    SET STATISTICS IO OFF;

    I am absolutely not recommending that you always duplicate your PK index with another one, but rather illustrating that it is not always completely pointless or harmful to do so. Choosing the best indexes takes a lot of careful consideration of your data distribution and your query usage patterns, and there is no one-size-fits-all answer.

    I hope this is useful to you,

    ~Kalen


  • Did You Know? Hot Off the Press -- Jim Gray Research Center in Wisconsin

     

    Jim Gray, winner of the Turing Award and author of the forward to three of my books, was the reason Microsoft opened the Bay Area Research Center. This article in Wired explains it:

    When Gray joined Microsoft in 1995, he convinced the software behemoth to launch a research center in San Francisco so that he and his wife wouldn't have to move to Redmond. "If Jim had wanted a lab in Monte Carlo, we would have built a lab in Monte Carlo," says Microsoft Research chief Rick Rashid.

    (The article also describes Jim's loss in January 2007, just outside San Francisco Bay, and the massive search for him.)

    Now, Microsoft is building a new research center in Madison, Wisconsin, and naming it after Jim Gray. You can read the early announcement here.

    Maybe I can figure out a way to get to Madison this fall... I'd love to visit the new Center.

    ~Kalen


  • Did You Know? Another Cumulative Update Package is Available

    Yep, CU #7 was announced yesterday... see http://support.microsoft.com/kb/949095 for details.

    It is interested that MS says:

    A supported cumulative update package is now available from Microsoft. However, it is intended to correct only the problems that are described in this article. Apply it only to systems that are experiencing these specific problems. This cumulative update package may receive additional testing. Therefore, if you are not severely affected by any of these problems, we recommend that you wait for the next SQL Server 2005 service pack that contains the hotfixes in this cumulative update package.

    So we're all waiting... just when will this next SQL Server Service Pack be available? When will it even be announced?  Maybe any day now....

     

    From the Microsoft MVP Summit, in beautiful downtown Seattle...

    ~Kalen


  • What Do You Think? Do Spelling and Grammar Matter?

    Ok, I'm not talking about simple typos. I can make typos like anyone else, but I don't like it. I try to reread anything I write before posting, on blogs and on newsgroups. With my blog, if I find a typo, I can go back and correct it and will always do that. With really nasty typos in newsgroup messages, I will post a followup correction.  For my published writing, there are teams of editors to seek out and destroy my typos, but still a few always slip through.

    But does it matter? Is it important to spell correctly and use proper grammar? I am probably a bigger stickler than most people, and I understand in a quick blog post, you don't want to spend hours with an editor. But how about reading over what you've posted just for something obvious? There are always spellcheckers, too. (My blog writing spellchecker doesn't recognize the word 'spellchecker'.) I  know they don't catch everything, but many people let things slip through that would be easily caught.

    And what about in emails? Does it matter in email to friends? What about to colleagues? What about to current or prospective clients?  What about on your website?

    I have had business associates who thought that spelling and grammar did not matter at all. After seeing mistakes  in letters to prospective clients, I would always offer to proof read anything before it was sent out. I was told that it didn't matter because the writer was a technie, and that clients would judge them on their technical skills, not their writing skills.  I always wondered how anyone could be sure that they had never lost work because of this. If someone was put off by the writing, they might just not respond  and never explain why.

    Yesterday, I saw a horrible paragraph on a website advertising a conference. In one paragraph, there were several basic spelling and grammar errors. But in the same paragraph, they wrote one of the speaker names as Willinam  instead of William. And for another speaker, they totally mangled his topic, repeating words from part of the description later in the sentence. (I guess they had meant to move words around, but did a 'copy and paste' instead of 'cut and paste'.)

    The grammar mistakes I see most often  are:

    incorrect use of its and it's
    incorrect use of your and you're
    incorrect use of "I" as the object (ex: The invitation was sent to my boss and I. -- yuck!)

     

    (I just googled and found that ZDNet does think we should care, and has this list of top 10 grammar mistakes that make us look 'stupid'.)

    Spelling mistakes are too common to enumerate.

    So should I just give up caring? Please note that this is absolutely positively NOT directed at any particular blogger, especially none of my esteemed colleagues on SQLBlog. This post was triggered by the Conference Website I saw yesterday, but has been bugging me for a long time. I was just wondered if it bugged anyone else?

    Thanks

    ~Kalen

    www.InsideSQLServer.com
    www.SQLServerDVD.com


  • Geek City: How SQL Server Detects the Correct Versioned Rows

    Here is a question I just received from the feedback page on my web site:

    I have finished the book <the storage engine> and like it very much. I am now reading <query tuning and optimization>

    I know in the READ COMMITTED SNAPSHOT isolation, when a row is being modified in a transaction, it generates an old committed version so another transaction can read it without being blocked.

    But I don’t know how SQL Server uses SNAPSHOT isolation to prevent Phantom Read being happening.

    In SERIALIZABLE isolation the ranged key or the whole table is locked, but in SNAPSHOT isolation, it can NOT generate any committed version since the row does NOT exist at the moment. So how does it know that the newly inserted data should not be included in the second SELECT statement?

    -Tom

    Tom is correct in that INSERTS do not generate versioned rows. However, SQL Server is able to keep track of when each change was made under snapshot isolation. Once a database is enabled for snapshots, every rows inserted, deleted or modified gets an additional 14 bytes of overhead added to it. These bytes contain 2 pieces of information. First is a row pointer to the previous committed version of the row, which is stored in the version store in tempdb. This pointer is only used for deleted and updated rows, since there will be no previous values for newly inserted rows.

    However, these extra bytes also include a value called XSN, or transaction sequence number, which you can think of like a timestamp for a database. Any database enabled for snapshot keeps an internal XSN value, that is incremented each time any change is made, or any snapshot select is performed. The metadata also keeps track of all active transactions, and what the XSN was when the transaction started. It uses the view sys.dm_tran_active_snapshot_database_transactions for this.  So when you are reading data, SQL Server will look at the XSN number in each row, and not return any rows that have an XSN value greater than the XSN value at the time the transaction started.

    I hope this helps!

    ~Kalen

    www.InsideSQLServer.com
    www.SQLServerDVD.com


  • Did You Know? My First Talk on SQL 2008

    I gave my first SQL Server 2008 presentation yesterday, and also spoke at a Launch Event for the first time. It was a terrific experience, primarily because it forced me to really start digging into SQL Server 2008. I had been doing so much 2005 training recently, I just really hadn't had a chance to really USE 2008, although I have dutifully installed all the CTPs. 

    My one-hour talk covered 3 main areas:

    1. Policy-based Management
    2. Performance Studio
    3. Multi-server Management

    In my prep the evening before, I had been able to go through the whole thing in about 55 minutes, and I usually end up talking faster when in front of an audience. So figured I had time to show a few extra things in the demos, as well as show off the new ZoomIt presentation tool that I talked about in my previous post.

    I have no idea what happened, but I was just barely finished with the first demo, and I looked at the clock and realized 45 minutes had gone by! I have absolutely no idea how that happened, and I would really like to apologize to all the people in the audience for having to deal with me rushing through the Performance Studio demo and not being able to say much at all about Multi-server management.  I must have just been having too much fun with policies, but that is really no excuse.

    I will be giving the same talk at our local (Puget Sound) .NET Developers Association Meeting on April 21, so if you want to hear the rest of the story, and see the rest of my demos, come on by!

    Thanks!

    ~Kalen

    www.InsideSQLServer.com
    www.SQLServerDVD.com


  • Did You Know? The Most Amazing Tool!

    I know, I said that about the SQL Internals Viewer, and that of course was amazing, but very limited in scope.

    I just found Zoomit at http://technet.microsoft.com/en-us/sysinternals/bb897434.aspx# 

    This is the tool I have been hoping to find for years and years of teaching classes and presenting conference sessions. And it was written by Mark Russinovich, of SysInternals fame... so you know you can trust it.

    It's small, it's easy, it does just what I want as a presenter, but not a whole lot more.

    A quick hotkey, and I can magnify an area of my screen... I can zoom in and out with my mouse wheel. This is great to show menus in Management Studio.

    Another hotkey and I can draw on the screen, change the width of the line and the color.

    One more hotkey, and I can WRITE TEXT on the screen! This is so way cool...I'm supposed to be prepping for a presentation tomorrow, it is almost midnight, and all I want to do is draw all over my screen with Zoomit. 

    Oh, and there is also a timer function for letting students know when to come back from breaks!

    THANK YOU MARK!!!

    Have fun...

    ~Kalen


  • Did You Know? Interview with CSTechCast

    A couple of weeks ago, I was interviewed on CSTechCast,  a technology podcast released every Monday hosted by Eric Johnson, Eric Beehler, and Josh Jones.

    You can access the TechCast at

    http://cstechcast.com/home.aspx?Episode=16

    After clicking the "Listen Now" button, you'll be shown a list of Recording TechCasts, and you need to make sure you choose #16 if you want to listen to me. The TechCast is actuallly an entire news show, and the interview with me is only part of it. My interview starts about 12 minutes and 15 seconds into the recording.

    Enjoy!

    ~Kalen

    www.InsideSQLServer.com

    www.SQLServerDVD.com

     


  • Did You Know? There are only a few seats left for my Kansas City Seminar!

    I'll be heading out to the middle of the country in less than a week. I am very pleased to see such incredible interest in my seminar. Since there really is an absolute maximum number of people we can have in the seminar room, we have to adhere to our upper limit for registrations, and we're very close to that number now, but there are a few seats left.

    Full details, including registration information, can be found here:
    http://www.cleardata.biz/events/query-plans-2008.aspx

     

    Thanks!

    ~Kalen


  • Did You Know? DVD Feedback

    As much as I love to teach, I don't like to watch myself teaching very much. I am always find things to be critical of and it is very difficult for me to be objective when watching video of myself.  So it was very encouraging that two terrific SQL Server professionals made such positive comments about my Lesson 1 DVD on their blogs.

    Database columnist Sean McCown wrote a very nice review on his blog today. Thanks, Sean!
    You can read it here:
    http://dbarant.blogspot.com/2008/03/sql-server-done-right.html

    And yes, it is ok to post my email responses, but it would have been nice if you had cleaned up my typos. :-)

     

    Author Brian Kelley also had some very nice things to say on his blog over at SQL Server Central.  Thank you, too, Brian.
    http://blogs.sqlservercentral.com/brian_kelley/archive/2008/02/12/kalen-delaney-sql-server-internals-dvd.aspx

     

    Have fun!

    ~Kalen


  • Geek City: Nonclustered Index Keys

    I recently received a question about the storage of nonclustered index keys. I am assuming you are aware of the fact that if your table has a clustered index, SQL Server uses the clustered index key (all of its columns, if it is a composite index) as a 'bookmark' in your nonclustered indexes,  to allow your nonclustered indexes to uniquely identify the row that the index is pointing to.

    In particular, this question was about a couple of statements in Inside SQL Server 2005: Query Tuning and Optimization, Chapter 3. Now, I didn't write Chapter 3, but I thought I knew all about how index keys were stored.  But (you may want to be sitting down for this) I WAS WRONG.

    Here are the statements:

    “…when you create a nonunique nonclustered index on a table with a clustered index, we append the clustered index keys to the nonclustered index keys if they are not explicitly part of the nonclustered index keys”

    And then a bit later on, there is another quote discussing this same effect:

    “Note that the key columns for each of the nonclustered indexes on T_clu include the clustered index key column a with the exception of T_clu_f, which is a unique index.”

    Although I had read this chapter when we were putting the book together, I must have read this last sentence too fast. The first quoted sentence is basically what I said in my lead-in paragraph. But the second sentence now seemed wrong. I thought it was saying that if your nonclustered index is UNIQUE, the clustered key columns are not stored in the index at all. I knew that wasn't true, so I contacted the author of Chapter 3. It turns out he was using a very specific definition of key.

    KEY COLUMN: a column that is stored in the leaf pages and the interior (node, or non-leaf) pages of the B-tree and that you can use in an index seek. 

    He also uses another term in some of the related sections of the chapter:

    COVERED COLUMN: a column that can be returned by the index without performing a bookmark lookup; it may or may not be a key column but it is stored in the leaf pages of the B-tree.

    The definition of an index key includes the the property that it is stored at all levels of an index, not just the leaf level. Covered columns are in the leaf, so we have the potential of a covering index, and the possibility of retrieving the data we need without accessing the table itself. But not all covered columns are key columns.

    This was news to me. I assumed that the fact that the clustered index key was in the nonclustered index meant that it was part of the key. And all my previous testing seemed to bear that out; I must never have tested at this level of detail with a unique nonclustered index. What this means is that I have been wrong in the following claim: "Explicitly declaring your clustered key columns as part of your nonclustered indexes does not make any difference in the storage of the index." That statement is true for nonunique indexes, but not true for unique.

    What does this really mean?

    Consider a TabA with a clustered index on col1.  These two index definitions are identical:

    CREATE INDEX nc_index ON TabA(col2)

    CREATE INDEX nc_index ON TabA(col2, col1)

    For a nonunique, nonclustered index (which the above index is), the clustered key col1 is automatically included at the leaf, and at all other levels, just as if we explicitly declared col1 to be part of the nonclustered index key.

    However, these two index definitions are not exactly the same:

    CREATE UNIQUE INDEX nc_index ON TabA(col2)

    CREATE UNIQUE INDEX nc_index ON TabA(col2, col1)

    The first index definition will create a nonclustered index on col2, and add col1 only at the leaf. The second index definition will  also create an nonclustered index on col2, but it will add col1 as a true key column and it will be stored at all index levels.

    Here's a specific example

    Let's look at a specific example. I will be demonstrating the contents of the index rows using DBCC PAGE, and determine the index pages I am interested in by using DBCC IND. For more information about these two undocumented commands, one of the best places to look is at a couple of blog posts by Paul Randal:

    http://www.sqlskills.com/blogs/paul/2007/10/01/InsideTheStorageEngineUsingDBCCPAGEAndDBCCINDToFindOutIfPageSplitsEverRollBack.aspx

    http://blogs.msdn.com/sqlserverstorageengine/archive/2006/12/13/More-undocumented-fun_3A00_-DBCC-IND_2C00_-DBCC-PAGE_2C00_-and-off_2D00_row-columns.aspx

    DBCC IND returns one row for every page of a table or index. The rows contain values indicating what type of page is listed (2 is an index page), what index the page belongs to, what level of an index the page is at (0 is the leaf), and then the file ID and page number. These last two values are then used in DBCC PAGE to get the contents of the page.  The output also includes the file ID and page number for the next and previous pages in logical order. A page with no previous page is the first page in order.

    When running DBCC IND on a large table, you can get lots of rows returned, and it can be tricky to find the exact rows you're interested in. For this reason, I usually take the output of DBCC IND and save it to a table, and then I can query the table. I can included the script to build a table called sp_index_info in the master database. Because of the sp_ prefix, this table can be accessed from any database.

    I'm going to use a table that I copy from the AdventureWorks database into a test database. You can use any database you like to run this code on your own server.

    USE testdb
    GO
    IF EXISTS (SELECT name FROM sys.tables WHERE name = 'Sales')
       DROP TABLE Sales;
    GO
    SELECT  * INTO Sales
    FROM AdventureWorks.Sales.SalesOrderDetail;
    GO

    Now build a clustered index on SalesOrderID and three similar nonclustered indexes on SalesOrderDetailID, which is unique.


    CREATE CLUSTERED INDEX Sales_ID_Index ON Sales(SalesOrderID);
    GO
    -- The first index is not declared as a unique index
    CREATE INDEX Sales_DetailID_Index1 ON Sales(SalesOrderDetailID);
    GO
    -- The second index is declared as a unique index
    CREATE UNIQUE INDEX Sales_DetailID_Index2 ON Sales(SalesOrderDetailID);
    GO
    -- The third index is declared as a unique index and explicitly includes the clustered index key
    CREATE UNIQUE INDEX Sales_DetailID_Index3 ON Sales(SalesOrderDetailID, SalesOrderID);
    GO

    Now, populate the sp_index_info table with the results of DBCC IND

    TRUNCATE TABLE sp_index_info
    INSERT INTO sp_index_info
        EXEC ('DBCC IND ( testdb, Sales, -1)'  );
    GO

    Find the first leaf page from each index, by looking for pages with no previous page, and with an IndexLevel value of 0.


    SELECT PageFID, PagePID, IndexID
    FROM sp_index_info
    WHERE PageType = 2 AND IndexLevel = 0
      AND PrevPagePID = 0
    ORDER BY IndexID;
    GO

    Here's my output:

    PageFID PagePID     IndexID
    ------- ----------- -------
    1       3416        2
    1       1248        3
    1       1680        4

    To look at the actual rows, we need to use DBCC PAGE. An option value of 3 is very useful for index pages, as it gives us tabular output showing each individual index row.  I'll take each of the file and page numbers and use them in DBCC PAGE, and of course you'll have to substitute whatever page numbers you get.


    DBCC TRACEON (3604);
    DBCC PAGE(testdb, 1, 3416, 3);
    DBCC PAGE(testdb, 1, 1248, 3);
    DBCC PAGE(testdb, 1, 1680, 3);

    Here are my results:

    image

    The results shown are for the leaf level. Note that because my clustered index is not unique, it includes a uniqueifier for every row. The uniqueifier is never visible using SELECT statements, but is a hidden column stored within the row to make sure each row in the clustered index is unique. This uniqueifier column is always considered part of the clustered key internally, so every nonunique clustered index is really a composite index.

    There is the exact same data in the leaf for all three nonclustered indexes (except for page numbers of course; all three indexes have their own pages). The same index keys are in each row of the first leaf level index page. However, you might note a couple of differences int the column headers. Just as the nonunique clustered index includes a uniqueifier, SQL Server has to have some way to make sure each nonclustered index row is unique. The column headers actually index which columns of my nonclustered indexes make up the key. For the nonunique index, SQL Server has to consider the entire clustered key as part of the nonclustered key. It knows the clustered key is unique, so adding it to the nonclustered guarantees uniqueness. You can see that all 3 columns are labeled as (key): SalesOrderDetailID, SalesOrderID (the clustered key) and the clustered index uniqueifier.

    For the second index, which was declared as unique, but did not explicitly include the clustered key, the only key the nonclustered index needs is the key the index was built on. Because that single key was declared as unique, no further information is needed to be part of the key. However, the clustered key stored in the leaf level rows, because SQL Server needs that to be able to find the row in the table data.

    The third index explicitly declared the clustered key to be part of the nonclustered index key. So the column labels indicate that both the column SalesOrderDetailID and SalesOrderID are key columns. Since the index was declared to be unique, only the explicitly declared key columns are part of the key. The uniqueifier is there, because as part of the clustered key is must be in the leaf level pages, but it is not marked as part of the nonclustered key.

    So how are these indexes different?

    The indexes differ in the upper levels. The table sp_index_info still contains the output of DBCC IND, so we can find an upper level page for each index. 

    At the level above the leaf the IndexLevel is 1. I don't want to see the rows for the upper level pages of the clustered index, so I filter them out.


    SELECT PageFID, PagePID, IndexID
    FROM sp_index_info
    WHERE PageType = 2 AND IndexLevel > 0 and IndexID > 1
      AND PrevPagePID = 0
    ORDER BY IndexID;

    Here are my results: 

    PageFID PagePID     IndexID
    ------- ----------- -------
    1       3520        2
    1       1360        3
    1       1744        4

    And then I use those results to look at the pages:

    DBCC TRACEON (3604);
    DBCC PAGE(testdb, 1, 3520, 3);
    DBCC PAGE(testdb, 1, 1360, 3);
    DBCC PAGE(testdb, 1, 1744, 3);

    Here are my results:

    image

    Now you should see something different. The first index, nonunique, still has all three columns in the upper level page: the nonclustered key SalesOrderDetailID, and the two columns of the clustered key: SalesOrderID and the uniqueifier.

    The second index is the unique nonclustered on a single column, and only has SalesOrderDetailID in the upper level page.

    The third index is unique and composite, so both of the declared keys are in the upper level.

    What other possibilities are there?

    One additional question you might have is what happens if the clustered index is also unique. How do the nonclustered index rows look different? I think with the tools and examples presented here, you should be able to figure that out for yourself!

    Have fun!

    ~Kalen

    http://DVD.KalenDelaney.com

    www.SQLCommunity.com

     

     

    ------------------------------Script to create a table to hold DBCC IND output--------------------------------------------------------

    -- Create a table to hold the output of DBCC IND
    USE master
    GO
    IF EXISTS (SELECT name FROM sys.tables
                WHERE name = 'sp_index_info')
       DROP TABLE sp_index_info;

    GO
    CREATE TABLE sp_index_info
    (PageFID  tinyint,
      PagePID int,  
      IAMFID   tinyint,
      IAMPID  int,
      ObjectID  int,
      IndexID  tinyint,
      PartitionNumber tinyint,
      PartitionID bigint,
      iam_chain_type  varchar(30),   
      PageType  tinyint,
      IndexLevel  tinyint,
      NextPageFID  tinyint,
      NextPagePID  int,
      PrevPageFID  tinyint,
      PrevPagePID int,
      Primary Key (PageFID, PagePID));


  • Geek City: LOBs are either BLOBs or CLOBs

    When I first heard the word 'BLOB' I had no idea it was an acronym. I thought it really just meant an unformatted bunch of bits. :-)

    But since that time, I've learned a bit more. SQL Server supports two kinds of Large Objects, and has since version 4.0 (Sybase). There is text, which is a large type holding character data, and image, which is a large type holding binary data. SQL 2000 added ntext, to hold Unicode characters.  

    So LOB (Large OBject) refers to any of these types, which have a special way of being stored internally, on special pages called 'LOB Pages'.

    BLOB means Binary Large OBject and only refers to image.

    CLOB means Character Large OBject and refers to text or ntext.

    A column of LOB data can store up to 2GB worth of data, so that could be lots of pages. When you select a column holding LOB data (or select *), SQL Server needs to access every one of those pages. However, if you are looking at the page accesses using SET STATISTICS IO,  you get different output in SQL 2000 and SQL 2005.

    Here's an example on 2005.

    -- First, create a table in a test database.

    USE testdb;
    CREATE TABLE hugerows
       (a char(100), 
        b varchar(1000),
        c text );

    -- Insert a row into the table; Note that replicate cannot return more than 8000 bytes unless you convert the first argument to varchar(max)

    INSERT INTO hugerows
         SELECT REPLICATE('g', 100), REPLICATE('h', 1000),
              REPLICATE(CAST('x' as varchar(max)), 300000);

    The following code will show you how many of each type of page belong to the 'hugerows' table.

    SELECT type_desc, sum(total_pages)
        FROM  sys.partitions p JOIN sys.allocation_units a
        ON  p.partition_id = a.container_id
    WHERE p.object_id = object_id('hugerows')
    GROUP BY type_desc;

    The output should indicate 40 pages for LOB data.

    Now SET STATISTICS IO to ON and SELECT from this table:

    SET STATISTICS IO ON;
    SELECT * FROM hugerows;

    You should see the following message:

    (1 row(s) affected)
    Table 'hugerows'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 224, lob physical reads 0, lob read-ahead reads 0.

    SQL 2005 includes the metrics for 'lob logical reads' and 'lob physical reads' which are not available in SQL 2000 at all. SQL 2000 would just show you the regular logical reads value and that would NOT include the reads of the lob pages. 

    So be careful!

    ~Kalen


More Posts Next page »