THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Paul White: Page Free Space

A technical SQL Server blog from New Zealand. See also my articles on SQLperformance.com

I see no LOBs!

Is it possible to see LOB (large object) logical reads from STATISTICS IO output on a table with no LOB columns?

I was asked this question today by someone who had spent a good fraction of their afternoon trying to work out why this was occurring – even going so far as to re-run DBCC CHECKDB to see if any corruption had taken place.  The table in question wasn’t particularly pretty – it had grown somewhat organically over time, with new columns being added every so often as the need arose.  Nevertheless, it remained a simple structure with no LOB columns – no TEXT or IMAGE, no XML, no MAX types – nothing aside from ordinary INT, MONEY, VARCHAR, and DATETIME types.  To add to the air of mystery, not every query that ran against the table would report LOB logical reads – just sometimes – but when it did, the query often took much longer to execute.

Ok, enough of the pre-amble.  I can’t reproduce the exact structure here, but the following script creates a table that will serve to demonstrate the effect:

IF      OBJECT_ID(N'dbo.Test', N'U')
        IS NOT NULL
        DROP TABLE dbo.Test
GO
CREATE  TABLE
        dbo.Test
        (
        row_id      NUMERIC IDENTITY NOT NULL,
 
        col01       NVARCHAR(450) NOT NULL,
        col02       NVARCHAR(450) NOT NULL,
        col03       NVARCHAR(450) NOT NULL,
        col04       NVARCHAR(450) NOT NULL,
        col05       NVARCHAR(450) NOT NULL,
        col06       NVARCHAR(450) NOT NULL,
        col07       NVARCHAR(450) NOT NULL,
        col08       NVARCHAR(450) NOT NULL,
        col09       NVARCHAR(450) NOT NULL,
        col10       NVARCHAR(450) NOT NULL,
        
        CONSTRAINT  [PK dbo.Test row_id]
            PRIMARY KEY CLUSTERED (row_id)
        )
;

The next script loads the ten variable-length character columns with one-character strings in the first row, two-character strings in the second row, and so on down to the 450th row:

WITH    Numbers
AS      (
        -- Generates numbers 1 - 450 inclusive
        SELECT  TOP (450)
                n = ROW_NUMBER() OVER (ORDER BY (SELECT 0))
        FROM    master.sys.columns C1,
                master.sys.columns C2,
                master.sys.columns C3
        ORDER   BY
                n ASC
        )
INSERT  dbo.Test
        WITH (TABLOCKX)
SELECT  REPLICATE(N'A', N.n), REPLICATE(N'B', N.n),
        REPLICATE(N'C', N.n), REPLICATE(N'D', N.n),
        REPLICATE(N'E', N.n), REPLICATE(N'F', N.n),
        REPLICATE(N'G', N.n), REPLICATE(N'H', N.n),
        REPLICATE(N'I', N.n), REPLICATE(N'J', N.n)
FROM    Numbers AS N
ORDER   BY
        N.n ASC
;

Once those two scripts have run, the table contains 450 rows and 10 columns of data like this:

Test Data Sample

Most of the time, when we query data from this table, we don’t see any LOB logical reads, for example:

-- Find the maximum length of the data in 
-- column 5 for a range of rows
SELECT  result = MAX(DATALENGTH(T.col05))
FROM    dbo.Test AS T
WHERE   row_id BETWEEN 50 AND 100
;

No LOB Reads

But with a different query…

-- Read all the data in column 1
SELECT  result = MAX(DATALENGTH(T.col01))
FROM    dbo.Test AS T
;

LOB Reads

…suddenly we have 49 LOB logical reads, as well as the ‘normal’ logical reads we would expect.

The Explanation

If we had tried to create this table in SQL Server 2000, we would have received a warning message to say that future INSERT or UPDATE operations on the table might fail if the resulting row exceeded the in-row storage limit of 8060 bytes.  If we needed to store more data than would fit in an 8060 byte row (including internal overhead) we had to use a LOB column – TEXT, NTEXT, or IMAGE.  These special data types store the large data values in a separate structure, with just a small pointer left in the original row.

Row Overflow

SQL Server 2005 introduced a feature called row overflow, which allows one or more variable-length columns in a row to move to off-row storage if the data in a particular row would otherwise exceed 8060 bytes.  You no longer receive a warning when creating (or altering) a table that might need more than 8060 bytes of in-row storage; if SQL Server finds that it can no longer fit a variable-length column in a particular row, it will silently move one or more of these columns off the row into a separate allocation unit.

Only variable-length columns can be moved in this way (for example the (N)VARCHAR, VARBINARY, and SQL_VARIANT types).  Fixed-length columns (like INTEGER and DATETIME for example) never move into ‘row overflow’ storage.  The decision to move a column off-row is done on a row-by-row basis – so data in a particular column might be stored in-row for some table records, and off-row for others.

In general, if SQL Server finds that it needs to move a column into row-overflow storage, it moves the largest variable-length column record for that row.  Note that in the case of an UPDATE statement that results in the 8060 byte limit being exceeded, it might not be the column that grew that is moved!

Sneaky LOBs

Anyway, that’s all very interesting but I don’t want to get too carried away with the intricacies of row-overflow storage internals.  The point is that it is now possible to define a table with non-LOB columns that will silently exceed the old row-size limit and result in ordinary variable-length columns being moved to off-row storage.  Adding new columns to a table, expanding an existing column definition, or simply storing more data in a column than you used to – all these things can result in one or more variable-length columns being moved off the row.

Note that row-overflow storage is logically quite different from old-style LOB and new-style MAX data type storage – individual variable-length columns are still limited to 8000 bytes each – you can just have more of them now.  Having said that, the physical mechanisms involved are very similar to full LOB storage – a column moved to row-overflow leaves a 24-byte pointer record in the row, and the ‘separate storage’ I have been talking about is structured very similarly to both old-style LOBs and new-style MAX types.  The disadvantages are also the same: when SQL Server needs a row-overflow column value it needs to follow the in-row pointer a navigate another chain of pages, just like retrieving a traditional LOB.

And Finally…

In the example script presented above, the rows with row_id values from 402 to 450 inclusive all exceed the total in-row storage limit of 8060 bytes.  A SELECT that references a column in one of those rows that has moved to off-row storage will incur one or more lob logical reads as the storage engine locates the data.  The results on your system might vary slightly depending on your settings, of course; but in my tests only column 1 in rows 402-450 moved off-row.  You might like to play around with the script – updating columns, changing data type lengths, and so on – to see the effect on lob logical reads and which columns get moved when.  You might even see row-overflow columns moving back in-row if they are updated to be smaller (hint: reduce the size of a column entry by at least 1000 bytes if you hope to see this).

Be aware that SQL Server will not warn you when it moves ‘ordinary’ variable-length columns into overflow storage, and it can have dramatic effects on performance.  It makes more sense than ever to choose column data types sensibly.  If you make every column a VARCHAR(8000) or NVARCHAR(4000), and someone stores data that results in a row needing more than 8060 bytes, SQL Server might turn some of your column data into pseudo-LOBs – all without saying a word.

Finally, some people make a distinction between ordinary LOBs (those that can hold up to 2GB of data) and the LOB-like structures created by row-overflow (where columns are still limited to 8000 bytes) by referring to row-overflow LOBs as SLOBs.  I find that quite appealing, but the ‘S’ stands for ‘small’, which makes expanding the whole acronym a little daft-sounding…small large objects anyone?

© Paul White 2011

email: SQLkiwi@gmail.com
twitter: @SQL_Kiwi

Published Saturday, February 19, 2011 8:11 AM by Paul White
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Roy Ernest said:

Very good blog. I had seen this behavior in our set up. Now I know why. Thanks.

February 18, 2011 2:30 PM
 

Rob Volk said:

Instead of "S" meaning "small", how about "Silent" Large-OBjects?

February 18, 2011 2:40 PM
 

Paul White said:

Roy,

Thanks for the feedback - I'm so glad you found this entry useful.

Paul

February 18, 2011 3:39 PM
 

Paul White said:

Hey Rob,

Yeah :)

Silent is about right!

Cheers,

Paul

February 18, 2011 3:40 PM
 

Paul White: Page Free Space said:

There is much more to query tuning than reducing logical reads and adding covering nonclustered indexes. 

February 22, 2011 10:19 AM
 

Paul White: Page Free Space said:

There is much more to query tuning than reducing logical reads and adding covering nonclustered indexes.

February 25, 2011 5:06 PM

Leave a Comment

(required) 
(required) 
Submit
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement