THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Michael Zilberstein

Monitoring page splits with Extended Events

After reading Kalen Delaney's post about single insert causing 10 page splits, I wanted to see those splits in detail - their order at first place. And in SQL Server 2008 there is a way to trace splits - using new Extended Events infrastructure. Here is simple script that creates the trace and afterwards displays results.

First of all, create and populate table in tempdb as described in the Kalen's post.

USE tempdb;
GO

CREATE TABLE split_page 
(id INT IDENTITY(0,2) PRIMARY KEY,
id2 bigint DEFAULT 0,
data1 VARCHAR(33) NULL, 
data2 VARCHAR(8000) NULL);
GO

INSERT INTO split_page DEFAULT VALUES;
GO 385

Now, let's create and start Extended Events session. The only event we would like to monitor is page_split. On the way we'll capture sql text in order to be sure that it is our insert that caused split.

CREATE EVENT SESSION MonitorPageSplits ON SERVER
ADD EVENT sqlserver.page_split
(
  
ACTION (sqlserver.database_id, sqlserver.sql_text
   
WHERE sqlserver.database_id = 2
)
ADD TARGET package0.asynchronous_file_target
(
  
SET 
      
filename = N'c:\temp\MonitorPageSplits.etx'
      
metadatafile = N'c:\temp\MonitorPageSplits.mta'
);
GO

ALTER EVENT SESSION MonitorPageSplits ON SERVER STATE = start;
GO

Now execute INSERT command from Kalen's script.

SET IDENTITY_INSERT split_page  ON;
GO
INSERT INTO split_page (id, id2, data1, data2)
     
SELECT 111, 0, REPLICATE('a', 33), REPLICATE('b', 8000);
GO
SET IDENTITY_INSERT split_page  OFF;
GO

Afterwards we'll close the session and display results.

ALTER EVENT SESSION MonitorPageSplits ON SERVER STATE = STOP;
GO

DROP EVENT SESSION MonitorPageSplits ON SERVER;
GO

SELECT 
  
split.value('(/event/data[@name=''file_id'']/value)[1]','int') AS [file_id],
  
split.value('(/event/data[@name=''page_id'']/value)[1]','int') AS [page_id],
  
split.value('(/event[@name=''page_split'']/@timestamp)[1]','datetime') AS [event_time],
  
split.value('(/event/action[@name=''sql_text'']/value)[1]','varchar(max)') AS [sql_text]
FROM
  

      
SELECT CAST(event_data AS XML) AS split
      
FROM sys.fn_xe_file_target_read_file('c:\temp\MonitorPageSplits*.etx', 'c:\temp\MonitorPageSplits*.mta', NULL, NULL)
   )
AS t
ORDER BY [event_time]
GO

OK, the result is:

image

So, we see 10 splits, among them 4 splits of page 148, another 3 of page 178 etc. It makes sense. When split occurs, ~half of the data from the old page goes to the new one. So if the new row - the one that caused split - should originally have entered first half of the page, after the split it would still try to enter the old page - not the new one. In our case originally we had 56 rows before the new one (id from 0 to 110 step 2) and 385 - 56 = 329 rows after. I would still expect 3 and not 4 splits of the initial page because (((385 / 2) / 2) / 2) = ~48 < 57 (new row's place). So I have expected that after the third split new row would at last leave the initial page. But I was wrong - don't know whether that's just not strict math or there're other factors I didn't think of.

Published Thursday, February 05, 2009 6:05 PM by Michael Zilberstein

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Jason Massie said:

Pardon my laziness but is object_id and index_id available in the xml? That would make this really useful. Of course, you can drop into dbcc page but it would be nice to group and order by your indexes with the most splits.

February 5, 2009 10:36 AM
 

Michael Zilberstein said:

There're 2 types of data in Extended Events - properties of particular event (for page_split those are file_id and page_id only) and data you capture when event fires - specified in ACTIONS predicate (sqlserver.database_id and sqlserver.sql_text in my example). I tried to find action for capturing object_id but looks like it doesn't exist:

SELECT p.name + '.' + o.name as [Full Name], o.[description]

FROM

sys.dm_xe_objects o

INNER JOIN sys.dm_xe_packages p on o.package_guid = p.[guid]

WHERE o.object_type = 'action'

ORDER BY p.name, o.name

Still, DBCC PAGE is probably unnecessary here - you can use sys.dm_os_buffer_descriptors or %%physloc%% to identify object by file_id and page_id (although it is valid for cached objects, so is good for real-time analysis only).

February 5, 2009 10:56 AM
 

Adam Machanic said:

A couple of comments:

A) Agreed with Michael that the buffer descriptors DMV is probably the best way to get related data (if you're collecting fairly often, which you probably should be).  But don't go for the object_id -- that will only tell you the affected table and not which index actually had the split.  Instead go for the allocation_unit_id.

B) The problem with this event, aside from the lack of data, is that it fires for ANY page allocation, whether or not data is moved.  So for example if you have a clustered index based on an IDENTITY column, that will cause "splits" at the last page, but this won't result in fragmentation -- not something to be concerned with.  I've been playing with this for a couple of weeks and have come up with the following algorithm to determine potential problems.  What do you think?

(pseudocode follows -- this is to be used for each index found)

---

int maxPageId = 0

int totalPotentialIssues = 0

foreach (pageId order by timestamp)

 if pageId <= maxPageId

 {

   totalPotentialIssues++

 }

 else

 {

   maxPageId = pageId

 }

---

In English: Find all of the data for a given index, and order it by timestamp, ascending. If the split happened on a page with a greater ID than any page we've seen before, assume that it's probably a leaf-level split.  If the split happened on a page less than or equal to the maximum page ID we've seen before, it's likely to be an internal split, so it should be counted as a potential issue.

Obviously this isn't failsafe but I think it gets us much closer to an accurate picture than simply looking at how many times the event fires.  What do you think?

February 5, 2009 11:21 AM
 

Greg Linwood said:

Nice idea Adam

I wonder whether its possible to simply check the first / last allocation in sys.indexes when the split event fires & see whether the new page is linked to either?

Cheers,

Greg

February 5, 2009 4:23 PM
 

Adam Machanic said:

Hi Greg,

I thought about this for a while and I think your idea would be impossible today due to lack of a streaming provider.  Since the check would effectively have to be done asynchronously, by the time you check there is no guarantee that another split would not have taken place.  Even if there were a streaming provider I would be a bit nervous about putting something like that in place -- it might be a bit too expensive to do so many checks.  My hope is that by the time there is such a provider we won't have to worry about it because the XE team will have made the event [more] useful...

February 9, 2009 10:09 AM
 

Jonathan Kehayias said:

There is a connect item in for the expansion of the information available in event:

https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=388482

February 9, 2009 4:27 PM
 

Michael Zilberstein said:

Hi Adam,

A) Sure, in buffer descriptors you have allocation_unit_id, not object_id.

B) Interesting idea. You can build a job that will probe extended events trace files every, say, 10 minutes and summarize splits per index (comparing page_id to the previous maximum value and updating it on the fly) plus write timestamp aside in order to avoid counting same split more than once. Still the ideal solution would be some asynchronous bucketizing target (bucket per index) but for this we need allocation_unit_id as property of event or as possible Action.

February 11, 2009 2:59 AM
 

Jay said:

This is an good article based upon e.g. but to get page split real time is very different & difficult for particluar table or Indexes.

Will like to see how to get real time for Indexes which are triggering page_split.

July 4, 2010 5:36 PM
 

Michael Zilberstein said:

Jay,

you're right - tracing page split in real environment should be done very carefully, especially in IO-intensive environment. Although event itself is very lightweight, writing it to file isn't that cheap when you have thousands splits per second. So it should be very focused and limited in time.

July 4, 2010 5:54 PM
 

Roji said:

Micheal,

>> I would still expect 3 and not 4 splits of the initial page because (((385 / 2) / 2) / 2) = ~48 < 57 (new row's place).

AFAIK, the purpose of the split is not to get teh row location at the end, but to find enough space to accomodate the new data.

You can try the above example with id 1 instead of 111, which will prove my point.

July 15, 2010 2:28 PM
 

Michael Zilberstein said:

Roji,

you're right about purpose of the split. But take into account, that my row's size is exactly 1 full page, so it can't be just added to another row - i causes other rows to migrate from the page. Here is how I see it step by step:

1) Split of page 148; 193 rows remain in p148, 192 rows migrate to another page.

Where my row should enter now? Still page 148. Does it contain enough free space? No.

2) Split of page 148; 92 rows remain in p148, 91 rows migrate to another page.

Where my row should enter now? Still page 148. Does it contain enough free space? No.

3) Split of page 148; 46 rows remain in p148 (id betwen 0 and 90), 46 rows migrate to another page (id starting from 92).

Where my row (id = 111) should enter now? New page where rows starting from id = 92 went during last split. Does it contain enough free space? No. But next split won't be of the initial page 148. That's why I say that 4 splits surprised me.

July 15, 2010 4:05 PM
 

The Rambling DBA: Jonathan Kehayias said:

Nearly two years ago Kalen Delaney blogged about Splitting a page into multiple pages , showing how page

December 27, 2010 9:58 PM
 

The Rambling DBA: Jonathan Kehayias said:

Nearly two years ago Kalen Delaney blogged about Splitting a page into multiple pages , showing how page

December 27, 2010 10:01 PM
 

Pavel Nefyodov said:

Hi Michael,

Your math and logic are good.

But have you noticed that before last row was inserted there were 2 pages and no Index pages?

Creation of Index page counts as a page split. So, you have got 3 data page splits + 1 index page= 4 page splits as shown in your results.

Hope that solves a problem.

April 20, 2011 11:31 AM
 

Michael Zilberstein said:

Hi Pavel,

That's interesting. Indeed before the split there was one data page and one IAM page. After the split there're 2 IAM pages, 1 upper-level index page and 10 data pages. So from 2 pages we're now at 13 - 11 "splits" (if page allocation is counted as split). On another hand, trace contains page_id - and none of those pages is index page. I'll try to understand it deeper and - if I something interesting is uncovered - probably it'll be another post :-)

April 20, 2011 11:57 AM
 

Pavel Nefyodov said:

There should be 1 index page, 1 IAM page and 10 data pages after the split. I checked it in SQL Server 2005/2008. Perhaps, it might be completely different situation in Denali.

BTW I made my own research on this subject. It's too big to fit into comments field, but if you are interested let me know.

April 21, 2011 5:23 AM
 

Michael Zilberstein said:

Hi Pavel,

Re-tested it again. Actually, transaction log provides the most complete information - much better than extended events. So I see that first index page is allocated, after it 9 data page splits occur (3 on the first page etc exactly as I calculated). Surprisingly, extended events session still shows as if 4 splits on the first page occurred. Looks like bug to me - index page allocation is presented as split in data page.

April 24, 2011 9:54 PM
 

Pavel Nefyodov said:

Yes, I have seen it in a script. You might be absolutely right WRT the bug. What version of SQL Server are you testing on? I have got index page allocation is presented as split in data page in 2005/2008.

Do you mind if I send you to your DBArt email some results of my research on that? It does not explain the bug, but it is relevant and you may find it interesting to read.

April 25, 2011 8:12 AM
 

Michael Zilberstein said:

2 years ago I wrote about monitoring page splits with Extended Events . Only 2 bloggers explored Extended

April 25, 2011 4:50 PM
 

Michael Zilberstein said:

Pavel,

I tested it now on SQL2008R2. I don't think there is any significant change in behavior between 2005, 2008 and 2008 R2. Anyway, I wrote new post which answers (I hope) all the questions that remained open after this post. Of course, you can send your results too - if you explored the problem deeper, I'll be happy to learn something new.

April 25, 2011 5:26 PM
 

Sribhagat said:

Hi,

I have used this Xevent to track page splits on test server. I have noticed many entries with file_id and page_id as 0. The xevent is not able to extract the sql_text for these page splits entries as well. Can someone please explain whats happening inside the SQL server...

January 13, 2014 1:57 AM

Leave a Comment

(required) 
(required) 
Submit
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement