THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

The Rambling DBA: Jonathan Kehayias

The random ramblings and rantings of frazzled SQL Server DBA

Parsing the sqlserver.sql_text Action in Extended Events by Offsets

A couple of weeks back I received an email from a member of the community who was reading the XEvent a Day blog series and had a couple of interesting questions about Extended Events.  This person had created an Event Session that captured the sqlserver.sql_statement_completed and sqlserver.sql_statement_starting Events and wanted to know how to do a correlation between the related Events so that the offset information from the starting Event could be used to find the statement of the completed Event to know the specific statement and its reads, cpu and write information from the sqlserver.sql_statement_completed Event.  Confused yet?  Since it doesn’t really read well what the intent was, lets look at the columns that are output by both of these Events, and it will make more sense.  I am going to break this post into two sections, SQL Server 2008 and SQL Server Denali, because there are significant differences between the two with regard to Extended Events and specifically these two events.

SQL Server 2008

In SQL Server 2008, the sqlserver.sql_statement_starting Event has the following output columns:

select name, column_id, type_name
from sys.dm_xe_object_columns
where object_name = 'sql_statement_starting'
  and column_type <> 'readonly'

image

The sqlserver.sql_statement_completed Event has the following output columns:

select name, column_id, type_name
from sys.dm_xe_object_columns
where object_name = 'sql_statement_completed'
  and column_type <> 'readonly'

image

The intent as I understood it was to use the offset and offset_end from the sqlserver.sql_statement_starting Event with the information in the sqlserver.sql_statement_completed Event and I could only assume the sqlserver.sql_text Action.  When I got the email I replied back to use the sqlserver.tsql_stack Action to get the offset information with the sqlserver.sql_statement_completed Event and use that to parse the statement from the sqlserver.sql_text Action.  I didn’t have time to play with it at the moment, but I promised to write a blog post showing how to do it later that day. 

When I got a free minute where I could write a demo to blog about for this I ran into some problems with the results that I got from parsing the sql_text data that was returned by Extended Events.  To show this, lets create an Event Session and run a test multi-statement batch so that we have to some sample Event Data to do processing on.

IF EXISTS (SELECT 1 FROM sys.dm_xe_sessions WHERE name = 'SQLStmtEvents')
    DROP EVENT SESSION SQLStmtEvents ON SERVER
GO
-- Create our Event Session for current Session_ID
DECLARE @SqlCmd NVARCHAR(MAX) = N'
CREATE EVENT SESSION SQLStmtEvents
ON SERVER
ADD EVENT sqlserver.sql_statement_completed
(    ACTION (sqlserver.sql_text, sqlserver.tsql_stack)
    WHERE (sqlserver.session_id = '+CAST(@@SPID AS NVARCHAR(4))+')    ),
ADD EVENT sqlserver.sql_statement_starting
(    ACTION (sqlserver.sql_text, sqlserver.tsql_stack)
    WHERE (sqlserver.session_id = '+CAST(@@SPID AS NVARCHAR(4))+')    )
ADD target package0.ring_buffer
WITH (MAX_DISPATCH_LATENCY=5SECONDS, TRACK_CAUSALITY=ON)';
EXEC(@SqlCmd);
GO
-- Start the Event Session
ALTER EVENT SESSION SQLStmtEvents
ON SERVER
STATE=START;
GO
-- Run a multi-statement Batch
SELECT * 
FROM sys.objects

SELECT * 
FROM master..spt_values
WHERE type = N'P'

SELECT *
FROM INFORMATION_SCHEMA.TABLES
GO

 

With the Event Session created I did quick check to validate that the offset information held in the tsql_stack matched the offset information returned by the sql_statement_starting Event.  The tsql_stack returns an XML document that is similar to the following:

<frame level="1" handle="0x020000004765D81E643DBDA1E90A8F634B78C042FF805374" line="2" offsetStart="64" offsetEnd="124" />

After validating the offsets matched, I wrote out the parser for the Event Data and sql_text as follows:

-- Extract the Event information from the Event Session 
SELECT 
    event_name, 
    timestamp, 
    cpu, 
    duration, 
    reads, 
    writes, 
    state, 
    ISNULL(offset, tsql_stack.value('(frame/@offsetStart)[1]', 'int')) as offset,
    ISNULL(offset_end, tsql_stack.value('(frame/@offsetEnd)[1]', 'int')) as offset_end,
    ISNULL(nest_level, tsql_stack.value('(frame/@level)[1]', 'int')) as nest_level,
    SUBSTRING(sql_text,    
                (ISNULL(offset, tsql_stack.value('(frame/@offsetStart)[1]', 'int'))/2)+1, 
                ((CASE ISNULL(offset_end, tsql_stack.value('(frame/@offsetEnd)[1]', 'int')) 
                    WHEN -1 THEN DATALENGTH(sql_text) 
                    ELSE ISNULL(offset_end, tsql_stack.value('(frame/@offsetEnd)[1]', 'int')) 
                  END - ISNULL(offset, tsql_stack.value('(frame/@offsetStart)[1]', 'int')))/2) + 1) AS statement_text_Xevents,
    sql_text,
    event_sequence,
    activity_id
FROM
(
    SELECT 
        event_data.value('(event/@name)[1]', 'varchar(50)') AS event_name,
        DATEADD(hh, 
            DATEDIFF(hh, GETUTCDATE(), CURRENT_TIMESTAMP), 
            event_data.value('(event/@timestamp)[1]', 'datetime2')) AS [timestamp],
        COALESCE(event_data.value('(event/data[@name="database_id"]/value)[1]', 'int'), 
            event_data.value('(event/action[@name="database_id"]/value)[1]', 'int')) AS database_id,
        event_data.value('(event/action[@name="session_id"]/value)[1]', 'int') AS [session_id],
        event_data.value('(event/data[@name="cpu"]/value)[1]', 'int') AS [cpu],
        event_data.value('(event/data[@name="duration"]/value)[1]', 'bigint') AS [duration],
        event_data.value('(event/data[@name="reads"]/value)[1]', 'bigint') AS [reads],
        event_data.value('(event/data[@name="writes"]/value)[1]', 'bigint') AS [writes],
        event_data.value('(event/data[@name="state"]/text)[1]', 'nvarchar(4000)') AS [state],
        event_data.value('(event/data[@name="offset"]/value)[1]', 'int') AS [offset],
        event_data.value('(event/data[@name="offset_end"]/value)[1]', 'int') AS [offset_end],
        event_data.value('(event/data[@name="nest_level"]/value)[1]', 'int') AS [nest_level],
        CAST(event_data.value('(event/action[@name="tsql_stack"]/value)[1]', 'nvarchar(4000)') AS XML) AS [tsql_stack],
        event_data.value('(event/action[@name="sql_text"]/value)[1]', 'nvarchar(max)') AS [sql_text],
        event_data.value('(event/data[@name="source_database_id"]/value)[1]', 'int') AS [source_database_id],
        event_data.value('(event/data[@name="object_id"]/value)[1]', 'int') AS [object_id],
        event_data.value('(event/data[@name="object_type"]/text)[1]', 'int') AS [object_type],
        CAST(SUBSTRING(event_data.value('(event/action[@name="attach_activity_id"]/value)[1]', 'varchar(50)'), 1, 36) AS uniqueidentifier) as activity_id,
        CAST(SUBSTRING(event_data.value('(event/action[@name="attach_activity_id"]/value)[1]', 'varchar(50)'), 38, 10) AS int) as event_sequence
    FROM 
    (   SELECT XEvent.query('.') AS event_data 
        FROM 
        (    -- Cast the target_data to XML 
            SELECT CAST(target_data AS XML) AS TargetData 
            FROM sys.dm_xe_session_targets st 
            JOIN sys.dm_xe_sessions s 
                ON s.address = st.event_session_address 
            WHERE name = 'SQLStmtEvents' 
              AND target_name = 'ring_buffer'
        ) AS Data 
        -- Split out the Event Nodes 
        CROSS APPLY TargetData.nodes ('RingBufferTarget/event') AS XEventData (XEvent)   
    ) AS tab (event_data)
) AS results

If you run these two examples, you will notice immediately that something is not right with the statement_text_XEvents data, it didn’t parse out correctly.

image

I also validated that this same behavior exists in SQL Server Denali CTP1.  To look at this further and determine where exactly the problem might be, I used the plan_handle from the tsql_stack to get the sql text from sys.dm_exec_sql_text() and then offset parse that in the same manner to validate that the offset information is not the problem.

SELECT 
    event_name, 
    tsql_stack.value('(frame/@offsetStart)[1]', 'int') AS offset,
    tsql_stack.value('(frame/@offsetEnd)[1]', 'int') AS offset_end,
    tsql_stack.value('(frame/@level)[1]', 'int') AS nest_level,
    SUBSTRING(sql_text,    
                (tsql_stack.value('(frame/@offsetStart)[1]', 'int')/2)+1, 
                ((CASE tsql_stack.value('(frame/@offsetEnd)[1]', 'int')
                    WHEN -1 THEN DATALENGTH(sql_text) 
                    ELSE tsql_stack.value('(frame/@offsetEnd)[1]', 'int')
                  END - tsql_stack.value('(frame/@offsetStart)[1]', 'int'))/2) + 1) AS statement_text_Xevents,
    SUBSTRING(st.text, (tsql_stack.value('(/frame/@offsetStart)[1]', 'int')/2)+1, 
        ((CASE tsql_stack.value('(/frame/@offsetEnd)[1]', 'int')
          WHEN -1 THEN DATALENGTH(st.text)
         ELSE tsql_stack.value('(/frame/@offsetEnd)[1]', 'int')
         END - tsql_stack.value('(/frame/@offsetStart)[1]', 'int'))/2) + 1) AS statement_text_from_cache, 
         sql_text
FROM
(
    SELECT 
        event_data.value('(event/@name)[1]', 'varchar(50)') AS event_name,
        CAST(event_data.value('(event/action[@name="tsql_stack"]/value)[1]', 'nvarchar(4000)') AS XML) AS [tsql_stack],
        event_data.value('(event/action[@name="sql_text"]/value)[1]', 'nvarchar(max)') AS [sql_text]
    FROM 
    (   SELECT XEvent.query('.') AS event_data 
        FROM 
        (    -- Cast the target_data to XML 
            SELECT CAST(target_data AS XML) AS TargetData 
            FROM sys.dm_xe_session_targets st 
            JOIN sys.dm_xe_sessions s 
                ON s.address = st.event_session_address 
            WHERE name = 'SQLStmtEvents' 
              AND target_name = 'ring_buffer'
        ) AS Data 
        -- Split out the Event Nodes 
        CROSS APPLY TargetData.nodes ('RingBufferTarget/event') AS XEventData (XEvent)   
    ) AS tab (event_data)
) AS results
CROSS APPLY sys.dm_exec_sql_text(tsql_stack.value('xs:hexBinary(substring((/frame/@handle)[1], 3))', 'varbinary(max)')) AS st

image

The information from the cache is correct, so there has to be something different about the sql_text Action output.  To figure out what that was, I changed my script slightly so that it would parse out each character in the text from cache and the sql_text Action to see what was different and where:

-- Extract the Event information from the Event Session 
SELECT 
    sv.number AS Position, 
    ASCII(SUBSTRING(text, sv.number+1, 1)) AS cache_text,
    ASCII(SUBSTRING(sql_text, sv.number+1, 1)) AS sql_text
FROM
(
    SELECT 
        CAST(event_data.value('(event/action[@name="tsql_stack"]/value)[1]', 'nvarchar(4000)') AS XML) AS [tsql_stack],
        event_data.value('(event/action[@name="sql_text"]/value)[1]', 'nvarchar(max)') AS [sql_text]
    FROM 
    (   -- Get only one of the events this time
        SELECT TOP 1 XEvent.query('.') AS event_data 
        FROM 
        (    -- Cast the target_data to XML 
            SELECT CAST(target_data AS XML) AS TargetData 
            FROM sys.dm_xe_session_targets st 
            JOIN sys.dm_xe_sessions s 
                ON s.address = st.event_session_address 
            WHERE name = 'SQLStmtEvents' 
              AND target_name = 'ring_buffer'
        ) AS Data 
        -- Split out the Event Nodes 
        CROSS APPLY TargetData.nodes ('RingBufferTarget/event') AS XEventData (XEvent)   
    ) AS tab (event_data)
) AS results
CROSS APPLY sys.dm_exec_sql_text(tsql_stack.value('xs:hexBinary(substring((/frame/@handle)[1], 3))', 'varbinary(max)')) AS st
CROSS JOIN master.dbo.spt_values AS sv
WHERE sv.type = 'P'
  AND sv.number <= DATALENGTH(text)

image

What I found was that the carriage returns are missing from the sql_text Action in Extended Events.  We can work around that by doing a REPLACE(sql_text, CHAR(10), CHAR(13)+CHAR(10)) operation to return the carriage returns to the output.

-- Extract the Event information from the Event Session 
SELECT 
    event_name, 
    timestamp, 
    cpu, 
    duration, 
    reads, 
    writes, 
    state, 
    SUBSTRING(sql_text,    
                (tsql_stack.value('(frame/@offsetStart)[1]', 'int')/2)+1, 
                ((CASE tsql_stack.value('(frame/@offsetEnd)[1]', 'int')
                    WHEN -1 THEN DATALENGTH(sql_text) 
                    ELSE tsql_stack.value('(frame/@offsetEnd)[1]', 'int') 
                  END - tsql_stack.value('(frame/@offsetStart)[1]', 'int'))/2) + 1) AS statement_text_Xevents,
    SUBSTRING(st.text, (tsql_stack.value('(/frame/@offsetStart)[1]', 'int')/2)+1, 
        ((CASE tsql_stack.value('(/frame/@offsetEnd)[1]', 'int')
          WHEN -1 THEN DATALENGTH(st.text)
         ELSE tsql_stack.value('(/frame/@offsetEnd)[1]', 'int')
         END - tsql_stack.value('(/frame/@offsetStart)[1]', 'int'))/2) + 1) AS statement_text_from_cache,
    sql_text,    
    event_sequence,
    activity_id
FROM
(
    SELECT 
        event_data.value('(event/@name)[1]', 'varchar(50)') AS event_name,
        DATEADD(hh, 
            DATEDIFF(hh, GETUTCDATE(), CURRENT_TIMESTAMP), 
            event_data.value('(event/@timestamp)[1]', 'datetime2')) AS [timestamp],
        COALESCE(event_data.value('(event/data[@name="database_id"]/value)[1]', 'int'), 
            event_data.value('(event/action[@name="database_id"]/value)[1]', 'int')) AS database_id,
        event_data.value('(event/action[@name="session_id"]/value)[1]', 'int') AS [session_id],
        event_data.value('(event/data[@name="cpu"]/value)[1]', 'int') AS [cpu],
        event_data.value('(event/data[@name="duration"]/value)[1]', 'bigint') AS [duration],
        event_data.value('(event/data[@name="reads"]/value)[1]', 'bigint') AS [reads],
        event_data.value('(event/data[@name="writes"]/value)[1]', 'bigint') AS [writes],
        event_data.value('(event/data[@name="state"]/text)[1]', 'nvarchar(4000)') AS [state],
        event_data.value('(event/data[@name="offset"]/value)[1]', 'int') AS [offset],
        event_data.value('(event/data[@name="offset_end"]/value)[1]', 'int') AS [offset_end],
        event_data.value('(event/data[@name="nest_level"]/value)[1]', 'int') AS [nest_level],
        CAST(event_data.value('(event/action[@name="tsql_stack"]/value)[1]', 'nvarchar(4000)') AS XML) AS [tsql_stack],
        REPLACE(event_data.value('(event/action[@name="sql_text"]/value)[1]', 'nvarchar(max)'), CHAR(10), CHAR(13)+CHAR(10)) AS [sql_text],
        event_data.value('(event/data[@name="source_database_id"]/value)[1]', 'int') AS [source_database_id],
        event_data.value('(event/data[@name="object_id"]/value)[1]', 'int') AS [object_id],
        event_data.value('(event/data[@name="object_type"]/text)[1]', 'int') AS [object_type],
        CAST(SUBSTRING(event_data.value('(event/action[@name="attach_activity_id"]/value)[1]', 'varchar(50)'), 1, 36) AS uniqueidentifier) as activity_id,
        CAST(SUBSTRING(event_data.value('(event/action[@name="attach_activity_id"]/value)[1]', 'varchar(50)'), 38, 10) AS int) as event_sequence
    FROM 
    (   SELECT XEvent.query('.') AS event_data 
        FROM 
        (    -- Cast the target_data to XML 
            SELECT CAST(target_data AS XML) AS TargetData 
            FROM sys.dm_xe_session_targets st 
            JOIN sys.dm_xe_sessions s 
                ON s.address = st.event_session_address 
            WHERE name = 'SQLStmtEvents' 
              AND target_name = 'ring_buffer'
        ) AS Data 
        -- Split out the Event Nodes 
        CROSS APPLY TargetData.nodes ('RingBufferTarget/event') AS XEventData (XEvent)   
    ) AS tab (event_data)
) AS results
CROSS APPLY sys.dm_exec_sql_text(tsql_stack.value('xs:hexBinary(substring((/frame/@handle)[1], 3))', 'varbinary(max)')) AS st

image

I filed this as a bug on Connect titled Extended Events sqlserver.sql_text Action is not Offset Parseable after trading a couple of emails with Mike Wachal about this, and he has already replied to the connect item stating that they are looking at the difference between the two operations in the engine. 

SQL Server Denali

I was hesitant to even bother filing this as a bug because in SQL Server Denali CTP1, none of this is necessary to collect the statement, since the statement exists as a customizable column on the Events.

In Denali CTP1, the sql_statement_starting Event has the following columns:

select name, column_id, type_name, column_type
from sys.dm_xe_object_columns
where object_name = 'sql_statement_starting'
  and column_type <> 'readonly'

image

The sql_statement_completed Event has the following columns:

select name, column_id, type_name, column_type
from sys.dm_xe_object_columns
where object_name = 'sql_statement_completed'
  and column_type <> 'readonly'

image

If you compare the columns output by the sql_statement_completed Event in Denali CTP1, you will notice that it has significantly more columns in its base column payload, as well as having the statement as a customizable column.  In Denali CTP1, the only thing necessary to get the specific TSQL statement that is executing is to turn collection of the customizable column statement on using the collect_statement column.

IF EXISTS (SELECT 1 FROM sys.dm_xe_sessions WHERE name = 'SQLStmtEvents')
    DROP EVENT SESSION SQLStmtEvents ON SERVER
GO
DECLARE @SqlCmd NVARCHAR(MAX) = N'
CREATE EVENT SESSION SQLStmtEvents
ON SERVER
ADD EVENT sqlserver.sql_statement_completed
(    SET collect_statement=1
    ACTION (sqlserver.sql_text, sqlserver.tsql_stack)
    WHERE (sqlserver.session_id = '+CAST(@@SPID AS NVARCHAR(4))+')    ),
ADD EVENT sqlserver.sql_statement_starting
(    SET collect_statement=1
    ACTION (sqlserver.sql_text, sqlserver.tsql_stack)
    WHERE (sqlserver.session_id = '+CAST(@@SPID AS NVARCHAR(4))+')    )
ADD target package0.ring_buffer
WITH (MAX_DISPATCH_LATENCY=5SECONDS, TRACK_CAUSALITY=ON)';
EXEC(@SqlCmd);
GO
ALTER EVENT SESSION SQLStmtEvents
ON SERVER
STATE=START;
GO

SELECT * 
FROM sys.objects

SELECT * 
FROM master..spt_values
WHERE type = N'P'

SELECT *
FROM INFORMATION_SCHEMA.TABLES
GO

With this turned on, all we have to do is query the Event Data and parse out the statement column to know the specific statement that was executing.

-- Extract the Event information from the Event Session 
SELECT 
    event_name, 
    timestamp, 
    cpu, 
    duration, 
    reads, 
    writes, 
    state, 
    statement,
    activity_id,
    event_sequence
FROM
(
    SELECT 
        event_data.value('(event/@name)[1]', 'varchar(50)') AS event_name,
        DATEADD(hh, 
            DATEDIFF(hh, GETUTCDATE(), CURRENT_TIMESTAMP), 
            event_data.value('(event/@timestamp)[1]', 'datetime2')) AS [timestamp],
        event_data.value('(event/data[@name="cpu"]/value)[1]', 'int') AS [cpu],
        event_data.value('(event/data[@name="duration"]/value)[1]', 'bigint') AS [duration],
        event_data.value('(event/data[@name="reads"]/value)[1]', 'bigint') AS [reads],
        event_data.value('(event/data[@name="writes"]/value)[1]', 'bigint') AS [writes],
        event_data.value('(event/data[@name="state"]/text)[1]', 'nvarchar(4000)') AS [state],
        event_data.value('(event/data[@name="statement"]/value)[1]', 'nvarchar(max)') AS [statement],
        CAST(SUBSTRING(event_data.value('(event/action[@name="attach_activity_id"]/value)[1]', 'varchar(50)'), 1, 36) AS uniqueidentifier) as activity_id,
        CAST(SUBSTRING(event_data.value('(event/action[@name="attach_activity_id"]/value)[1]', 'varchar(50)'), 38, 10) AS int) as event_sequence
    FROM 
    (   SELECT XEvent.query('.') AS event_data 
        FROM 
        (    -- Cast the target_data to XML 
            SELECT CAST(target_data AS XML) AS TargetData 
            FROM sys.dm_xe_session_targets st 
            JOIN sys.dm_xe_sessions s 
                ON s.address = st.event_session_address 
            WHERE name = 'SQLStmtEvents' 
              AND target_name = 'ring_buffer'
        ) AS Data 
        -- Split out the Event Nodes 
        CROSS APPLY TargetData.nodes ('RingBufferTarget/event') AS XEventData (XEvent)   
    ) AS tab (event_data)
) AS results
image 

The new statement column in Denali CTP 1 eliminates the need to do offset parsing of the sql_text Action entirely.  Knowing that, it really became clear to me that the addition of the offset information in the Event’s base column payload was really useless, and for that I submitted a different Connect feedback item Extended Events - Remove data from Event output that is available by Actions.  You might ask why does this matter?  It matters because for everytime the Event fires, this information is included in the Event Data for the Event.  Since Event Data is materialized as XML, this becomes needless bloat of the Event Data output for the Event and it really is useless information.  If you really wanted to collect the offset information, you could add the tsql_stack Action to the Events to collect it, so there is no reason for the Event to carry this information by default.

So there you have it.  How to parse the sqlserver.sql_text Action in SQL Server 2008 using offset information to isolate the actual executing statement from a multi-statement batch using a work around, and why its not necessary to do this in SQL Server Denali going forward.

Published Sunday, January 30, 2011 10:05 PM by Jonathan Kehayias

Comments

 

Pankush said:

hi Jonathan,

Thank you for the post and the whole series. It definately helps.

January 30, 2011 10:27 PM
Anonymous comments are disabled

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement