Hey folks! Yes, I am alive, but as you can imagine, I am spending my time chewing on SQL Server 2008 and have limited time to post much. But today, I received an email that I wanted to post on since this is a common question.
"How do I shred an XML document into a table."
The big issue here is that there really is not one best way to do it. If you have a relatively small document and you want a simple solution, what I'm going to show here is a reasonably well suited. However, if you have a large document, or a large number of small documents, this approach is relatively inefficient and you probably should look at using an SSIS, BizTalk or custom developed solution. If you have schematically complex XML, BizTalk or a custom develop solution is probably going to be needed at some point.
Then there is always the question "should you?" My answer is typically yes if your primary concern is query performance under load. The more transactional processing you need to do on the data encapsulated in the document, the more sense shredding makes. Conversely, if there is only limit query of the XML and reassembling the data into an XML format would be expensive, then you are probably better off leaving it "as is."
Following are two example of bringing a small amount of XML into a table using SQL Server 2005 and SQL Server 2008. In the case of SQL Server 2005, the example simply inserts data into the target table without any handling or attempts to update existing matching rows.
use scratch
go
if not OBJECT_ID('dbo.books') is null
drop table dbo.books;
go
create table dbo.books(
id tinyint not null primary key,
sku nvarchar(15) not null,
skuType nvarchar(10) not null,
title nvarchar(100) not null,
listPrice money not null);
go
set nocount on
declare @booklist xml;
select @booklist = bulkcolumn from openRowSet(bulk 'c:\booklist.xml',single_blob) as b;
select @booklist;
-- Yukon-style insert, you could do update similarly.
insert into dbo.books
select t.c.value('./id[1]','tinyint')
, t.c.value('./sku[1]','nvarchar(15)')
, t.c.value('./skuType[1]','nvarchar(10)')
, t.c.value('./title[1]','nvarchar(100)')
, t.c.value('./listPrice[1]','money')
from @booklist.nodes('//book') as t(c);
go
select * from dbo.books;
go
Transact-SQL for SQL Server 2008 adds a very useful new statement called MERGE. This statement allows us synchronize two data sources. While that may not seem to have much use in a case like this, we can actually use it like an "UPSERT" where existing data is update and new data is inserted into a destination table.
declare @booklist xml;
select @booklist = bulkcolumn from openRowSet(bulk 'c:\booklist.xml',single_blob) as b;
with b2(id,sku,skuType,title,listPrice) as (
select t.c.value('./id[1]','tinyint')
, t.c.value('./sku[1]','nvarchar(15)')
, t.c.value('./skuType[1]','nvarchar(10)')
, t.c.value('./title[1]','nvarchar(100)')
, t.c.value('./listPrice[1]','money')
from @booklist.nodes('//book') as t(c))
merge into dbo.books as b1
using b2
on(b1.id = b2.id)
when matched then
update set
b1.sku = b2.sku
, b1.skuType = b2.skuType
, b1.title = b2.title
, b1.listPrice = b2.listPrice
when target not matched then
insert (id,sku,skuType,title,listPrice)
values (b2.id,b2.sku ,b2.skuType,b2.title,b2.listPrice);
go
select * from dbo.books;
go
The syntax of the MERGE statement takes a bit of getting used to, but once you have the hang of it, it's very useful.
Please feel free to send me a message using the link below if you are interested in downloading the example code and data.