THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Jamie Thomson

This is the blog of Jamie Thomson, a freelance data mangler in London

Service Broker, not ETL

I have been very quiet on this blog of late and one reason for that is I have been very busy on a client project that I would like to talk about a little here.

The client that I have been working for has a website that runs on a distributed architecture utilising a messaging infrastructure for communication between different endpoints. My brief was to build a system that could consume these messages and produce analytical information in near-real-time. More specifically I basically had to deliver a data warehouse however it was the real-time aspect of the project that really intrigued me.

This real-time requirement meant that using an Extract transformation, Load (ETL) tool was out of the question and so I had no choice but to write T-SQL code (i.e. stored-procedures) to process the incoming messages and load the data into the data warehouse. This concerned me though – I had no way to control the rate at which data would arrive into the system yet we were going to have end-users querying the system at the same time that those messages were arriving; the potential for contention in such a scenario was pretty high and and was something I wanted to minimise as much as possible. Moreover I did not want the processing of data inside the data warehouse to have any impact on the customer-facing website. As you have probably guessed from the title of this blog post this is where Service Broker stepped in!

For those that have not heard of it Service Broker is a queuing technology that has been built into SQL Server since SQL Server 2005. It provides a number of features however the one that was of interest to me was the fact that it facilitates asynchronous data processing which, in layman’s terms, means the ability to process some data without requiring the system that supplied the data having to wait for the response. That was a crucial feature because on this project the customer-facing website (in effect an OLTP system) would be calling one of our stored procedures with each message – we did not want to cause the OLTP system to wait on us every time we processed one of those messages. This asynchronous nature also helps to alleviate the contention problem because the asynchronous processing activity is handled just like any other task in the database engine and hence can wait on another task (such as an end-user query).

Service Broker it was then! The stored procedure called by the OLTP system would simply put the message onto a queue and we would use a feature called activation to pick each message off the queue in turn and process it into the warehouse. At the time of writing the system is not yet up to full capacity but so far everything seems to be working OK (touch wood) and crucially our users are seeing data in near-real-time. By near-real-time I am talking about latencies of a few minutes at most and to someone like me who is used to building systems that have overnight latencies that is a huge step forward!

So then, am I advocating that you all go out and dump your ETL tools? Of course not, no! What this project has taught me though is that in certain scenarios there may be better ways to implement a data warehouse system then the traditional “load data in overnight” approach that we are all used to. Moreover I have really enjoyed getting to grips with a new technology and even if you don’t want to use Service Broker you might want to consider asynchronous messaging architectures for your BI/data warehousing solutions in the future.

This has been a very high level overview of my use of Service Broker and I have deliberately left out much of the minutiae of what has been a very challenging implementation. Nonetheless I hope I have caused you to reflect upon your own approaches to BI and question whether other approaches may be more tenable. All comments and questions gratefully received!

Lastly, if you have never used Service Broker before and want to kick the tyres I have provided below a very simple “Service Broker Hello World” script that will create all of the objects required to facilitate Service Broker communications and then send the message “Hello World” from one place to anther! This doesn’t represent a “proper” implementation per se because it doesn’t close down down conversation objects (which you should always do in a real-world scenario) but its enough to demonstrate the capabilities!

@Jamiet

-----------------------------------------------------------------------------------------------

/*This is a basic Service Broker Hello World app. Have fun!
-Jamie
*/

USE MASTER
GO
CREATE DATABASE SBTest
GO
--Turn Service Broker on!
ALTER DATABASE SBTest SET ENABLE_BROKER
GO
USE SBTest
GO
-- 1) we need to create a message type. Note that our message type is
-- very simple and allowed any type of content
CREATE MESSAGE TYPE HelloMessage
VALIDATION = NONE
GO

-- 2) Once the message type has been created, we need to create a contract
-- that specifies who can send what types of messages
CREATE CONTRACT HelloContract
(HelloMessage SENT BY INITIATOR)
GO
--We can query the metadata of the objects we just created
SELECT * FROM   sys.service_message_types WHERE name = 'HelloMessage';
SELECT * FROM   sys.service_contracts WHERE name = 'HelloContract';
SELECT * FROM   sys.service_contract_message_usages
WHERE  service_contract_id IN (SELECT service_contract_id FROM sys.service_contracts WHERE name = 'HelloContract')
AND       
message_type_id IN (SELECT message_type_id FROM sys.service_message_types WHERE name = 'HelloMessage');

-- 3) The communication is between two endpoints. Thus, we need two queues to
-- hold messages
CREATE QUEUE SenderQueue
CREATE QUEUE ReceiverQueue
GO
--more querying metatda
SELECT * FROM sys.service_queues WHERE name IN ('SenderQueue','ReceiverQueue');
--we can also select from the queues as if they were tables
SELECT * FROM SenderQueue  
SELECT * FROM ReceiverQueue  

-- 4) Create the required services and bind them to be above created queues
CREATE SERVICE Sender
 
ON QUEUE SenderQueue
CREATE SERVICE Receiver
 
ON QUEUE ReceiverQueue (HelloContract)
GO
--more querying metadata
SELECT * FROM sys.services WHERE name IN ('Receiver','Sender');

-- 5) At this point, we can begin the conversation between the two services by
-- sending messages
DECLARE @conversationHandle UNIQUEIDENTIFIER
DECLARE
@message NVARCHAR(100)

BEGIN
  BEGIN TRANSACTION
;
 
BEGIN DIALOG @conversationHandle
       
FROM SERVICE Sender
       
TO SERVICE 'Receiver'
       
ON CONTRACT HelloContract WITH ENCRYPTION=OFF
 
-- Send a message on the conversation
 
SET @message = N'Hello, World';
 
SEND  ON CONVERSATION @conversationHandle
       
MESSAGE TYPE HelloMessage (@message)
 
COMMIT TRANSACTION
END
GO
--check contents of queues
SELECT * FROM SenderQueue  
SELECT * FROM ReceiverQueue  
GO
-- Receive a message from the queue
RECEIVE CONVERT(NVARCHAR(MAX), message_body) AS MESSAGE
FROM
ReceiverQueue
GO
--If no messages were received and/or you can't see anything on the queues you may wish to check the following for clues:
SELECT * FROM sys.transmission_queue

-- Cleanup
DROP SERVICE Sender
DROP SERVICE Receiver
DROP QUEUE SenderQueue
DROP QUEUE ReceiverQueue
DROP CONTRACT HelloContract
DROP MESSAGE TYPE HelloMessage
GO

USE MASTER
GO
DROP DATABASE SBTest
GO

Published Tuesday, June 14, 2011 7:58 PM by jamiet

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Víctor M said:

Hi Jamiet.. once again a great article. Keep sharing your experience with us.

Regards!

June 14, 2011 2:24 PM
 

Matt Masson said:

Jeff Bernhardt did a great presentation about the different Data Integration technologies in the Microsoft stack, and when to make use of them.

http://channel9.msdn.com/Events/TechEd/NorthAmerica/2010/BIE202

He talks about using Service Broker around minute 23...

June 14, 2011 2:34 PM
 

suhail ali said:

Great article.  I'm curious if you did all the typical DW logic all in the stored procedure called by  the service broker.  This would include things like looking up surrogate key, handling late or in your just-in-time dimension processing, data cleaning and so forth.  I would think some type of service broker to SSIS integration would be ideal for a real-time DW solution.

June 14, 2011 2:39 PM
 

jamiet said:

Hi Suhail,

Good question. In short, no, all that the activation stored proc does is take a message off the queue and pass it onto a second stored proc that determines what type of message it is. Depending on the message type it gets passed to a stored proc that handles that message type (s we have one stored proc per message type). In short, everything is split into many stored procs.

Regards

jamie

June 14, 2011 4:37 PM
 

Davide Mauri said:

I just finished a consulting work where I talk of the very same thing. Service Broker is simply perfect when you have to do real-time integration and be sure that you can scale out *a lot*.

I also add that using it in conjuction with SQLCLR allows to create killer-applications :)

June 15, 2011 3:48 AM
 

Claire said:

Agreed re: using asynchronous messaging--it's much safer.

@Davide Isn't CLR the only way to make a trigger asynchronous?

June 15, 2011 10:10 AM
 

jamiet said:

Hiya Claire,

Interesting that you say its "safer". That's not a word I used - why do you say that?

Just interested to know, that's all.

JT

June 15, 2011 10:15 AM
 

Claire said:

Hi Jamie,

I think of async as safer because it lets your systems distribute processing more evenly, so your server isn't over-taxed at one point and under-utilized at another. Also, even if the system of the record app fails, everyone using that data is safe because it's also cached locally.

But if you think otherwise, please let me know!

June 15, 2011 10:33 AM
 

jamiet said:

Hey Claire,

Nope, I agree with everything you said there. I suspected that's what you meant - just wanted you to clarify that's all!

JT

June 15, 2011 11:00 AM
 

Davide Mauri said:

@Claire

When I need to do async jobs in a trigger I just use the Service Broker. The rare case when I need to call a Webservice, to geodecode data for example, I just use this technique. Is very easy to implement, quick and also easy to maintain.

You can probably do something even more powerful with SQLCLR but you really need to do some very specific to do that, otherwise the Service Broker is just the way to go, IMHO.

June 15, 2011 5:49 PM
 

Simon Munro said:

Welcome to the dark side Jamie. I have been looking forward to service broker nuggets instead of SSIS nuggets

June 17, 2011 6:56 AM
 

Rich said:

Nice stuff Jamie. I've just been "kicking the tyres" of SB over the past couple of months and have been pretty impressed with it. Similarly, its the async side of things that caught my eye and i've been meaning to blog about my simple implementation. Looks like I may not have to :-)

June 20, 2011 11:59 AM
 

Dave P said:

Isn't this what StreamInsight is for?

July 3, 2011 1:20 AM
 

jamiet said:

Dave P,

"Isn't this what StreamInsight is for?"

I presume you mean real-time BI/reporting/analytics. yes, it is, but we didn't use it for 2 reasons:

1) Unfamiliarity. We had a short time window in which to implement this and StreamInsight was considered too much of a risk given we had no experience of using it.

2) Historial reporting was also required - hence however we did this some form of data warehouse would be required somewhere.

3) Although i said "real-time", a small amount of latency was acceptable - we didn't require sub-second, or even sub-minute, latency.

Regards

Jamie

July 3, 2011 8:18 PM
 

Khalid Flynn said:

Cheers for the insight Jamie (no StreamInsight pun I promise!), and the simple example.

July 7, 2011 5:39 PM
 

Roddy said:

Hi Jamie,

I've been following your posts for a while now Jamie and I have to thank you for all the info you provide to the community. After reading your post about Queues I started playing with them and the first thing I tried was the SSIS task "Message Queue". In a (very) basic test scenario seems to work ok. In the analysis phase, did you think about consuming the Queue with SSIS (e.g. Message Queue task inside an infinite loop)?

Many thanks

Roddy

July 11, 2011 7:23 AM
 

jamiet said:

Hi Roddy,

We didn't consider that, no. For our purposes SSIS was considered too "heavyweight". What I mean by that is; there is a lot of extra stuff around the fringes with SSIS that is good in a batch scenario but not in the message-oriented scenario in which we found ourselves. Plus, running something in an infinite loop rather than a ub/sub model like Service Broker activation doesn't strike me as being particularly efficient.

Regards

Jamie

July 11, 2011 8:11 AM
 

Roddy said:

Hi Jamie,

Thanks for your reply. I understand what you mean by heavyweight and I agree. By Infinite loop I mean the Message Task will be waiting for new messages to arrive, once a new message arrives it processes it and then waits for the next to arrive, it wouldn't be doing anything else (apart from being in memory of course) but as you say it is not the most efficient way of doing it.

As always I guess it depends on what you need to do and measure Simplicity/Performance/Maintainability etc.

Many thanks

Roddy

July 11, 2011 9:06 AM
 

Miron said:

CDC combined with indexed views and well tuned timing of delta grooming supports near real data warehouse very nicely.

May 7, 2013 8:55 AM

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement