THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Linchi Shea

Checking out SQL Server via empirical data points

Transactional replication and massive data updates

SQL Server transactional replication has been a rather solid feature and works well for delivering data to reporting servers (among other things) in near real time. That said, it may not work too well when you need to perform massive updates on a published table, for instance, when you need to archive a large amount of data. It can take a long time for all the changes to be replicated to the subscribers.

For some requirements, a perfect solution for this scenario would be to temporarily disable the transactional replication, perform the large scale data archiving concurrently on both the publisher and all the subscribers, and then resume the replication. That is, this would work very well if you can keep the published table from being modified by any other process for the duration.

But wait! This won’t work because there is no built-in SQL Server feature to disable/enable transactional replication so that it becomes completely transparent to data updates while it’s disabled.

A potential workaround is to remove the replication completely before making massive data changes and then re-create it afterwards. Before I actually started using this approach, it sounded to me like a rather dicey way of getting better performance.  A number of things could go horribly wrong to help ruin your otherwise peaceful weekend:

  • Replication you re-create afterwards may not be identical to what you have removed.
  • Your table(s) could get out of sync among the servers before you re-create replication.
  • Coordinating concurrent updates on multiple servers is inherently more complex than just updating the publisher.

But now that I have put this approach in practice and battle tested for many months, I can report that this can be a very effective approach to gaining performance in your archiving process. Here are some of the tips I’d like to share in order to achieve that effectiveness:

  • You must have a robust mechanism to quiesce all the processes that update your published data. If your data is updated (by a non-archiving process) on the publisher while replication is removed, your subscribers will be out of sync with the publisher and you must re-sync them with the publisher, which defeats the purpose of removing replication in the first place.
  • Once the data update processes are quiesced, you must verify that the tables are indeed in sync among the publisher and the subscribers. If this fails, there is no point of proceeding further.
  • You’ll need a program or a script to generate the T-SQL scripts for deleting and creating the replication. I do this with a C# program via RMO, but a Poweshell script will do just fine. The T-SQL scripts need to be generated immediately before you remove the replication so that any change to the replication is captured.
    • If you are archiving a single table or a few tables, you can just script out the replication setup on the table(s). In many cases, you are archiving a database and you should script out all the replication setup for that database.
    • Among other things, your replication script generator should take the name of the table or database as an input and automatically find out all the replication setup that needs to be scripted. This ensures that you don’t need to change the script generator as publications and/or subscriptions are being added, dropped, or modified.
    • You’ll also need a program to execute the generated replication delete scripts and another program to execute the generated replication create scripts. These programs need to be aware of where a given script must be executed. For instance, a script to create the pull subscript needs to be executed on the subscriber. My solution is to the replication script generator record this information. As it goes through each publication and pull subscription to generate scripts, the script generator records the generated T-SQL script locations and the target server names in a table. The programs that execute the generated scripts are driven by this table to find each script and what server on which that script needs to be executed.
  • Once the tables are verified to be in sync and the replication is removed (by executing the delete scripts), you can start to archive data concurrently on all the servers. This step is best driven by a common list of what need to be archived so that you are guaranteed that archiving will be done on all the server on exactly the same data, and at the end of archiving, you are guaranteed that the tables will still be in sync.
  • Since archiving is done on separate servers, things can obviously go wrong to crash it on one server but not on the others. Regardless what may go wrong, before you execute the generated T-SQL scripts to re-create the replication, you must verify that:
    • The archiving processes are indeed done on all the servers, and
    • The data is still in sync across all the servers
  • All the subscriptions need to be re-created with no-sync. Otherwise, you have just archived the subscriber for nothing. This means that the replication script generator needs to modify the generated T-SQL scripts so that the value of the @sync_type parameter for the sp_addsubscription procedure is set to ‘none’ regardless of how it is currently set.
  • To ensure that the re-created replication is the same as the original replication (before it’s removed), you can take a snapshot of the syspublications, sysarticles, and syssubscriptions tables before the replication is removed, and take another snapshot of the same three tables after the replication is re-created. By comparing the before and after replication configurations recorded in these three tables, you can determine whether you have changed any replication setup.
  • I would also suggest that you run a test change and see it get replicated from the publisher to the subscribers. Nothing gives you more comfy feeling than see it in action and see it works.
  • The distribution agents should run under the security context of the SQL Agent service account. Otherwise, because the generated scripts don’t include any password text, the script containing sp_addpushsubscription_agent or sp_addpullsubscription_agent will fail and no distribution agent would be created.

As mentioned, this approach has been proven to be effective in practice, and it is also quite maintainable. Still this amounts to regular schema changes with a lot of moving parts. It would be much better that replication can be disabled with a single DDL statement, just like you can disable an index or a constraint.

Published Thursday, October 06, 2011 1:19 AM by Linchi Shea

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Moham said:

Hi Linchi,

have you ever considered publishing stored procedures execution ?

It is much helpful when it comes to large batch operations instead of replicating the individual changes for each row, only . You can write the archiving code as a SP and publish its execution.

The above approach might work but , as you pointed, we need to ensure that nothing beside the the archiving operation works against published tables (setting DB to single_user maybe or disable remote connections)

Also , the subscribers validation operation (to ensure rows are in sync) can be slow with huge tables and checksum option.

The issue with mass tables updates is usually the Log Reader to post marked transactions from a database’s transaction log to distribution database.

In the case the articles got out of synch and it's very expensive to re-initialize , I would use Dist. agent "Continue On Data Consistency Errors" profile and sync. tables (TableDiff Utility or Red-Gate)

Cheers

October 16, 2011 3:49 AM
 

Linchi Shea said:

Hi Moham;

I did, and as a matter of fact, in one case we were replicating a purge stored procedure. The problem is that it doesn't give us the granularity of control over the process. And the execution of the proc on the subscriber has to wait until the execution of the proc on the publisher is finished, making it twice as long as executing the purge concurrently on all the servers.

October 17, 2011 3:56 PM
 

John Sansom said:

Interesting stuff.

I often have a need to perform similar Replication Admin tasks. My preference however is remove the given article(table) from Replication, perform the required large data modifications and then push a copy of the table data to the Subscribers using the Import/Export Wizard or SSIS. Naturally a quiesce exercise/data validation may need to be performed. The article is then added back to the Publication/Subscribers (also using the nosync option). This method has the advantage of enabling Replication to continue for all but the article/table being modified.

I actually believe these types of maintenance tasks could be more common in the wild than we may first think. I would love to see more support within the product for this type of administrative activity. Right now, it's all implemented via the replication stored procedures and a thorough understanding of all the parameters/options is needed.

I also second your advice with regard to testing, it's imperative when it comes to working with Replication.

Thanks for the great post.

October 24, 2011 11:51 AM
 

andrismg said:

Linchi Hello, I need to have the following scenario in my environment to SQL Server 2005 Enterprise Edition 32bit, 8GB RAM (6GB Dedicated to SQLServer), I have also a Database Mirroring implemented no Failover automatic activated (without the witness).

I am evaluating the possibility of creating a transactional replication because my client needs to have in your local server 7 tables of a database and 4 of another database to create your reporters system. My client only need to 'select' on your side and have the data updated at least every 1 hour daily.

I recommend this type of replication?

I need to take into account my server processing can raise a lot?

Thanks a million Linchi!

October 24, 2011 12:52 PM

Leave a Comment

(required) 
(required) 
Submit

About Linchi Shea

Checking out SQL Server via empirical data points

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement