SQL Server Fine – Tuning Transactional Replication Performance – Part 2

Hope you have liked the first part of SQL Server fine – tuning transactional replication performance. In case you have not been through it you can check it here .

In this part we will talk about improving distribution agent throughput by modifying the profile parameters .it’s important to note that the changes that you make in the log reader agent might increase latency from distributor to subscriber if you leave the distribution agent unchanged .This is because if you increase the log reader throughput only, you will be hitting the distribution database with more data.

I would suggest that before making changes in the agent profiles, always check for other bottlenecks like IO contention, Memory and CPU bottlenecks, fragmented indexes, stale statistics etc . Without removing these bottlenecks ,you won’t be able to leverage much out of profile parameter changes .In fact you might degrade performance more .

As I stated in my previous blog, all of these setting have been practiced by me on large enterprise servers where the databases are between many GBs to TBs (largest being 7TB).

-CommitBatchSize : Like –ReadBatchSize , -CommitBatchSize is the number of transactions carried out to the subscriber in one batch before a commit is fired. Increasing this value means committing more in one batch .Unlike the Log Reader agent, a publication has a seperate distribution agent for each Subscriber .So , Distribution agent might have to distribute more than the log reader agent in case same article is distributed to more than one subscriber .So this setting is to be changed for all the distribution agents you want to work more .The default is 100 and I would advise you to increase it with 10 at a time .You will see the entry like the one below in the verbose log :

100 transaction(s) with 1000 command(s) were delivered.

Once you reach the level of breaking the distribution agent capacity you will start getting the below error as a signal that you should stop now and the distribution agent is (or agents are) bleeding.

The replication agent has not logged a progress message in %ld minutes. This might indicate an unresponsive agent or high system activity. Verify that records are being replicated to the destination and that connections to the Subscriber, Publisher, and Distributor are still active.

For more information refer: http://technet.microsoft.com/en-us/library/ms152484.aspx 

   

-CommitBatchThreshold : This setting is the partner of –ReadBatchThreshold setting in log reader agent. I always prefer to increase this parameter first .This parameter affects the number of commands to be picked up in one transaction (in one single commit) at the subscriber .The default is 1000 for each agent and I would increase it by 100 in one go .Again, stop incrementing once you see that the agent is showing stress (see above example) .Verbose log will show you this entry :100 transaction(s) with 1000 command(s) were delivered.

 -SubscriptionStreams : This setting is a boon as well as well as curse .However, I treat is mostly a boon .This setting breaks the batch in to multiple parallel connections (threads) with each connection taking a part of batch. This keeps the subscriber database transactionally inconsistent with publisher until the complete transaction reaches the subscriber .This setting immensely helpful for the environments with very high transactions per seconds and faster SANs /Networks. However , if a single stream fails , everything rolls back and starts afresh in single stream .You will have to again set it back to multiple stream .This will cause more latency for some time because there are new transactions that are flowing in Plus the transactions that have rolled back + Plus the time till the agent was delivering in single stream mode .For more information please refer : http://support.microsoft.com/kb/953199 .Mostly the reason is long term blocking .

On busy systems you might see short burst of blocking due to subscriptionStreaming but you can ignore it as the waittime is very less .In case the waitime is more , then there are other resource bottlenecks (most probably hardware bottlenecks) .The values can be between 0 and 64 .I would prefer using half of the cores initially . Also the order in which transactions are received at the Subscriber may differ from the order in which they were made at the Publisher. This might create some problem .Please refer this article before implementing subscription stream: http://technet.microsoft.com/en-us/library/ms147373.aspx .

This is an advanced setting and hence I would suggest you to do a thorough testing before rolling this to production.

Happy Learning !
Regards

Abhay Chaudhary

Like us on FaceBook Follow us on Twitter

Follow me on FaceBook | Join the fastest growing SQL Server group on FaceBook

   

2 Comments on “SQL Server Fine – Tuning Transactional Replication Performance – Part 2”

  1. Hi,

    An excellent post sir.

    I’m wondering if perhaps the following sentance: “Unlike Log Reader agent, every publication has one distribution agent .” should instead read something like: “Unlike the Log Reader agent, a publication has a seperate distribution agent for each Subscriber.” ?

Leave a Reply

Your email address will not be published.