Today I’m going discuss about an issue that I’ve experienced with SQL Server snapshot job in one of our environments . To give you an overview before I start on the actual problem; we have mirroring setup and on mirror server snapshotjob scheduled to run 4 times a day as per client requirement and is used for reporting purpose. Approximate run time isn’t more than ~2 minutes from history but on this occasion it went on running for ~50 minutes and didn’t complete.
I did check that mirroring is in sync, there is no blocking on snapshot job thread and query state is varying between runnable and running so everything seemed to be is fine as per initial look. Now, question was quite obvious that if mirroring is in sync, there are no blockings and the job got successful in 2 minutes (max) or even less time in all previous schedules then what has happened in between which is holding job to complete this time; definitely there is something what we are missing out here.
One thing clicked on my mind i.e. mostly we have our maintenance jobs scheduled on weekends and re-indexing is the one which generates large transaction logs. Just to mention here this database is quite big one in size i.e. around 700 Gigs and I realized though mirroring shown as Synchronized nothing wrong if I can have a look into Unrestored log and Time to restore log on mirroring monitor. Do as you think, so when checked following screenshot was something which I observed;
You can clearly see here that plenty of log records that were to be applied to mirror database and an approximate time to restore log records shown up on mirroring monitor. Unless they are fully applied to mirror database we cannot say it is in SYNC. with principal. As obvious, I did not have any other options than allowing mirror database to catch up with principal and check mirroring monitor to get an idea on Unrestored log and Time to restore log. Once they shown as zero (screenshot below), snapshot job completed successfully.
We can monitor unrestored log etc. through perfmon counter Redo Queue KB and so on however we should be fine with mirroring monitor.
I think learning here is not to run a snapshot job with re-indexing which generates considerable amount of log records and depends on your database size. In case we get to see Unrestored log on mirroring monitor, we should disable snapshot job (if not running) and allow mirror database to catch up with principal. Only when Unrestored log shown as zero we should be fine with running snapshot job to ensure we do not run into such issues.
I’m very keen to hear if anyone of you has experienced similar issues on your environments and how it was resolved.
Database Mirroring and Database Snapshots : http://technet.microsoft.com/en-us/library/ms175511.aspx
SQL Server, Database Mirroring Object : http://technet.microsoft.com/en-us/library/ms189931.aspx