Contention is the word that you need to concentrate in resolving the performance issue, page latches contention is common in multi-CPU systems due to the nature of when multiple threads concurrently attempt to acquire incompatible latches to the same in-memory structure.
Going back to basiscs a latch is an internal mechanism that is used by SQL DB engine automaticatlly to determine when to use it. By design they are deterministic and any behavior of schema design can affect the issue.One of the Technet article clearly refers about how SQL Server uses the latches:
- A page in SQL Server is 8KB and can store multiple rows. To increase concurrency and performance, buffer latches are held only for the duration of the physical operation on the page, unlike locks which are held for the duration of the logical transaction.
In general there is a mis-representation that latches means locks, but not both are different to each other. Latch is used to provide consistency of in-memory structures, this means they are automatic on the SQL Server when multi-CPU system is used to allow maxium concurrency and provide maximum performance. By design they are held only for the duration of the physical operation on the database.To know more about how the system uses the latches the two DMVs: sys.dm_os_latch_stats about non-buffer latch waits & sys.dm_os_latch_stats about aggregated waits for each index are helpful, the later is helpful for troubleshooting latch related performance issues.
Recently I had a discussion with one of CSS engineers about a performance problem that is related to latches issue at a customer site, and the explanation that I recieved from that engineer was as follows: In a heavy OLTP system with high concurrency workload you will observer many number of page splits with throughput that will certainly leads to performance degradation. Since SQL 2005 SuperLatches have been introducted which can enable increased performance for accessing the shared pages in the database for a multiple concurrency workload which intrun requires worker threads require SH (shared) latches. No need to set or enable any configuration this is performed automatically/dynamically based on the mutli-CPU configuration to promote a latch on a page to be a super-latch. Then this SuperLatch partitions the single latch into array of sublatch structure per CPU core.
To see more information and resource information on Latches invoke PERFMON *SYSMON tool to look at the SQL Server:Latches object with associated counters for IO, CPU and memory to gather information about SuperLatches, including the number of SuperLatches, SuperLatch promotions per second, and SuperLatch demotions per second. Read this SQL Server:Latches object documentation and also more about SuperLatches refer to How It Works: SQL Server SuperLatching / Sub-latches documentation. Also we can use sys.dm_os_wait_stats
DMV to obtain information on Buffer, Non-buffer and IO latches information with a read about Analyzing I/O Characteristics and Sizing Storage Systems reference.
The required tools that I use to diagnose the latch related issues are PERFMON (SYSMON), DMVs and SQL Server error log dumps (get RML utilities too). Use this query to find waiting tasks:
oswt.session_id, oswt.wait_type, er.last_wait_type AS last_wait_type, oswt.wait_duration_ms , oswt.blocking_session_id, oswt.blocking_exec_context_id, resource_description FROM sys.dm_os_waiting_tasks oswt JOIN sys.dm_exec_sessions es ON oswt.session_id = es.session_id JOIN sys.dm_exec_requests er ON oswt.session_id = er.session_id WHERE es.is_user_process = 1
Analyze current wait buffer latches for both PAGEIOLATCH_ or PAGELATCH_ types:
oswt.session_id, oswt.wait_type , er.last_wait_type AS last_wait_type , oswt.wait_duration_ms , oswt.blocking_session_id, oswt.blocking_exec_context_id, resource_description FROM sys.dm_os_waiting_tasks oswt JOIN sys.dm_exec_sessions es ON oswt.session_id = es.session_id JOIN sys.dm_exec_requests er ON oswt.session_id = er.session_id WHERE es.is_user_process = 1 AND oswt.wait_type <> ‘SLEEP_TASK’ ORDER BY oswt.wait_duration_ms
Later I will update the scenario in resolving the issues for a particular type of queries that are generated for a workload.
Satya Shyam K Jayanty