Performance_Tuning_Waits_Queues ---2005

Memory pressure and IO subsystem issues

Latencies caused by disk to memory transfers frequently surface as PageIOLatch waits.  Memory pressure or disk IO subsystem issues can also increase PageIOLatch waits. When a user needs a page that is not in buffer cache, SQL Server has to first allocate a buffer page, and then puts a exclusive PageIOLatch_ex latch on the buffer while the page is transferred from disk to cache. Meanwhile, SQL Server puts a PageIOLatch_sh request on the buffer on behalf of the user. After the write to cache finishes, the PageIOLatch_ex latch is released.  This allows the user to read the buffer page after which the PageIOLatch_sh is released.  Consequently, high values for both PageIOLatch_ex and PageIOLatch_sh wait types can indicate IO subsystem issues.

Pertinent performance counters include Physical disk: disk seconds/read and Physical disk: disk seconds/write and SQL Server Buffer Manager: Page Life Expectancy. See counters for more information.

IO Stalls

The table valued dynamic management function, sys.dm_io_virtual_file_stats provides a breakdown of SQL Server reads, writes, and io_stalls for a particular database or transaction log file.  IO_stalls is the total cumulative time, in milliseconds, that users waited for I/O to be completed on the file since the last restart of SQL Server.

·         Select * from sys.dm_io_virtual_file_stats (dbid,file#)

·         Select * from sys.dm_io_virtual_file_stats (dbid,NULL) to list all files for a database.

 

If IO_stalls is inordinately high for one or more files, it is possible that there is either a disk bottleneck or that high reads and writes are occurring on one drive. Average IO Waits per read or write can distinguish between consistently high IO queues or a temporary IO spike. A significantly higher average value for IO stalls on one particular drive indicates consistently high IO requests. This should be corroborated with Performance Monitor counters Physical Disk: Average Disk Seconds/Read and Average Disk Seconds/Write.  The following script can also compute the Average Disk Seconds/Read and Average Disk Seconds/Write using sys.dm_io_virtual_file_stats.

---- average stalls per read, write and total

---- adding 1.0 to avoid division by zero errors

select database_id, file_id

    ,io_stall_read_ms

    ,num_of_reads

    ,cast(io_stall_read_ms/(1.0+num_of_reads) as numeric(10,1)) as 'avg_read_stall_ms'

    ,io_stall_write_ms

    ,num_of_writes

    ,cast(io_stall_write_ms/(1.0+num_of_writes) as numeric(10,1)) as 'avg_write_stall_ms'

    ,io_stall_read_ms + io_stall_write_ms as io_stalls

    ,num_of_reads + num_of_writes as total_io

    ,cast((io_stall_read_ms+io_stall_write_ms)/(1.0+num_of_reads + num_of_writes) as numeric(10,1)) as 'avg_io_stall_ms'

from sys.dm_io_virtual_file_stats(null,null)

order by avg_io_stall_ms desc

Missing or poorly formed indexes

Missing or poorly formed indexes can also cause excessive memory pressure or cache flushes. In some cases, SQL Server 2005 optimizer identifies potentially useful indexes to benefit a specific query (figure 1). The computed benefit of the index can be seen in the column avg_user_impact (percentage improvement with suggested index). It should be noted that this benefit applies to the individual query only where the maintenance cost is borne by inserts, updates, and delete operations.

The following is a list of useful indexes.

 

-- Potentially Useful Indexes

select d.*

        , s.avg_total_user_cost

        , s.avg_user_impact

        , s.last_user_seek

        ,s.unique_compiles

from sys.dm_db_missing_index_group_stats s

        ,sys.dm_db_missing_index_groups g

        ,sys.dm_db_missing_index_details d

where s.group_handle = g.index_group_handle

and d.index_handle = g.index_handle

order by s.avg_user_impact desc

go

--- suggested index columns and usage

declare @handle int

 

select @handle = d.index_handle

from sys.dm_db_missing_index_group_stats s

        ,sys.dm_db_missing_index_groups g

        ,sys.dm_db_missing_index_details d

where s.group_handle = g.index_group_handle

and d.index_handle = g.index_handle

 

select *

from sys.dm_db_missing_index_columns(@handle)

order by column_id

Largest IO queries.

The underlying purpose of an index suggested by sys.dm_db_missing_index_columns, is to avoid doing large amounts of IO for the query in question. Therefore, you can expect such queries to rank among the highest IO queries. To find the highest IO queries, you can use the following sample code:

--- top 50 statements by IO

SELECT TOP 50

        (qs.total_logical_reads + qs.total_logical_writes) /qs.execution_count as [Avg IO],

        substring (qt.text,qs.statement_start_offset/2,

         (case when qs.statement_end_offset = -1

        then len(convert(nvarchar(max), qt.text)) * 2

        else qs.statement_end_offset end -    qs.statement_start_offset)/2)

        as query_text,

    qt.dbid,

    qt.objectid

FROM sys.dm_exec_query_stats qs

cross apply sys.dm_exec_sql_text (qs.sql_handle) as qt

ORDER BY [Avg IO] DESC

Query optimizer, query plans and statistics

The SQL Server 2005 Query Optimizer compiles a query plan for a user’s query.  It is the job of the SQL Server Query Optimizer to determine the lowest cost strategy that will be used to retrieve or modify the data.  The query plan contains the strategy or series of steps to be executed in the query plan.

 

For more information about SQL Server 2005 Query Optimizer, Query plans, and Statistics refer to http://www.microsoft.com/technet/prodtechnol/sql/2005/qrystats.mspx.

Query plan reuse and DMVs

High query plan reuse is important for OLTP applications where there are many identical transactions. The advantage of plan reuse means you will not incur the CPU cost of optimization for each execution of the same plan. The statements with the lowest plan reuse can be found using DMVs as follows:

--- DMV reports statements with lowest plan reuse

---

SELECT TOP 50

        qs.sql_handle

        ,qs.plan_handle

        ,cp.cacheobjtype

        ,cp.usecounts

        ,cp.size_in_bytes 

        ,qs.statement_start_offset

        ,qs.statement_end_offset

        ,qt.dbid

        ,qt.objectid

        ,qt.text

        ,SUBSTRING(qt.text,qs.statement_start_offset/2,

             (case when qs.statement_end_offset = -1

            then len(convert(nvarchar(max), qt.text)) * 2

            else qs.statement_end_offset end -qs.statement_start_offset)/2)

        as statement

FROM sys.dm_exec_query_stats qs

cross apply sys.dm_exec_sql_text(qs.sql_handle) as qt

inner join sys.dm_exec_cached_plans as cp on qs.plan_handle=cp.plan_handle

where cp.plan_handle=qs.plan_handle

and qt.dbid = db_id()    ----- put the database ID here

ORDER BY [Usecounts] ASC

Query plan reuse and Performance counters

The Perfmon object SQL Server:SQL Statistics contains counters that can be used to compute plan reuse. The idea is to compare batch requests to initial compilations. An initial compilation occurs when a plan is not found in cache. OLTP applications should have high plan reuse, > 90%.

Initial Compilations = SQL Compilations/sec – SQL Re-Compilations/sec

Plan reuse = (Batch requests/sec – Initial Compilations/sec) / Batch requests/sec

Memory pressure can cause query plans to be discarded and therefore result in reduced plan reuse. See OLTP Blueprints for memory pressure.

Statement level recompilation and DMVs

In SQL Server 2005, individual statements inside a stored procedure can be recompiled. The recompilation feature can be beneficial. For example, assume that you have a stored procedure that creates a table called MyTable, populates the table, and then joins MyTable to other tables. Because these operations occur AFTER the initial compile, the final row size and rowcount of MyTable is not known until run time. It would be possible that when MyTable is joined, it can contain 1 million rows. SQL Server tracks the statistics of MyTable and recompiles the join statement to take advantage of the new statistics for MyTable. For more information about SQL Server 2005 Optimizer and Statistics refer to http://www.microsoft.com/technet/prodtechnol/sql/2005/qrystats.mspx

Recompilation is not always a good idea; for example, when the recomplied plan is the same as the orginal plan. In these cases, you will want to identify the recompiled statements. For more information about SQL Server 2005 recompilation refer to http://www.microsoft.com/technet/prodtechnol/sql/2005/recomp.mspx.

The following script returns statements that have been recompiled.

---- Recompilation and SQL.sql

----     (plan_generation_num) and sql statements

---- A statement has been recompiled WHEN the plan generation number is incremented

----

select top 25

    --sql_text.text,

    sql_handle,

    plan_generation_num,

    substring(text,qs.statement_start_offset/2,

             (case when qs.statement_end_offset = -1

            then len(convert(nvarchar(max), text)) * 2

            else qs.statement_end_offset end - qs.statement_start_offset)/2)

        as stmt_executing,

    execution_count,

    dbid,

    objectid

from sys.dm_exec_query_stats as qs

    Cross apply sys.dm_exec_sql_text(sql_handle) sql_text

where plan_generation_num >1

order by sql_handle, plan_generation_num

Tracking down blocking issues.

Long blocks can be trapped with Profiler and sp_configure. The blocked process threshold, set with sp_configure “blocked process threshold”, is the mechanism for reporting any blocks that exceed this configured number of seconds. Make sure that the threshold is not set too low as it can capture false positives. After setting the blocked process threshold using sp_configure, Profiler is then used to capture the blocker and blocked statements using the Errors and Warnings object, Blocked Process Report event.

In order to see the main objects of blocking contention, the following code lists the table and index with most blocks:

----Find Row lock waits

declare @dbid int

select @dbid = db_id()

Select dbid=database_id, objectname=object_name(s.object_id)

, indexname=i.name, i.index_id      --, partition_number

, row_lock_count, row_lock_wait_count

, [block %]=cast (100.0 * row_lock_wait_count / (1 + row_lock_count) as numeric(15,2))

, row_lock_wait_in_ms

, [avg row lock waits in ms]=cast (1.0 * row_lock_wait_in_ms / (1 + row_lock_wait_count) as numeric(15,2))

from sys.dm_db_index_operational_stats (@dbid, NULL, NULL, NULL) s,     sys.indexes i

where objectproperty(s.object_id,'IsUserTable') = 1

and i.object_id = s.object_id

and i.index_id = s.index_id

order by row_lock_wait_count desc

 

Notice the average block time reported in the above script is in milliseconds. You must convert the average block time to seconds in order to set the ‘blocked process threshold’ with sp_configure.  This should give you a hint on how to set the sp_configure ‘blocked process threshold’ if you are not sure where to start. Remember not to set the blocked process threshold (seconds) too low as this will generate false positives.  The blocked process threshold fires a trace event (Blocked Process Report) for any block that exceeds the configured number of seconds.

Retrieving statements in the waiter list

The stored procedure get_statements_in_waiter_list reports statements in the waiter list that matches the optional parameter @wait_list. If @wait_list is NULL, it lists ALL statements in the waiter list. It can be run at any time to capture statements waiting in the waiter list. For example, you can list any statement waiting for parallelism as follows:

Exec get_statements_in_waiter_list @wait_type = ‘CXPACKET’