Troubleshooting performance issues in the heat of the moment when you have managers, engineers, developers, and the occasional stray C-level executive panting anxiously down the back of your neck as they sweat dollar bills is a high-pressure situation and as a result two things are often possible:

  1.  Incorrect decisions are made as a result of poor diagnostic data at your disposal.
  2. You miss collecting all the diagnostic information you need to provide that well-informed recommendation.

I recently encountered a situation where that came to light while working with a client's DBA.  While working on an unrelated issue for them I was sent a message that amounted to the following: "Oh my! Look at this query that is producing over two billion reads against the database!" 

Folks, that will get your attention.

I looked at the query results they sent over and the metadata looked a bit suspect as well as incomplete.  The query was indeed producing those level of reads but those were over the course of time since the DMV metadata was last cleared or the query plan dropped out of cache - whichever happened last.  It's not that each time the query was executed it wa producing two billion reads but rather over the course of its "lifespan" it was producing that level of activity.  Furthermore there were not enough metrics in the supplied result set to provide any insight - there was no associated execution plan xml nor was there execution count metadata (besides other desirable metrics) that could provide an informed decision.  When I raised this issue to the DBA I was then given another set of results that gave me the query plan and executing statements.  However when I started to dig into the issue to get some idea of what was going on the plan was not showing me where the query plan was abusing the buffer pool and disk by requesting these pages.  That's when I pulled out my own comprehensive query and ran it against the instance.  It gave me a completely different story (and target) to focus on.

The Right Tools for the Job

It's important that the DBA - like any other profession - has the right tools (scripts) for the job.  The DBA and I were making incorrect decisions because of a tool issue - a bad set of scripts that had us looking the the wrong direction.  I told the DBA we needed to take a step back and use a tool I've kept in my toolbelt for a while now and have provided below.  It's actually two queries based upon two different Dynamic Management Views at their core: one targeting query execution statistics (which may include both ad hoc and prepared statement execution activity) and the other that isolated execution metrics for only stored procedures.  By using the two scripts appropriately you get a comprehensive view of queries and stored procedures that are causing the most amount of issues on your instance whether those issues are:

  • High reads
  • High writes
  • prolonged execution times
  • High CPU

These two scripts are customizable for each situation by simply de-commenting the appropriate ORDER BY clause for your needs.  Additionally you can de-comment the WHERE clause in either script and then replace the template parameter to filter on database name if you want to narrow your focus from the instance level of granularity to that of the database scope.  In each query you'll see returned all statistics pertaining to:

  • Descriptive identification (name, code, query plans)
  • Read activity (logical or physical where available)
  • Write activity
  • Historical information for returned rows volume
  • CPU metrics
  • Execution time, counts, duration
  • In the case of query analysis you'll also get plan cache lifespan data

This information is then sub-divided into min, max, average, and total for each of those sub-sections.

The Query Code and Sample Results

Below I've provided the code for both analysis queries - first that one which is query-centric followed by the one associated/focused solely on stored procedures.  The code is quite extensive but you'll easily be able to copy/paste this into your own environments for use (at your discretion and responsibility of course.)  I've gone so far as to wrap these queries inside stored procedures for my use.  As always the results have been altered to protect myself and my customers.  The metrics are all from real-world scenarios however all plans, statement execution code and names have been changed.  In order to better display the results I've first exported them into Excel for display.

Analyze High-Impact Queries for a SQL Server Instance:

The heart of this query is sourced from the sys.dm_exec_procedure_stats DMV that stores execution data for each cached query run in the instance.

 

SELECT TOP 25 db_name(eST.[dbid]) AS [database]
        , eQS.execution_count

        -- CPU
        , eQS.min_worker_time AS [min_cpu]
        , eQS.max_worker_time AS [max_cpu]
        , eQS.total_worker_time/ISNULL(eQS.execution_count, 1) AS [avg_cpu]
        , eQS.last_elapsed_time AS [last_cpu]
        , eQS.total_worker_time AS [total_cpu]

        -- ELAPSED TIME
        , eQS.min_elapsed_time AS [min_duration]
        , eQS.max_elapsed_time AS [max_duration]
        , eQS.total_elapsed_time/ISNULL(eQS.execution_count, 1) AS [avg_duration]
        , eQS.last_elapsed_time AS [last_duration]
        , eQS.total_elapsed_time AS [total_duration]  

        -- LOGICAL READS
        , eQS.min_logical_reads AS [min_logical_reads]
        , eQS.max_logical_reads AS [max_logical_reads]
        , eQS.total_logical_reads/ISNULL(eQS.execution_count, 1) AS [avg_logical_reads]
        , eQS.last_logical_reads AS [last_logical_reads]
        , eQS.total_logical_reads

        -- PHYSICAL READS
        , eQS.min_physical_reads AS [min_physical_reads]
        , eQS.max_physical_reads AS [max_physical_reads]
        , eQS.total_physical_reads/ISNULL(eQS.execution_count, 1) AS [avg_physical_reads]
        , eQS.last_physical_reads AS [last_physical_reads]
        , eQS.total_physical_reads

        -- LOGICAL WRITES
        , eQS.min_logical_writes AS [min_writes]
        , eQS.max_logical_writes AS [max_writes]
        , eQS.total_logical_writes/ISNULL(eQS.execution_count, 1) AS [avg_writes]
        , eQS.last_logical_writes AS [last_writes]
        , eQS.total_logical_writes AS [total_writes]

        --ROW COUNTS
        , eQS.min_rows AS [min_rows]
        , eQS.max_rows AS [max_rows]
        , eQS.total_rows/ISNULL(eQS.execution_count, 1) AS [avg_rows]
        , eQS.last_rows AS [last_rows]
        , eQS.total_rows

        -- CACHE & EXEC TIMES
        , eQS.last_execution_time
        , eQS.creation_time
        , DATEDIFF(Minute, eQS.creation_time, GetDate()) AS 'minutes_in_cache'
        , eQS.execution_count/ISNULL(DATEDIFF(Minute, NULLIF(eQS.creation_time,0), GetDate()), 1) AS [calls/minute]
        , eQS.execution_count/ISNULL(DATEDIFF(Second, NULLIF(eQS.creation_time,0), GetDate()), 1) AS [calls/second]

        --STATEMENTS AND QUERY TEXT DETAILS
        , eST.text AS [batch_text]
        , SUBSTRING
                (
                        eST.text, (eQS.statement_start_offset/2) + 1
                                ,
                                        (
                                                (
                                                        CASE eQS.statement_end_offset  
                                                                WHEN -1 THEN DATALENGTH(eST.text)  
                                                                ELSE eQS.statement_end_offset  
                                                        END
                                                                - eQS.statement_start_offset
                                                )/2
                                        ) + 1
                ) AS [statement_executing]  
        , eQP.[query_plan]
        , eQS.[plan_handle]
FROM sys.dm_exec_query_stats AS eQS  
        CROSS APPLY sys.dm_exec_sql_text(eQS.sql_handle) AS eST  
        CROSS APPLY sys.dm_exec_query_plan (eQS.plan_handle) AS eQP
WHERE db_name(eST.[dbid]) = '<database_name,,>'
--ORDER BY eQS.total_logical_reads/ISNULL(eQS.execution_count, 1) DESC;         -- [avg_logical_reads]
--ORDER BY eQS.total_physical_reads/ISNULL(eQS.execution_count, 1) DESC;        -- [avg_physical_reads]
--ORDER BY eQS.total_logical_writes/ISNULL(eQS.execution_count, 1) DESC;        -- [avg_logical_writes]
--ORDER BY eQS.total_worker_time/ISNULL(eQS.execution_count, 1) DESC;           -- [avg_cpu]
--ORDER BY eQS.total_elapsed_time/ISNULL(eQS.execution_count, 1) DESC;          -- [avg_duration]
--ORDER BY eQS.total_rows/ISNULL(eQS.execution_count, 1) DESC;                          -- [avg_rows]

 

Running this we see the results (which I broke-down by relative section of metrics.  The first two columns repeat in the display only for ease of reading the comprehensive results. 

 

Analyze High-Impact Stored Procedures for a SQL Server Instance:

Below we now look at the code and results for the query focused on the sys.dm_exec_procedure_stats DMV.  Some columns are not included in this DMV compared to her sister DMV at the core of the previous query so the resulting columns are slightly different.

SELECT TOP 25 db_name(eST.[dbid]) AS [database]
        , OBJECT_SCHEMA_NAME(ePS.[object_id], ePS.database_id) AS [schema_name]
        , OBJECT_NAME(ePS.[object_id], ePS.database_id) AS [procedure_name]
        , ePS.execution_count

        -- CPU
        , ePS.min_worker_time AS [min_cpu]
        , ePS.max_worker_time AS [max_cpu]
        , ePS.total_worker_time/ISNULL(ePS.execution_count, 1) AS [avg_cpu]
        , ePS.last_elapsed_time AS [last_cpu]
        , ePS.total_worker_time AS [total_cpu]

        -- ELAPSED TIME
        , ePS.min_elapsed_time AS [min_duration]
        , ePS.max_elapsed_time AS [max_duration]
        , ePS.total_elapsed_time/ISNULL(ePS.execution_count, 1) AS [avg_duration]
        , ePS.last_elapsed_time AS [last_duration]
        , ePS.total_elapsed_time AS [total_duration]  

        -- LOGICAL READS
        , ePS.min_logical_reads AS [min_logical_reads]
        , ePS.max_logical_reads AS [max_logical_reads]
        , ePS.total_logical_reads/ISNULL(ePS.execution_count, 1) AS [avg_logical_reads]
        , ePS.last_logical_reads AS [last_logical_reads]
        , ePS.total_logical_reads

        -- PHYSICAL READS
        , ePS.min_physical_reads AS [min_physical_reads]
        , ePS.max_physical_reads AS [max_physical_reads]
        , ePS.total_physical_reads/ISNULL(ePS.execution_count, 1) AS [avg_physical_reads]
        , ePS.last_physical_reads AS [last_physical_reads]
        , ePS.total_physical_reads

        -- LOGICAL WRITES
        , ePS.min_logical_writes AS [min_writes]
        , ePS.max_logical_writes AS [max_writes]
        , ePS.total_logical_writes/ISNULL(ePS.execution_count, 1) AS [avg_writes]
        , ePS.last_logical_writes AS [last_writes]
        , ePS.total_logical_writes AS [total_writes]

        -- CACHE & EXEC TIMES
        , ePS.last_execution_time

        --STATEMENTS AND QUERY TEXT DETAILS
        , eST.text AS [procedure_code]
        , ePS.[plan_handle]

FROM sys.dm_exec_procedure_stats AS ePS  
        CROSS APPLY sys.dm_exec_sql_text(ePS.sql_handle) AS eST  
        CROSS APPLY sys.dm_exec_query_plan (ePS.plan_handle) AS eQP
WHERE db_name(eST.[dbid]) = '<database_name,,>'
ORDER BY ePS.total_logical_reads/ISNULL(ePS.execution_count, 1) DESC;           -- [avg_logical_reads]
--ORDER BY ePS.total_physical_reads/ISNULL(ePS.execution_count, 1) DESC;        -- [avg_physical_reads]
--ORDER BY ePS.total_logical_writes/ISNULL(ePS.execution_count, 1) DESC;        -- [avg_logical_writes]
--ORDER BY ePS.total_worker_time/ISNULL(ePS.execution_count, 1) DESC;           -- [avg_cpu]
--ORDER BY ePS.total_elapsed_time/ISNULL(ePS.execution_count, 1) DESC;          -- [avg_duration]
--ORDER BY ePS.total_rows/ISNULL(ePS.execution_count, 1) DESC;                          -- [avg_rows]

As was the case above, running this we see the results (which I broke-down by relative section of metrics.  The first two columns repeat in the display only for ease of reading the comprehensive results. 

Conclusion

Once again the Dynamic Management Views provide a comprehensive source for diagnosing performance issues on SQL Server - since SQL Server 2005 of course.  It's why these constructs are my favorite part of SQL Server and why Microsoft continues to enhance and expand the Dynamic Management Objects each release.