Investigate SQL Server tempdb I/O latency

In my post about increasing your SQL Server performance with tempdb, I talked about performance tuning steps related to the tempdb system database. But of course: as proof or even part of such tuning steps, it is crucial to measure the differences between current and previous implementations. That is why I would like to highlight some useful techniques and provide you with functional insights on how to analyze tempdb I/O stalls or bottlenecks. This information is especially useful if you’re a database administrator!

Stall all statistics

To investigate the I/O bottlenecks, you can use the Dynamic Management Views from SQL Server and demand to stall statistics on all database and log files.

Next, execute the following query:

SELECT mf.name, io_stall_write_ms, num_of_writes,
format(( io_stall_write_ms / ( 1.0 + num_of_writes ) ),'#.0000') as write_score, /**lower means less stall**/
format(( sum(io_stall_write_ms) over ()/ ( 1.0 + sum(num_of_writes) over () ) ),'#.0000') as overal_write_score,
io_stall_read_ms, num_of_reads,
format(( io_stall_read_ms / ( 1.0 + num_of_reads ) ),'#.0000') as read_score,
format(( sum(io_stall_read_ms) over ()/ ( 1.0 + sum(num_of_reads) over () ) ),'#.0000') as overal_read_score
FROM sys.dm_io_virtual_file_stats(DB_ID('tempdb'), NULL) AS fs
INNER JOIN sys.master_files AS mf ON fs.database_id = mf.database_id
AND fs.[file_id] = mf.[file_id]
and mf.type = 0

Query explained

A short explanation of all columns:

  • name: Logical name of the file in the database
  • io_stall_write_ms: Total time, in milliseconds, that the users waited for write issued on the file
  • num_of_writes: Number of writes made on this file
  • write_score: ratio of io_stall_write_ms and num_of_writes on this file (lower is better)
  • overall_write_score: ratio of io_stall_write_ms and num_of_writes for all files (lower is better)
  • io_stall_read_ms: Total time, in milliseconds, that the users waited for reads issued on the file
  • num_of_reads: Number of reads made on this file
  • read_score: ratio of io_stall_read_ms and num_of_reads on this file (lower is better)
  • overall_read_score: ratio of io_stall_read_ms and num_of_reads on all files (lower is better)

The query gave me the following result:

temp io stall

The most important columns here in scope of I/O stall times is overall_write_score and overall_read_score. A lower value means less average I/O stall time on this database file. After performing some tempdb optimization, the same query gave me the following results:

temp io stall

I love how the different I/O stall times have been decreased! Values for write stalls have been decreased in average by 80%. Values for read stalls have been decreased by a staggering 223%. Please note that – since the tempdb database is recreated at every restart of SQL server – the statistics date from the moment SQL server has been restarted. The creation time can be verified using the query below:

SELECT crdate FROM sysdatabases WHERE NAME='tempdb

Let’s go!

After performing some disk optimizations and applying the necessary configuration changes to the tempdb database in SQL server, I am very satisfied with the result. The techniques used are no rocket science and does not cover all facets in database performance measurement. But it is good practice to keep an eye on, for example,  I/O access times as well. With this post, I’ve hopefully given you some guidance as to how stall times of a database file can be checked. Finally, I would recommend all database administrators to keep an eye on these values on a regular basis.