SQL Server and Disk IO

Many aspects of a database system’s configuration will affect the performance of queries running on that DBMS. However, there is one single component that has the greatest impact on DBMS performance, and that, of course, is disk (IO) access.

It’s a broad topic and all aspects of it are pretty well covered in various blogs and technical articles, but there’s not much out there that gathers it all together. The intention of this series of posts is to try consolidate some of this information and focus in on some of the functions and procedures that can be used to monitor, troubleshoot and configure SQL Server disk configuration and most, importantly, disk performance. It won’t be exhaustive and cover every conceivable option, but will be based on experience gained whilst working at Sybase and Microsoft and sprinkling of ‘real world’ database administration experience.

I’ve split the topic up into smaller sub-topics in order to make it more manageable:

1. Partition offsets
2. Wait Stats
3. Tracking SQL Server IO performance
4. Physical Fragmentation
5. System Monitor (Perfmon)
6. Tempdb

Kinda ambitious, I know, but I don’t blog often and it gives me something to aim for!

There are plenty of excellent articles which cover IO in some depth, so I’m going to try my best to avoid regurgitation and try keep things DBA-centric.  (I highly recommend reading the links at the end of the relevant posts to anyone wanting a deeper understanding of IO issues, from configuration to monitoring and troubleshooting).

Until solid state disk prices come down and become ubiquitous, disk access is still a painfully slow (relatively speaking) mechanical process. An IO (read or write) request is issued by SQL Server which is passed onto the OS. The OS then passes this onto your disk controller or HBA and eventually this data request translates into a mechanical arm sitting above a spinning disk which will read or write the data that SQL Server requested.

What this means in performance terms is that disk access will be measured in milliseconds.

Access times to the other major components of a DBMS that directly affect performance such as CPU and RAM can be measured in microseconds or nanoseconds.

As you can see, it’s an order of magnitude difference.

This single factor alone highlights why reviewing a SQL Server database installation’s disk configuration can be such a valuable and rewarding exercise.  In fact, I begin many of my SQL Server troubleshooting exercises by eliminating IO performance first.

Prior to SQL Server 2000 SP4 it was difficult to identify IO performance issues without a lot of digging around. The biggest clue SQL Server would give would be the occasional mysterious 17883 (or even 17884) errors, and these were not always guaranteed to be IO related, although IO was usually heavily implicated. SP4 for SQL Server 2000 introduced those ‘slow IO’ messages which most DBAs will be familiar with and was a step in the right direction.

SQL Server 2005 introduced various DMVs which would give much greater insight into IO performance, and SQL Server 2008 built on this with the Performance DataWarehouse. These posts are going to cover methods that can be applied from SQL Server 2000 onwards, so will not delve deeply into features and properties only available in later versions, but focus instead on functions and features available across the last three versions of SQL Server as I know there’s still a lot of SQL Server 2000 installations out there.