What is my latency anyway?

So yes I work for a storage company and I am passionate about latency for databases. Does it fix bad code? No. Is it the be all end all for performance? No. But storage historically has been the biggest infrastructure bottleneck for databases. So ignoring storage from my perspective is a mistake. It is one of the things that we can easily change and optimize that effects everything. I know this topic has been addressed before but everyday I meet someone who is having latency issues.

Whenever I started optimizing systems at a new job this was one of the first things I looked at. Why? Because fixing storage issues affected all the databases that resided on that storage. It was a big hammer. After I fixed my storage issues, I was able to then have enough breathing room to address other things.

So I encourage everyone to know their latency. It is even more important if you are virtualized. You could have something going on between you and the hyper-visor. DBAs should never ask do we have a problem. They should know. So Microsoft has generally guided that latency should be 25ms or below. My opinion in the age of all flash is that it should be close to what your storage array states it can do. At Pure we expect our customers to be sub 1 millisecond.

So what is your latency? Well the best way to figure that out is to log your dm_io_virtual_filestats or leverage a monitoring tool that does that for you. I will cover logging the dm_io_virtual_filestats in a future blog. But if you want to do a quick gut check run the following query.

SELECT
[ReadLatency] = CASE WHEN sum([num_of_reads]) = 0 THEN 0 ELSE (SUM([io_stall_read_ms]) / SUM([num_of_reads])) END,
[WriteLatency] = CASE WHEN SUM([num_of_writes]) = 0 THEN 0 ELSE (SUM([io_stall_write_ms]) / SUM([num_of_writes])) END,
[Latency] = CASE WHEN SUM([num_of_reads]) = 0 AND SUM([num_of_writes]) = 0 THEN 0 ELSE (SUM([io_stall]) / (SUM([num_of_reads]) + SUM([num_of_writes]))) END,
[AvgBPerRead] = CASE WHEN SUM([num_of_reads]) = 0 THEN 0 ELSE (SUM([num_of_bytes_read]) / SUM([num_of_reads])) END,
[AvgBPerWrite] = CASE WHEN SUM([num_of_writes]) = 0 THEN 0 ELSE (SUM([num_of_bytes_written]) / SUM([num_of_writes])) END,
[AvgBPerTransfer] = CASE WHEN (SUM([num_of_reads]) = 0 AND SUM([num_of_writes]) = 0)
THEN 0 ELSE
((SUM([num_of_bytes_read]) + SUM([num_of_bytes_written])) /
(SUM([num_of_reads]) + SUM([num_of_writes]))) END,
volume_mount_point
FROM sys.dm_io_virtual_file_stats (NULL,NULL) AS [vfs]
JOIN sys.master_files AS [mf] ON [vfs].[database_id] = [mf].[database_id] AND [vfs].[file_id] = [mf].[file_id]
CROSS APPLY sys.dm_os_volume_stats(mf.database_id, mf.file_id)
group by volume_mount_point
ORDER BY volume_mount_point

Now this shows the average since the last restart of the sql server service. Any peaks in latency will be averaged out. From my experience if you are around the 20ms mark with this query, you are absolutely exceeding the acceptable 25 ms at times. Also if you are on modern storage you should be below a 1 ms. Stay flashy!