SQL Server performance tuning – how to look for memory issue

We in the first post saw how capturing and analyzing wait statistics can help a SQL DBA to look for potential performance issue.

In the second post we looked at some of the example and potential scenario of wait stats which were similar to a production environment.

In this post, we will see how we can find whether we have any memory issue on our SQL instance. This post will feature some of the queries which we can use in our environment and step by step approach to troubleshoot our issue.

If we are observing high RESOURCE_SEMAPHORE waits in our environment, it can possibly indicate a memory issue. Though, this is not the case always. You should look for other indicators like memory grants pending, PLE count, Buffer cache hit ratio, target and total server memory etc.

You can start by checking available memory in the SQL instance. Use below code to check the current memory status. Focus on system_memory_state_desc which should be ideally HIGH. STEADY or LOW indicates memory pressure.

SELECT total_physical_memory_kb/1024 AS Total_Physical_Memory_MB, 
available_physical_memory_kb/1024 AS Available_Physical_Memory_MB,
 physical_memory_in_use_kb/1024 AS Physical_memory_in_Use_MB,
       locked_page_allocations_kb/1024 AS Locked_Page_Allocations_MB, 
        Memory_utilization_percentage,
	   process_physical_memory_low, 
       process_virtual_memory_low,
       system_memory_state_desc
FROM sys.dm_os_sys_memory, sys.dm_os_process_memory

Once you have run the above query, next step is to find PLE count, Memory grants pending and other OS performance counters using below code:

SELECT * 
FROM sys.dm_os_performance_counters
WHERE OBJECT_NAME = N'SQLServer:Buffer Manager' -- Modify this if you have named instances
AND counter_name IN('Page life expectancy','Buffer cache hit ratio', 'Buffer cache hit ratio base',
'Free list stalls/sec','Lazy writes/sec','Total pages','Target pages','stolen pages',
'page reads/sec','page writes/sec','Memory Grants Pending')
go

We are looking here a good PLE count, which normally should be 300 per 4 GB of memory. So if you have 16 GB as max server memory set for your SQL instance, PLE count should be near to 1200.

Please note that this a generic idea for a good PLE count value. However, this value may vary on your environment. You can refer this good article by Paul Randal on PLE count for understanding the concept.

Also, total pages should be less than or equal to target pages. Other value that this query shows is Buffer cache hit ratio, Free list stalls/sec, Lazy writes/sec. Total pages

Similarly, Memory grants pending should be ideally ZERO. Please remember that you should run this query few times to get a holistic picture of memory.

Once you gather data from above query, you will get a good idea of whether there is memory issue or not. Assuming that there is low PLE count, or, Memory Grants Pending is not ZERO, you should find what in SQL is consuming the memory.

This leads us to check the cache store of SQL. You can use below query in the SQL instance from SQL 2012 onwards:

select name, type, SUM(pages_kb)/1024
as cache_size_mb
from sys.dm_os_memory_cache_counters
where type like 'CACHESTORE%'
group by name, type
order by cache_size_mb desc
go

The result of above query will show you which cache store is consuming your memory. We would ideally want to have a low value of “CACHESTORE_SQLCP”. This indicates memory getting consumed by ad hoc queries.

If CACHESTORE_SQLCP value is high, ensure that Optimize for ad hoc workload option is enabled at SQL instance level.

“CACHESTORE_OBJCP” indicates memory consumed by compiled plans for stored procedures.

Up to this level, you would be quite sure that you have a memory pressure or not. If all the values from above mentioned queries are giving you output that you think is good for your environment, then memory is not the problem here. You should focus your energy to a different area.

But, if you think that memory is the issue, then move to the next step to check which database is consuming the buffer pool. This would help you to focus your energy on that database. You can use below query to check the buffer pool usage:

--Each of these pages are present in the Buffer Cache, meaning they are IN_RAM pages.

DECLARE @total_buffer INT; 
SELECT @total_buffer = cntr_value 
   FROM sys.dm_os_performance_counters 
   WHERE RTRIM([object_name]) LIKE'%Buffer Manager'
   AND counter_name = 'Total Pages';

;WITH BufCount AS 
( 
  SELECT 
       database_id, db_buffer_pages = COUNT_BIG(*) 
       FROM sys.dm_os_buffer_descriptors 
       WHERE database_id BETWEEN 5 AND 32766 
       GROUP BY database_id 
) 
SELECT 
   [Database_Name] = CASE [database_id] WHEN 32767 
       THEN'MSSQL System Resource DB' 
       ELSE DB_NAME([database_id]) END, 
   [Database_ID], 
   db_buffer_pages as [Buffer Count (8KB Pages)], 
   [Buffer Size (MB)] = db_buffer_pages / 128, 
   [Buffer Size (%)] = CONVERT(DECIMAL(6,3), 
       db_buffer_pages * 100.0 / @total_buffer) 
FROM BufCount 
ORDER BY [Buffer Size (MB)] DESC; 

Once you get the database which is consuming memory. It is now time to break the buffer usage used by current database by object (table, index) in the buffer cache. Please note this is a fairly expensive query.

Use <DatabaseName>
SELECT OBJECT_NAME(p.[object_id]) AS [ObjectName], 
p.index_id, COUNT(*)/128 AS [Buffer size(MB)],  COUNT(*) AS [BufferCount], 
p.data_compression_desc AS [CompressionType]
FROM sys.allocation_units AS a
INNER JOIN sys.dm_os_buffer_descriptors AS b
ON a.allocation_unit_id = b.allocation_unit_id
INNER JOIN sys.partitions AS p
ON a.container_id = p.hobt_id
WHERE b.database_id = CONVERT(int,DB_ID())
AND p.[object_id] > 100
GROUP BY p.[object_id], p.index_id, p.data_compression_desc
ORDER BY [BufferCount] DESC;

At this moment, you will have the data which will tell you which table or index or any other object is consuming the memory and you can shift your focus on that. There is still one more thing, you would want to check.

Which is the most expensive cached stored procedures in that particular database. It could be the case, that, a particular SP is making all the memory consumption and it will allow us to further drill down our issue.

-- Top Cached SPs By Total Logical Reads. Logical reads relate to memory pressure
SELECT TOP(25) p.name AS [SP Name], qs.total_logical_reads AS [TotalLogicalReads], 
qs.total_logical_reads/qs.execution_count AS [AvgLogicalReads],qs.execution_count, 
ISNULL(qs.execution_count/DATEDIFF(Second, qs.cached_time, GETDATE()), 0) AS [Calls/Second], 
qs.total_elapsed_time, qs.total_elapsed_time/qs.execution_count 
AS [avg_elapsed_time], qs.cached_time
FROM sys.procedures AS p
INNER JOIN sys.dm_exec_procedure_stats AS qs
ON p.[object_id] = qs.[object_id]
WHERE qs.database_id = DB_ID()
ORDER BY qs.total_logical_reads DESC;

If you have filtered down to a stored procedure, then, check the execution plan of that stored procedure and try to figure out the real reason behind it. We would take up execution plan in future blogs.

Till this point, you will be pretty sure about what is causing the memory crunch. However, if you want to monitor the SQL instance for a longer duration, a good way will be to set up performance counter. We will talk about these performance counter in our future blogs.

I am sure this approach would help you in troubleshooting memory related issue.

Again, as we discussed in our earlier blogs, there is no absolute approach for performance tuning. Every DBA has his own way to doing things. Still, what we are discussing here are tried and tested method which I have successfully used over my 12 years career and will be really helpful for you as well.

Do let me know if you have some different style to troubleshoot the memory issue.