Avoid deadlock scheduler situation in AlwaysOn setup

Scenario

Kindly consider below AlwaysOn setup between three servers.

Server A — Primary Replica
Server B — Secondary synchronous commit replica
Server C — Secondary asynchronous commit replica
Each server is having 8 logical CPUs thus having 576 max worker threads.
SQL instance is having 150 user databases across 5 AGs with 30 databases in each AG.

With above configuration you may notice below messages in SQL server error logs.

The thread pool for AlwaysOn Availability Groups was unable to start a new worker thread because there are not enough available worker threads.  This may degrade AlwaysOn Availability Groups performance.  Use the "max worker threads" configuration option to increase number of allowable threads.

Although above messages will not cause outage but they are indicative of performance issues. However you may experience an outage in any of below cases due to deadlock scheduler situation.

  1. As part of decommissioning existing server A and server B and migrating all databases on new server E and F, two more secondary replicas (E & F) were added in all AGs.
  2. More new databases are getting created and being added into AG.

When deadlock schedulers situation occurs, SQL server will not allow any new connections to SQL server (although you can still use DAC) and below message will be logged in SQL server error logs.

*******************************************************************************
*
* BEGIN STACK DUMP:
*   03/29/21 02:58:16 spid 2408
*
* Deadlocked Schedulers
*
* *******************************************************************************
New queries assigned to process on Node 0 have not been picked up by a worker thread in the last 300 seconds. Blocking or long-running queries can contribute to this condition, and may degrade client response time. Use the "max worker threads" configuration option to increase number of allowable threads, or optimize current running queries. SQL Process Utilization: 1%. System Idle: 91%.

Cause:

Worker threads are responsible for executing the tasks assigned to them by schedulers. Deadlock scheduler is a situation when no more worker threads remain available to pick any tasks. In this situation SQL server becomes unresponsive and no further work is carried out.

In any alwayson setup, below threads are required for alwayson to function.

  1. Log capture thread per database on primary replica
  2. Log send queue thread per database on primary replica per secondary replica
  3. Redo thread per database on secondary replica
  4. At least one message handler thread
  5. Backup thread on primary replica for duration of backup on secondary replica

Maximum number of threads used by availability groups are capped to “max worker thread -40”. For example on a 8 CPU server, maximum threads available to availability groups will be 576-40 = 536

Note: Threads are not held permanently. Threads are released after ~15 seconds of inactivity.

Formula to calculate minimum number of threads required on primary replica


Threads required = No of databases * (LogCaptureThread + (LogSendThread*NumberofSecondaryReplicas))+1 message handler thread


Formula to calculate minimum number of threads required on secondary replica

As per MSBOL "A SQL Server instance uses up to 100 threads for parallel redo for secondary replicas. Each database uses up to one-half of the total number of CPU cores, but not more than 16 threads per database. If the total number of required threads for a single instance exceeds 100, SQL Server uses a single redo thread for every remaining database. Serial Redo threads are released after ~15 seconds of inactivity."

Considering all DBs are active, below will be minimum number of threads required.

Case 1: SQL instance is having 150 user databases across 5 AGs with 30 databases in each AG with two secondary replicas
Threads Required = 150*(1+(1*2))+1 = 451

Case 2: SQL instance is having 300 user databases across 5 AGs with 60 databases in each AG with two secondary replicas
Threads Required = 300*(1+(1*2))+1 = 901

Case 3: SQL instance is having 150 user databases across 5 AGs with 30 databases in each AG with four secondary replicas
Threads Required = 150*(1+(1*4))+1 = 751

As you can see on a 8 CPU server, threads required by availability groups are exceeding maximum available threads for availability groups i.e. 536. In case all databases does not remain active, SQL will be able to manage workload as threads will be released after inactivity of ~15 seconds. However if all databases remain active then you may experience deadlock scheduler situation where SQL will become unresponsive for any new tasks.

Hence we should always keep in mind the thread requirement in a AG setup before adding new secondary replicas, new databases to AG.

Workaround

  1. Kindly distribute the AGs across available replicas so that thread requirement for primary replica databases can be distributed and not leading to thread starvation. For example if we distribute 5 AGs one on each replica in case-3 above, thread requirement will come down to 151 threads for primary replica DBs and 101 threads for secondary replica databases for redo operations i.e. total 252 threads which is well below 536 threads available for AGs.
  2. Consider increasing the CPUs on the server.

References and further reading: I will recommend to read below articles for better understanding on this topic.
Availability group: Prerequisites, restrictions, & recommendations – SQL Server Always On | Microsoft Docs
AlwaysON – HADRON Learning Series: Worker Pool Usage for HADRON Enabled Databases | Microsoft Docs
Monitoring SQL Server 2012 AlwaysOn Availability Groups Worker Thread Consumption – Microsoft Tech Community

Hope it helps!!
Happy learning!!