It’s a late Friday afternoon, this means that somebody is going to tell you that the server is unresponsive, users are blocked and there is a imperative deadline for the business at 6PM (which you think is probably because they don’t want to be late to the happy hour), so you do what you do best: sigh and log on to see what’s going on with this instance.
You fire up SSMS, and damn, it’s real, you can’t connect!
You check the service, it’s up; The network, it’s up; Weirdly enough, you ask your colleague if he’s connected to the instance, yep he’s in
What’s going on?
Turns out, that SQL Server has a finite number of threads that it can create (indeed, a finite Thread Pool): Each connection needs a thread, queries need threads, even intellisense needs a thread (because it opens up a different connection) to work.
If you have a lot of activity going on, you have a lot of supermassive parallel queries going on, for some reason connections aren’t closed by an application, or whatever other reason that may be consuming threads, you’ll end up with a THREADPOOL wait stat, during which you can’t even open up a new connection to the DB(*), because you don’t have enough threads
(*) Luckily, SQL Server always thinks for you, and provides by default a single (hidden) thread for you to connect with in case of this kind of emergencies, called DAC
DAC supports only one user at the time and only locally by default (but you can enable remote login), it’s a single thread in a dedicated and separate scheduler in SQL Server
How to troubleshoot?
You would think that increasing the dimension of the Thread Pool will help you with this, right? Well, yes, it may in the short term, but that’s not a route that you want to follow, because your server has way more problems, this is just a symptom
The maximum number of thready, by default, depends on your core count and can be discovered in this way:
SELECT max_workers_count FROM sys.dm_os_sys_info
The thread pool can be exhausted in a various number of ways, for example the following:
- A lot of connections, each one running a query that’s running in parallel
- And maybe these queries are still running themselves, causing the issue, because there is another. unrelated, issue that prevents them from completing in a decent time
- A transaction is locking a pretty busy object in the Database and every other query is piling up waiting for the lock to be released, eventually starving the thread pool
- Rare, but maybe someone messed with the Max Worker Threads parameter in your sp_configure
i.e. , connect via DAC, launch sp_WhoIsActive and see what’s actually going on.