So, nikitamalyavin , I fixed the Windows and threadpool implementations of the "wake" to address following limitation of your previous implementation
- Windows, thread-per-connection - rather than a unconventional, QueueUserAPC + self-pipe trick with own implementation of socketpair, use CancelIoEx, to cancel an IO, which is what was originally intendedn. Own socketpair is costly in that it needs 2 extra ports in the ephemeral range, on the server which reduces the max possible number of connection by factor of 3
- Threadpool - move the common logic into threadpool_common, introduce a new TP_STATE TP_INTERRUPTED , to mark that connection just needs to check for the APC, rather than reading and executing query. Unfortunately Linux needs yet another workaround, because unlike on Windows, where we can cancel IO, or elsewhere where we can disable monitoring of fd safely, Linux offers nothing like that for epoll, and attempts to fix it with EPOLL_CTL_CLOSE were abandoned.
With that, implementation of "wake" looks more or less ok to me. All this work is now in bb-10.10-MDEV-16440.
There needs to be much more investment into testing, because there is a lot of subtle scenarios, that might go wrong.
Apparently interesting cases that are not covered by your testing just yet, are
- interrupting named pipe connections on Windows
- Interrupting long running queries, in both thread-per-connection, and pool-of-thread such as
SELECT SLEEP(a lot)
SELECT SLEEP(a lot), which is done from for remote a table (federated and/or connect)
long operation (reads binlogs) in replication scenarios
stuck in some kind of locks, user lock (GET_LOCK), TABLE LOCK, FTWRL
- You need also a basic test for non-debug build, as it is tested much more often on buildbots.
I have revised the documentation related to the CriticalSection. Indeed, it does not make any clue on the CRITICAL_SECTION structure access order. While I think it's fraudish to put it this way (no sane implementation should make any accesses to a mutex structure after the ownership is released), let's see what the developers can say: https://github.com/MicrosoftDocs/sdk-api/pull/1212.
Anyway, if it turns out that winapi's CriticalSection violates such guaranteees, I think we should just stop using it, and implement a mutex for example based on spinlock+WaitOnAddress