thread_local is faster a simpler than function call. https://godbolt.org/z/whHsjI
I'm running main suite tests in debug mode with -O2 optimizations, for example, for Spent 1556.669 of 260 seconds executing testcases and with thread_local, for example, for Spent 1484.632 of 246 seconds executing testcases.
Patch requires transferring dbug.c from C to C++. C11 would fit too, but it's not used in MariaDB.
Similar technique could be used also for THD and PFS.