I attached the patch where lsn_lock is just pthread_mutex (with non-portable adaptive attribute) lsn_lock_is_pthread_mutex.diff
, applied to commit bbe99cd4e2d7c83a06dd93ea88af97f2d5796810 (current 10.9)
that performs better for me 141851.87 tps without the patch, vs 152514.50 with the patch
in a 30 seconds update_index standoff , and this is on Intel. Perhaps ARM can also benefit from it.
I used 1000 clients , 8 tables x 1500000 rows, large redo log and buffer pool, --innodb-flush-log-at-trx-commit=2 --thread-handling=pool-of-threads #
And it performs good on Windows, too
Update : However, if I run this on Linux on 2 NUMA nodes, results are not so good, it is 83495.47 without the patch vs 65638.71 with the patch. If you notice the single NUMA node numbers, yes that's how the NUMA performs on that box I have, I give it 2x the CPUs, and it makes the server half as fast (therefore I almost never do any NUMA test). That's an old 3.10 kernel, so maybe things are better for someone else.
I think we might have more urgent problems, than that spinlock (for example, getting rid of lock-free transaction hash, the CPU for lf_finds is alarming). If it turns out to be hot, which it seems to be, I'm always for OS mutex, the pthread_mutex, or even for std::mutex, which for me turns out of to be native SRWLOCK. while does not seem to protect large section of code, I'm not sure how often it is entered, so it seems to be oft, and thus maybe we can stay with whatever is just normal mutex (if ARM likes spinning, there is this ADAPTIVE thing it may like). The mysql_mutex_t might give an idea of how hot it is, since it is perfschema-instrumented, on the other hand it might make it even hotter, exactly because of the perfschema overhead.