MDEV-8684 we saw a large potion of CPU in the mutex implementation.
The top calls on high CPU for mutex_spin_wait, _raw_spin_lock(kernel), ut_delay are all due to a single global lock trx_sys->mutex.
All these go back to the mutex use in:
- read_view_open_now; and
Every read only select query hits this global mutex 3 times.
The first two are called from row_search_for_mysql and the trx_commit_low is the clean up of JOIN::optimize_inner within the same SELECT handler.
trx_start_low needs the mutex for trx_sys_get_new_trx_id() and addition on trx_sys->ro_trx_list (or trx_sys->rw_trx_list) and trx_reserve_descriptor
read_view_open_now(,_low) needs the lock for trx_find_descriptor and for inserting into trx_sys->view_list. This is called shortly via trx_assign_read_view after trx_start_if_not_started(ends up in trx_start_low) in the Phase 3 in row_search_for_mysql.
trx_commit_low needs the mutex for removing from the above lists and also from trx->global_read_view.
It might be possible to group trx_start_low and read_view_open_now update the same mutex.
MySQL-5.7 has mitigated this with http://dev.mysql.com/worklog/task/?id=6047
This removes the mutex in trx_start_low, trx_sys_get_new_trx_id and tx_commit_low for read only queries by not allocating a trx_id.
It has fairly invasive patches like https://github.com/mysql/mysql-server/commit/ed460aae81f9897984157bfe7759075182efb2b7
and the worklog task references previous commits that I'll need to port too.
I'll attempt to do this porting.