[MDEV-21390] lock_print_info_summary() should work even when trx_sys.mutex is locked Created: 2019-12-24  Updated: 2022-02-21  Resolved: 2022-02-21

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.2.30
Fix Version/s: 10.3.5, 10.4.0

Type: Bug Priority: Major
Reporter: Geoff Montee (Inactive) Assignee: Marko Mäkelä
Resolution: Won't Fix Votes: 1
Labels: None

Issue Links:
Relates
relates to MDEV-14756 Remove trx_sys_t::rw_trx_list Closed
relates to MDEV-17237 thread IDs are printed in different f... Open
relates to MDEV-17238 Document special thread IDs used in S... Open
relates to MDEV-18391 Print ENGINE INNODB STATUS in machine... Open
relates to MDEV-18429 Consistent non-locking reads do not a... Closed
relates to MDEV-18572 Thread executing DROP TABLE listed tw... Open
relates to MDEV-18582 Port status variables related to SHOW... Closed
relates to MDEV-18698 Show InnoDB's internal background thr... Open
relates to MDEV-21566 Lock monitor doesn't print a name for... Closed
relates to MDEV-22087 Increase buffer size for query in SHO... Open
relates to MDEV-21330 Lock monitor doesn't print a semaphor... Closed

 Description   

The lock_print_info_summary() function currently requires trx_sys.mutex, since it calls the trx_sys_get_max_trx_id() function.

https://github.com/mariadb/server/blob/mariadb-10.2.30/storage/innobase/lock/lock0lock.cc#L4910

https://github.com/MariaDB/server/blob/mariadb-10.2.30/storage/innobase/include/trx0sys.ic#L400

This causes a couple different problems:

1.) This means that SHOW ENGINE INNODB STATUS will hang if it can't lock trx_sys.mutex.

2.) If there is a long semaphore wait on trx_sys.mutex, then the sync_array_print_long_waits() function will be called, which will set srv_print_innodb_monitor. This will cause the InnoDB monitor thread to call the lock_print_info_summary() function, so that the InnoDB status information will be written to the error log. If the lock_print_info_summary() function requires trx_sys.mutex, then the monitor thread will not be able to write the InnoDB status information to the error log.

https://github.com/MariaDB/server/blob/mariadb-10.2.30/storage/innobase/sync/sync0arr.cc#L1081

https://github.com/MariaDB/server/blob/mariadb-10.2.30/storage/innobase/srv/srv0srv.cc#L1756

Both of the above problems cause issues when users are attempting to diagnose long semaphore waits on trx_sys.mutex.

Maybe the lock_print_info_summary() function should print dirty data with a warning if it can't lock trx_sys.mutex?



 Comments   
Comment by Sergey Vojtovich [ 2019-12-24 ]

FWIW lock_print_info_summary() of 10.3+ doesn't take trx_sys.mutex.

Comment by Geoff Montee (Inactive) [ 2019-12-24 ]

Thanks, svoj. In that case, this problem might be specific to MariaDB 10.2.

Comment by Geoff Montee (Inactive) [ 2020-01-24 ]

To be specific, it looks like this was fixed in 10.3+ by MDEV-14756 starting with MariaDB 10.3.5. See here:

https://github.com/MariaDB/server/commit/7078203389b04e742de660d78c36034a3a4deb59

Comment by Marko Mäkelä [ 2022-02-21 ]

The 10.2 release series will soon reach its end of life, and it is not feasible to fix this bug there.

In MariaDB Server 10.3.5, MDEV-15104 introduced a mutex-free trx_sys.get_max_trx_id() that is based on 64-bit atomic memory access.

Generated at Thu Feb 08 09:06:46 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.