[MDEV-5089] possible deadlocks between rwlocks and mutexes Created: 2013-10-01  Updated: 2014-10-02  Resolved: 2014-10-02

Status: Closed
Project: MariaDB Server
Component/s: OTHER
Fix Version/s: 5.5.40

Type: Task Priority: Major
Reporter: Sergei Golubchik Assignee: Sergey Vojtovich
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
PartOf
includes MDEV-5345 Deadlock between mysql_change_user(),... Closed
includes MDEV-5607 Query cache destroys uninitialized rw... Closed
includes MDEV-5616 Deadlock between CREATE/DROP FUNCTION... Closed
includes MDEV-6162 TokuDB: multiple locks and unlock of ... Closed
includes MDEV-6749 Deadlock between GRANT/REVOKE, SELECT... Closed
includes MDEV-6774 Deadlock between SELECT, DROP TABLE, ... Closed

 Description   

safe_mutex has built-in deadlock detection, but it only works for mutexes.
Still, compiling with -DUSE_MUTEX_INSTEAD_OF_RW_LOCKS we can magically turn all rwlocks into mutexes.

When done, safe_mutex reports new locking order violations, the server doesn't even start.



 Comments   
Comment by Sergey Vojtovich [ 2014-02-13 ]

SHOW VARIABLES may acquire LOCK_system_variables_hash read lock twice: in fill_variables() and then in intern_sys_var_ptr(). According to pthread_rwlock_rdlock() manual it seems to be acceptable: "A thread may hold multiple concurrent read locks on rwlock (that is, successfully call the pthread_rwlock_rdlock() function n times)."

The same functions also have mixed lock order:
fill_variables(): mysql_rwlock_rdlock(&LOCK_system_variables_hash);
show_status_array(): mysql_mutex_lock(&LOCK_global_system_variables);
intern_sys_var_ptr(): mysql_rwlock_rdlock(&LOCK_system_variables_hash);

The above should also be acceptable because we don't acquire any locks while we're under wrlock(&LOCK_system_variables_hash).

Will keep this code intact.

Comment by Sergey Vojtovich [ 2014-03-14 ]

Pushed one minor revision to 5.5.37:
revno: 4109
revision-id: svoj@mariadb.org-20140213074049-wo2l3qdtgi0s2mjd
parent: psergey@askmonty.org-20140311180702-1pntx903p1df1fyn
committer: Sergey Vojtovich <svoj@mariadb.org>
branch nick: 5.5
timestamp: Thu 2014-02-13 11:40:49 +0400
message:
MDEV-5089 - possible deadlocks between rwlocks and mutexes

Pre-MDL versions had direct relationship between LOCK_open and
LOCK_global_system_variables, e.g.:
intern_sys_var_ptr // locks LOCK_global_system_variable
mysql_sys_var_char
create_options_are_valid
ha_innobase::create
handler::ha_create
ha_create_table
rea_create_table
mysql_create_table_no_lock // locks LOCK_open
mysql_create_table

With MDL this relationship was removed, but mutex order was still
recorded. In fact there is indirect relationship between LOCK_open
and LOCK_global_system_variables via rwlocks in reverse order.

Removed LOCK_open and LOCK_global_system_variables order recording,
instead assert that LOCK_open is never held in intern_sys_var_ptr().

This solves only one of many problems detected with MDEV-5089.

Comment by Sergey Vojtovich [ 2014-04-25 ]

There is read-lock in reverse order in Aria:

ha_maria::repair();
  lock(share->intern_lock);
  _ma_update_auto_increment_key()/maria_rlast()/maria_rprev();
    rdlock(keyinfo->root_lock);
    unlock(keyinfo->root_lock);
  unlock(share->intern_lock);

Which may conflict with write-lock e.g. in maria_write():

maria_write();
  wrlock(keyinfo->root_lock);
  _ma_ck_write_btree()/.../_ma_new();
    lock(share->intern_lock);
    unlock(share->intern_lock);
  unlock(keyinfo->root_lock);

But since repair code is executed with protection of exclusive lock (TL_WRITE) deadlock doesn't seem to be possible.

Comment by Sergey Vojtovich [ 2014-09-16 ]

In 5.5 there is possible deadlock between 3 mutexes and 2 rwlocks:

lock(LOCK_open)                    -> rdlock(LOCK_grant)                  SELECT * FROM INFORMATION_SCHEMA.COLUMNS
wrlock(LOCK_grant)                 -> lock(acl_cache->lock)               GRANT/REVOKE CREATE/DROP USER
lock(acl_cache->lock)              -> lock(LOCK_global_system_variables)  FLUSH PRIVILEGES
lock(LOCK_global_system_variables) -> wrlock(LOCK_logger)                 SET @@global.log_output="TABLE"
rdlock(LOCK_logger)                -> lock(LOCK_open)                     SELECT 1

But threads are serialized by table-level locks. E.g. GRANT/etc acquires write-lock and FLUSH PRIVILEGES acquires read-lock. No actual deadlock possible.

In 10.0 things are even better: there is no LOCK_open -> LOCK_grant order.

Comment by Sergey Vojtovich [ 2014-10-02 ]

All issues found in 5.5 are reported/fixed. No extra problems were detected in 10.0 and 10.1.

Generated at Thu Feb 08 07:01:34 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.