[MDEV-22458] Server with WSREP hangs after INSERT, wrong usage of mutex 'LOCK_thd_data' and 'share->intern_lock' / 'lock->mutex' Created: 2020-05-04  Updated: 2020-07-24  Resolved: 2020-07-24

Status: Closed
Project: MariaDB Server
Component/s: Galera, wsrep
Affects Version/s: 10.4, 10.5
Fix Version/s: 10.4.14, 10.5.5

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Jan Lindström (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Attachments: Text File hang_all_threads.txt    
Issue Links:
Relates
relates to MDEV-22460 safe_mutex: Found wrong usage of mute... Closed
relates to MDEV-22154 safe_mutex: Trying to lock mutex at <... Closed

 Description   

--source include/galera_cluster.inc
 
CREATE TABLE t1 (a INT);
 
--connect (con1,localhost,root,,test)
--let $con1= `SELECT CONNECTION_ID()`
INSERT INTO t1 VALUES (1),(2),(3),(4);
 
--connection default
--error ER_TARGET_NOT_EXPLAINABLE
eval SHOW EXPLAIN FOR $con1;
 
--connection con1
INSERT INTO t1 VALUES (5),(6),(7),(8);
 
# Cleanup
--disconnect con1
--connection default
--reap
DROP TABLE t1;

The test hangs upon 2nd INSERT, seemingly forever.
The process list shows this:

10.4 7f03a933

+----+-------------+-----------------+------+---------+------+--------------------------+-----------------------+----------+
| Id | User        | Host            | db   | Command | Time | State                    | Info                  | Progress |
+----+-------------+-----------------+------+---------+------+--------------------------+-----------------------+----------+
|  1 | system user |                 | NULL | Sleep   |   83 | wsrep aborter idle       | NULL                  |    0.000 |
|  2 | system user |                 | NULL | Sleep   |   83 |                          | NULL                  |    0.000 |
|  4 | system user |                 | NULL | Daemon  | NULL | InnoDB purge worker      | NULL                  |    0.000 |
|  5 | system user |                 | NULL | Daemon  | NULL | InnoDB purge worker      | NULL                  |    0.000 |
|  6 | system user |                 | NULL | Daemon  | NULL | InnoDB purge worker      | NULL                  |    0.000 |
|  3 | system user |                 | NULL | Daemon  | NULL | InnoDB purge coordinator | NULL                  |    0.000 |
|  7 | system user |                 | NULL | Daemon  | NULL | InnoDB shutdown handler  | NULL                  |    0.000 |
| 17 | root        | localhost       | test | Sleep   |   73 |                          | NULL                  |    0.000 |
| 18 | root        | localhost:39086 | test | Sleep   |   73 |                          | NULL                  |    0.000 |
| 19 | root        | localhost       |      | Busy    |   73 | Init                     | NULL                  |    0.000 |
| 20 | root        | localhost:39096 | NULL | Query   |    0 | Init                     | show full processlist |    0.000 |
+----+-------------+-----------------+------+---------+------+--------------------------+-----------------------+----------+

All threads' stack trace is attached as hang_all_threads.txt.

Reproducible on 10.4 and 10.5, debug and non-debug builds alike, with a cluster (e.g. inside the galera suite) as well as with one node with wsrep enabled, with InnoDB and MyISAM.

Couldn't reproduce on 10.3.
Couldn't reproduce without wsrep_on.

Occasionally, while hanging the same way, some builds also produce a mutex error in the log:

safe_mutex: Found wrong usage of mutex 'LOCK_thd_data' and 'share->intern_lock'
Mutex currently locked (in reverse order):
share->intern_lock                /data/src/10.5e/storage/myisam/mi_locking.c  line 57
LOCK_thd_data                     /data/src/10.5e/sql/sql_parse.cc  line 9096
safe_mutex: Found wrong usage of mutex 'LOCK_thd_data' and 'lock->mutex'
Mutex currently locked (in reverse order):
lock->mutex                       /data/src/10.5e/mysys/thr_lock.c  line 763
LOCK_thd_data                     /data/src/10.5e/sql/sql_parse.cc  line 9096

The error appears in the log sporadically. I assume it is related.



 Comments   
Comment by Mario Karuza (Inactive) [ 2020-06-29 ]

Fix PR https://github.com/MariaDB/server/pull/1614

Generated at Thu Feb 08 09:14:53 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.