[MDEV-23843] Assertions in Diagnostics_area upon table operations under FTWRL Created: 2020-09-29  Updated: 2021-03-02  Resolved: 2021-03-02

Status: Closed
Project: MariaDB Server
Component/s: Admin statements, Locking, Replication
Affects Version/s: 10.4, 10.5
Fix Version/s: 10.4.19, 10.5.10

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Michael Widenius
Resolution: Fixed Votes: 0
Labels: affects-tests, regression


 Description   

The failure appeared in 10.4 branch after this commit:

commit 16ea692ed490a1a2c6dc0b855247d5bd02bd2345
Author: Monty
Date:   Mon Sep 7 10:38:12 2020 +0300
 
    MDEV-23586 Mariabackup: GTID saved for replication in 10.4.14 is wrong

--source include/have_log_bin.inc
 
CREATE TABLE t (a INT);
FLUSH TABLES WITH READ LOCK;
--connect (con1,localhost,root,,)
SET lock_wait_timeout= 1;
OPTIMIZE TABLE t;
 
# Cleanup
--disconnect con1
--connection default
UNLOCK TABLES;
DROP TABLE t;

10.4 79e32e47

mysqld: /data/src/10.4/sql/sql_error.cc:445: void Diagnostics_area::set_error_status(uint, const char*, const char*, const Sql_user_condition_identity&, const Sql_condition*): Assertion `! is_set() || m_can_overwrite_status' failed.
200929 17:06:23 [ERROR] mysqld got signal 6 ;
 
#7  0x00007fea87695f36 in __GI___assert_fail (assertion=0x5622d20ab0e0 "! is_set() || m_can_overwrite_status", file=0x5622d20aae78 "/data/src/10.4/sql/sql_error.cc", line=445, function=0x5622d20ab108 "void Diagnostics_area::set_error_status(uint, const char*, const char*, const Sql_user_condition_identity&, const Sql_condition*)") at assert.c:101
#8  0x00005622d12a9330 in Diagnostics_area::set_error_status (this=0x7fea68006780, sql_errno=1205, message=0x7fea815cf280 "Lock wait timeout exceeded; try restarting transaction", sqlstate=0x5622d20d068d "HY000", ucid=..., error_condition=0x0) at /data/src/10.4/sql/sql_error.cc:445
#9  0x00005622d12834ed in THD::raise_condition (this=0x7fea68000d90, sql_errno=1205, sqlstate=0x5622d20d068d "HY000", level=Sql_state_errno_level::WARN_LEVEL_ERROR, ucid=..., msg=0x7fea815cf280 "Lock wait timeout exceeded; try restarting transaction") at /data/src/10.4/sql/sql_class.cc:1100
#10 0x00005622d11c50b2 in THD::raise_condition (this=0x7fea68000d90, sql_errno=1205, sqlstate=0x0, level=Sql_state_errno_level::WARN_LEVEL_ERROR, msg=0x7fea815cf280 "Lock wait timeout exceeded; try restarting transaction") at /data/src/10.4/sql/sql_class.h:4385
#11 0x00005622d11b7a43 in my_message_sql (error=1205, str=0x7fea815cf280 "Lock wait timeout exceeded; try restarting transaction", MyFlags=0) at /data/src/10.4/sql/mysqld.cc:3373
#12 0x00005622d1fc8bbd in my_error (nr=1205, MyFlags=0) at /data/src/10.4/mysys/my_error.c:125
#13 0x00005622d1494919 in MDL_context::acquire_lock (this=0x7fea68000eb0, mdl_request=0x7fea815cf750, lock_wait_timeout=1) at /data/src/10.4/sql/mdl.cc:2365
#14 0x00005622d17a5432 in MYSQL_BIN_LOG::write (this=0x5622d2b85a60 <mysql_bin_log>, event_info=0x7fea815cf990, with_annotate=0x0) at /data/src/10.4/sql/log.cc:6391
#15 0x00005622d12949b0 in THD::binlog_query (this=0x7fea68000d90, qtype=THD::STMT_QUERY_TYPE, query_arg=0x7fea68014348 "OPTIMIZE TABLE t", query_len=16, is_trans=false, direct=false, suppress_use=false, errcode=0) at /data/src/10.4/sql/sql_class.cc:7202
#16 0x00005622d13dd50e in write_bin_log (thd=0x7fea68000d90, clear_error=true, query=0x7fea68014348 "OPTIMIZE TABLE t", query_length=16, is_trans=false) at /data/src/10.4/sql/sql_table.cc:1988
#17 0x00005622d149cefe in Sql_cmd_optimize_table::execute (this=0x7fea68014ad0, thd=0x7fea68000d90) at /data/src/10.4/sql/sql_admin.cc:1391
#18 0x00005622d130718d in mysql_execute_command (thd=0x7fea68000d90) at /data/src/10.4/sql/sql_parse.cc:6098
#19 0x00005622d130c733 in mysql_parse (thd=0x7fea68000d90, rawbuf=0x7fea68014348 "OPTIMIZE TABLE t", length=16, parser_state=0x7fea815d0550, is_com_multi=false, is_next_command=false) at /data/src/10.4/sql/sql_parse.cc:7896
#20 0x00005622d12f8c3e in dispatch_command (command=COM_QUERY, thd=0x7fea68000d90, packet=0x7fea680088a1 "", packet_length=16, is_com_multi=false, is_next_command=false) at /data/src/10.4/sql/sql_parse.cc:1834
#21 0x00005622d12f74a6 in do_command (thd=0x7fea68000d90) at /data/src/10.4/sql/sql_parse.cc:1352
#22 0x00005622d14852bb in do_handle_one_connection (connect=0x5622d535bf40) at /data/src/10.4/sql/sql_connect.cc:1412
#23 0x00005622d1485004 in handle_one_connection (arg=0x5622d535bf40) at /data/src/10.4/sql/sql_connect.cc:1316
#24 0x00005622d1e9dbf2 in pfs_spawn_thread (arg=0x5622d530b5f0) at /data/src/10.4/storage/perfschema/pfs.cc:1869
#25 0x00007fea87bad609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#26 0x00007fea87781293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Only reproducible on 10.4, not on 10.2-10.3 or 10.5 (yet?).
Also, only on 10.4 a non-debug build returns ER_LOCK_WAIT_TIMEOUT for OPTIMIZE in the test case above. On other versions it only prints "Lock wait timeout exceeded; try restarting transaction" in the optimize output, but doesn't throw the error.
No obvious problem on a non-debug build.
Not reproducible without binary logging.



 Comments   
Comment by Elena Stepanova [ 2020-09-29 ]

Obviously related problem, different assertion. All the same notes apply: appeared after the same commit, new ER_LOCK_WAIT_TIMEOUT, etc.

--source include/have_log_bin.inc
 
FLUSH TABLES WITH READ LOCK;
--connect (con1,localhost,root,,test)
SET lock_wait_timeout= 1;
FLUSH TABLES;

10.4 79e32e47

mysqld: /data/src/10.4/sql/sql_error.cc:335: void Diagnostics_area::set_ok_status(ulonglong, ulonglong, const char*): Assertion `!is_set() || (m_status == DA_OK_BULK && is_bulk_op())' failed.
200929 19:14:17 [ERROR] mysqld got signal 6 ;
 
#7  0x00007f49f19bdf36 in __GI___assert_fail (assertion=0x5556574c8ff0 "!is_set() || (m_status == DA_OK_BULK && is_bulk_op())", file=0x5556574c8e78 "/data/src/10.4/sql/sql_error.cc", line=335, function=0x5556574c9028 "void Diagnostics_area::set_ok_status(ulonglong, ulonglong, const char*)") at assert.c:101
#8  0x00005556566c6e02 in Diagnostics_area::set_ok_status (this=0x7f49c8006780, affected_rows=0, last_insert_id=0, message=0x0) at /data/src/10.4/sql/sql_error.cc:335
#9  0x00005556566686c4 in my_ok (thd=0x7f49c8000d90, affected_rows_arg=0, id=0, message=0x0) at /data/src/10.4/sql/sql_class.h:5040
#10 0x0000555656722c50 in mysql_execute_command (thd=0x7f49c8000d90) at /data/src/10.4/sql/sql_parse.cc:5500
#11 0x000055565672a733 in mysql_parse (thd=0x7f49c8000d90, rawbuf=0x7f49c8014348 "FLUSH TABLES", length=12, parser_state=0x7f49ec0f9550, is_com_multi=false, is_next_command=false) at /data/src/10.4/sql/sql_parse.cc:7896
#12 0x0000555656716c3e in dispatch_command (command=COM_QUERY, thd=0x7f49c8000d90, packet=0x7f49c80088a1 "FLUSH TABLES", packet_length=12, is_com_multi=false, is_next_command=false) at /data/src/10.4/sql/sql_parse.cc:1834
#13 0x00005556567154a6 in do_command (thd=0x7f49c8000d90) at /data/src/10.4/sql/sql_parse.cc:1352
#14 0x00005556568a32bb in do_handle_one_connection (connect=0x55565aa29f40) at /data/src/10.4/sql/sql_connect.cc:1412
#15 0x00005556568a3004 in handle_one_connection (arg=0x55565aa29f40) at /data/src/10.4/sql/sql_connect.cc:1316
#16 0x00005556572bbbf2 in pfs_spawn_thread (arg=0x55565a9d95f0) at /data/src/10.4/storage/perfschema/pfs.cc:1869
#17 0x00007f49f1ed5609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#18 0x00007f49f1aa9293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Comment by Elena Stepanova [ 2020-11-16 ]

Possibly the same or closely related problem, no binary log is needed.

CREATE TABLE t (a INT) ENGINE=Aria;
 
--connect (con1,localhost,root,,test)
FLUSH TABLES WITH READ LOCK;
 
--connection default
SET lock_wait_timeout= 1;
--error 0,ER_LOCK_WAIT_TIMEOUT
FLUSH TABLES t FOR EXPORT;
 
# Cleanup
--connection con1
UNLOCK TABLES;
--disconnect con1
--connection default
UNLOCK TABLES;
DROP TABLE t;

10.4 1bebc8de

mysqld: /data/src/10.4/sql/sql_error.cc:445: void Diagnostics_area::set_error_status(uint, const char*, const char*, const Sql_user_condition_identity&, const Sql_condition*): Assertion `! is_set() || m_can_overwrite_status' failed.
201116 19:16:51 [ERROR] mysqld got signal 6 ;
 
#7  0x00007fd1b1b72f36 in __GI___assert_fail (assertion=0x564a471cc220 "! is_set() || m_can_overwrite_status", file=0x564a471cbfb8 "/data/src/10.4/sql/sql_error.cc", line=445, function=0x564a471cc248 "void Diagnostics_area::set_error_status(uint, const char*, const char*, const Sql_user_condition_identity&, const Sql_condition*)") at assert.c:101
#8  0x0000564a46487296 in Diagnostics_area::set_error_status (this=0x7fd190006780, sql_errno=1205, message=0x7fd1ac342640 "Lock wait timeout exceeded; try restarting transaction", sqlstate=0x564a471f1b1d "HY000", ucid=..., error_condition=0x0) at /data/src/10.4/sql/sql_error.cc:445
#9  0x0000564a46461415 in THD::raise_condition (this=0x7fd190000d90, sql_errno=1205, sqlstate=0x564a471f1b1d "HY000", level=Sql_state_errno_level::WARN_LEVEL_ERROR, ucid=..., msg=0x7fd1ac342640 "Lock wait timeout exceeded; try restarting transaction") at /data/src/10.4/sql/sql_class.cc:1100
#10 0x0000564a463a2a94 in THD::raise_condition (this=0x7fd190000d90, sql_errno=1205, sqlstate=0x0, level=Sql_state_errno_level::WARN_LEVEL_ERROR, msg=0x7fd1ac342640 "Lock wait timeout exceeded; try restarting transaction") at /data/src/10.4/sql/sql_class.h:4393
#11 0x0000564a463953b8 in my_message_sql (error=1205, str=0x7fd1ac342640 "Lock wait timeout exceeded; try restarting transaction", MyFlags=0) at /data/src/10.4/sql/mysqld.cc:3373
#12 0x0000564a470e7d65 in my_error (nr=1205, MyFlags=0) at /data/src/10.4/mysys/my_error.c:125
#13 0x0000564a46673c75 in MDL_context::acquire_lock (this=0x7fd190000eb0, mdl_request=0x7fd1ac3429e0, lock_wait_timeout=1) at /data/src/10.4/sql/mdl.cc:2365
#14 0x0000564a468461f2 in ha_maria_implicit_commit (thd=0x7fd190000d90, new_trn=false) at /data/src/10.4/sql/handler.cc:144
#15 0x0000564a464e595e in mysql_execute_command (thd=0x7fd190000d90) at /data/src/10.4/sql/sql_parse.cc:6204
#16 0x0000564a464ea897 in mysql_parse (thd=0x7fd190000d90, rawbuf=0x7fd190013458 "FLUSH TABLES t FOR EXPORT", length=25, parser_state=0x7fd1ac343550, is_com_multi=false, is_next_command=false) at /data/src/10.4/sql/sql_parse.cc:7938
#17 0x0000564a464d6b9d in dispatch_command (command=COM_QUERY, thd=0x7fd190000d90, packet=0x7fd1900087b1 "FLUSH TABLES t FOR EXPORT", packet_length=25, is_com_multi=false, is_next_command=false) at /data/src/10.4/sql/sql_parse.cc:1839
#18 0x0000564a464d5405 in do_command (thd=0x7fd190000d90) at /data/src/10.4/sql/sql_parse.cc:1357
#19 0x0000564a46664289 in do_handle_one_connection (connect=0x564a48be00b0) at /data/src/10.4/sql/sql_connect.cc:1412
#20 0x0000564a46663fd2 in handle_one_connection (arg=0x564a48be00b0) at /data/src/10.4/sql/sql_connect.cc:1316
#21 0x0000564a47082f46 in pfs_spawn_thread (arg=0x564a48b2d8e0) at /data/src/10.4/storage/perfschema/pfs.cc:1869
#22 0x00007fd1b23f3609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#23 0x00007fd1b1c5e293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Comment by Michael Widenius [ 2021-02-14 ]

MDEV-23843 Assertions in Diagnostics_area upon table operations under FTWRL

2 different problems:

  • MYSQL_BIN_LOG::write() did not check if mdl_context.acquire_lock() failed
  • Sql_cmd_optimize_table::execute() and Sql_cmd_repair_table::execute()
    called write_bin_log(), which could fail if sql_admin() had already
    called my_eof()

Fixed by adding check for aquire_lock() return status and protect
write_bin_log() in the above two functions with set_overwrite_status().

Comment by Elena Stepanova [ 2021-03-02 ]

The failure still happens after the patch, for example with the same test case from the description, only with ANALYZE instead of OPTIMIZE:

--source include/have_log_bin.inc
 
CREATE TABLE t (a INT);
FLUSH TABLES WITH READ LOCK;
--connect (con1,localhost,root,,)
SET lock_wait_timeout= 1;
ANALYZE TABLE t;
 
# Cleanup
--disconnect con1
--connection default
UNLOCK TABLES;
DROP TABLE t;

10.4 a6c6c4f4

mysqld: /data/src/10.4/sql/sql_error.cc:445: void Diagnostics_area::set_error_status(uint, const char*, const char*, const Sql_user_condition_identity&, const Sql_condition*): Assertion `! is_set() || m_can_overwrite_status' failed.
210302 13:02:06 [ERROR] mysqld got signal 6 ;
 
#7  0x00007fb572668f36 in __GI___assert_fail (assertion=0x56098f8f5408 "! is_set() || m_can_overwrite_status", file=0x56098f8f51a0 "/data/src/10.4/sql/sql_error.cc", line=445, function=0x56098f8f5430 "void Diagnostics_area::set_error_status(uint, const char*, const char*, const Sql_user_condition_identity&, const Sql_condition*)") at assert.c:101
#8  0x000056098ebab69e in Diagnostics_area::set_error_status (this=0x7fb554006780, sql_errno=1205, message=0x7fb56c5a2280 "Lock wait timeout exceeded; try restarting transaction", sqlstate=0x56098f91ad4d "HY000", ucid=..., error_condition=0x0) at /data/src/10.4/sql/sql_error.cc:445
#9  0x000056098eb8566d in THD::raise_condition (this=0x7fb554000d90, sql_errno=1205, sqlstate=0x56098f91ad4d "HY000", level=Sql_state_errno_level::WARN_LEVEL_ERROR, ucid=..., msg=0x7fb56c5a2280 "Lock wait timeout exceeded; try restarting transaction") at /data/src/10.4/sql/sql_class.cc:1103
#10 0x000056098eac6a26 in THD::raise_condition (this=0x7fb554000d90, sql_errno=1205, sqlstate=0x0, level=Sql_state_errno_level::WARN_LEVEL_ERROR, msg=0x7fb56c5a2280 "Lock wait timeout exceeded; try restarting transaction") at /data/src/10.4/sql/sql_class.h:4387
#11 0x000056098eab9386 in my_message_sql (error=1205, str=0x7fb56c5a2280 "Lock wait timeout exceeded; try restarting transaction", MyFlags=0) at /data/src/10.4/sql/mysqld.cc:3346
#12 0x000056098f8115cf in my_error (nr=1205, MyFlags=0) at /data/src/10.4/mysys/my_error.c:125
#13 0x000056098ed98bc1 in MDL_context::acquire_lock (this=0x7fb554000eb0, mdl_request=0x7fb56c5a2750, lock_wait_timeout=1) at /data/src/10.4/sql/mdl.cc:2365
#14 0x000056098f0abc36 in MYSQL_BIN_LOG::write (this=0x5609903ac300 <mysql_bin_log>, event_info=0x7fb56c5a2990, with_annotate=0x0) at /data/src/10.4/sql/log.cc:6393
#15 0x000056098eb96b4c in THD::binlog_query (this=0x7fb554000d90, qtype=THD::STMT_QUERY_TYPE, query_arg=0x7fb554015668 "ANALYZE TABLE t", query_len=15, is_trans=false, direct=false, suppress_use=false, errcode=0) at /data/src/10.4/sql/sql_class.cc:7209
#16 0x000056098ece0912 in write_bin_log (thd=0x7fb554000d90, clear_error=true, query=0x7fb554015668 "ANALYZE TABLE t", query_length=15, is_trans=false) at /data/src/10.4/sql/sql_table.cc:1987
#17 0x000056098eda1101 in Sql_cmd_analyze_table::execute (this=0x7fb554015de0, thd=0x7fb554000d90) at /data/src/10.4/sql/sql_admin.cc:1399
#18 0x000056098ec09869 in mysql_execute_command (thd=0x7fb554000d90) at /data/src/10.4/sql/sql_parse.cc:6161
#19 0x000056098ec0ee17 in mysql_parse (thd=0x7fb554000d90, rawbuf=0x7fb554015668 "ANALYZE TABLE t", length=15, parser_state=0x7fb56c5a3550, is_com_multi=false, is_next_command=false) at /data/src/10.4/sql/sql_parse.cc:7959
#20 0x000056098ebfb121 in dispatch_command (command=COM_QUERY, thd=0x7fb554000d90, packet=0x7fb55400acc1 "", packet_length=15, is_com_multi=false, is_next_command=false) at /data/src/10.4/sql/sql_parse.cc:1855
#21 0x000056098ebf9989 in do_command (thd=0x7fb554000d90) at /data/src/10.4/sql/sql_parse.cc:1373
#22 0x000056098ed891a5 in do_handle_one_connection (connect=0x560992edbb60) at /data/src/10.4/sql/sql_connect.cc:1412
#23 0x000056098ed88eee in handle_one_connection (arg=0x560992edbb60) at /data/src/10.4/sql/sql_connect.cc:1316
#24 0x000056098f7ac958 in pfs_spawn_thread (arg=0x560992eaf7c0) at /data/src/10.4/storage/perfschema/pfs.cc:1869
#25 0x00007fb572ee9609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#26 0x00007fb572754293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Comment by Michael Widenius [ 2021-03-02 ]

Need to fix Elenas new test case

Comment by Michael Widenius [ 2021-03-02 ]

Fixed and pushed into 10.4

Generated at Thu Feb 08 09:25:28 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.