Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.3.7, 10.4.0, 10.5.0
Description
Assert hit during RQG testing.
|
# 2020-09-02T03:02:48 [4059144] | [rr 4061206 1280885]2020-09-02 02:59:21 0x7f08f187d300[rr 4061206 1280888] InnoDB: Assertion failure in file /home/mleich/Server/10.5A/storage/innobase/fil/fil0fil.cc line 2243
|
# 2020-09-02T03:02:48 [4059144] | [rr 4061206 1280890]InnoDB: Failing assertion: !space->referenced()
|
# 2020-09-02T03:02:48 [4059144] | [rr 4061206 1280892]InnoDB: We intentionally generate a memory trap.
|
...
|
Query (0x62b000126238): ALTER TABLE t2 ADD PRIMARY KEY ( col_int, col_text(9) ) /* E_R Thread11 QNO 444 CON_ID 62 */
|
# 2020-09-02T03:02:48 [4059144] | Connection ID (thread ID): 62
|
# 2020-09-02T03:02:48 [4059144] | [rr 4061206 1300994]Status: NOT_KILLED
|
|
(rr) bt
|
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
|
#1 0x00007f08efbd9859 in __GI_abort () at abort.c:79
|
#2 0x00005637b2e99aa8 in ut_dbg_assertion_failed (expr=expr@entry=0x5637b3f9ca00 "!space->referenced()", file=file@entry=0x5637b3fa3090 "/home/mleich/Server/10.5A/storage/innobase/fil/fil0fil.cc", line=line@entry=2243)
|
at /home/mleich/Server/10.5A/storage/innobase/ut/ut0dbg.cc:60
|
#3 0x00005637b316faf3 in fil_delete_tablespace (id=<optimized out>, if_exists=if_exists@entry=false, detached_handles=detached_handles@entry=0x7f08f18728c0) at /home/mleich/Server/10.5A/storage/innobase/fil/fil0fil.cc:2243
|
#4 0x00005637b2c487aa in row_drop_table_for_mysql (name=<optimized out>, trx=trx@entry=0x22773c0f2b90, sqlcom=sqlcom@entry=SQLCOM_DROP_DB, create_failed=create_failed@entry=false, nonatomic=<optimized out>, nonatomic@entry=false)
|
at /home/mleich/Server/10.5A/storage/innobase/row/row0mysql.cc:3722
|
#5 0x00005637b320ffef in fts_drop_table (trx=trx@entry=0x22773c0f2b90, table_name=table_name@entry=0x7f08f1873bc0 "test/FTS_", '0' <repeats 13 times>, "26a_DELETED") at /home/mleich/Server/10.5A/storage/innobase/fts/fts0fts.cc:1443
|
#6 0x00005637b321090d in fts_drop_common_tables (trx=trx@entry=0x22773c0f2b90, fts_table=fts_table@entry=0x7f08f1873f60) at /home/mleich/Server/10.5A/storage/innobase/fts/fts0fts.cc:1577
|
#7 0x00005637b3210b4c in fts_drop_tables (trx=trx@entry=0x22773c0f2b90, table=table@entry=0x619000274820) at /home/mleich/Server/10.5A/storage/innobase/fts/fts0fts.cc:1688
|
#8 0x00005637b2c47a8b in row_drop_ancillary_fts_tables (trx=0x22773c0f2b90, table=0x619000274820) at /home/mleich/Server/10.5A/storage/innobase/row/row0mysql.cc:3214
|
#9 row_drop_table_for_mysql (name=name@entry=0x7f08f1875730 "test/#sql-alter-3df816-3e", trx=trx@entry=0x22773c0f2b90, sqlcom=sqlcom@entry=SQLCOM_ALTER_TABLE, create_failed=create_failed@entry=false, nonatomic=<optimized out>, nonatomic@entry=true)
|
at /home/mleich/Server/10.5A/storage/innobase/row/row0mysql.cc:3682
|
#10 0x00005637b2840018 in ha_innobase::delete_table (this=this@entry=0x62b000128fb0, name=name@entry=0x61b0005a1140 "./test/#sql-alter-3df816-3e", sqlcom=sqlcom@entry=SQLCOM_ALTER_TABLE)
|
at /home/mleich/Server/10.5A/storage/innobase/handler/ha_innodb.cc:13211
|
#11 0x00005637b281c22e in ha_innobase::delete_table (this=0x62b000128fb0, name=0x61b0005a1140 "./test/#sql-alter-3df816-3e") at /home/mleich/Server/10.5A/storage/innobase/handler/ha_innodb.cc:13338
|
#12 0x00005637b1b65996 in hton_drop_table (hton=<optimized out>, path=<optimized out>) at /home/mleich/Server/10.5A/sql/handler.cc:564
|
#13 0x00005637b19acec9 in THD::rm_temporary_table (this=this@entry=0x62b00011f218, base=0x615000001e98, path=0x61b0005a1140 "./test/#sql-alter-3df816-3e") at /home/mleich/Server/10.5A/sql/temporary_tables.cc:703
|
#14 0x00005637b19af049 in THD::free_tmp_table_share (this=this@entry=0x62b00011f218, share=share@entry=0x61b0005a0b98, delete_table=delete_table@entry=true) at /home/mleich/Server/10.5A/sql/handler.h:1718
|
#15 0x00005637b19b4f8b in THD::drop_temporary_table (this=this@entry=0x62b00011f218, table=table@entry=0x619000956a98, is_trans=is_trans@entry=0x0, delete_table=delete_table@entry=true) at /home/mleich/Server/10.5A/sql/temporary_tables.cc:669
|
#16 0x00005637b15a3be7 in mysql_alter_table (thd=thd@entry=0x62b00011f218, new_db=<optimized out>, new_name=new_name@entry=0x62b000124060, create_info=create_info@entry=0x7f08f1878e50, table_list=<optimized out>, table_list@entry=0x62b0001263d0,
|
alter_info=alter_info@entry=0x7f08f1878d40, order_num=<optimized out>, order=<optimized out>, ignore=<optimized out>, if_exists=<optimized out>) at /home/mleich/Server/10.5A/sql/sql_table.cc:10997
|
#17 0x00005637b174eb72 in Sql_cmd_alter_table::execute (this=<optimized out>, thd=0x62b00011f218) at /home/mleich/Server/10.5A/sql/structs.h:559
|
#18 0x00005637b12b662a in mysql_execute_command (thd=thd@entry=0x62b00011f218) at /home/mleich/Server/10.5A/sql/sql_parse.cc:5952
|
#19 0x00005637b126454a in mysql_parse (thd=thd@entry=0x62b00011f218, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x7f08f187aed0, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false)
|
at /home/mleich/Server/10.5A/sql/sql_parse.cc:7994
|
#20 0x00005637b1295bb6 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x62b00011f218, packet=packet@entry=0x629001b26219 " ALTER TABLE t2 ADD PRIMARY KEY ( col_int, col_text(9) ) /* E_R Thread11 QNO 444 CON_ID 62 */ ",
|
packet_length=packet_length@entry=95, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false) at /home/mleich/Server/10.5A/sql/sql_class.h:1251
|
#21 0x00005637b12a105d in do_command (thd=0x62b00011f218) at /home/mleich/Server/10.5A/sql/sql_parse.cc:1348
|
#22 0x00005637b173467c in do_handle_one_connection (connect=<optimized out>, connect@entry=0x6080000031b8, put_in_cache=put_in_cache@entry=true) at /home/mleich/Server/10.5A/sql/sql_connect.cc:1410
|
#23 0x00005637b17358b2 in handle_one_connection (arg=arg@entry=0x6080000031b8) at /home/mleich/Server/10.5A/sql/sql_connect.cc:1312
|
#24 0x00005637b25b3d03 in pfs_spawn_thread (arg=0x61500000a818) at /home/mleich/Server/10.5A/storage/perfschema/pfs.cc:2201
|
#25 0x000032542082d609 in start_thread (arg=<optimized out>) at pthread_create.c:477
|
#26 0x00007f08efcd6103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
|
(rr)
|
|
origin/10.5, 10.5 0af6e525212f81c2da8185817c6ca2e7c3ac036a 2020-09-02T09:29:57+10:00
|
The server claims to be 10.5.6.
|
|
Getting RQG
|
==========
|
git clone https://github.com/mleich1/rqg --branch experimental RQG
|
origin/experimental 8762b0e26ec3c127695bd36b19a00aa6723705d4 2020-09-01T13:43:47+02:00
|
|
perl rqg.pl \
|
--no-mask \
|
--seed=random \
|
--engine=InnoDB \
|
--queries=100000000 \
|
--restart_timeout=120 \
|
--mysqld=--log_output=none \
|
--mysqld=--loose-innodb-use-atomic-writes \
|
--mysqld=--plugin-load-add=file_key_management.so \
|
--mysqld=--loose-file-key-management-filename=/RQG/conf/mariadb/encryption_keys.txt \
|
--mysqld=--innodb_lock_schedule_algorithm=fcfs \
|
--mysqld=--loose-idle_write_transaction_timeout=0 \
|
--mysqld=--loose-idle_transaction_timeout=0 \
|
--mysqld=--loose-idle_readonly_transaction_timeout=0 \
|
--mysqld=--connect_timeout=60 \
|
--mysqld=--interactive_timeout=28800 \
|
--mysqld=--slave_net_timeout=60 \
|
--mysqld=--net_read_timeout=30 \
|
--mysqld=--net_write_timeout=60 \
|
--mysqld=--loose-table_lock_wait_timeout=50 \
|
--mysqld=--wait_timeout=28800 \
|
--mysqld=--lock-wait-timeout=86400 \
|
--mysqld=--innodb-lock-wait-timeout=50 \
|
--reporters=Backtrace \
|
--reporters=ErrorLog \
|
--reporters=Deadlock1 \
|
--validators=None \
|
--mysqld=--log-bin \
|
--mysqld=--log_bin_trust_function_creators=1 \
|
--mysqld=--loose-debug_assert_on_not_freed_memory=0 \
|
--duration=300 \
|
--mysqld=--loose-innodb_fatal_semaphore_wait_threshold=300 \
|
--grammar=conf/mariadb/table_stress_innodb.yy \
|
--gendata=conf/mariadb/table_stress.zz \
|
--gendata_sql=conf/mariadb/table_stress.sql \
|
--mysqld=--loose-innodb_buffer_pool_instances=3 \
|
--mysqld=--loose-innodb_doublewrite=1 \
|
--mysqld=--loose-innodb-encryption-threads=7 \
|
--mysqld=--innodb_encryption_rotate_key_age=0 \
|
--redefine=conf/mariadb/redefine_innodb_undo.yy \
|
--threads=13 \
|
--mysqld=--innodb-encrypt-log \
|
--mysqld=--innodb-encrypt-tables \
|
--duration=300 \
|
--no_mask \
|
--workdir=<local settings< \
|
--vardir=<local settings< \
|
--mtr-build-thread=<local settings< \
|
--basedir1=<local settings< \
|
--script_debug=_nix_ \
|
--rr=Server \
|
--rr_options=--chaos
|
|
Error tagging pattern for RQG
|
[ 'TBR-671-MDEV-23651' , 'InnoDB: Assertion failure in file .{1,150}fil0fil.cc.{1,200}InnoDB: Failing assertion: \!space->referenced\(\)' ],
|
Attachments
Issue Links
- is caused by
-
MDEV-15983 Reduce fil_system.mutex contention further
-
- Closed
-
- relates to
-
MDEV-15949 InnoDB: Failing assertion: space->n_pending_ops == 0 in fil_delete_tablespace upon DROP TABLE
-
- Closed
-
-
MDEV-23776 Test encryption.create_or_replace fails intermittently with a warning
-
- Closed
-
The reference count was incremented from 0 (and never reached 0 after that) in
10.5
Thread 8 hit Hardware watchpoint 1: -location $2->n_pending_ops._M_i
Old value = 0
New value = 1
fil_system_t::keyrotate_next (encrypt=false, recheck=false,
space=<optimized out>, this=<optimized out>)
at /home/mleich/Server/10.5A/storage/innobase/fil/fil0crypt.cc:1498
1498 return space;
(rr) when
Current event: 1280868
(rr) bt
#0 fil_system_t::keyrotate_next (encrypt=false, recheck=false,
space=<optimized out>, this=<optimized out>)
at /home/mleich/Server/10.5A/storage/innobase/fil/fil0crypt.cc:1498
#1 fil_space_next (space=0x0, recheck=<optimized out>,
encrypt=<optimized out>)
at /home/mleich/Server/10.5A/storage/innobase/fil/fil0crypt.cc:1514
#2 0x00005637b31a4e19 in fil_crypt_find_space_to_rotate (
key_state=key_state@entry=0x6d50794c9d40,
state=state@entry=0x6d50794c9d60, recheck=recheck@entry=0x6d50794c9d30)
at /home/mleich/Server/10.5A/storage/innobase/fil/fil0crypt.cc:1575
#3 0x00005637b31a5846 in fil_crypt_thread ()
at /home/mleich/Server/10.5A/storage/innobase/fil/fil0crypt.cc:1163
…
1594 state->space = fil_space_next(state->space, *recheck,
(rr) s
fil_space_next (space=0x6160020dab98, recheck=false, encrypt=false)
at /home/mleich/Server/10.5A/storage/innobase/fil/fil0crypt.cc:1510
1510 {
(rr) n
1511 mutex_enter(&fil_system.mutex);
(rr)
[Switching to Thread 4061206.4071781]
Thread 9 hit Hardware watchpoint 1: -location $2->n_pending_ops._M_i
Old value = 1
New value = 2
fil_system_t::keyrotate_next (encrypt=false, recheck=false,
space=<optimized out>, this=<optimized out>)
at /home/mleich/Server/10.5A/storage/innobase/fil/fil0crypt.cc:1498
1498 return space;
(rr) bt
#0 fil_system_t::keyrotate_next (encrypt=false, recheck=false,
space=<optimized out>, this=<optimized out>)
at /home/mleich/Server/10.5A/storage/innobase/fil/fil0crypt.cc:1498
#1 fil_space_next (space=0x0, recheck=<optimized out>,
encrypt=<optimized out>)
at /home/mleich/Server/10.5A/storage/innobase/fil/fil0crypt.cc:1514
#2 0x00005637b31a4e19 in fil_crypt_find_space_to_rotate (
key_state=key_state@entry=0x640001027d40,
state=state@entry=0x640001027d60, recheck=recheck@entry=0x640001027d30)
at /home/mleich/Server/10.5A/storage/innobase/fil/fil0crypt.cc:1575 0x00003254208317db <+43>: xchg %eax,(%rdi)
=> 0x00003254208317dd <+45>: cmp $0x1,%eax
0x00003254208317e0 <+48>: jg 0x325420831820 <__GI___pthread_mutex_unlock+112>
#3 0x00005637b31a5846 in fil_crypt_thread ()
at /home/mleich/Server/10.5A/storage/innobase/fil/fil0crypt.cc:1163
…
(rr) when
Current event: 1280872
At the same time when Thread 8 is about to decrement the reference count for this tablespace, Thread 9 will acquire the same tablespace for processing. A bit later, Thread 9 will release the reference count and Thread 10 will increment and decrement it, all while Thread 8 is still waiting for fil_system.mutex:
Thread 8 (Thread 4061206.4071774):
#0 __pthread_mutex_unlock_usercnt (decr=1, mutex=0x6d50794c97a8) at pthread_mutex_unlock.c:58
#1 __GI___pthread_mutex_unlock (mutex=0x6d50794c97a8) at pthread_mutex_unlock.c:357
#2 0x00005637b2837118 in OSMutex::exit (this=0x6d50794c97a0) at /home/mleich/Server/10.5A/storage/innobase/include/sync0types.h:434
#3 MutexDebug<TTASEventMutex<GenericPolicy> >::set (thread_id=18446744073709551615, line=0, filename=0x0, mutex=0x0, this=0x6d50794c9770) at /home/mleich/Server/10.5A/storage/innobase/include/sync0policy.h:59
#4 MutexDebug<TTASEventMutex<GenericPolicy> >::enter (line=1511, filename=0x5637b3fa6b00 "/home/mleich/Server/10.5A/storage/innobase/fil/fil0crypt.cc", mutex=..., this=0x5637b4c20ab0 <fil_system+48>) at /home/mleich/Server/10.5A/storage/innobase/include/sync0policy.h:126
#5 PolicyMutex<TTASEventMutex<GenericPolicy> >::enter (this=0x5637b4c20aa0 <fil_system+32>, n_spins=<optimized out>, n_delay=<optimized out>, name=name@entry=0x5637b3fa6b00 "/home/mleich/Server/10.5A/storage/innobase/fil/fil0crypt.cc", line=line@entry=1511) at /home/mleich/Server/10.5A/storage/innobase/include/ib0mutex.h:588
#6 0x00005637b31a052a in fil_space_next (space=0x6160020dab98, recheck=<optimized out>, encrypt=<optimized out>) at /home/mleich/Server/10.5A/storage/innobase/fil/fil0crypt.cc:1511
This situation continues until the assertion fails, at event 1280908. It is unclear why the function is stuck here:
0x00003254208317db <+43>: xchg %eax,(%rdi)
=> 0x00003254208317dd <+45>: cmp $0x1,%eax
0x00003254208317e0 <+48>: jg 0x325420831820 <__GI___pthread_mutex_unlock+112>
It looks like the failure occurs simply be because rr decided not to let this thread run:
Thread 9 hit Hardware watchpoint 1: -location $2->n_pending_ops._M_i
Old value = 1
New value = 2
fil_system_t::keyrotate_next (encrypt=false, recheck=false,
space=<optimized out>, this=<optimized out>)
at /home/mleich/Server/10.5A/storage/innobase/fil/fil0crypt.cc:1498
1498 return space;
1: x/i $pc
=> 0x5637b31a0b31 <fil_space_next(fil_space_t*, bool, bool)+1864>:
lea -0x1e0(%rbx),%rax
(rr) disa 1
(rr) thr 8
[Switching to thread 8 (Thread 4061206.4071774)]
#0 __pthread_mutex_unlock_usercnt (decr=1, mutex=0x6d50794c97a8)
at pthread_mutex_unlock.c:58
58 pthread_mutex_unlock.c: No such file or directory.
(rr) display
1: x/i $pc
=> 0x3254208317dd <__GI___pthread_mutex_unlock+45>: cmp $0x1,%eax
(rr) si
[rr 4061206 1280885]2020-09-02 02:59:21 0x7f08f187d300[rr 4061206 1280888] InnoDB: Assertion failure in file /home/mleich/Server/10.5A/storage/innobase/fil/fil0fil.cc line 2243
I think that this bug can only be fixed by removing the reliance on the tablespace reference count in background operations, and always acquiring proper MDL (or an InnoDB table reference).
MDEV-19514andMDEV-16678solved this problem in 10.5 in many other scenarios. But, even withMDEV-19514fixed, as noted inMDEV-11634, ibuf_insert_low() could invoke ibuf_read_merge_pages() to merge other buffered changes before buffering one more change, and that merge is not necessarily covered by any MDL or InnoDB table reference.