mleich, thank you. I checked each rr replay trace as well as the core dump.
First, let me mention failure 2, which is definitely not an InnoDB problem. Can you please file a separate bug for it, tentatively for the partitioning storage engine? The options was freed in:
bb-10.6-MDEV-11026 cfb6106e511cae168aaba4fa2964d2c110ed834a
|
#5 0x0000564287694428 in free_root (root=0x619001434150, MyFlags=0) at /data/Server/bb-10.10-MDEV-11026B/mysys/my_alloc.c:501
|
#6 0x0000564285f9c0ec in closefrm (table=0x619001433e98) at /data/Server/bb-10.10-MDEV-11026B/sql/table.cc:4557
|
#7 0x00005642862b9912 in intern_close_table (table=0x619001433e98) at /data/Server/bb-10.10-MDEV-11026B/sql/table_cache.cc:220
|
#8 0x00005642862b9b79 in tc_remove_table (table=0x619001433e98) at /data/Server/bb-10.10-MDEV-11026B/sql/table_cache.cc:258
|
#9 0x00005642862baf4d in tc_release_table (table=0x619001433e98) at /data/Server/bb-10.10-MDEV-11026B/sql/table_cache.cc:449
|
#10 0x0000564285aa2470 in close_thread_table (thd=0x62b00038e218, table_ptr=0x62b00038e308) at /data/Server/bb-10.10-MDEV-11026B/sql/sql_base.cc:999
|
#11 0x0000564285c84dde in handle_alter_part_error (lpt=0x7f9edddb33e0, action_completed=false, drop_partition=false, frm_install=false, reopen=true)
|
at /data/Server/bb-10.10-MDEV-11026B/sql/sql_partition.cc:7113
|
#12 0x0000564285c89461 in fast_alter_partition_table (thd=0x62b00038e218, table=0x619001433e98, alter_info=0x7f9edddb63f0, alter_ctx=0x7f9edddb5080, create_info=0x7f9edddb6520, table_list=0x6290012e83b0)
|
at /data/Server/bb-10.10-MDEV-11026B/sql/sql_partition.cc:7819
|
#13 0x0000564285eeae27 in mysql_alter_table (thd=0x62b00038e218, new_db=0x62b000392cd8, new_name=0x62b0003930e8, create_info=0x7f9edddb6520, table_list=0x6290012e83b0, alter_info=0x7f9edddb63f0, order_num=0,
|
order=0x0, ignore=false, if_exists=false) at /data/Server/bb-10.10-MDEV-11026B/sql/sql_table.cc:10409
|
#14 0x00005642860806b4 in Sql_cmd_alter_table::execute (this=0x6290012e8b50, thd=0x62b00038e218) at /data/Server/bb-10.10-MDEV-11026B/sql/sql_alter.cc:547
|
#15 0x0000564285c40758 in mysql_execute_command (thd=0x62b00038e218, is_called_from_prepared_stmt=false) at /data/Server/bb-10.10-MDEV-11026B/sql/sql_parse.cc:5996
|
#16 0x0000564285c4cc44 in mysql_parse (thd=0x62b00038e218, rawbuf=0x6290012e8238 "ALTER TABLE l REBUILD PARTITION p0,p1,p2,p3 /* E_R Thread31 QNO 219 CON_ID 43 */", length=80, parser_state=0x7f9edddb7b10)
|
at /data/Server/bb-10.10-MDEV-11026B/sql/sql_parse.cc:8036
|
A subsequent operation is attempting to read the freed memory:
bb-10.6-MDEV-11026 cfb6106e511cae168aaba4fa2964d2c110ed834a
|
#6 0x00007f9f0a6d108b in __asan_report_load4 () from /lib/x86_64-linux-gnu/libasan.so.5
|
#7 0x0000564286d307a7 in create_table_info_t::check_table_options (this=0x7f9edf2ba2c0) at /data/Server/bb-10.10-MDEV-11026B/storage/innobase/handler/ha_innodb.cc:11217
|
#8 0x0000564286d34637 in create_table_info_t::prepare_create_table (this=0x7f9edf2ba2c0, name=0x619001cf8038 "test/l#P#p0", strict=false)
|
at /data/Server/bb-10.10-MDEV-11026B/storage/innobase/handler/ha_innodb.cc:11988
|
#9 0x0000564286d757f3 in ha_innobase::create (this=0x61d0010ea0b8, name=0x619001cf8038 "test/l#P#p0", form=0x6190014f7d98, create_info=0x7f9edf2ba9c0, file_per_table=true, trx=0x7f9efdc5c240)
|
at /data/Server/bb-10.10-MDEV-11026B/storage/innobase/handler/ha_innodb.cc:13162
|
#10 0x0000564286d412ba in ha_innobase::truncate (this=0x61d0010ea0b8) at /data/Server/bb-10.10-MDEV-11026B/storage/innobase/handler/ha_innodb.cc:13930
|
#11 0x00005642864889ae in handler::ha_truncate (this=0x61d0010ea0b8) at /data/Server/bb-10.10-MDEV-11026B/sql/handler.cc:5070
|
#12 0x0000564286c1ab9b in ha_partition::truncate_partition (this=0x61d0010e96b8, alter_info=0x62b000354c20, binlog_stmt=0x7f9edf2baed0) at /data/Server/bb-10.10-MDEV-11026B/sql/ha_partition.cc:4913
|
#13 0x000056428606e4a6 in Sql_cmd_alter_table_truncate_partition::execute (this=0x6290011f8ac8, thd=0x62b00034f218) at /data/Server/bb-10.10-MDEV-11026B/sql/sql_partition_admin.cc:959
|
#14 0x0000564285c40758 in mysql_execute_command (thd=0x62b00034f218, is_called_from_prepared_stmt=false) at /data/Server/bb-10.10-MDEV-11026B/sql/sql_parse.cc:5996
|
#15 0x0000564285c4cc44 in mysql_parse (thd=0x62b00034f218, rawbuf=0x6290011f8238 "ALTER TABLE l TRUNCATE PARTITION p0 /* E_R Thread24 QNO 232 CON_ID 38 */", length=72, parser_state=0x7f9edf2bbb10)
|
at /data/Server/bb-10.10-MDEV-11026B/sql/sql_parse.cc:8036
|
I expect this bug to affect older releases. While the TRUNCATE in InnoDB was refactored in 10.6, it has been similar ever since MDEV-13564 in MariaDB Server 10.2.19.
The remaining failures are related to the InnoDB FULLTEXT INDEX implementation, of which thiru is the subsystem owner (and should file separate bug(s) for these).
In Failure 1, the table was not dropped, but the cached metadata was freed in innobase_reload_table() during ALTER TABLE. There is a race condition between two InnoDB threads that handle FULLTEXT INDEX:
(rr) reverse-continue
|
Continuing.
|
[Switching to Thread 3047615.3048245]
|
Thread 6 hit Hardware watchpoint 2: -location slot->table
|
|
Old value = (dict_table_t *) 0x0
|
New value = (dict_table_t *) 0x6180000ab908
|
0x000055ff2377997d in fts_optimize_del_table (remove=0x6100000184e8) at /data/Server/bb-10.10-MDEV-11026B/storage/innobase/fts/fts0opt.cc:2679
|
2679 slot->table = NULL;
|
(rr) bt
|
#0 0x000055ff2377997d in fts_optimize_del_table (remove=0x6100000184e8) at /data/Server/bb-10.10-MDEV-11026B/storage/innobase/fts/fts0opt.cc:2679
|
#1 0x000055ff2377a904 in fts_optimize_callback () at /data/Server/bb-10.10-MDEV-11026B/storage/innobase/fts/fts0opt.cc:2863
|
…
|
(rr) thread apply 3 backtrace
|
|
Thread 3 (Thread 3047615.3048290):
|
#0 0x0000000070000002 in syscall_traced ()
|
…
|
#22 0x000000006f6e31ab in __asan_report_load8 () from /lib/x86_64-linux-gnu/libasan.so.5
|
#23 0x000055ff23779e0b in fts_is_sync_needed () at /data/Server/bb-10.10-MDEV-11026B/storage/innobase/fts/fts0opt.cc:2741
|
#24 0x000055ff2377a5d1 in fts_optimize_callback () at /data/Server/bb-10.10-MDEV-11026B/storage/innobase/fts/fts0opt.cc:2816
|
That is, during the time ASAN already noticed that slot->table is pointing to freed memory, another thread is assigning slot->table to nullptr.
Failures 3 and 4 seem to be due to the same problem: the slot passed to fts_optimize_table_bk() points to freed memory. With the rr trace, it should be easy to debug this in more detail:
rr replay /data/results/1655481418/TBR-1554/1/rr/latest-trace
|
continue
|
watch -l *(char*)0x0c307fff8628
|
reverse-continue
|
reverse-continue
|
backtrace
|
The memory was freed during ha_innobase::delete_table(), which was executed as part of ddl_recovery.log processing. I think that this bug should affect 10.6 already. One possible fix to this failure might be to not schedule any FTS tasks before DDL recovery has been completed.
I have not seen these FTS related failures again.
origin/bb-10.10-MDEV-11026 c8e3bcf79b3931dad9ce16fecc714d39d2247f89 2022-06-27T12:00:31+02:00
performed well in RQG testing.