[MDEV-26636] Race conditions due to attempted updates of InnoDB defragmentation statistics for temporary tables Created: 2021-09-18  Updated: 2022-09-15  Resolved: 2021-09-18

Status: Closed
Project: MariaDB Server
Component/s: Data Definition - Alter Table, Data Manipulation - Insert, Storage Engine - InnoDB
Affects Version/s: 10.2, 10.3, 10.4, 10.5, 10.6, 10.7
Fix Version/s: 10.2.41, 10.3.32, 10.4.22, 10.5.13, 10.6.5

Type: Bug Priority: Major
Reporter: Roel Van de Paar Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: affects-tests, race, regression

Issue Links:
Relates
relates to MDEV-29277 On error, fts_sync_table() fails to r... Closed
relates to MDEV-5834 Merge Kakao Defragmentation implement... Closed
relates to MDEV-17794 Do not assign persistent ID for tempo... Closed
relates to MDEV-24258 Merge dict_sys.mutex into dict_sys.latch Closed

 Description   

Sporadic. Loop till crash is seen. Two attempts usually is enough. For some reason it seems to affect 10.3 and 10.5-10.7, but not 10.4. Stacks looks a bit odd. Not sure on real priority, if any. The TEMPORARY is required.

SET GLOBAL innodb_defragment_stats_accuracy=1;
SET GLOBAL innodb_limit_optimistic_insert_debug=2;
CREATE TEMPORARY TABLE t (c INT) ENGINE=InnoDB;
INSERT INTO t VALUES (0);
INSERT INTO t VALUES (0);
INSERT INTO t VALUES (0);
ALTER TABLE t ENGINE=InnoDB;
ALTER TABLE t ENGINE=InnoDB;
ALTER TABLE t ENGINE=InnoDB;
ALTER TABLE t ENGINE=InnoDB;
DROP TABLE t;

Leads to:

10.7.0 d552e092c9f3e20da078d1b62b976f629f73d3a4 (Debug)

Core was generated by `/test/MD180921-mariadb-10.7.0-linux-x86_64-dbg/bin/mysqld --no-defaults --core-'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  dict_table_get_first_index (table=0x0)
    at /test/10.7_dbg/storage/innobase/include/dict0dict.ic:211
[Current thread is 1 (Thread 0x1455a0364700 (LWP 3119087))]
(gdb) bt
#0  dict_table_get_first_index (table=0x0) at /test/10.7_dbg/storage/innobase/include/dict0dict.ic:211
#1  dict_table_find_index_on_id (table=table@entry=0x0, id=id@entry=23) at /test/10.7_dbg/storage/innobase/dict/dict0dict.cc:1342
#2  0x00005588fc9881da in dict_stats_process_entry_from_defrag_pool () at /test/10.7_dbg/storage/innobase/include/dict0mem.h:1802
#3  dict_defrag_process_entries_from_defrag_pool () at /test/10.7_dbg/storage/innobase/dict/dict0defrag_bg.cc:212
#4  0x00005588fc985ce7 in dict_stats_func () at /test/10.7_dbg/storage/innobase/dict/dict0stats_bg.cc:394
#5  0x00005588fca28b38 in tpool::thread_pool_generic::timer_generic::run (this=0x5588ff5b4e90) at /test/10.7_dbg/tpool/tpool_generic.cc:313
#6  tpool::thread_pool_generic::timer_generic::execute (arg=0x5588ff5b4e90) at /test/10.7_dbg/tpool/tpool_generic.cc:333
#7  0x00005588fca29aa3 in tpool::task::execute (this=0x5588ff5b4ed0) at /test/10.7_dbg/tpool/task.cc:52
#8  0x00005588fca28653 in tpool::thread_pool_generic::worker_main (this=0x5588ff217e70, thread_var=0x5588ff227890) at /test/10.7_dbg/tpool/tpool_generic.cc:550
#9  0x00005588fca2898a in std::__invoke_impl<void, void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> (__t=<optimized out>, __f=<optimized out>) at /usr/include/c++/9/bits/invoke.h:89
#10 std::__invoke<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> (__fn=<optimized out>) at /usr/include/c++/9/bits/invoke.h:95
#11 std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> >::_M_invoke<0ul, 1ul, 2ul> (this=<optimized out>) at /usr/include/c++/9/thread:244
#12 std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> >::operator() (this=<optimized out>) at /usr/include/c++/9/thread:251
#13 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> > >::_M_run (this=<optimized out>) at /usr/include/c++/9/thread:195
#14 0x00001455b8776de4 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#15 0x00001455b8891609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#16 0x00001455b847f293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

10.3.32 b112c9dfaacbcb7c3548414c6f402114663223dc (Debug)

InnoDB: Failing assertion: table->get_ref_count() == 0

10.3.32 b112c9dfaacbcb7c3548414c6f402114663223dc (Debug)

Core was generated by `/test/MD180921-mariadb-10.3.32-linux-x86_64-dbg/bin/mysqld --no-defaults --core'.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
[Current thread is 1 (Thread 0x14c55c056700 (LWP 2206952))]
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x000014c573312859 in __GI_abort () at abort.c:79
#2  0x000055bd02efbc79 in ut_dbg_assertion_failed (expr=expr@entry=0x55bd03525f52 "table->get_ref_count() == 0", file=file@entry=0x55bd0355e338 "/test/10.3_dbg/storage/innobase/dict/dict0dict.cc", line=line@entry=1844) at /test/10.3_dbg/storage/innobase/ut/ut0dbg.cc:60
#3  0x000055bd02fea50e in dict_table_remove_from_cache_low (table=0x14c51400c290, lru_evict=lru_evict@entry=0) at /test/10.3_dbg/storage/innobase/dict/dict0dict.cc:1844
#4  0x000055bd02feb349 in dict_table_remove_from_cache (table=<optimized out>) at /test/10.3_dbg/storage/innobase/dict/dict0dict.cc:1948
#5  0x000055bd02e10520 in row_drop_table_for_mysql (name=name@entry=0x14c55c051530 "data/#sql21a851_9_ac", trx=trx@entry=0x14c55e4a6178, sqlcom=sqlcom@entry=SQLCOM_ALTER_TABLE, create_failed=create_failed@entry=false, nonatomic=<optimized out>, nonatomic@entry=true) at /test/10.3_dbg/storage/innobase/row/row0mysql.cc:3413
#6  0x000055bd02ca017a in ha_innobase::delete_table (this=this@entry=0x14c514012698, name=<optimized out>, name@entry=0x14c51404c2a0 "/test/MD180921-mariadb-10.3.32-linux-x86_64-dbg/data/#sql21a851_9_ac", sqlcom=sqlcom@entry=SQLCOM_ALTER_TABLE) at /test/10.3_dbg/storage/innobase/handler/ha_innodb.cc:13215
#7  0x000055bd02c95d44 in ha_innobase::delete_table (this=0x14c514012698, name=0x14c51404c2a0 "/test/MD180921-mariadb-10.3.32-linux-x86_64-dbg/data/#sql21a851_9_ac") at /test/10.3_dbg/storage/innobase/handler/ha_innodb.cc:13340
#8  0x000055bd02a5aab7 in handler::ha_delete_table (this=this@entry=0x14c514012698, name=name@entry=0x14c51404c2a0 "/test/MD180921-mariadb-10.3.32-linux-x86_64-dbg/data/#sql21a851_9_ac") at /test/10.3_dbg/sql/handler.cc:4710
#9  0x000055bd029ae958 in THD::rm_temporary_table (this=this@entry=0x14c514000d90, base=0x55bd0581f4f0, path=0x14c51404c2a0 "/test/MD180921-mariadb-10.3.32-linux-x86_64-dbg/data/#sql21a851_9_ac") at /test/10.3_dbg/sql/temporary_tables.cc:706
#10 0x000055bd029af26f in THD::free_tmp_table_share (this=this@entry=0x14c514000d90, share=share@entry=0x14c51404bd50, delete_table=delete_table@entry=true) at /test/10.3_dbg/sql/handler.h:1630
#11 0x000055bd029b0dd1 in THD::drop_temporary_table (this=this@entry=0x14c514000d90, table=table@entry=0x14c51404c840, is_trans=is_trans@entry=0x0, delete_table=delete_table@entry=true) at /test/10.3_dbg/sql/temporary_tables.cc:669
#12 0x000055bd028867e1 in mysql_alter_table (thd=thd@entry=0x14c514000d90, new_db=new_db@entry=0x14c514005470, new_name=new_name@entry=0x14c514005858, create_info=create_info@entry=0x14c55c053490, table_list=<optimized out>, table_list@entry=0x14c514011658, alter_info=alter_info@entry=0x14c55c0533d0, order_num=0, order=0x0, ignore=false) at /test/10.3_dbg/sql/sql_table.cc:10200
#13 0x000055bd028f9754 in Sql_cmd_alter_table::execute (this=0x14c514011cc8, thd=0x14c514000d90) at /test/10.3_dbg/sql/sql_alter.cc:512
#14 0x000055bd027c65b1 in mysql_execute_command (thd=thd@entry=0x14c514000d90) at /test/10.3_dbg/sql/sql_parse.cc:6075
#15 0x000055bd027c85b2 in mysql_parse (thd=thd@entry=0x14c514000d90, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x14c55c055530, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false) at /test/10.3_dbg/sql/sql_parse.cc:7870
#16 0x000055bd027cad23 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x14c514000d90, packet=packet@entry=0x14c5140198f1 "", packet_length=packet_length@entry=27, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false) at /test/10.3_dbg/sql/sql_class.h:1152
#17 0x000055bd027cdeb3 in do_command (thd=0x14c514000d90) at /test/10.3_dbg/sql/sql_parse.cc:1398
#18 0x000055bd028f5709 in do_handle_one_connection (connect=connect@entry=0x55bd0635e7e0) at /test/10.3_dbg/sql/sql_connect.cc:1403
#19 0x000055bd028f5940 in handle_one_connection (arg=0x55bd0635e7e0) at /test/10.3_dbg/sql/sql_connect.cc:1308
#20 0x000014c5734e8609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#21 0x000014c57340f293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Bug confirmed present in:
MariaDB: 10.3.32 (dbg), 10.5.13 (dbg), 10.6.5 (dbg), 10.7.0 (dbg)

Bug (or feature/syntax) confirmed not present in:
MariaDB: 10.2.41 (dbg), 10.2.41 (opt), 10.3.32 (opt), 10.4.22 (dbg), 10.4.22 (opt), 10.5.13 (opt), 10.6.5 (opt), 10.7.0 (opt)
MySQL: 5.5.62 (dbg), 5.5.62 (opt), 5.6.51 (dbg), 5.6.51 (opt), 5.7.35 (dbg), 5.7.35 (opt), 8.0.26 (dbg), 8.0.26 (opt)



 Comments   
Comment by Roel Van de Paar [ 2021-09-18 ]

There is a secondary related stack, which is significantly more sporadic and significantly harder to reproduce. This partially reduced testcase can produce a similar or the same stack (i.e. reduced UniqueID was the same), but it is unknown if the C connector is required, which options are specifically required (i.e. most of them would not be required) (testcase updated, options reduced, and bug can be replayed in CLI) and there are likely superfluous queries and idioms. Still, it is likely enough to work from/with in total.

# mysqld options required:  --sql_mode=
CREATE TEMPORARY TABLE t1 (i INT) ENGINE=RocksDB;
CREATE UNIQUE INDEX i ON t1 (i);
ALTER TABLE t1 ADD COLUMN e ENUM ('a', 'b') FIRST;
DROP TABLE t1;
CREATE TEMPORARY TABLE temp_ignore (c1 INT) ENGINE=RocksDB;
CREATE TEMPORARY TABLE t1 (KEYc INT, c1 CHAR(100), c2 CHAR(100), PRIMARY KEY(KEYc)) ENGINE=RocksDB;
SET GLOBAL innodb_defragment_stats_accuracy=1;
CREATE TEMPORARY TABLE tmp (c1 CHAR(100));
DROP TABLE t1;
CREATE TEMPORARY TABLE t1 (c1 INT NOT NULL, c2 INT NOT NULL, c3 CHAR(255) NOT NULL, c4 TEXT (6000) NOT NULL, c5 BLOB (6000) NOT NULL, c6 VARCHAR(2000) NOT NULL, c7 VARCHAR(2000) NOT NULL, c8 DATETIME, c9 DECIMAL(6,3), PRIMARY KEY(c1), INDEX (c3,c4 (50),c5 (50)), INDEX (c2)) ENGINE=RocksDB;
INSERT INTO t1 VALUES (3792709,107,'bx9gGjR4CiwHK9V9lFmJ','wAMO93uhIrkQzodNeR8T7QxZMVzKcsQeECbyIkXAUZMDTQI0ZAgyjYD7VITQDK4W2LnSh3ZepFQiwNUh13piM6L0AcuaSfRpLEomxcjx9v1Mqfn5hL6RLT5OaNZDNs6I9wK24bvcmP21udCh5WA7I4RzPHFBOWe1OWjyTIvVOTr5HZx1gToLfcDKVQuFtUcTjCZvwMeHGJtbwd3ExPnXeRtA9nyBMLzsFnfDnp67Djrr4aH','FbPyKA2resmBzjRGP','H4boLxbibNiRR3Cc6aozLNXd8PwXAb9WBwS8D3hT8wqY7lPRDEVVfW5L4QD3JsgEq2CO4LGpXq3pXYttIn5XBKXdwYEWzsJcsokqyrFHvMuOQdGhp4meIrQ6','v','Hv',6);
INSERT INTO t1 VALUES (1941919362,-89,'o25lFD6yZcK','SxXR20TCKRZw8CytWCR90UBGgxCTFlSX3AO5','ZEOnEBYqDhkKwtYV0LvHIn0xMlJ7e4peXzLT1EFX3MB','OA','9','Z',3);
INSERT INTO t1 VALUES (1444293002,648016786,'HbY4xbRx9wLRvDGE4gr5pBKusm','Ku0bNy4Gaw4fjzfvEYKnZbyxsfsUX0ytFgdEVb2d0RJoM1qnI6aLhfmJEVf8qw4olJcT30IF8Yn053hwLaqpaGhPajxVwnDfb6SFtGEvnnkcrx6f4uNn8keoohRauRZC','1','1AxmkbgvnEbxlNTlXMh','j','H',3);
INSERT INTO t1 VALUES (-1448915294450022741,-187701819,'RSDp3D9UCeg65YN','KMV8x9SdMnQRlCEIo','aNLmbAmmnLelbKA0goXWais','GTcUltWVAFP','t','m',4);
INSERT INTO t1 VALUES (10437286018070182185,30439,'j0HIPFTkthYOm0KU3Qvxhpo4js6ZZPbTC6887KD','isQ1rH5omSjfzTbpGIjsSGgF3b1Rh1toIwfd5yOfH6xyfMoXknBanXvt3c1DEBciU9AoPRcgLcgZbuKqlQvHaM4EKkucH0XlrCJxiXyCiTZEJQY4TvdhgolYyu','qGJlDODgup0S','9W5PnxVs6GVejucJgrOEuGWhgWYsSzt','G','Lv',3);
INSERT INTO t1 VALUES (7118530,4126170,'yh','WODRNFoqXGb30szrmv2sdYIFQAOpZidHdvgT4nKmKSPim8d8XQyVdgIkU1rkjpmxWoLwdzxOmRGGIFPYHjjXUHkSUY4jNZeiVtsHUteuyvXkBQkurlB7C3tiXKdLT97ftxE5J2pypX26Y50z1DSLsqsHxIwgXwNfQdTKFaWIxMDmxG8hfGAkXwFXjWZG54CXnU5r','SivJexqd0Ao','ETMOtmlawEMLYI9VR4GpbUctDlLFVBHBvYQVBUwR041SFqvhqLawxoN9ERFzYDNA618KcWzjnz1Rta9fjEEE0dhTrs9XKSmxYrj0xgbuBTOJq231HqmcxOBxrQpLbploSHXcRfUIcyelak75gmQuaRA','ME','CJ',14);
INSERT INTO t1 VALUES (-108,37,'fV','Jz8VHylg','RRUdtmZQ1sElrfO7phsakzy2rBCDmCu3HjkbbOolR','PDAYF1zW7KqFFYaZZLsEKKC','j','A',5);
UPDATE t1 SET c4=REPEAT ("c4", 1024);

The stack:

10.6.5 1e9c922fa726b22f4522f2a4de0fcb6595404086 (Optimized)

Core was generated by `/test/MD180921-mariadb-10.6.5-linux-x86_64-opt/bin/mysqld --no-defaults --max_a'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  dict_table_find_index_on_id (table=table@entry=0x0, id=id@entry=225)
    at /test/10.6_opt/storage/innobase/dict/dict0dict.cc:1342
[Current thread is 1 (Thread 0x14c07bbfd700 (LWP 356879))]
(gdb) bt
#0  dict_table_find_index_on_id (table=table@entry=0x0, id=id@entry=225) at /test/10.6_opt/storage/innobase/dict/dict0dict.cc:1342
#1  0x0000557e1e3ad507 in dict_stats_process_entry_from_defrag_pool () at /test/10.6_opt/storage/innobase/dict/dict0defrag_bg.cc:193
#2  dict_defrag_process_entries_from_defrag_pool () at /test/10.6_opt/storage/innobase/dict/dict0defrag_bg.cc:212
#3  0x0000557e1e3abcd9 in dict_stats_func () at /test/10.6_opt/storage/innobase/dict/dict0stats_bg.cc:394
#4  0x0000557e1e412289 in tpool::thread_pool_generic::timer_generic::run (this=0x557e21150a80) at /test/10.6_opt/tpool/tpool_generic.cc:313
#5  tpool::thread_pool_generic::timer_generic::execute (arg=0x557e21150a80) at /test/10.6_opt/tpool/tpool_generic.cc:332
#6  0x0000557e1e4132ba in tpool::task::execute (this=0x557e21150ac0) at /test/10.6_opt/tpool/task.cc:37
#7  tpool::task::execute (this=0x557e21150ac0) at /test/10.6_opt/tpool/task.cc:27
#8  0x0000557e1e411d77 in tpool::thread_pool_generic::worker_main (this=0x557e20e106a0, thread_var=0x557e20e20140) at /test/10.6_opt/tpool/tpool_generic.cc:549
#9  0x000014c0b5c12de4 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#10 0x000014c0b5d2d609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#11 0x000014c0b591b293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Comment by Roel Van de Paar [ 2021-09-18 ]

Of note is that innodb_limit_optimistic_insert_debug is not used in the second testcase

Comment by Roel Van de Paar [ 2021-09-18 ]

Second testcase replays readily under a previous ASAN/UBSAN build. With markers inserted before/after testcase, we see:

10.7.0 1bc82aaf0a7746c0921a94034aff2d51f0d75cd0 (Debug, ASAN/UBSAN)

Version: '10.7.0-MariaDB-debug'  socket: '/test/UBASAN_MD060921-mariadb-10.7.0-linux-x86_64-dbg/socket.sock'  port: 28739  MariaDB Server
----
/test/10.7_dbg_san/storage/innobase/dict/dict0defrag_bg.cc:194:44: runtime error: null pointer passed as argument 1, which is declared to never be null
    #0 0x564fd9540592 in dict_stats_process_entry_from_defrag_pool /test/10.7_dbg_san/storage/innobase/dict/dict0defrag_bg.cc:194
    #1 0x564fd9540592 in dict_defrag_process_entries_from_defrag_pool() /test/10.7_dbg_san/storage/innobase/dict/dict0defrag_bg.cc:212
    #2 0x564fd9534aae in dict_stats_func /test/10.7_dbg_san/storage/innobase/dict/dict0stats_bg.cc:394
    #3 0x564fd9802d6b in tpool::thread_pool_generic::timer_generic::run() /test/10.7_dbg_san/tpool/tpool_generic.cc:313
    #4 0x564fd9802d6b in tpool::thread_pool_generic::timer_generic::execute(void*) /test/10.7_dbg_san/tpool/tpool_generic.cc:333
    #5 0x564fd980c9fc in tpool::task::execute() /test/10.7_dbg_san/tpool/task.cc:52
    #6 0x564fd97fffa2 in tpool::thread_pool_generic::worker_main(tpool::worker_data*) /test/10.7_dbg_san/tpool/tpool_generic.cc:550
    #7 0x564fd9802121 in void std::__invoke_impl<void, void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*>(std::__invoke_memfun_deref, void (tpool::thread_pool_generic::*&&)(tpool::worker_data*), tpool::thread_pool_generic*&&, tpool::worker_data*&&) /usr/include/c++/9/bits/invoke.h:73
    #8 0x564fd9802121 in std::__invoke_result<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*>::type std::__invoke<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*>(void (tpool::thread_pool_generic::*&&)(tpool::worker_data*), tpool::thread_pool_generic*&&, tpool::worker_data*&&) /usr/include/c++/9/bits/invoke.h:95
    #9 0x564fd9802121 in void std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> >::_M_invoke<0ul, 1ul, 2ul>(std::_Index_tuple<0ul, 1ul, 2ul>) /usr/include/c++/9/thread:244
    #10 0x564fd9802121 in std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> >::operator()() /usr/include/c++/9/thread:251
    #11 0x564fd9802121 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> > >::_M_run() /usr/include/c++/9/thread:195
    #12 0x148d3b658de3  (/usr/lib/x86_64-linux-gnu/libstdc++.so.6+0xd6de3)
    #13 0x148d3b775608 in start_thread /build/glibc-eX1tMB/glibc-2.31/nptl/pthread_create.c:477
    #14 0x148d3a9eb292 in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x122292)
 
/test/10.7_dbg_san/storage/innobase/include/dict0dict.ic:211:2: runtime error: member access within null pointer of type 'const struct dict_table_t'
    #0 0x564fd944896e in dict_table_get_first_index /test/10.7_dbg_san/storage/innobase/include/dict0dict.ic:211
    #1 0x564fd944896e in dict_table_find_index_on_id(dict_table_t const*, unsigned long) /test/10.7_dbg_san/storage/innobase/dict/dict0dict.cc:1340
    #2 0x564fd95402c5 in dict_stats_process_entry_from_defrag_pool /test/10.7_dbg_san/storage/innobase/dict/dict0defrag_bg.cc:194
    #3 0x564fd95402c5 in dict_defrag_process_entries_from_defrag_pool() /test/10.7_dbg_san/storage/innobase/dict/dict0defrag_bg.cc:212
    #4 0x564fd9534aae in dict_stats_func /test/10.7_dbg_san/storage/innobase/dict/dict0stats_bg.cc:394
    #5 0x564fd9802d6b in tpool::thread_pool_generic::timer_generic::run() /test/10.7_dbg_san/tpool/tpool_generic.cc:313
    #6 0x564fd9802d6b in tpool::thread_pool_generic::timer_generic::execute(void*) /test/10.7_dbg_san/tpool/tpool_generic.cc:333
    #7 0x564fd980c9fc in tpool::task::execute() /test/10.7_dbg_san/tpool/task.cc:52
    #8 0x564fd97fffa2 in tpool::thread_pool_generic::worker_main(tpool::worker_data*) /test/10.7_dbg_san/tpool/tpool_generic.cc:550
    #9 0x564fd9802121 in void std::__invoke_impl<void, void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*>(std::__invoke_memfun_deref, void (tpool::thread_pool_generic::*&&)(tpool::worker_data*), tpool::thread_pool_generic*&&, tpool::worker_data*&&) /usr/include/c++/9/bits/invoke.h:73
    #10 0x564fd9802121 in std::__invoke_result<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*>::type std::__invoke<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*>(void (tpool::thread_pool_generic::*&&)(tpool::worker_data*), tpool::thread_pool_generic*&&, tpool::worker_data*&&) /usr/include/c++/9/bits/invoke.h:95
    #11 0x564fd9802121 in void std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> >::_M_invoke<0ul, 1ul, 2ul>(std::_Index_tuple<0ul, 1ul, 2ul>) /usr/include/c++/9/thread:244
    #12 0x564fd9802121 in std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> >::operator()() /usr/include/c++/9/thread:251
    #13 0x564fd9802121 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> > >::_M_run() /usr/include/c++/9/thread:195
    #14 0x148d3b658de3  (/usr/lib/x86_64-linux-gnu/libstdc++.so.6+0xd6de3)
    #15 0x148d3b775608 in start_thread /build/glibc-eX1tMB/glibc-2.31/nptl/pthread_create.c:477
    #16 0x148d3a9eb292 in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x122292)
 
210918 16:03:23 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
 
Server version: 10.7.0-MariaDB-debug
key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=1
max_threads=153
thread_count=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 468081 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x62b0000af288
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x100000
/test/UBASAN_MD060921-mariadb-10.7.0-linux-x86_64-dbg/bin/mysqld(+0x7e79790)[0x564fd496d790]
asan_interceptors.o:0(__interceptor_backtrace.part.0)[0x564fd9a33750]
mysys/stacktrace.c:212(my_print_stacktrace)[0x564fd6bc2bcd]
sigaction.c:0(__restore_rt)[0x148d3b7813c0]
include/dict0dict.ic:211(dict_table_find_index_on_id(dict_table_t const*, unsigned long))[0x564fd944891a]
dict/dict0defrag_bg.cc:194(dict_defrag_process_entries_from_defrag_pool())[0x564fd95402c6]
dict/dict0stats_bg.cc:395(dict_stats_func(void*))[0x564fd9534aaf]
tpool/tpool_generic.cc:314(tpool::thread_pool_generic::timer_generic::run())[0x564fd9802d6c]
tpool/task.cc:53(tpool::task::execute())[0x564fd980c9fd]
tpool/tpool_generic.cc:548(tpool::thread_pool_generic::worker_main(tpool::worker_data*))[0x564fd97fffa3]
9/thread:195(std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> > >::_M_run())[0x564fd9802122]
??:0(std::error_code::default_error_condition() const)[0x148d3b658de4]
nptl/pthread_create.c:478(start_thread)[0x148d3b775609]
x86_64/clone.S:97(__GI___clone)[0x148d3a9eb293]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x0): (null)
Connection ID (thread ID): 0
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off
 
The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
information that should help you find out what is causing the crash.
 
We think the query pointer is invalid, but we will try to print it anyway.
Query:
 
Writing a core file...
Working directory at /test/UBASAN_MD060921-mariadb-10.7.0-linux-x86_64-dbg/data
Resource Limits:
Limit                     Soft Limit           Hard Limit           Units
Max cpu time              unlimited            unlimited            seconds
Max file size             unlimited            unlimited            bytes
Max data size             unlimited            unlimited            bytes
Max stack size            unlimited            unlimited            bytes
Max core file size        0                    0                    bytes
Max resident set          unlimited            unlimited            bytes
Max processes             unlimited            unlimited            processes
Max open files            1048576              1048576              files
Max locked memory         unlimited            unlimited            bytes
Max address space         unlimited            unlimited            bytes
Max file locks            unlimited            unlimited            locks
Max pending signals       unlimited            unlimited            signals
Max msgqueue size         unlimited            unlimited            bytes
Max nice priority         0                    0
Max realtime priority     0                    0
Max realtime timeout      unlimited            unlimited            us
Core pattern: |/usr/share/apport/apport %p %s %c %d %P %E
 
----

Comment by Marko Mäkelä [ 2021-09-18 ]

There are two problems here. The first one is that the defragmentation (MDEV-5834) should not have been enabled for temporary tables in the first place.

MariaDB 10.2.2 introduced proper temporary tables in InnoDB, by importing changes from MySQL 5.7.9.

In MariaDB 10.4, MDEV-17794 introduced a separate dict_sys.temp_id_hash for temporary tables. At that point, defragmentation must have ceased to work for temporary tables. (In my opinion, it never made sense or was safe with 10.2 or 10.3 either, because all code that deals with temporary tables assumes that the current thread has exclusive access to the temporary table.)

Even though you were not able to repeat an ill effect with 10.2 or 10.4, I think that we must disable the defragmentation for temporary tables everywhere.

Finally, in 10.6, I see that MDEV-24258 in 10.6 accidentally replaced an incorrect condition in the rewrite of dict_stats_process_entry_from_defrag_pool().

A simple analysis with ./mtr --rr revealed the following call, which I think is the root cause of all this. We must never add temporary tables to the defragmentation queue.

10.6 ae8c8d8874302a86de67f47ed70486e73204504f

Thread 11 hit Hardware watchpoint 1: -location defrag_pool._M_impl._M_finish
 
Old value = (defrag_pool_item_t *) 0x0
New value = (defrag_pool_item_t *) 0x7fe144112cd0
0x00005629fce86f1a in std::vector<defrag_pool_item_t, ut_allocator<defrag_pool_item_t, true> >::_M_realloc_insert<defrag_pool_item_t const&> (
    this=this@entry=0x5629fe1a8000 <defrag_pool>, __position=<optimized out>)
    at /usr/include/c++/11/bits/vector.tcc:504
504	      this->_M_impl._M_end_of_storage = __new_start + __len;
(rr) bt
#0  0x00005629fce86f1a in std::vector<defrag_pool_item_t, ut_allocator<defrag_pool_item_t, true> >::_M_realloc_insert<defrag_pool_item_t const&> (
    this=this@entry=0x5629fe1a8000 <defrag_pool>, __position=<optimized out>)
    at /usr/include/c++/11/bits/vector.tcc:504
#1  0x00005629fce85883 in std::vector<defrag_pool_item_t, ut_allocator<defrag_pool_item_t, true> >::push_back (__x=
      @0x7fe15d25bc80: {table_id = 10, index_id = 23}, 
    this=0x5629fe1a8000 <defrag_pool>)
    at /usr/include/c++/11/bits/stl_iterator.h:1010
#2  dict_stats_defrag_pool_add (index=0x7fe14417b0e0)
    at /mariadb/10.6/storage/innobase/dict/dict0defrag_bg.cc:132
#3  0x00005629fce04352 in btr_defragment_save_defrag_stats_if_needed (
    index=0x7fe14417b0e0)
    at /mariadb/10.6/storage/innobase/btr/btr0defragment.cc:230
#4  0x00005629fcdac5d6 in btr_page_split_and_insert (flags=flags@entry=2, 
    cursor=cursor@entry=0x7fe15d25bf80, offsets=offsets@entry=0x7fe15d25bf70, 
    heap=heap@entry=0x7fe15d25bf68, tuple=tuple@entry=0x7fe144119130, 
    n_ext=n_ext@entry=0, mtr=0x7fe15d25c2d0)
    at /mariadb/10.6/storage/innobase/btr/btr0btr.cc:2771

Comment by Marko Mäkelä [ 2021-09-18 ]

I fixed the 10.6 regression and also developed a follow-up fix for MDEV-25919, removing a race condition between DDL and saving the defragmentation statistics which is not easily fixable in earlier release series than 10.6.

Saving any defragmentation statistics for temporary tables was never meaningful to begin with, because ha_innobase::optimize() does not defragmentation for temporary tables. Besides, the table names would be ‘garbage’ consisting of the string #sql followed by an automatically generated part.

Comment by Roel Van de Paar [ 2021-09-20 ]

Summary of fix; defragmentation statistics of TEMPORARY tables disabled in 10.2+ (and defragmentation of TEMPORARY tables was already disabled)

Comment by Roel Van de Paar [ 2021-10-13 ]

I saw this in a 10.7 debug build from 9 October:

10.7.1 5cc9cf9a05582307abefd8f3b57548945da86b1a (Debug)

InnoDB: Failing assertion: table->get_ref_count() == 0

10.7.1 5cc9cf9a05582307abefd8f3b57548945da86b1a (Debug)

Core was generated by `/test/MD091021-mariadb-10.7.1-linux-x86_64-dbg/bin/mysqld --no-defaults --core-'.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
[Current thread is 1 (Thread 0x1525100e1700 (LWP 1625177))]
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00001525124bd859 in __GI_abort () at abort.c:79
#2  0x0000563bcd532312 in ut_dbg_assertion_failed (expr=expr@entry=0x563bcdbacbc0 "table->get_ref_count() == 0", file=file@entry=0x563bcdbad1c0 "/test/10.7_dbg/storage/innobase/dict/dict0dict.cc", line=line@entry=1823) at /test/10.7_dbg/storage/innobase/ut/ut0dbg.cc:60
#3  0x0000563bcd6040a0 in dict_sys_t::remove (this=0x563bce1ba4c0 <dict_sys>, table=table@entry=0x1524c697ba68, lru=lru@entry=false, keep=keep@entry=false) at /test/10.7_dbg/storage/innobase/dict/dict0dict.cc:1823
#4  0x0000563bcd636928 in trx_t::commit (this=this@entry=0x1524fd755390, deleted=std::vector of length 5, capacity 8 = {{m_file = 125, m_psi = 0x0}, {m_file = 140, m_psi = 0x0}, {m_file = 144, m_psi = 0x0}, {m_file = 146, m_psi = 0x0}, {m_file = 141, m_psi = 0x0}}) at /test/10.7_dbg/storage/innobase/dict/drop.cc:249
#5  0x0000563bcd2faef4 in ha_innobase::delete_table (this=<optimized out>, name=<optimized out>) at /test/10.7_dbg/storage/innobase/handler/ha_innodb.cc:13600
#6  0x0000563bccf4a731 in hton_drop_table (hton=<optimized out>, path=<optimized out>) at /test/10.7_dbg/sql/handler.cc:576
#7  0x0000563bccf53ff8 in ha_delete_table (thd=thd@entry=0x1524c4000db8, hton=hton@entry=0x563bd091f928, path=path@entry=0x1525100dd5d0 "./test/#sql-backup-18c39f-4", db=db@entry=0x1525100df180, alias=alias@entry=0x1525100dd980, generate_warning=generate_warning@entry=false) at /test/10.7_dbg/sql/handler.cc:3100
#8  0x0000563bccd24232 in quick_rm_table (thd=thd@entry=0x1524c4000db8, base=base@entry=0x563bd091f928, db=db@entry=0x1525100df180, table_name=table_name@entry=0x1525100dd980, flags=3, table_path=table_path@entry=0x0) at /test/10.7_dbg/sql/sql_table.cc:1919
#9  0x0000563bccd3c216 in mysql_alter_table (thd=thd@entry=0x1524c4000db8, new_db=new_db@entry=0x1524c4013d70, new_name=new_name@entry=0x1524c4013d80, create_info=create_info@entry=0x1525100dffc0, table_list=<optimized out>, table_list@entry=0x1524c4013d58, alter_info=alter_info@entry=0x1525100dfed0, order_num=<optimized out>, order=<optimized out>, ignore=<optimized out>, if_exists=<optimized out>) at /test/10.7_dbg/sql/sql_table.cc:10616
#10 0x0000563bccc4d381 in mysql_execute_command (thd=thd@entry=0x1524c4000db8, is_called_from_prepared_stmt=is_called_from_prepared_stmt@entry=false) at /test/10.7_dbg/sql/structs.h:568
#11 0x0000563bccc38d6f in mysql_parse (thd=thd@entry=0x1524c4000db8, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x1525100e0400) at /test/10.7_dbg/sql/sql_parse.cc:8028
#12 0x0000563bccc47968 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x1524c4000db8, packet=packet@entry=0x1524c55f1ed9 "", packet_length=packet_length@entry=31, blocking=blocking@entry=true) at /test/10.7_dbg/sql/sql_class.h:1358
#13 0x0000563bccc4ad6e in do_command (thd=0x1524c4000db8, blocking=blocking@entry=true) at /test/10.7_dbg/sql/sql_parse.cc:1402
#14 0x0000563bccdc0e88 in do_handle_one_connection (connect=<optimized out>, connect@entry=0x563bd0f060b8, put_in_cache=put_in_cache@entry=true) at /test/10.7_dbg/sql/sql_connect.cc:1418
#15 0x0000563bccdc148d in handle_one_connection (arg=arg@entry=0x563bd0f060b8) at /test/10.7_dbg/sql/sql_connect.cc:1312
#16 0x0000563bcd22a83e in pfs_spawn_thread (arg=0x563bd0e45d98) at /test/10.7_dbg/storage/perfschema/pfs.cc:2201
#17 0x00001525129cc609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#18 0x00001525125ba293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

When testing (looping, so likely highly sporadic) an original longer input testcase of MDEV-26381. TEMPORARY tables are present in the testcase.

Comment by Roel Van de Paar [ 2021-10-13 ]

marko A core dump for the issue above is available. Let me know if needed. Also let me know if this bug should be reopened or a new one created.

Comment by Marko Mäkelä [ 2022-08-05 ]

Roel, yes, new tickets should be filed if a bug does not appear to be fixed. This assertion failure has many potential root causes.

In 10.6 and later, there is a known bug MDEV-29250 that affects tables that contain FULLTEXT INDEX.

Comment by Roel Van de Paar [ 2022-09-15 ]

Created MDEV-29277: MDEV-26636 Race conditions not fully fixed yet

Generated at Thu Feb 08 09:46:48 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.