[MDEV-26539] SIGSEGV in spider_check_and_set_trx_isolation and I_P_List_iterator from THD::drop_temporary_table (10.5.3 opt only) on ALTER Created: 2021-09-05  Updated: 2021-10-18  Resolved: 2021-10-18

Status: Closed
Project: MariaDB Server
Component/s: Data Definition - Alter Table, Storage Engine - Spider
Affects Version/s: 10.5, 10.6, 10.7
Fix Version/s: 10.5.13, 10.6.5

Type: Bug Priority: Critical
Reporter: Roel Van de Paar Assignee: Nayuta Yanagisawa (Inactive)
Resolution: Fixed Votes: 0
Labels: affects-tests, not-10.2, not-10.3, not-10.4, regression

Issue Links:
Problem/Incident
is caused by MDEV-19002 Partition performance optimization Stalled
Relates
relates to MDEV-26540 Spider: Assertion `inited==RND' faile... Confirmed

 Description   

INSTALL PLUGIN spider SONAME 'ha_spider.so';
CREATE TABLE t (c INT) ENGINE=SPIDER PARTITION BY LIST COLUMNS (c) (PARTITION p DEFAULT ENGINE=SPIDER);
INSERT INTO t VALUES (1);
ALTER TABLE t CHECK PARTITION ALL;

Leads to:

10.7.0 1bc82aaf0a7746c0921a94034aff2d51f0d75cd0 (Debug)

Core was generated by `/test/MD040921-mariadb-10.7.0-linux-x86_64-dbg/bin/mysqld --no-defaults --core-'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  spider_check_and_set_trx_isolation (conn=0x1464b008e868, 
    need_mon=<optimized out>) at /test/10.7_dbg/storage/spider/spd_trx.cc:1698
1698      if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL)
[Current thread is 1 (Thread 0x1464fc5ad700 (LWP 3331251))]
(gdb) bt
#0  spider_check_and_set_trx_isolation (conn=0x1464b008e868, need_mon=<optimized out>) at /test/10.7_dbg/storage/spider/spd_trx.cc:1698
#1  0x00001464c3b51bdf in ha_spider::dml_init (this=this@entry=0x1464b007c610) at /test/10.7_dbg/storage/spider/ha_spider.cc:16563
#2  0x00001464c3b55785 in ha_spider::rnd_init (this=0x1464b007c610, scan=<optimized out>) at /test/10.7_dbg/storage/spider/ha_spider.cc:7336
#3  0x000055b0885681f9 in handler::ha_rnd_init (scan=true, this=0x1464b007c610) at /test/10.7_dbg/sql/handler.h:3535
#4  ha_partition::check_misplaced_rows (this=this@entry=0x1464b002fc20, read_part_id=read_part_id@entry=0, do_repair=do_repair@entry=false) at /test/10.7_dbg/sql/ha_partition.cc:11089
#5  0x000055b088568b8a in ha_partition::handle_opt_part (this=this@entry=0x1464b002fc20, thd=thd@entry=0x1464b0000db8, check_opt=check_opt@entry=0x1464b00063b8, part_id=part_id@entry=0, flag=flag@entry=3) at /test/10.7_dbg/sql/ha_partition.cc:1378
#6  0x000055b088568e5e in ha_partition::handle_opt_partitions (this=this@entry=0x1464b002fc20, thd=thd@entry=0x1464b0000db8, check_opt=check_opt@entry=0x1464b00063b8, flag=flag@entry=3) at /test/10.7_dbg/sql/ha_partition.cc:1548
#7  0x000055b08856916c in ha_partition::check (this=0x1464b002fc20, thd=0x1464b0000db8, check_opt=0x1464b00063b8) at /test/10.7_dbg/sql/ha_partition.cc:1280
#8  0x000055b088298ad1 in handler::ha_check (this=0x1464b002fc20, thd=0x1464b0000db8, check_opt=0x1464b00063b8) at /test/10.7_dbg/sql/handler.cc:4922
#9  0x000055b08811b020 in mysql_admin_table (thd=thd@entry=0x1464b0000db8, tables=tables@entry=0x1464b0013d68, check_opt=check_opt@entry=0x1464b00063b8, operator_name=operator_name@entry=0x55b08934ac60 <msg_check>, lock_type=lock_type@entry=TL_READ_NO_INSERT, org_open_for_modify=org_open_for_modify@entry=false, repair_table_use_frm=false, extra_open_options=32, prepare_func=0x0, operator_func=(int (handler::*)(handler * const, THD *, HA_CHECK_OPT *)) 0x55b088298a6a <handler::ha_check(THD*, st_ha_check_opt*)>, view_operator_func=0x55b0880b1f32 <view_check(THD*, TABLE_LIST*, st_ha_check_opt*)>, is_cmd_replicated=false) at /test/10.7_dbg/sql/sql_admin.cc:919
#10 0x000055b08811d47e in Sql_cmd_check_table::execute (this=this@entry=0x1464b0014450, thd=thd@entry=0x1464b0000db8) at /test/10.7_dbg/sql/sql_admin.cc:1517
#11 0x000055b088108abc in Sql_cmd_alter_table_check_partition::execute (this=0x1464b0014450, thd=0x1464b0000db8) at /test/10.7_dbg/sql/sql_partition_admin.cc:790
#12 0x000055b087f99029 in mysql_execute_command (thd=thd@entry=0x1464b0000db8, is_called_from_prepared_stmt=is_called_from_prepared_stmt@entry=false) at /test/10.7_dbg/sql/sql_parse.cc:5997
#13 0x000055b087f7fccb in mysql_parse (thd=thd@entry=0x1464b0000db8, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x1464fc5ac400) at /test/10.7_dbg/sql/sql_parse.cc:8036
#14 0x000055b087f8e8d0 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x1464b0000db8, packet=packet@entry=0x1464b000b739 "", packet_length=packet_length@entry=33, blocking=blocking@entry=true) at /test/10.7_dbg/sql/sql_class.h:1358
#15 0x000055b087f91cd6 in do_command (thd=0x1464b0000db8, blocking=blocking@entry=true) at /test/10.7_dbg/sql/sql_parse.cc:1404
#16 0x000055b0881080c8 in do_handle_one_connection (connect=<optimized out>, connect@entry=0x55b08c1831d8, put_in_cache=put_in_cache@entry=true) at /test/10.7_dbg/sql/sql_connect.cc:1418
#17 0x000055b0881086cd in handle_one_connection (arg=arg@entry=0x55b08c1831d8) at /test/10.7_dbg/sql/sql_connect.cc:1312
#18 0x000055b088571ade in pfs_spawn_thread (arg=0x55b08c082348) at /test/10.7_dbg/storage/perfschema/pfs.cc:2201
#19 0x000014650067a609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#20 0x0000146500268293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Bug confirmed present in:
MariaDB: 10.5.13 (dbg), 10.5.13 (opt), 10.6.5 (dbg), 10.6.5 (opt), 10.7.0 (dbg), 10.7.0 (opt)

Bug (or feature/syntax) confirmed not present in:
MariaDB: 10.2.41 (dbg), 10.2.41 (opt), 10.3.32 (dbg), 10.3.32 (opt), 10.4.22 (dbg), 10.4.22 (opt)
MySQL: 5.5.62 (dbg), 5.5.62 (opt), 5.6.51 (dbg), 5.6.51 (opt), 5.7.35 (dbg), 5.7.35 (opt), 8.0.26 (dbg), 8.0.26 (opt)

10.4 Works correctly:

10.4.22 46c3e7e3537c31a94289033bfeccf3faf8d4069e (Debug)

10.4.22-dbg>ALTER TABLE t CHECK PARTITION ALL;
+--------+-------+----------+----------+
| Table  | Op    | Msg_type | Msg_text |
+--------+-------+----------+----------+
| test.t | check | status   | OK       |
+--------+-------+----------+----------+
1 row in set (0.000 sec)

OR (Alternative testcase):

INSTALL PLUGIN spider SONAME 'ha_spider.so';
CREATE TABLE t (c INT) ENGINE=SPIDER PARTITION BY LIST COLUMNS (c) (PARTITION p DEFAULT ENGINE=SPIDER);
SELECT * FROM t;
ALTER TABLE t ENGINE=MEMORY;

Leads to:

10.7.0 1bc82aaf0a7746c0921a94034aff2d51f0d75cd0 (Debug)

Core was generated by `/test/MD040921-mariadb-10.7.0-linux-x86_64-dbg/bin/mysqld --no-defaults --core-'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  spider_check_and_set_trx_isolation (conn=0x15532008e868, 
    need_mon=<optimized out>) at /test/10.7_dbg/storage/spider/spd_trx.cc:1698
1698	  if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL)
[Current thread is 1 (Thread 0x1553749e0700 (LWP 1456230))]
(gdb) bt
#0  spider_check_and_set_trx_isolation (conn=0x15532008e868, need_mon=<optimized out>) at /test/10.7_dbg/storage/spider/spd_trx.cc:1698
#1  0x00001553602a3bdf in ha_spider::dml_init (this=this@entry=0x15532007c610) at /test/10.7_dbg/storage/spider/ha_spider.cc:16563
#2  0x00001553602a7785 in ha_spider::rnd_init (this=0x15532007c610, scan=<optimized out>) at /test/10.7_dbg/storage/spider/ha_spider.cc:7336
#3  0x00005575f65721a6 in handler::ha_rnd_init (scan=true, this=0x15532007c610) at /test/10.7_dbg/sql/handler.h:3535
#4  ha_partition::rnd_init (this=0x15532002fc20, scan=true) at /test/10.7_dbg/sql/ha_partition.cc:5133
#5  0x00005575f62a3435 in handler::ha_rnd_init (scan=true, this=0x15532002fc20) at /test/10.7_dbg/sql/handler.h:3535
#6  handler::ha_rnd_init_with_error (this=0x15532002fc20, scan=scan@entry=true) at /test/10.7_dbg/sql/handler.cc:3614
#7  0x00005575f5e96f79 in init_read_record (info=info@entry=0x1553749dc080, thd=thd@entry=0x155320000db8, table=table@entry=0x15532002f348, select=select@entry=0x0, filesort=filesort@entry=0x0, use_record_cache=use_record_cache@entry=1, print_error=true, disable_rr_cache=false) at /test/10.7_dbg/sql/records.cc:328
#8  0x00005575f608f1ce in copy_data_between_tables (alter_ctx=0x1553749dd6b0, keys_onoff=<optimized out>, deleted=<synthetic pointer>, copied=<synthetic pointer>, order=<optimized out>, order_num=<optimized out>, ignore=<optimized out>, create=@0x1553749dd960: {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x0, last = 0x2f747365742f2e00, elements = 1919299188}, <No data fields>}, to=0x1553200ccff8, from=0x15532002f348, thd=0x155320000db8) at /test/10.7_dbg/sql/sql_table.cc:11000
#9  mysql_alter_table (thd=thd@entry=0x155320000db8, new_db=new_db@entry=0x1553200059b8, new_name=new_name@entry=0x155320005dd0, create_info=create_info@entry=0x1553749de4d0, table_list=<optimized out>, table_list@entry=0x155320013d60, alter_info=alter_info@entry=0x1553749de3e0, order_num=<optimized out>, order=<optimized out>, ignore=<optimized out>, if_exists=<optimized out>) at /test/10.7_dbg/sql/sql_table.cc:10356
#10 0x00005575f611d0bb in Sql_cmd_alter_table::execute (this=<optimized out>, thd=0x155320000db8) at /test/10.7_dbg/sql/structs.h:568
#11 0x00005575f5fa6029 in mysql_execute_command (thd=thd@entry=0x155320000db8, is_called_from_prepared_stmt=is_called_from_prepared_stmt@entry=false) at /test/10.7_dbg/sql/sql_parse.cc:5997
#12 0x00005575f5f8cccb in mysql_parse (thd=thd@entry=0x155320000db8, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x1553749df400) at /test/10.7_dbg/sql/sql_parse.cc:8036
#13 0x00005575f5f9b8d0 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x155320000db8, packet=packet@entry=0x15532000b739 "ALTER TABLE t ENGINE=MEMORY", packet_length=packet_length@entry=27, blocking=blocking@entry=true) at /test/10.7_dbg/sql/sql_class.h:1358
#14 0x00005575f5f9ecd6 in do_command (thd=0x155320000db8, blocking=blocking@entry=true) at /test/10.7_dbg/sql/sql_parse.cc:1404
#15 0x00005575f61150c8 in do_handle_one_connection (connect=<optimized out>, connect@entry=0x5575f98d2078, put_in_cache=put_in_cache@entry=true) at /test/10.7_dbg/sql/sql_connect.cc:1418
#16 0x00005575f61156cd in handle_one_connection (arg=arg@entry=0x5575f98d2078) at /test/10.7_dbg/sql/sql_connect.cc:1312
#17 0x00005575f657eade in pfs_spawn_thread (arg=0x5575f97d12b8) at /test/10.7_dbg/storage/perfschema/pfs.cc:2201
#18 0x0000155377aad609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#19 0x000015537769b293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Bug confirmed present in:
MariaDB: 10.5.13 (dbg), 10.5.13 (opt), 10.6.5 (dbg), 10.6.5 (opt), 10.7.0 (dbg), 10.7.0 (opt)

Bug (or feature/syntax) confirmed not present in:
MariaDB: 10.2.41 (dbg), 10.2.41 (opt), 10.3.32 (dbg), 10.3.32 (opt), 10.4.22 (dbg), 10.4.22 (opt)
MySQL: 5.5.62 (dbg), 5.5.62 (opt), 5.6.51 (dbg), 5.6.51 (opt), 5.7.35 (dbg), 5.7.35 (opt), 8.0.26 (dbg), 8.0.26 (opt)



 Comments   
Comment by Roel Van de Paar [ 2021-09-05 ]

The second testcase above produces a different assert on 10.5.13 optimized only:

10.5.13 0268b8712288d46fbd8a43fdef6bada399b68dff (Optimized)

Core was generated by `/test/MD160821-mariadb-10.5.13-linux-x86_64-opt/bin/mysqld --no-defaults --core'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000055674869e0ab in I_P_List_iterator<TABLE, I_P_List<TABLE, All_share_tables, I_P_List_null_counter, I_P_List_no_push_back<TABLE> > >::operator++ (
    this=<synthetic pointer>) at /test/10.5_opt/sql/table.h:1798
[Current thread is 1 (Thread 0x145d60400700 (LWP 1473351))]
(gdb) bt
#0  0x000055674869e0ab in I_P_List_iterator<TABLE, I_P_List<TABLE, All_share_tables, I_P_List_null_counter, I_P_List_no_push_back<TABLE> > >::operator++ (this=<synthetic pointer>) at /test/10.5_opt/sql/table.h:1798
#1  THD::drop_temporary_table (this=this@entry=0x145d14000c58, table=table@entry=0x145d1405f0e8, is_trans=is_trans@entry=0x0, delete_table=delete_table@entry=false) at /test/10.5_opt/sql/temporary_tables.cc:634
#2  0x00005567485adc16 in mysql_alter_table (thd=thd@entry=0x145d14000c58, new_db=new_db@entry=0x145d140054f8, new_name=new_name@entry=0x145d14005918, create_info=create_info@entry=0x145d603fe5a0, table_list=<optimized out>, table_list@entry=0x145d140104d0, alter_info=alter_info@entry=0x145d603fe4d0, order_num=0, order=0x0, ignore=false, if_exists=false) at /test/10.5_opt/sql/sql_table.cc:10966
#3  0x000055674860dfa7 in Sql_cmd_alter_table::execute (this=<optimized out>, thd=0x145d14000c58) at /test/10.5_opt/sql/structs.h:559
#4  0x0000556748504bbe in mysql_execute_command (thd=0x145d14000c58) at /test/10.5_opt/sql/sql_parse.cc:6056
#5  0x00005567484f4143 in mysql_parse (thd=0x145d14000c58, rawbuf=<optimized out>, length=<optimized out>, parser_state=<optimized out>, is_com_multi=<optimized out>, is_next_command=<optimized out>) at /test/10.5_opt/sql/sql_parse.cc:8100
#6  0x0000556748500925 in dispatch_command (command=COM_QUERY, thd=0x145d14000c58, packet=<optimized out>, packet_length=<optimized out>, is_com_multi=<optimized out>, is_next_command=<optimized out>) at /test/10.5_opt/sql/sql_class.h:1290
#7  0x0000556748502eb2 in do_command (thd=0x145d14000c58) at /test/10.5_opt/sql/sql_parse.cc:1370
#8  0x00005567486092e1 in do_handle_one_connection (connect=<optimized out>, connect@entry=0x55674b94b138, put_in_cache=put_in_cache@entry=true) at /test/10.5_opt/sql/sql_connect.cc:1418
#9  0x000055674860975d in handle_one_connection (arg=arg@entry=0x55674b94b138) at /test/10.5_opt/sql/sql_connect.cc:1312
#10 0x00005567489985c9 in pfs_spawn_thread (arg=0x55674b8c84a8) at /test/10.5_opt/storage/perfschema/pfs.cc:2201
#11 0x0000145d767bb609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#12 0x0000145d763a9293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Comment by Roel Van de Paar [ 2021-09-05 ]

The first testcase above produced this stack at least once (sporadically different stack) on 10.7 debug:

10.7.0 1bc82aaf0a7746c0921a94034aff2d51f0d75cd0 (Debug)

mysqld: /test/10.7_dbg/sql/sql_class.cc:1556: void THD::cleanup(): Assertion `open_tables == __null' failed.

10.7.0 1bc82aaf0a7746c0921a94034aff2d51f0d75cd0 (Debug)

Core was generated by `/test/MD040921-mariadb-10.7.0-linux-x86_64-dbg/bin/mysqld --no-defaults --core-'.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
[Current thread is 1 (Thread 0x151f3aece800 (LWP 2208329))]
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x0000151f3b08a859 in __GI_abort () at abort.c:79
#2  0x0000151f3b08a729 in __assert_fail_base (fmt=0x151f3b220588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x55c4c7809f85 "open_tables == __null", file=0x55c4c78078c8 "/test/10.7_dbg/sql/sql_class.cc", line=1556, function=<optimized out>) at assert.c:92
#3  0x0000151f3b09bf36 in __GI___assert_fail (assertion=assertion@entry=0x55c4c7809f85 "open_tables == __null", file=file@entry=0x55c4c78078c8 "/test/10.7_dbg/sql/sql_class.cc", line=line@entry=1556, function=function@entry=0x55c4c7809d48 "void THD::cleanup()") at assert.c:101
#4  0x000055c4c6b0fccf in THD::cleanup (this=this@entry=0x55c4ca62ec78) at /test/10.7_dbg/sql/sql_class.cc:1556
#5  0x000055c4c6b111db in THD::free_connection (this=this@entry=0x55c4ca62ec78) at /test/10.7_dbg/sql/sql_class.cc:1623
#6  0x000055c4c6b1aa91 in THD::~THD (this=0x55c4ca62ec78, __in_chrg=<optimized out>) at /test/10.7_dbg/sql/sql_class.cc:1702
#7  0x000055c4c6b1aea1 in THD::~THD (this=0x55c4ca62ec78, __in_chrg=<optimized out>) at /test/10.7_dbg/sql/sql_class.cc:1672
#8  0x000055c4c6ac8ca5 in grant_init () at /test/10.7_dbg/sql/sql_acl.cc:7780
#9  0x000055c4c6a42404 in mysqld_main (argc=<optimized out>, argv=<optimized out>) at /test/10.7_dbg/sql/mysqld.cc:5713
#10 0x000055c4c6a33b36 in main (argc=<optimized out>, argv=<optimized out>) at /test/10.7_dbg/sql/main.cc:34

Comment by Roel Van de Paar [ 2021-09-05 ]

Additional testcase, all to be added to MTR:

INSTALL PLUGIN spider SONAME 'ha_spider.so';
CREATE TABLE t (c INT) ENGINE=SPIDER;
INSERT INTO t VALUES (0);
ALTER TABLE t ENGINE=InnoDB;

And

INSTALL PLUGIN spider SONAME 'ha_spider.so';
CREATE TABLE t (c INT PRIMARY KEY) ENGINE=SPIDER PARTITION BY LIST (c) (PARTITION p0 VALUES IN (0),PARTITION p1 VALUES IN (1));
INSERT INTO t VALUES (0);
ALTER TABLE t ENGINE=InnoDB;

And

INSTALL PLUGIN spider SONAME 'ha_spider.so';
CREATE TABLE t (c INT,c2 INT,PRIMARY KEY(c,c2)) ENGINE=SPIDER PARTITION BY LIST (c) (PARTITION p0 VALUES IN (0),PARTITION p1 VALUES IN (1));
INSERT INTO t VALUES (0);
ALTER TABLE t ENGINE=InnoDB;

Three different stacks are produced, two already listed here and this one on optimized:

10.7.0 1bc82aaf0a7746c0921a94034aff2d51f0d75cd0 (Optimized)

Core was generated by `/test/MD040921-mariadb-10.7.0-linux-x86_64-opt/bin/mysqld --no-defaults --core-'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  TABLE_SHARE::destroy (this=this@entry=0x14c5b4064268)
    at /test/10.7_opt/sql/table.cc:486
[Current thread is 1 (Thread 0x14c60413e700 (LWP 2060548))]
(gdb) bt
#0  TABLE_SHARE::destroy (this=this@entry=0x14c5b4064268) at /test/10.7_opt/sql/table.cc:486
#1  0x0000564e8e4458d9 in free_table_share (share=share@entry=0x14c5b4064268) at /test/10.7_opt/sql/table.cc:543
#2  0x0000564e8e523069 in THD::free_tmp_table_share (this=this@entry=0x14c5b4000c58, share=share@entry=0x14c5b4064268, delete_table=delete_table@entry=false) at /test/10.7_opt/sql/temporary_tables.cc:1464
#3  0x0000564e8e524327 in THD::drop_temporary_table (this=this@entry=0x14c5b4000c58, table=table@entry=0x14c5b4095828, is_trans=is_trans@entry=0x0, delete_table=delete_table@entry=false) at /test/10.7_opt/sql/temporary_tables.cc:669
#4  0x0000564e8e41d2cf in mysql_alter_table (thd=thd@entry=0x14c5b4000c58, new_db=new_db@entry=0x14c5b4005698, new_name=new_name@entry=0x14c5b4005ab0, create_info=create_info@entry=0x14c60413c5a0, table_list=<optimized out>, table_list@entry=0x14c5b4010880, alter_info=alter_info@entry=0x14c60413c4b0, order_num=0, order=0x0, ignore=false, if_exists=false) at /test/10.7_opt/sql/sql_table.cc:10465
#5  0x0000564e8e48592d in Sql_cmd_alter_table::execute (this=<optimized out>, thd=0x14c5b4000c58) at /test/10.7_opt/sql/structs.h:568
#6  0x0000564e8e36709e in mysql_execute_command (thd=0x14c5b4000c58, is_called_from_prepared_stmt=<optimized out>) at /test/10.7_opt/sql/sql_parse.cc:5997
#7  0x0000564e8e357456 in mysql_parse (thd=0x14c5b4000c58, rawbuf=<optimized out>, length=<optimized out>, parser_state=<optimized out>) at /test/10.7_opt/sql/sql_parse.cc:8036
#8  0x0000564e8e363345 in dispatch_command (command=COM_QUERY, thd=0x14c5b4000c58, packet=<optimized out>, packet_length=<optimized out>, blocking=<optimized out>) at /test/10.7_opt/sql/sql_class.h:1358
#9  0x0000564e8e365217 in do_command (thd=0x14c5b4000c58, blocking=blocking@entry=true) at /test/10.7_opt/sql/sql_parse.cc:1404
#10 0x0000564e8e480ae7 in do_handle_one_connection (connect=<optimized out>, put_in_cache=true) at /test/10.7_opt/sql/sql_connect.cc:1418
#11 0x0000564e8e480e2d in handle_one_connection (arg=arg@entry=0x564e90019548) at /test/10.7_opt/sql/sql_connect.cc:1312
#12 0x0000564e8e7d4298 in pfs_spawn_thread (arg=0x564e904bbe98) at /test/10.7_opt/storage/perfschema/pfs.cc:2201
#13 0x000014c605676609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#14 0x000014c605264293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

The third testcase has a very common looking stack so other bugs may be masked.

SIGSEGV|TABLE_SHARE::destroy|free_table_share|THD::free_tmp_table_share|THD::drop_temporary_table

Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-21 ]

MTR test case:

--disable_query_log
--disable_result_log
--source ../../t/test_init.inc
--enable_result_log
--enable_query_log
 
CREATE DATABASE auto_test_local;
USE auto_test_local;
 
CREATE TABLE t (c INT) ENGINE=SPIDER PARTITION BY LIST COLUMNS (c) (PARTITION p DEFAULT ENGINE=SPIDER);
--error 1429
INSERT INTO t VALUES (1);
ALTER TABLE t CHECK PARTITION ALL;
 
DROP DATABASE auto_test_local;
 
--disable_query_log
--disable_result_log
--source ../../t/test_deinit.inc
--enable_result_log
--enable_query_log

Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-21 ]

I ran git-bisect with the above test case and got the following:

e954d9de886aebc68c39240304fe97ae88276dbb is the first bad commit
commit e954d9de886aebc68c39240304fe97ae88276dbb
Author: Kentoku SHIBA <kentokushiba@gmail.com>
Date:   Tue Mar 3 02:50:40 2020 +0900
 
    MDEV-19002 Spider performance optimization with partition
 
    Change the following function for batch call instead of each partition
    - store_lock
    - external_lock
    - start_stmt
    - extra
    - cond_push
    - info_push
    - top_table
 
 sql/ha_partition.h                                 |    4 +
 storage/spider/ha_spider.cc                        | 2975 +++++++++++---------
 storage/spider/ha_spider.h                         |   69 +-
 .../spider/bugfix/include/insert_select_deinit.inc |   16 +
 .../spider/bugfix/include/insert_select_init.inc   |   43 +
 .../mysql-test/spider/bugfix/t/insert_select.cnf   |    3 +
 .../mysql-test/spider/bugfix/t/insert_select.test  |   99 +
 storage/spider/spd_conn.cc                         |   43 +-
 storage/spider/spd_copy_tables.cc                  |   16 +-
 storage/spider/spd_db_conn.cc                      |  358 +--
 storage/spider/spd_db_include.h                    |    1 -
 storage/spider/spd_db_mysql.cc                     |  216 +-
 storage/spider/spd_db_oracle.cc                    |  146 +-
 storage/spider/spd_group_by_handler.cc             |   20 +-
 storage/spider/spd_include.h                       |  127 +-
 storage/spider/spd_table.cc                        |  456 +--
 storage/spider/spd_table.h                         |   26 +-
 storage/spider/spd_trx.cc                          |  132 +-
 storage/spider/spd_trx.h                           |    4 +
 19 files changed, 2808 insertions(\+), 1946 deletions(\-)
 create mode 100644 storage/spider/mysql-test/spider/bugfix/include/insert_select_deinit.inc
 create mode 100644 storage/spider/mysql-test/spider/bugfix/include/insert_select_init.inc
 create mode 100644 storage/spider/mysql-test/spider/bugfix/t/insert_select.cnf
 create mode 100644 storage/spider/mysql-test/spider/bugfix/t/insert_select.test
bisect run success

Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-24 ]

I analyzed the execution trace of the above test case. The bug looks quite similar to MDEV-26582 (See the comment of it). The connection is freed when INSERT INTO t VALUES (1) is rolled back and then is reused by ALTER TABLE t CHECK PARTITION ALL.

Thread 2 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 877778.877799]
rwatch -l conn-0x00007fcc59cfbdcf in spider_check_and_set_trx_isolation (conn=0x7fcc7412a3f8, need_mon=0x7fcc74128828) at /home/nayuta/repo/mariadb-server/storage/spider/spd_trx.cc:1640
>1640     if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL)
(rr) rwatch -l conn->thd
Hardware read watchpoint 1: -location conn->thd
(rr) rc
Continuing.
 
Thread 2 received signal SIGSEGV, Segmentation fault.
0x00007fcc59cfbdcf in spider_check_and_set_trx_isolation (conn=0x7fcc7412a3f8, need_mon=0x7fcc74128828) at /home/nayuta/repo/mariadb-server/storage/spider/spd_trx.cc:1640
1640      if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL)
(rr) rc
Continuing.
 
Thread 2 hit Hardware read watchpoint 1: -location conn->thd
 
Value = (THD *) 0x8f8f8f8f8f8f8f8f
0x00007fcc59cfbda1 in spider_check_and_set_trx_isolation (conn=0x7fcc7412a3f8, need_mon=0x7fcc74128828) at /home/nayuta/repo/mariadb-server/storage/spider/spd_trx.cc:1637
1637      THD *thd = conn->thd;
(rr) rc
Continuing.
 
Thread 2 hit Hardware read watchpoint 1: -location conn->thd
 
Value = (THD *) 0x0
0x00007fcc843e3474 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
(rr) bt
#0  0x00007fcc843e3474 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x000055e28d78f3c6 in free_memory (ptr=0x7fcc7412a3d0) at /home/nayuta/repo/mariadb-server/mysys/safemalloc.c:279
#2  0x000055e28d78f07c in sf_free (ptr=0x7fcc7412a3d0) at /home/nayuta/repo/mariadb-server/mysys/safemalloc.c:198
#3  0x000055e28d77c3cb in my_free (ptr=0x7fcc7412a3e8) at /home/nayuta/repo/mariadb-server/mysys/my_malloc.c:211
#4  0x00007fcc59d9f548 in spider_free_mem (trx=0x7fcc740f2d08, ptr=0x7fcc7412a3f8, my_flags=0) at /home/nayuta/repo/mariadb-server/storage/spider/spd_malloc.cc:188
#5  0x00007fcc59d41076 in spider_free_conn (conn=0x7fcc7412a3f8) at /home/nayuta/repo/mariadb-server/storage/spider/spd_conn.cc:1404
#6  0x00007fcc59d3ed77 in spider_free_conn_from_trx (trx=0x7fcc740f2d08, conn=0x7fcc7412a3f8, another=false, trx_free=false, roop_count=0x7fcc5a1fbb04)
    at /home/nayuta/repo/mariadb-server/storage/spider/spd_conn.cc:420
#7  0x00007fcc59cf61b3 in spider_free_trx_conn (trx=0x7fcc740f2d08, trx_free=false) at /home/nayuta/repo/mariadb-server/storage/spider/spd_trx.cc:117
#8  0x00007fcc59d01240 in spider_rollback (hton=0x7fcc74034d68, thd=0x7fcc74002718, all=false) at /home/nayuta/repo/mariadb-server/storage/spider/spd_trx.cc:3559
#9  0x000055e28ce608a3 in ha_rollback_trans (thd=0x7fcc74002718, all=false) at /home/nayuta/repo/mariadb-server/sql/handler.cc:2068
#10 0x000055e28ccb3ec6 in trans_rollback_stmt (thd=0x7fcc74002718) at /home/nayuta/repo/mariadb-server/sql/transaction.cc:535
#11 0x000055e28caf6188 in mysql_execute_command (thd=0x7fcc74002718) at /home/nayuta/repo/mariadb-server/sql/sql_parse.cc:6109
#12 0x000055e28cafbd5b in mysql_parse (thd=0x7fcc74002718, rawbuf=0x7fcc74015920 "INSERT INTO t VALUES (1)", length=24, parser_state=0x7fcc5a1fc3f0, is_com_multi=false,
    is_next_command=false) at /home/nayuta/repo/mariadb-server/sql/sql_parse.cc:8100
#13 0x000055e28cae7ce2 in dispatch_command (command=COM_QUERY, thd=0x7fcc74002718, packet=0x7fcc7400d039 "INSERT INTO t VALUES (1)", packet_length=24, is_com_multi=false,
    is_next_command=false) at /home/nayuta/repo/mariadb-server/sql/sql_parse.cc:1891
#14 0x000055e28cae64da in do_command (thd=0x7fcc74002718) at /home/nayuta/repo/mariadb-server/sql/sql_parse.cc:1370
#15 0x000055e28cc97693 in do_handle_one_connection (connect=0x55e290d009a8, put_in_cache=true) at /home/nayuta/repo/mariadb-server/sql/sql_connect.cc:1418
#16 0x000055e28cc97349 in handle_one_connection (arg=0x55e290d009a8) at /home/nayuta/repo/mariadb-server/sql/sql_connect.cc:1312
#17 0x000055e28d1bd116 in pfs_spawn_thread (arg=0x55e290c3c3a8) at /home/nayuta/repo/mariadb-server/storage/perfschema/pfs.cc:2201
#18 0x00007fcc847de450 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#19 0x00007fcc84377d53 in clone () from /lib/x86_64-linux-gnu/libc.so.6

Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-29 ]

The present bug is not resolved by the fix for MDEV-26582. So, this might be a different one, while I still think the mechanism to occur these two bugs are similar.

Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-30 ]

The server crashes even if the INSERT prior to ALTER TABLE succeeds.

--disable_query_log
--disable_result_log
--source ../../t/test_init.inc
--enable_result_log
--enable_query_log
 
CREATE SERVER s FOREIGN DATA WRAPPER mysql OPTIONS (USER 'root', HOST '127.0.0.1', PORT 16000, DATABASE 'auto_test_remote');
CREATE DATABASE auto_test_remote;
USE auto_test_remote;
CREATE TABLE t (c INT) ENGINE=InnoDB;
SET @@session.spider_same_server_link = ON;
 
CREATE DATABASE auto_test_local;
USE auto_test_local;
 
CREATE TABLE t (c INT) ENGINE=SPIDER PARTITION BY LIST COLUMNS (c) (PARTITION p DEFAULT COMMENT='srv "s", table "t"');
INSERT INTO t VALUES (1);
ALTER TABLE t CHECK PARTITION ALL;
 
DROP DATABASE auto_test_local;
DROP DATABASE auto_test_remote;
 
--disable_query_log
--disable_result_log
--source ../../t/test_deinit.inc
--enable_result_log
--enable_query_log

Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-30 ]

A connection is freed at the end of the INSERT and then is accessed by the ALTER TABLE.

Thread 2 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 61780.61825]
0x00007f323d865dcf in spider_check_and_set_trx_isolation (conn=0x7f323012a9f8, need_mon=0x7f3230128e20) at /home/nayuta/repo/mariadb-server/storage/spider/spd_trx.cc:1640
1640      if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL)
(rr) watch -l conn->thd
Hardware watchpoint 1: -location conn->thd
(rr) rc
Continuing.
 
Thread 2 received signal SIGSEGV, Segmentation fault.
0x00007f323d865dcf in spider_check_and_set_trx_isolation (conn=0x7f323012a9f8, need_mon=0x7f3230128e20) at /home/nayuta/repo/mariadb-server/storage/spider/spd_trx.cc:1640
1640      if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL)
(rr) 
Continuing.
 
Thread 2 hit Hardware watchpoint 1: -location conn->thd
 
Old value = (THD *) 0x8f8f8f8f8f8f8f8f
New value = (THD *) 0x0
0x00007f3255330474 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
(rr) bt
#0  0x00007f3255330474 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x000055cae84323c6 in free_memory (ptr=0x7f323012a9d0) at /home/nayuta/repo/mariadb-server/mysys/safemalloc.c:279
#2  0x000055cae843207c in sf_free (ptr=0x7f323012a9d0) at /home/nayuta/repo/mariadb-server/mysys/safemalloc.c:198
#3  0x000055cae841f3cb in my_free (ptr=0x7f323012a9e8) at /home/nayuta/repo/mariadb-server/mysys/my_malloc.c:211
#4  0x00007f323d909570 in spider_free_mem (trx=0x7f32300f2c08, ptr=0x7f323012a9f8, my_flags=0) at /home/nayuta/repo/mariadb-server/storage/spider/spd_malloc.cc:188
#5  0x00007f323d8ab09e in spider_free_conn (conn=0x7f323012a9f8) at /home/nayuta/repo/mariadb-server/storage/spider/spd_conn.cc:1404
#6  0x00007f323d8a8d9f in spider_free_conn_from_trx (trx=0x7f32300f2c08, conn=0x7f323012a9f8, another=false, trx_free=false, roop_count=0x7f323dd65114)
    at /home/nayuta/repo/mariadb-server/storage/spider/spd_conn.cc:420
#7  0x00007f323d8601b3 in spider_free_trx_conn (trx=0x7f32300f2c08, trx_free=false) at /home/nayuta/repo/mariadb-server/storage/spider/spd_trx.cc:117
#8  0x00007f323d86ae9a in spider_commit (hton=0x7f3230035fb8, thd=0x7f3230001d28, all=false) at /home/nayuta/repo/mariadb-server/storage/spider/spd_trx.cc:3486
#9  0x000055cae7b033a8 in commit_one_phase_2 (thd=0x7f3230001d28, all=false, trans=0x7f32300054b0, is_real_trans=true) at /home/nayuta/repo/mariadb-server/sql/handler.cc:1956
#10 0x000055cae7b03298 in ha_commit_one_phase (thd=0x7f3230001d28, all=false) at /home/nayuta/repo/mariadb-server/sql/handler.cc:1935
#11 0x000055cae7b023d4 in ha_commit_trans (thd=0x7f3230001d28, all=false) at /home/nayuta/repo/mariadb-server/sql/handler.cc:1729
#12 0x000055cae7956976 in trans_commit_stmt (thd=0x7f3230001d28) at /home/nayuta/repo/mariadb-server/sql/transaction.cc:472
#13 0x000055cae77991df in mysql_execute_command (thd=0x7f3230001d28) at /home/nayuta/repo/mariadb-server/sql/sql_parse.cc:6116
#14 0x000055cae779ed5b in mysql_parse (thd=0x7f3230001d28, rawbuf=0x7f3230016ba0 "INSERT INTO tbl_a VALUES (1)", length=28, parser_state=0x7f323dd663f0, is_com_multi=false, 
    is_next_command=false) at /home/nayuta/repo/mariadb-server/sql/sql_parse.cc:8100
#15 0x000055cae778ace2 in dispatch_command (command=COM_QUERY, thd=0x7f3230001d28, packet=0x7f323000e2b9 "INSERT INTO tbl_a VALUES (1)", packet_length=28, is_com_multi=false, 
    is_next_command=false) at /home/nayuta/repo/mariadb-server/sql/sql_parse.cc:1891
#16 0x000055cae77894da in do_command (thd=0x7f3230001d28) at /home/nayuta/repo/mariadb-server/sql/sql_parse.cc:1370
#17 0x000055cae793a693 in do_handle_one_connection (connect=0x55caeb9e3778, put_in_cache=true) at /home/nayuta/repo/mariadb-server/sql/sql_connect.cc:1418
#18 0x000055cae793a349 in handle_one_connection (arg=0x55caeb9e3778) at /home/nayuta/repo/mariadb-server/sql/sql_connect.cc:1312
#19 0x000055cae7e60116 in pfs_spawn_thread (arg=0x55caeb9207e8) at /home/nayuta/repo/mariadb-server/storage/perfschema/pfs.cc:2201
#20 0x00007f325572b450 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#21 0x00007f32552c4d53 in clone () from /lib/x86_64-linux-gnu/libc.so.6

Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-30 ]

spider_check_trx_and_get_conn() does not allocate connections if sql_command == SQLCOM_ALTER_TABLE. So, the following patch fix the bug.

However, I'm not convinced that this is right fix because I do not know why spider_check_trx_and_get_conn() behaves in such a way. I need to check it.

diff --git a/storage/spider/spd_trx.cc b/storage/spider/spd_trx.cc
index 0eda9d31df6..a28f4a47f47 100644
--- a/storage/spider/spd_trx.cc
+++ b/storage/spider/spd_trx.cc
@@ -3745,8 +3745,7 @@ int spider_check_trx_and_get_conn(
   spider->wide_handler->trx = trx;
   spider->set_error_mode();
   if (
-    spider->wide_handler->sql_command != SQLCOM_DROP_TABLE &&
-    spider->wide_handler->sql_command != SQLCOM_ALTER_TABLE
+    spider->wide_handler->sql_command != SQLCOM_DROP_TABLE
   ) {
     SPIDER_TRX_HA *trx_ha = spider_check_trx_ha(trx, spider);
     if (!trx_ha || trx_ha->wait_for_reusing)

Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-30 ]

Just dropping tbl_a does not lead to the server crash because DROP TABLE does not access to a data node. So, there is no need to allocate connections before DROP TABLE and thus the current Spider behavior seems to be correct. On the other hand, some ALTER TABLE statements, like ALTER TABLE ... CHECK PARTITION, might access the data node. So, we need to allocate a new connection.

Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-30 ]

serg Please review: https://github.com/MariaDB/server/commit/ad74dfc98a6d670802d6ca74476bf068e2779f5e

Comment by Sergei Golubchik [ 2021-10-15 ]

ad74dfc98a6d670802d6ca74476bf068e2779f5e is ok to push

Generated at Thu Feb 08 09:46:04 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.