[MDEV-31463] SIGSEGV in server_mysql_send_query | server_mysql_real_query Created: 2023-06-12  Updated: 2023-09-12  Resolved: 2023-06-28

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - Spider
Affects Version/s: 11.0, 11.1
Fix Version/s: 11.0.3, 11.1.2

Type: Bug Priority: Critical
Reporter: Ramesh Sivaraman Assignee: Yuchen Pei
Resolution: Fixed Votes: 0
Labels: UBSAN

Issue Links:
Problem/Incident
causes MDEV-31586 The test spider/bugfix.mdev_31463 has... Closed
is caused by MDEV-29676 Dual thread hang in 'closing tables' ... Closed

 Description   

CREATE SERVER srv FOREIGN DATA WRAPPER MYSQL OPTIONS (SOCKET '../socket.sock',DATABASE'',USER'',PASSWORD'');
SET GLOBAL wait_timeout=+1;
INSTALL PLUGIN Spider SONAME 'ha_spider.so';
CREATE TABLE t (c INT KEY,c1 BLOB,c2 TEXT) ENGINE=Spider COMMENT='WRAPPER "mysql",srv "srv"';
XA START 'a';
INSERT INTO t VALUES (0,0,0,0);
SELECT SLEEP (4);
INSERT INTO t VALUES (1,2,3),(4,5,6),(7,8,9);
SELECT * FROM information_schema.key_column_usage;

Leads to:

11.0.2 5ba3bafb834dbc56e6105809ded9a7ccef70ee54 (Optimized)

Core was generated by `/test/MD060623-mariadb-11.0.2-linux-x86_64-opt/bin/mariadbd --no-defaults --cor'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000055e2abdce487 in server_mysql_send_query (mysql=mysql@entry=0x0, 
    query=0x14e250201918 "set @old_lock_wait_timeout=@@session.lock_wait_timeout;set session lock_wait_timeout=1;", length=87)
    at /test/11.0_opt/sql-common/client.c:3541
[Current thread is 1 (Thread 0x14e28c08f700 (LWP 64380))]
(gdb) bt
#0  0x000055e2abdce487 in server_mysql_send_query (mysql=mysql@entry=0x0, query=0x14e250201918 "set @old_lock_wait_timeout=@@session.lock_wait_timeout;set session lock_wait_timeout=1;", length=87) at /test/11.0_opt/sql-common/client.c:3541
#1  0x000055e2abdce516 in server_mysql_real_query (mysql=0x0, query=<optimized out>, length=<optimized out>) at /test/11.0_opt/sql-common/client.c:3558
#2  0x000014e278173738 in spider_db_mbase::exec_query (this=0x14e2500654e0, query=0x14e250201918 "set @old_lock_wait_timeout=@@session.lock_wait_timeout;set session lock_wait_timeout=1;", length=87, quick_mode=<optimized out>) at /test/11.0_opt/storage/spider/spd_db_mysql.cc:2083
#3  0x000014e278181c11 in spider_db_mbase::set_lock_wait_timeout (this=0x14e2500654e0, timeout=<optimized out>) at /test/11.0_opt/sql/sql_string.h:741
#4  0x000014e278182116 in spider_mbase_handler::show_table_status (this=0x14e250069390, link_idx=<optimized out>, sts_mode=1, flag=<optimized out>) at /test/11.0_opt/storage/spider/spd_db_mysql.cc:13595
#5  0x000014e278135411 in spider_get_sts (share=0x14e250061f38, link_idx=0, tmp_time=tmp_time@entry=1686553891, spider=spider@entry=0x14e2500247b0, sts_interval=sts_interval@entry=0, sts_mode=sts_mode@entry=1, sts_sync=sts_sync@entry=0, sts_sync_level=1, flag=86) at /test/11.0_opt/storage/spider/spd_table.cc:6822
#6  0x000014e278163b68 in ha_spider::info (this=0x14e2500247b0, flag=22) at /test/11.0_opt/storage/spider/ha_spider.cc:6690
#7  0x000055e2abc38b82 in get_schema_key_column_usage_record (res=false, table_name=0x14e250082008, db_name=0x14e250014670, table=0x14e250083d10, tables=<optimized out>, thd=0x14e250000c58) at /test/11.0_opt/sql/sql_show.cc:7448
#8  get_schema_key_column_usage_record (thd=0x14e250000c58, tables=<optimized out>, table=0x14e250083d10, res=<optimized out>, db_name=0x14e250014670, table_name=0x14e250082008) at /test/11.0_opt/sql/sql_show.cc:7426
#9  0x000055e2abc28218 in fill_schema_table_by_open (thd=0x14e250000c58, mem_root=<optimized out>, is_show_fields_or_keys=<optimized out>, table=0x14e250083d10, schema_table=0x55e2acc87ce0 <schema_tables+1280>, orig_db_name=0x14e250014670, orig_table_name=0x14e250082008, open_tables_state_backup=0x14e28c08bdf0, can_deadlock=true) at /test/11.0_opt/sql/sql_show.cc:4749
#10 0x000055e2abc3f7ea in get_all_tables (thd=0x14e250000c58, tables=0x14e250010ec8, cond=<optimized out>) at /test/11.0_opt/sql/sql_show.cc:5385
#11 0x000055e2abc40f2e in get_schema_tables_result (join=join@entry=0x14e250012560, executed_place=executed_place@entry=PROCESSED_BY_JOIN_EXEC) at /test/11.0_opt/sql/sql_show.cc:9060
#12 0x000055e2abc23e97 in JOIN::exec_inner (this=0x14e250012560) at /test/11.0_opt/sql/sql_select.cc:4859
#13 0x000055e2abc247ae in JOIN::exec (this=this@entry=0x14e250012560) at /test/11.0_opt/sql/sql_select.cc:4680
#14 0x000055e2abc228ec in mysql_select (thd=0x14e250000c58, tables=0x14e250010ec8, fields=<optimized out>, conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=<optimized out>, result=0x14e250012538, unit=0x14e250004ce0, select_lex=0x14e250010830) at /test/11.0_opt/sql/sql_select.cc:5161
#15 0x000055e2abc23037 in handle_select (thd=thd@entry=0x14e250000c58, lex=lex@entry=0x14e250004c08, result=result@entry=0x14e250012538, setup_tables_done_option=setup_tables_done_option@entry=0) at /test/11.0_opt/sql/sql_select.cc:619
#16 0x000055e2abba01fe in execute_sqlcom_select (thd=0x14e250000c58, all_tables=0x14e250010ec8) at /test/11.0_opt/sql/sql_parse.cc:6279
#17 0x000055e2abbae0d4 in mysql_execute_command (thd=0x14e250000c58, is_called_from_prepared_stmt=<optimized out>) at /test/11.0_opt/sql/sql_parse.cc:3949
#18 0x000055e2abb9ae55 in mysql_parse (rawbuf=<optimized out>, length=<optimized out>, parser_state=<optimized out>, thd=0x14e250000c58) at /test/11.0_opt/sql/sql_parse.cc:8014
#19 mysql_parse (thd=0x14e250000c58, rawbuf=<optimized out>, length=<optimized out>, parser_state=<optimized out>) at /test/11.0_opt/sql/sql_parse.cc:7936
#20 0x000055e2abba7142 in dispatch_command (command=COM_QUERY, thd=0x14e250000c58, packet=<optimized out>, packet_length=<optimized out>, blocking=<optimized out>) at /test/11.0_opt/sql/sql_class.h:1370
#21 0x000055e2abba8f5e in do_command (thd=0x14e250000c58, blocking=blocking@entry=true) at /test/11.0_opt/sql/sql_parse.cc:1407
#22 0x000055e2abcc850f in do_handle_one_connection (connect=<optimized out>, connect@entry=0x55e2adfbbaf8, put_in_cache=put_in_cache@entry=true) at /test/11.0_opt/sql/sql_connect.cc:1416
#23 0x000055e2abcc87fd in handle_one_connection (arg=0x55e2adfbbaf8) at /test/11.0_opt/sql/sql_connect.cc:1318
#24 0x000014e2a453a609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#25 0x000014e2a4126133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Bug confirmed present in:
MariaDB: 11.0.2 (dbg), 11.0.2 (opt)

Bug (or feature/syntax) confirmed not present in:
MariaDB: 10.4.30 (dbg), 10.4.30 (opt), 10.5.21 (dbg), 10.5.21 (opt), 10.6.14 (dbg), 10.6.14 (opt), 10.9.7 (dbg), 10.9.7 (opt), 10.10.5 (dbg), 10.10.5 (opt), 10.11.4 (dbg), 10.11.4 (opt), 11.1.0 (dbg), 11.1.0 (opt)



 Comments   
Comment by Roel Van de Paar [ 2023-06-13 ]

This bug also produces a new UBSAN error:

11.0.2 368dd22a816f3b437bccd0b9ff28b9de9b1abf0a (Debug, UBASAN)

/test/11.0_dbg_san/sql-common/client.c:3541:21: runtime error: member access within null pointer of type 'struct MYSQL'
    #0 0x555ef6996bce in server_mysql_send_query /test/11.0_dbg_san/sql-common/client.c:3541
    #1 0x555ef6996ce7 in server_mysql_real_query /test/11.0_dbg_san/sql-common/client.c:3558
    #2 0x14844b9d9be5 in spider_db_mbase::exec_query(char const*, unsigned int, int) /test/11.0_dbg_san/storage/spider/spd_db_mysql.cc:2083
    #3 0x14844ba3f7e0 in spider_db_mbase::set_lock_wait_timeout(unsigned int) /test/11.0_dbg_san/storage/spider/spd_db_mysql.cc:13457
    #4 0x14844ba427f9 in spider_mbase_handler::show_table_status(int, int, unsigned int) /test/11.0_dbg_san/storage/spider/spd_db_mysql.cc:13548
    #5 0x14844b7018a1 in spider_db_show_table_status(ha_spider*, int, int, unsigned int) /test/11.0_dbg_san/storage/spider/spd_db_conn.cc:5210
    #6 0x14844b7da51d in spider_get_sts(st_spider_share*, int, long, ha_spider*, double, int, int, int, unsigned int) /test/11.0_dbg_san/storage/spider/spd_table.cc:6822
    #7 0x14844b948b44 in ha_spider::info(unsigned int) /test/11.0_dbg_san/storage/spider/ha_spider.cc:6690
    #8 0x555ef599ff75 in get_schema_key_column_usage_record /test/11.0_dbg_san/sql/sql_show.cc:7448
    #9 0x555ef58cd43c in fill_schema_table_by_open /test/11.0_dbg_san/sql/sql_show.cc:4749
    #10 0x555ef59dec13 in get_all_tables(THD*, TABLE_LIST*, Item*) /test/11.0_dbg_san/sql/sql_show.cc:5385
    #11 0x555ef59e6dc4 in get_schema_tables_result(JOIN*, enum_schema_table_state) /test/11.0_dbg_san/sql/sql_show.cc:9060
    #12 0x555ef588283c in JOIN::exec_inner() /test/11.0_dbg_san/sql/sql_select.cc:4857
    #13 0x555ef5886916 in JOIN::exec() /test/11.0_dbg_san/sql/sql_select.cc:4677
    #14 0x555ef58750c1 in mysql_select(THD*, TABLE_LIST*, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*) /test/11.0_dbg_san/sql/sql_select.cc:5158
    #15 0x555ef587951c in handle_select(THD*, LEX*, select_result*, unsigned long long) /test/11.0_dbg_san/sql/sql_select.cc:616
    #16 0x555ef53eba01 in execute_sqlcom_select /test/11.0_dbg_san/sql/sql_parse.cc:6279
    #17 0x555ef544cef5 in mysql_execute_command(THD*, bool) /test/11.0_dbg_san/sql/sql_parse.cc:3949
    #18 0x555ef547c973 in mysql_parse(THD*, char*, unsigned int, Parser_state*) /test/11.0_dbg_san/sql/sql_parse.cc:8014
    #19 0x555ef548c707 in dispatch_command(enum_server_command, THD*, char*, unsigned int, bool) /test/11.0_dbg_san/sql/sql_parse.cc:1894
    #20 0x555ef549a542 in do_command(THD*, bool) /test/11.0_dbg_san/sql/sql_parse.cc:1407
    #21 0x555ef5e6f8b5 in do_handle_one_connection(CONNECT*, bool) /test/11.0_dbg_san/sql/sql_connect.cc:1416
    #22 0x555ef5e70dd0 in handle_one_connection /test/11.0_dbg_san/sql/sql_connect.cc:1318
    #23 0x14846ee94b42 in start_thread nptl/pthread_create.c:442
    #24 0x14846ef269ff  (/lib/x86_64-linux-gnu/libc.so.6+0x1269ff)
 
230613 18:30:31 [ERROR] mysqld got signal 11 ;

Comment by Roel Van de Paar [ 2023-06-14 ]

MTR Testcase

--source include/have_innodb.inc
--source include/have_partition.inc
--let $SOCKET= `SELECT @@global.socket`
SET GLOBAL wait_timeout=+1;
INSTALL PLUGIN Spider SONAME 'ha_spider.so';
CREATE USER spider@localhost IDENTIFIED BY 'pwd';
GRANT ALL ON test.* TO spider@localhost;
eval CREATE SERVER srv FOREIGN DATA WRAPPER MYSQL OPTIONS (SOCKET "$SOCKET",DATABASE 'test',USER 'spider',PASSWORD 'pwd');
CREATE TABLE t (c INT KEY,c1 BLOB,c2 TEXT) ENGINE=Spider COMMENT='WRAPPER "mysql",srv "srv"';
XA START 'a';
--error ER_WRONG_VALUE_COUNT_ON_ROW
INSERT INTO t VALUES (0,0,0,0);
SELECT SLEEP (4);
--error ER_NET_ERROR_ON_WRITE
INSERT INTO t VALUES (1,2,3),(4,5,6),(7,8,9);
SELECT * FROM information_schema.key_column_usage;

Comment by Yuchen Pei [ 2023-06-14 ]

Not surprisingly, the bad commit is

commit 6abafdbb7cb79aec4dda159f177948e1ea77b7aa (HEAD -> bb-11.0-mdev-29676, upstream/bb-11.0-mdev-29676)
Author: Yuchen Pei <yuchen.pei@mariadb.com>
Date:   Thu Apr 27 09:51:34 2023 +1000
 
    MDEV-29676 Add query to set lock wait timeout when getting sts crd
    
    Set the lock wait timeout to 1 beforehand, and reset it afterwards, to
    avoid lock conflict caused by opening the same table twice in case of
    self-reference.

An mtr testcase is as follows. It needs to be improved as on a good commit, it does not clean up and fails the internal post-test check

--echo #
--echo # MDEV-31463 SIGSEGV in server_mysql_send_query | server_mysql_real_query
--echo #
 
--disable_query_log
--disable_result_log
--source ../../t/test_init.inc
--enable_result_log
--enable_query_log
 
evalp CREATE SERVER srv FOREIGN DATA WRAPPER MYSQL OPTIONS (SOCKET "$MASTER_1_MYSOCK", DATABASE 'test',user 'root');
 
SET GLOBAL wait_timeout=+1;
CREATE TABLE t (c INT KEY,c1 BLOB,c2 TEXT) ENGINE=Spider COMMENT='WRAPPER "mysql",srv "srv"';
XA START 'a';
--error ER_WRONG_VALUE_COUNT_ON_ROW
INSERT INTO t VALUES (0,0,0,0);
SELECT SLEEP (4);
--error ER_NET_ERROR_ON_WRITE
INSERT INTO t VALUES (1,2,3),(4,5,6),(7,8,9);
SELECT * FROM information_schema.key_column_usage;

Comment by Yuchen Pei [ 2023-06-14 ]

This is caused by set_lock_wait_timeout() not checking for errors like conn->server_lost before executing query. Compare this to spider_db_query() which does the check through spider_db_before_query(). So we need to do something similar with setting the lock wait timeout.

Comment by Yuchen Pei [ 2023-06-15 ]

Raising the prio because this is a new failure caused by a recent change

Comment by Yuchen Pei [ 2023-06-15 ]

Here's a PoC patch that fixes the issue and passes all existing tests. Will need to refine it:

https://github.com/MariaDB/server/commit/861f3f98865

Comment by Yuchen Pei [ 2023-06-16 ]

Hi holyfoot, ptal thanks:

https://github.com/MariaDB/server/commit/ba739969856

Comment by Alexey Botchkov [ 2023-06-27 ]

ok to push.
one recommendation there in the patch comment.

Comment by Yuchen Pei [ 2023-06-28 ]

Thanks for the review holyfoot. Applied your comment and pushed to 11.0

Generated at Thu Feb 08 10:24:04 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.