[MDEV-26158] SIGSEGV in spider_free_mem from ha_spider::open on INSERT Created: 2021-07-16  Updated: 2021-10-19  Resolved: 2021-10-19

Status: Closed
Project: MariaDB Server
Component/s: Partitioning, Storage Engine - Spider
Affects Version/s: 10.5, 10.6, 10.7
Fix Version/s: 10.5.13, 10.6.5

Type: Bug Priority: Critical
Reporter: Nayuta Yanagisawa (Inactive) Assignee: Nayuta Yanagisawa (Inactive)
Resolution: Fixed Votes: 2
Labels: affects-tests, memory, not-10.2, not-10.3, not-10.4, regression

Issue Links:
Problem/Incident
is caused by MDEV-19002 Partition performance optimization Stalled
Relates

 Description   

During the investigation for MDEV-24769, I hit another crash bug possibly related to the Spider SE.

How to reproduce:

INSTALL PLUGIN spider SONAME 'ha_spider.so';
SET @@session.spider_same_server_link = ON; 
 
use test;
 
GRANT ALL PRIVILEGES ON *.* TO 'spinne'@'127.0.0.1'  IDENTIFIED BY 'password';
 
CREATE SERVER IF NOT EXISTS data1
FOREIGN DATA WRAPPER mysql
OPTIONS(
HOST '127.0.0.1',
DATABASE 'test',
USER 'spinne',
PORT 16000,
PASSWORD 'password'
);
 
CREATE TABLE IF NOT EXISTS `spider_tab` (
  `col1`int,
  `col2`int,
  PRIMARY KEY (`col1`)
) ENGINE=SPIDER DEFAULT CHARSET=utf8 COMMENT='table "spider_data_tab"'
PARTITION BY LIST COLUMNS(`col1`)
(PARTITION `pt1` DEFAULT COMMENT = 'srv "data1"' ENGINE = SPIDER);
 
CREATE TABLE IF NOT EXISTS `spider_data_tab` (
  `col1` int,
  `col2` int,
  PRIMARY KEY (`col1`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
 
insert into `spider_data_tab` VALUES (1,999), (2,999);
 
insert into spider_tab select col1+10, col2 from spider_tab;

Error log:

Version: '10.5.10-MariaDB-debug-log'  socket: '/home/vagrant/repo/mariadb-server/bld/mysql-test/var/tmp/mysqld.1.1.sock'  port: 16020  Source distribution
210716  6:47:18 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
 
Server version: 10.5.10-MariaDB-debug-log
key_buffer_size=1048576
read_buffer_size=131072
max_used_connections=1
max_threads=153
thread_count=22
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 63751 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x7f8830000db8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f885ee67d98 thread_stack 0x49000
mysys/stacktrace.c:212(my_print_stacktrace)[0x55b006c91891]
sql/signal_handler.cc:224(handle_fatal_signal)[0x55b0063668f2]
??:0(__restore_rt)[0x7f88c03c43c0]
spider/spd_malloc.cc:185(spider_free_mem(st_spider_transaction*, void*, unsigned long))[0x7f88bf94db66]
spider/ha_spider.cc:669(ha_spider::open(char const*, int, unsigned int))[0x7f88bf95df62]
sql/handler.cc:2983(handler::ha_open(TABLE*, char const*, int, unsigned int, st_mem_root*, List<String>*))[0x55b0063708a6]
sql/ha_partition.cc:8641(ha_partition::open_read_partitions(char*, unsigned long))[0x55b0066bb33c]
sql/ha_partition.cc:3714(ha_partition::open(char const*, int, unsigned int))[0x55b0066ad1d2]
sql/handler.cc:2983(handler::ha_open(TABLE*, char const*, int, unsigned int, st_mem_root*, List<String>*))[0x55b0063708a6]
sql/table.cc:4218(open_table_from_share(THD*, TABLE_SHARE*, st_mysql_const_lex_string const*, unsigned int, unsigned int, unsigned int, TABLE*, bool, List<String>*))[0x55b0061476c3]
sql/sql_base.cc:2001(open_table(THD*, TABLE_LIST*, Open_table_context*))[0x55b005f4f5b9]
sql/sql_base.cc:3801(open_and_process_table(THD*, TABLE_LIST*, unsigned int*, unsigned int, Prelocking_strategy*, bool, Open_table_context*))[0x55b005f534c4]
sql/sql_base.cc:4275(open_tables(THD*, DDL_options_st const&, TABLE_LIST**, unsigned int*, unsigned int, Prelocking_strategy*))[0x55b005f5478c]
sql/sql_base.cc:5211(open_and_lock_tables(THD*, DDL_options_st const&, TABLE_LIST*, bool, unsigned int, Prelocking_strategy*))[0x55b005f56731]
sql/sql_base.h:507(open_and_lock_tables(THD*, TABLE_LIST*, bool, unsigned int))[0x55b005f0c049]
sql/sql_parse.cc:4712(mysql_execute_command(THD*))[0x55b0060003e8]
sql/sql_parse.cc:8099(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x55b00600b78a]
sql/sql_parse.cc:1894(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool))[0x55b005ff7701]
sql/sql_parse.cc:1370(do_command(THD*))[0x55b005ff5ef3]
sql/sql_connect.cc:1410(do_handle_one_connection(CONNECT*, bool))[0x55b0061a5aff]
sql/sql_connect.cc:1314(handle_one_connection)[0x55b0061a5862]
perfschema/pfs.cc:2203(pfs_spawn_thread)[0x55b0066c93bb]
??:0(start_thread)[0x7f88c03b8609]
??:0(clone)[0x7f88bff8b293]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f8830013f20): insert into spider_tab select col1+10, col2 from spider_tab
 
Connection ID (thread ID): 24
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off



 Comments   
Comment by Nayuta Yanagisawa (Inactive) [ 2021-07-22 ]

Now, I cannot reproduce the bug even in my development environment. The crash might be due to my environment. Thus, I change the priority to 'major'.

Comment by Nayuta Yanagisawa (Inactive) [ 2021-08-13 ]

This seems to be due to my environment. So, I close the issue.

Comment by Roel Van de Paar [ 2021-09-04 ]

This crash is easily reproducible, thank you for a clear testcase.

Not sporadic it seems, at least on CS 10.7. I have also reduced the testcase.

INSTALL PLUGIN spider SONAME 'ha_spider.so';
CREATE SERVER d FOREIGN DATA WRAPPER mysql OPTIONS(HOST'',DATABASE'',USER'',PORT 16000,PASSWORD'');
CREATE TABLE t(c INT) ENGINE=SPIDER COMMENT='table "st"' PARTITION BY LIST COLUMNS(c)(PARTITION pt DEFAULT COMMENT='srv "d"' ENGINE=SPIDER);
INSERT INTO t SELECT * FROM t;

Leads to:

10.7.0 1bc82aaf0a7746c0921a94034aff2d51f0d75cd0 (Debug)

Core was generated by `/test/MD040921-mariadb-10.7.0-linux-x86_64-dbg/bin/mysqld --no-defaults --core-'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  spider_free_mem (trx=0x14ca200545e8, ptr=ptr@entry=0x0, 
    my_flags=my_flags@entry=0)
    at /test/10.7_dbg/storage/spider/spd_malloc.cc:185
185	  size = *((uint *) tmp_ptr);
[Current thread is 1 (Thread 0x14ca64154700 (LWP 38827))]
(gdb) bt
#0  spider_free_mem (trx=0x14ca200545e8, ptr=ptr@entry=0x0, my_flags=my_flags@entry=0) at /test/10.7_dbg/storage/spider/spd_malloc.cc:185
#1  0x000014ca50140919 in ha_spider::open (this=0x14ca200cdd90, name=<optimized out>, mode=<optimized out>, test_if_locked=<optimized out>) at /test/10.7_dbg/storage/spider/ha_spider.cc:668
#2  0x000055f7810357eb in handler::ha_open (this=0x14ca200cdd90, table_arg=0x14ca200ccbf8, name=name@entry=0x14ca64152160 "./test/t#P#pt", mode=2, test_if_locked=1042, mem_root=mem_root@entry=0x0, partitions_to_open=0x0) at /test/10.7_dbg/sql/handler.cc:3313
#3  0x000055f781304f72 in ha_partition::open_read_partitions (this=this@entry=0x14ca200cd4d0, name_buff=name_buff@entry=0x14ca64152160 "./test/t#P#pt", name_buff_size=name_buff_size@entry=513) at /test/10.7_dbg/sql/ha_partition.cc:8726
#4  0x000055f7813058ea in ha_partition::open (this=0x14ca200cd4d0, name=0x14ca2002e9b8 "./test/t", mode=2, test_if_locked=18) at /test/10.7_dbg/sql/ha_partition.cc:3709
#5  0x000055f7810357eb in handler::ha_open (this=0x14ca200cd4d0, table_arg=table_arg@entry=0x14ca200ccbf8, name=0x14ca2002e9b8 "./test/t", mode=mode@entry=2, test_if_locked=test_if_locked@entry=18, mem_root=mem_root@entry=0x0, partitions_to_open=0x0) at /test/10.7_dbg/sql/handler.cc:3313
#6  0x000055f780e63bbc in open_table_from_share (thd=thd@entry=0x14ca20000db8, share=share@entry=0x14ca2002e420, alias=alias@entry=0x14ca20014a68, db_stat=db_stat@entry=33, prgflag=prgflag@entry=8, ha_open_flags=18, outparam=0x14ca200ccbf8, is_create_table=false, partitions_to_open=0x0) at /test/10.7_dbg/sql/table.cc:4327
#7  0x000055f780c9c60a in open_table (thd=thd@entry=0x14ca20000db8, table_list=table_list@entry=0x14ca20014a20, ot_ctx=ot_ctx@entry=0x14ca64152d30) at /test/10.7_dbg/sql/sql_base.cc:2003
#8  0x000055f780ca0d51 in open_and_process_table (ot_ctx=0x14ca64152d30, has_prelocking_list=false, prelocking_strategy=0x14ca64152fc0, flags=0, counter=0x14ca64152dcc, tables=0x14ca20014a20, thd=0x14ca20000db8) at /test/10.7_dbg/sql/sql_base.cc:3807
#9  open_tables (thd=thd@entry=0x14ca20000db8, options=@0x14ca20006610: {m_options = DDL_options_st::OPT_NONE}, start=start@entry=0x14ca64152db8, counter=counter@entry=0x14ca64152dcc, flags=flags@entry=0, prelocking_strategy=prelocking_strategy@entry=0x14ca64152fc0) at /test/10.7_dbg/sql/sql_base.cc:4290
#10 0x000055f780ca1e4e in open_and_lock_tables (thd=thd@entry=0x14ca20000db8, options=@0x14ca20006610: {m_options = DDL_options_st::OPT_NONE}, tables=<optimized out>, derived=derived@entry=true, flags=flags@entry=0, prelocking_strategy=prelocking_strategy@entry=0x14ca64152fc0) at /test/10.7_dbg/sql/sql_base.cc:5262
#11 0x000055f780d385ff in open_and_lock_tables (flags=0, derived=true, tables=<optimized out>, thd=0x14ca20000db8) at /test/10.7_dbg/sql/sql_base.h:509
#12 mysql_execute_command (thd=thd@entry=0x14ca20000db8, is_called_from_prepared_stmt=is_called_from_prepared_stmt@entry=false) at /test/10.7_dbg/sql/sql_parse.cc:4649
#13 0x000055f780d22ccb in mysql_parse (thd=thd@entry=0x14ca20000db8, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x14ca64153400) at /test/10.7_dbg/sql/sql_parse.cc:8036
#14 0x000055f780d318d0 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x14ca20000db8, packet=packet@entry=0x14ca2000b739 "INSERT INTO t SELECT * FROM t", packet_length=packet_length@entry=29, blocking=blocking@entry=true) at /test/10.7_dbg/sql/sql_class.h:1358
#15 0x000055f780d34cd6 in do_command (thd=0x14ca20000db8, blocking=blocking@entry=true) at /test/10.7_dbg/sql/sql_parse.cc:1404
#16 0x000055f780eab0c8 in do_handle_one_connection (connect=<optimized out>, connect@entry=0x55f7848e94c8, put_in_cache=put_in_cache@entry=true) at /test/10.7_dbg/sql/sql_connect.cc:1418
#17 0x000055f780eab6cd in handle_one_connection (arg=arg@entry=0x55f7848e94c8) at /test/10.7_dbg/sql/sql_connect.cc:1312
#18 0x000055f781314ade in pfs_spawn_thread (arg=0x55f7847e82d8) at /test/10.7_dbg/storage/perfschema/pfs.cc:2201
#19 0x000014ca65cc8609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#20 0x000014ca658b6293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Bug confirmed present in:
MariaDB: 10.5.13 (dbg), 10.5.13 (opt), 10.6.5 (dbg), 10.6.5 (opt), 10.7.0 (dbg), 10.7.0 (opt)

Bug (or feature/syntax) confirmed not present in:
MariaDB: 10.2.41 (dbg), 10.2.41 (opt), 10.3.32 (dbg), 10.3.32 (opt), 10.4.22 (dbg), 10.4.22 (opt)
MySQL: 5.5.62 (dbg), 5.5.62 (opt), 5.6.51 (dbg), 5.6.51 (opt), 5.7.35 (dbg), 5.7.35 (opt), 8.0.26 (dbg), 8.0.26 (opt)

Comment by Roel Van de Paar [ 2021-09-04 ]

This bug terminates most of my initial Spider test runs with similar SQL.

Comment by Roel Van de Paar [ 2021-09-06 ]

Additional testcase to test/add to MTR

INSTALL PLUGIN spider SONAME 'ha_spider.so';
SET SESSION spider_same_server_link=ON;
CREATE SERVER d FOREIGN DATA WRAPPER mysql OPTIONS (HOST '127.0.0.1', DATABASE 'test', USER 'root', PORT 21562, PASSWORD '');
CREATE TABLE test_table (c1 int(11) NOT NULL, c2 TEXT DEFAULT NULL, PRIMARY KEY (c1)) ENGINE=SPIDER COMMENT='wrapper "mysql", table "test_table"'  PARTITION BY LIST (c1) (PARTITION pt0 VALUES IN (0) ENGINE=SPIDER COMMENT='srv "d"', PARTITION pt1 VALUES IN (1) ENGINE=SPIDER COMMENT='srv "d"');
INSERT INTO test_table VALUES (100,'a');

Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-06 ]

I confirm the bug is reproducible on 10.5 HEAD.

Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-21 ]

MTR test case:

--disable_query_log
--disable_result_log
--source ../../t/test_init.inc
--enable_result_log
--enable_query_log
 
CREATE DATABASE auto_test_local;
USE auto_test_local;
 
CREATE SERVER d FOREIGN DATA WRAPPER mysql OPTIONS(HOST'',DATABASE'',USER'',PORT 16000,PASSWORD'');
CREATE TABLE t(c INT) ENGINE=SPIDER COMMENT='table "st"' PARTITION BY LIST COLUMNS(c)(PARTITION pt DEFAULT COMMENT='srv "d"' ENGINE=SPIDER);
--error 1429
INSERT INTO t SELECT * FROM t;
 
DROP DATABASE auto_test_local;
 
--disable_query_log
--disable_result_log
--source ../../t/test_deinit.inc
--enable_result_log
--enable_query_log

Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-21 ]

I tried git-bisect and got the following result:

e954d9de886aebc68c39240304fe97ae88276dbb is the first bad commit
commit e954d9de886aebc68c39240304fe97ae88276dbb
Author: Kentoku SHIBA <kentokushiba@gmail.com>
Date:   Tue Mar 3 02:50:40 2020 +0900
 
    MDEV-19002 Spider performance optimization with partition
 
    Change the following function for batch call instead of each partition
    - store_lock
    - external_lock
    - start_stmt
    - extra
    - cond_push
    - info_push
    - top_table
 
 sql/ha_partition.h                                 |    4 +
 storage/spider/ha_spider.cc                        | 2975 +++++++++++---------
 storage/spider/ha_spider.h                         |   69 +-
 .../spider/bugfix/include/insert_select_deinit.inc |   16 +
 .../spider/bugfix/include/insert_select_init.inc   |   43 +
 .../mysql-test/spider/bugfix/t/insert_select.cnf   |    3 +
 .../mysql-test/spider/bugfix/t/insert_select.test  |   99 +
 storage/spider/spd_conn.cc                         |   43 +-
 storage/spider/spd_copy_tables.cc                  |   16 +-
 storage/spider/spd_db_conn.cc                      |  358 +--
 storage/spider/spd_db_include.h                    |    1 -
 storage/spider/spd_db_mysql.cc                     |  216 +-
 storage/spider/spd_db_oracle.cc                    |  146 +-
 storage/spider/spd_group_by_handler.cc             |   20 +-
 storage/spider/spd_include.h                       |  127 +-
 storage/spider/spd_table.cc                        |  456 +--
 storage/spider/spd_table.h                         |   26 +-
 storage/spider/spd_trx.cc                          |  132 +-
 storage/spider/spd_trx.h                           |    4 +
 19 files changed, 2808 insertions(\+), 1946 deletions(\-)
 create mode 100644 storage/spider/mysql-test/spider/bugfix/include/insert_select_deinit.inc
 create mode 100644 storage/spider/mysql-test/spider/bugfix/include/insert_select_init.inc
 create mode 100644 storage/spider/mysql-test/spider/bugfix/t/insert_select.cnf
 create mode 100644 storage/spider/mysql-test/spider/bugfix/t/insert_select.test
bisect run success

Comment by Nayuta Yanagisawa (Inactive) [ 2021-10-05 ]

A refined MTR test case:

--echo #
--echo # MDEV-26158 SIGSEGV in spider_free_mem from ha_spider::open on INSERT
--echo #
 
--disable_query_log
--disable_result_log
--source ../../t/test_init.inc
--enable_result_log
--enable_query_log
 
--connection master_1
CREATE DATABASE auto_test_local;
USE auto_test_local;
 
eval CREATE TABLE t (
    c INT
) $MASTER_1_ENGINE $MASTER_1_CHARSET COMMENT='table "tbl_a"'
PARTITION BY LIST COLUMNS(`c`) (
    PARTITION `pt1` DEFAULT COMMENT = 'srv "s_2_1"'
);
 
--error 1429
INSERT INTO t SELECT * FROM t;
 
DROP DATABASE auto_test_local;
 
--disable_query_log
--disable_result_log
--source ../../t/test_deinit.inc
--enable_result_log
--enable_query_log

Comment by Nayuta Yanagisawa (Inactive) [ 2021-10-05 ]

In the above MTR test case, spider_get_share() fails due to non-existing remote database. Then, Spider jums to the error_get_share label, calls the spider_free() macro, and then SIGSEGV.

Comment by Nayuta Yanagisawa (Inactive) [ 2021-10-05 ]

In the below code fragment, this == pt_handler_share_handlers[0]. So, it nullifies wide_handler and then tries to free it.

int ha_spider::open(
...
error_get_share:
  if (wide_handler_alloc)
  {
#ifdef WITH_PARTITION_STORAGE_ENGINE
    if (pt_handler_share_handlers)
    {
      pt_handler_share_handlers[0]->wide_handler = NULL;
    }
#endif
    spider_free(spider_current_trx, wide_handler, MYF(0));
    spider->wide_handler = NULL;
    owner->wide_handler = NULL;
    owner->wide_handler_owner = FALSE;
  }
  wide_handler = NULL;
...
}

Comment by Nayuta Yanagisawa (Inactive) [ 2021-10-05 ]

holyfoot Please review: https://github.com/MariaDB/server/commit/93a8c3304877d6355faad033301afaee0a3477dd

Comment by Alexey Botchkov [ 2021-10-19 ]

Patch is ok to push.

Generated at Thu Feb 08 09:43:11 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.