[MDEV-32558] ERROR 1429 (base) versus crash [SIGSEGV in spider_create_conn] (28856 patch) and ERROR 12719 infinite loop (base) versus ERROR 12518 table is read only (patch) on optimized builds in CLI Created: 2023-10-24  Updated: 2023-10-26  Resolved: 2023-10-26

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - Spider
Affects Version/s: N/A
Fix Version/s: N/A

Type: Bug Priority: Critical
Reporter: Roel Van de Paar Assignee: Yuchen Pei
Resolution: Duplicate Votes: 0
Labels: affects-tests

Issue Links:
Duplicate
duplicates MDEV-32492 SIGSEGV in spider_conn_first_link_idx... Confirmed
Problem/Incident
is caused by MDEV-28856 Spider: Implement more engine-defined... Closed
Relates
relates to MDEV-32492 SIGSEGV in spider_conn_first_link_idx... Confirmed

 Description   

Relates to MDEV-32492. This testcase:

INSTALL PLUGIN Spider SONAME 'ha_spider.so';
CREATE SERVER srv FOREIGN DATA WRAPPER mysql OPTIONS (SOCKET '../socket.sock',DATABASE 'test',user 'Spider',PASSWORD '');
CREATE TABLE t1 (a INT,b VARCHAR(255),PRIMARY KEY(a)) ENGINE=Spider COMMENT="srv 'srv', table 't1', read_only_mode '1'";
INSERT INTO t1 VALUES (1,'aaa'),(2,'bbb'),(3,'ccc'),(4,'ddd');
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (a INT) ENGINE=Spider COMMENT='port "123 456"';
INSERT IGNORE INTO t1 VALUES (42),(42);

When executed at the CLI has different outcomes when executed on the latest MDEV-28856 patch versus the base (pre-patch). Base:

preview-11.3-preview 465f9beea1c43a1dad74330aa2dc30927bc224f5 (Optimized)

11.3.0-opt>INSTALL PLUGIN Spider SONAME 'ha_spider.so';
Query OK, 0 rows affected, 1 warning (0.007 sec)
 
11.3.0-opt>CREATE SERVER srv FOREIGN DATA WRAPPER mysql OPTIONS (SOCKET '../socket.sock',DATABASE 'test',user 'Spider',PASSWORD '');
Query OK, 0 rows affected (0.001 sec)
 
11.3.0-opt>CREATE TABLE t1 (a INT,b VARCHAR(255),PRIMARY KEY(a)) ENGINE=Spider COMMENT="srv 'srv', table 't1', read_only_mode '1'";
Query OK, 0 rows affected (0.300 sec)
 
11.3.0-opt>INSERT INTO t1 VALUES (1,'aaa'),(2,'bbb'),(3,'ccc'),(4,'ddd');
 
ERROR 12719 (HY000): An infinite loop is detected when opening table test.t1
11.3.0-opt>SHOW CREATE TABLE t1;
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                                                |
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| t1    | CREATE TABLE `t1` (
  `a` int(11) NOT NULL,
  `b` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`a`)
) ENGINE=SPIDER DEFAULT CHARSET=latin1 COLLATE=latin1_swedish_ci COMMENT='srv ''srv'', table ''t1'', read_only_mode ''1''' |
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set, 1 warning (0.034 sec)
 
11.3.0-opt>DROP TABLE t1;
Query OK, 0 rows affected (0.005 sec)
 
11.3.0-opt>CREATE TABLE t1 (a INT) ENGINE=Spider COMMENT='port "123 456"';
Query OK, 0 rows affected (0.007 sec)
 
11.3.0-opt>INSERT IGNORE INTO t1 VALUES (42),(42);
ERROR 1429 (HY000): Unable to connect to foreign data source: localhost

Versus patch:

bb-11.3-mdev-28856-and-fixes cc08a83ef4225960dccb46bd68fc549160d21841 (Optimized)

11.3.0-opt>INSTALL PLUGIN Spider SONAME 'ha_spider.so';
Query OK, 0 rows affected, 1 warning (0.006 sec)
 
11.3.0-opt>CREATE SERVER srv FOREIGN DATA WRAPPER mysql OPTIONS (SOCKET '../socket.sock',DATABASE 'test',user 'Spider',PASSWORD '');
Query OK, 0 rows affected (0.002 sec)
 
11.3.0-opt>CREATE TABLE t1 (a INT,b VARCHAR(255),PRIMARY KEY(a)) ENGINE=Spider COMMENT="srv 'srv', table 't1', read_only_mode '1'";
 
Query OK, 0 rows affected (0.291 sec)
 
11.3.0-opt>INSERT INTO t1 VALUES (1,'aaa'),(2,'bbb'),(3,'ccc'),(4,'ddd');
ERROR 12518 (HY000): Table 'test.t1' is read only
11.3.0-opt>SHOW CREATE TABLE t1;
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                                                |
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| t1    | CREATE TABLE `t1` (
  `a` int(11) NOT NULL,
  `b` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`a`)
) ENGINE=SPIDER DEFAULT CHARSET=latin1 COLLATE=latin1_swedish_ci COMMENT='srv ''srv'', table ''t1'', read_only_mode ''1''' |
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set, 1 warning (0.033 sec)
 
11.3.0-opt>DROP TABLE t1;
Query OK, 0 rows affected (0.005 sec)
 
11.3.0-opt>CREATE TABLE t1 (a INT) ENGINE=Spider COMMENT='port "123 456"';
Query OK, 0 rows affected (0.005 sec)
 
11.3.0-opt>INSERT IGNORE INTO t1 VALUES (42),(42);
ERROR 2026 (HY000): TLS/SSL error: The TLS connection was non-properly terminated.
# i.e. crashed instance

Besides the SIGSEGV in spider_create_conn crash on Optimized builds in the feature tree, please note the following difference:

ERROR 12719 (HY000): An infinite loop is detected when opening table test.t1

Versus

ERROR 12518 (HY000): Table 'test.t1' is read only



 Comments   
Comment by Roel Van de Paar [ 2023-10-24 ]

The stack is:

11.3.0 cc08a83ef4225960dccb46bd68fc549160d21841 (Optimized)

Core was generated by `/test/28856_P2_MD211023-mariadb-11.3.0-linux-x86_64-opt/bin/mariadbd --no-defau'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000150f2415054e in spider_create_conn (share=0x150ec0059098, 
    spider=0x150ec0046e70, link_idx=21874, base_link_idx=1, error_num=0x150f440c5f24)
    at /test/bb-11.3-mdev-28856-and-fixes_opt/storage/spider/spd_conn.cc:405
[Current thread is 1 (Thread 0x150f440c8640 (LWP 3024573))]
(gdb) bt
#0  0x0000150f2415054e in spider_create_conn (share=0x150ec0059098, spider=0x150ec0046e70, link_idx=21874, base_link_idx=1, error_num=0x150f440c5f24) at /test/bb-11.3-mdev-28856-and-fixes_opt/storage/spider/spd_conn.cc:405
#1  0x0000150f24151cbe in spider_get_conn (share=share@entry=0x150ec0059098, link_idx=21874, link_idx@entry=1, conn_key=0x150ec00752cf "0mariadb", trx=trx@entry=0x150ec003dc28, spider=spider@entry=0x150ec0046e70, another=another@entry=false, thd_chg=true, error_num=0x150f440c5f24) at /test/bb-11.3-mdev-28856-and-fixes_opt/storage/spider/spd_conn.cc:767
#2  0x0000150f24135984 in spider_check_trx_and_get_conn (thd=<optimized out>, spider=spider@entry=0x150ec0046e70) at /test/bb-11.3-mdev-28856-and-fixes_opt/storage/spider/spd_trx.cc:3578
#3  0x0000150f24180ab9 in ha_spider::check_access_kind_for_connection (this=0x150ec0046e70, thd=<optimized out>, write_request=<optimized out>) at /test/bb-11.3-mdev-28856-and-fixes_opt/storage/spider/ha_spider.cc:589
#4  0x0000150f24192523 in ha_spider::dml_init (this=this@entry=0x150ec0046e70) at /test/bb-11.3-mdev-28856-and-fixes_opt/storage/spider/ha_spider.cc:12181
#5  0x0000150f24195118 in ha_spider::write_row (this=0x150ec0046e70, buf=0x150ec00288c8 "\375*") at /test/bb-11.3-mdev-28856-and-fixes_opt/storage/spider/ha_spider.cc:7890
#6  0x00005572b5cb9f58 in handler::ha_write_row (this=0x150ec0046e70, buf=0x150ec00288c8 "\375*") at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/handler.cc:7840
#7  0x00005572b59ed172 in write_record (thd=thd@entry=0x150ec0000c68, table=table@entry=0x150ec00284b8, info=info@entry=0x150f440c62b0, sink=sink@entry=0x0) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_insert.cc:2204
#8  0x00005572b59f6e50 in mysql_insert (thd=thd@entry=0x150ec0000c68, table_list=<optimized out>, fields=@0x150ec0005ee8: {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x5572b6d6a330 <end_of_list>, last = 0x150ec0005ee8, elements = 0}, <No data fields>}, values_list=@0x150ec0005f30: {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x150ec0016f80, last = 0x150ec0017040, elements = 2}, <No data fields>}, update_fields=@0x150ec0005f18: {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x5572b6d6a330 <end_of_list>, last = 0x150ec0005f18, elements = 0}, <No data fields>}, update_values=@0x150ec0005f00: {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x5572b6d6a330 <end_of_list>, last = 0x150ec0005f00, elements = 0}, <No data fields>}, duplic=DUP_ERROR, ignore=true, result=0x0) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_insert.cc:1154
#9  0x00005572b5a2b88e in mysql_execute_command (thd=0x150ec0000c68, is_called_from_prepared_stmt=<optimized out>) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_parse.cc:4416
#10 0x00005572b5a2fb66 in mysql_parse (thd=0x150ec0000c68, rawbuf=<optimized out>, length=<optimized out>, parser_state=<optimized out>) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_parse.cc:7734
#11 0x00005572b5a322fd in dispatch_command (command=COM_QUERY, thd=0x150ec0000c68, packet=<optimized out>, packet_length=<optimized out>, blocking=<optimized out>) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_parse.cc:1990
#12 0x00005572b5a340a0 in do_command (thd=0x150ec0000c68, blocking=blocking@entry=true) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_parse.cc:1406
#13 0x00005572b5b5c0ff in do_handle_one_connection (connect=<optimized out>, put_in_cache=true) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_connect.cc:1445
#14 0x00005572b5b5c44d in handle_one_connection (arg=arg@entry=0x5572b83a96e8) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_connect.cc:1347
#15 0x00005572b5f064f1 in pfs_spawn_thread (arg=0x5572b835d7e8) at /test/bb-11.3-mdev-28856-and-fixes_opt/storage/perfschema/pfs.cc:2201
#16 0x0000150f48094ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#17 0x0000150f48126a40 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Which can notably also produced in base, as described in MDEV-32492, in this comment. However, in that case, MTR is being used. The CLI here differs in that a crash can be produced in optimized builds whereas the same code does not crash the pre-patch code.

Comment by Yuchen Pei [ 2023-10-24 ]

I cannot reproduce the difference of failure vs crash.

Release build, both at cc08a83ef4225960dccb46bd68fc549160d21841 and
465f9beea1c43a1dad74330aa2dc30927bc224f5 I get

ERROR 1429 (HY000): Unable to connect to foreign data source:
localhost

with ASAN on, at cc08a83ef4225960dccb46bd68fc549160d21841 I still get

ERROR 1429 (HY000): Unable to connect to foreign data source:
localhost

The difference of 12719 (infinite loop) vs 12518 (read only) can be
explained by the missing of the commit

7527a821d1b MDEV-31524 Fixing spider table param / variable
overriding

at 465f9beea1c43a1dad74330aa2dc30927bc224f5 (base). See the
description testcase in MDEV-31524.

Comment by Roel Van de Paar [ 2023-10-25 ]

There is another testcase:

INSTALL PLUGIN Spider SONAME 'ha_spider.so';
CREATE SERVER srv FOREIGN DATA WRAPPER MYSQL OPTIONS (SOCKET '../socket.sock',DATABASE 'test',user 'Spider',PASSWORD '');
SET GLOBAL spider_same_server_link=ON;
CREATE TABLE t1 (c INT KEY,c1 BLOB,c2 TEXT) ENGINE=Spider COMMENT='WRAPPER "mysql",srv "srv",table "t", read_only_mode "maybe"';
INSERT INTO t1 VALUES (1,"aaa"),(2,"bbb"),(3,"ccc"),(4,"ddd");
SHOW CREATE TABLE t1;
CREATE TABLE t1 (a INT,b VARCHAR,PRIMARY KEY(a)) ENGINE=InnoDB REMOTE_SERVER="srv" REMOTE_TABLE="t1";
DROP TABLE t1;
CREATE TABLE t1 (a INT) ENGINE=Spider COMMENT='port "123 456"';
SELECT SQL_NO_CACHE * FROM t1;

Which crashes both BASE and PATCH, optimized builds, as:

SIGSEGV|spider_conn_first_link_idx|spider_check_trx_and_get_conn|ha_spider::info|make_join_statistics

(One of the stacks seen in MDEV-32492).
It also crashes both BASE and PATCH, debug builds, as:

SIGSEGV|spider_conn_first_link_idx|spider_check_trx_and_get_conn|ha_spider::info|TABLE_LIST::fetch_number_of_rows

However, when transforming the same testcase to the newer syntax:

INSTALL PLUGIN Spider SONAME 'ha_spider.so';
CREATE SERVER srv FOREIGN DATA WRAPPER MYSQL OPTIONS (SOCKET '../socket.sock',DATABASE 'test',user 'Spider',PASSWORD '');
SET GLOBAL spider_same_server_link=ON;
CREATE TABLE t1 (c INT KEY,c1 BLOB,c2 TEXT) ENGINE=Spider COMMENT='WRAPPER "mysql",SRV "srv",TABLE "t"';
INSERT INTO t1 VALUES (1,"aaa"),(2,"bbb"),(3,"ccc"),(4,"ddd");
SHOW CREATE TABLE t1;
CREATE TABLE t1 (a INT,b VARCHAR,PRIMARY KEY(a)) ENGINE=InnoDB REMOTE_SERVER="srv" REMOTE_TABLE="t1" read_only=maybe;
DROP TABLE t1;
CREATE TABLE t1 (a INT) ENGINE=Spider REMOTE_PORT="123 456";
SELECT SQL_NO_CACHE * FROM t1;

We get a different stack on the feature tree (only place where the new SQL is valid), on optimized builds only:

SIGSEGV|spider_create_conn|spider_get_conn|spider_check_trx_and_get_conn|ha_spider::info

Debug feature tree build shows the previously seen:

SIGSEGV|spider_conn_first_link_idx|spider_check_trx_and_get_conn|ha_spider::info|TABLE_LIST::fetch_number_of_rows

The different crash for optimized builds in spider_create_conn aligns with the earlier seen spider_create_conn crash described in this bug.
Full stack for the crash in optimized feature tree builds:

11.3.0 cc08a83ef4225960dccb46bd68fc549160d21841 (Optimized)

Core was generated by `/test/28856_P2_MD211023-mariadb-11.3.0-linux-x86_64-opt/bin/mariadbd --no-defau'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000014800814c64f in spider_create_conn (share=0x147fb4059628, 
    spider=0x147fb4086690, link_idx=22042, base_link_idx=1, error_num=0x1480140ccb34)
    at /test/bb-11.3-mdev-28856-and-fixes_opt/storage/spider/spd_conn.cc:411
[Current thread is 1 (Thread 0x1480140cf640 (LWP 2379681))]
(gdb) bt
#0  0x000014800814c64f in spider_create_conn (share=0x147fb4059628, spider=0x147fb4086690, link_idx=22042, base_link_idx=1, error_num=0x1480140ccb34) at /test/bb-11.3-mdev-28856-and-fixes_opt/storage/spider/spd_conn.cc:411
#1  0x000014800814dcbe in spider_get_conn (share=share@entry=0x147fb4059628, link_idx=22042, link_idx@entry=1, conn_key=0x147fb407696f "0mariadb", trx=trx@entry=0x147fb403f608, spider=spider@entry=0x147fb4086690, another=another@entry=false, thd_chg=true, error_num=0x1480140ccb34) at /test/bb-11.3-mdev-28856-and-fixes_opt/storage/spider/spd_conn.cc:767
#2  0x0000148008131984 in spider_check_trx_and_get_conn (thd=<optimized out>, spider=spider@entry=0x147fb4086690) at /test/bb-11.3-mdev-28856-and-fixes_opt/storage/spider/spd_trx.cc:3578
#3  0x0000148008190603 in ha_spider::info (this=0x147fb4086690, flag=18) at /test/bb-11.3-mdev-28856-and-fixes_opt/storage/spider/ha_spider.cc:6653
#4  0x0000561acc5d0a82 in make_join_statistics (join=0x147fb4017a58, tables_list=@0x147fb40164c0: {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x147fb4018210, last = 0x147fb4018210, elements = 1}, <No data fields>}, keyuse_array=0x147fb4017db8) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_select.cc:5499
#5  0x0000561acc5d7852 in JOIN::optimize_inner (this=0x147fb4017a58) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_select.cc:2624
#6  0x0000561acc5d7eaa in JOIN::optimize (this=this@entry=0x147fb4017a58) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_select.cc:1944
#7  0x0000561acc5d7fa1 in mysql_select (thd=0x147fb4000c68, tables=0x147fb40168c8, fields=<optimized out>, conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=<optimized out>, result=0x147fb4017a30, unit=0x147fb4004fb8, select_lex=0x147fb40162a8) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_select.cc:5235
#8  0x0000561acc5d87f4 in handle_select (thd=thd@entry=0x147fb4000c68, lex=lex@entry=0x147fb4004ed8, result=result@entry=0x147fb4017a30, setup_tables_done_option=setup_tables_done_option@entry=0) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_select.cc:628
#9  0x0000561acc54c685 in execute_sqlcom_select (thd=0x147fb4000c68, all_tables=0x147fb40168c8) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_parse.cc:6012
#10 0x0000561acc55b792 in mysql_execute_command (thd=0x147fb4000c68, is_called_from_prepared_stmt=<optimized out>) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_parse.cc:3911
#11 0x0000561acc55cb66 in mysql_parse (thd=0x147fb4000c68, rawbuf=<optimized out>, length=<optimized out>, parser_state=<optimized out>) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_parse.cc:7734
#12 0x0000561acc55f2fd in dispatch_command (command=COM_QUERY, thd=0x147fb4000c68, packet=<optimized out>, packet_length=<optimized out>, blocking=<optimized out>) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_parse.cc:1990
#13 0x0000561acc5610a0 in do_command (thd=0x147fb4000c68, blocking=blocking@entry=true) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_parse.cc:1406
#14 0x0000561acc6890ff in do_handle_one_connection (connect=<optimized out>, put_in_cache=true) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_connect.cc:1445
#15 0x0000561acc68944d in handle_one_connection (arg=arg@entry=0x561aced07748) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_connect.cc:1347
#16 0x0000561acca334f1 in pfs_spawn_thread (arg=0x561acecbb7a8) at /test/bb-11.3-mdev-28856-and-fixes_opt/storage/perfschema/pfs.cc:2201
#17 0x000014802c894ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#18 0x000014802c926a40 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Comment by Yuchen Pei [ 2023-10-26 ]

> It could be that the missing MDEV-31524 commit in cc08a83ef4225960dccb46bd68fc549160d21841 is the cause for the offset in stacks besides the error message difference.

Maybe, but it doesn't matter, because...

> We get a different stack on the feature tree (only place where the new SQL is valid), on optimized builds only:
> [... 16 lines elided]
> (gdb) bt
> #0 0x000014800814c64f in spider_create_conn (share=0x147fb4059628, spider=0x147fb4086690, link_idx=22042, base_link_idx=1, error_num=0x1480140ccb34)
> #1 0x000014800814dcbe in spider_get_conn (share=share@entry=0x147fb4059628, link_idx=22042, link_idx@entry=1, conn_key=0x147fb407696f "0mariadb", trx=trx@entry=0x147fb403f608, spider=spider@entry=0x147fb4086690, another=another@entry=false, thd_chg=true, error_num=0x1480140ccb34) at /test/bb-11.3-mdev-28856-and-fixes_opt/storage/spider/spd_conn.cc:767

Note that random looking link_idx=22042 implies that this is the same
problem as MDEV-32492.

Indeed, in spider_check_trx_and_get_conn(), 22042 is passed as
roop_count which is obtained from spider_conn_link_idx_next() which
returns a number share->conn_link_idx, which is a value from
undefined memory, see analysis in MDEV-32492.

      for (
        roop_count = spider_conn_link_idx_next(share->link_statuses,
          spider->conn_link_idx, -1, share->link_count,
          SPIDER_LINK_STATUS_RECOVERY);
        roop_count < (int) share->link_count;
        roop_count = spider_conn_link_idx_next(share->link_statuses,
          spider->conn_link_idx, roop_count, share->link_count,
          SPIDER_LINK_STATUS_RECOVERY)
      ) {
        if (roop_count == spider->search_link_idx)
          search_link_idx_is_checked = TRUE;
        if (!spider->conns[roop_count])
        {
          *spider->conn_keys[roop_count] = first_byte;
          if (
            !(conn =
              spider_get_conn(share, roop_count,
                spider->conn_keys[roop_count], trx,
                spider, FALSE, TRUE,
                &error_num))

It could be the MDEV-31524 commit causing different random numbers in
some memory location, but the problem is an existing bug, namely
MDEV-32492, and more generally, the management of SPIDER_TRX across
statements.

Comment by Roel Van de Paar [ 2023-10-26 ]

ycp Thank you for the analysis and insights.

Comment by Roel Van de Paar [ 2023-10-26 ]

Discussed with ycp. Agreed that this is not a blocker to MDEV-28856 given the same source, the rationale re: MDEV-31524 and that the likeliness for this being seen is similar to it was before the patch. We also discussed the randomness further and agreed to mark this as a duplicate to MDEV-32492 even though there are well-defined (but still random var) additional stacks here (will make a comment in that ticket to refer to this one for additional information).

Comment by Roel Van de Paar [ 2023-10-26 ]

Another UniqueID seen is

SIGSEGV|spider_get_conn|spider_check_trx_and_get_conn|ha_spider::info|make_join_statistics

Which has reduced to:

FLUSH PRIVILEGES;
GRANT ALL ON test.* TO Spider@localhost;
INSTALL PLUGIN Spider SONAME 'ha_spider.so';
SET SESSION SPIDER_IGNORE_COMMENTS=1;
DROP TABLE t1;
CREATE TABLE t1 (c INT, PRIMARY KEY(c)) ENGINE=Spider COMMENT='WRAPPER "mysql", SRV "srv",TABLE "t2", PK_NAME "c"';
FLUSH PRIVILEGES;
SELECT * FROM t1;
INSERT INTO t1 VALUES (42);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (a INT) ENGINE=Spider REMOTE_PORT="123 456";
SELECT * FROM t1;

However this does not readily reproduce the issue. The stack is:

11.3.0 cc08a83ef4225960dccb46bd68fc549160d21841 (Optimized)

Core was generated by `/test/28856_P2_MD211023-mariadb-11.3.0-linux-x86_64-opt/bin/mariadbd --no-defau'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000014766c14f820 in spider_get_conn (share=share@entry=0x147588103008, 
    link_idx=21965, link_idx@entry=1, conn_key=0x14758808ed5f "0mariadb", 
    trx=trx@entry=0x147588016b08, spider=spider@entry=0x14758810e0a0, 
    another=another@entry=false, thd_chg=true, error_num=0x1476780b3b34)
    at /test/bb-11.3-mdev-28856-and-fixes_opt/storage/spider/spd_conn.cc:705
[Current thread is 1 (Thread 0x1476780b6640 (LWP 920912))]
(gdb) bt
#0  0x000014766c14f820 in spider_get_conn (share=share@entry=0x147588103008, link_idx=21965, link_idx@entry=1, conn_key=0x14758808ed5f "0mariadb", trx=trx@entry=0x147588016b08, spider=spider@entry=0x14758810e0a0, another=another@entry=false, thd_chg=true, error_num=0x1476780b3b34) at /test/bb-11.3-mdev-28856-and-fixes_opt/storage/spider/spd_conn.cc:705
#1  0x000014766c133984 in spider_check_trx_and_get_conn (thd=<optimized out>, spider=spider@entry=0x14758810e0a0) at /test/bb-11.3-mdev-28856-and-fixes_opt/storage/spider/spd_trx.cc:3578
#2  0x000014766c192603 in ha_spider::info (this=0x14758810e0a0, flag=18) at /test/bb-11.3-mdev-28856-and-fixes_opt/storage/spider/ha_spider.cc:6653
#3  0x000055cd15b3aa82 in make_join_statistics (join=0x147588012338, tables_list=@0x147588010da0: {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x147588012af0, last = 0x147588012af0, elements = 1}, <No data fields>}, keyuse_array=0x147588012698) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_select.cc:5499
#4  0x000055cd15b41852 in JOIN::optimize_inner (this=0x147588012338) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_select.cc:2624
#5  0x000055cd15b41eaa in JOIN::optimize (this=this@entry=0x147588012338) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_select.cc:1944
#6  0x000055cd15b41fa1 in mysql_select (thd=0x147588000c68, tables=0x1475880111a8, fields=<optimized out>, conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=<optimized out>, result=0x147588012310, unit=0x147588004fb8, select_lex=0x147588010b88) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_select.cc:5235
#7  0x000055cd15b427f4 in handle_select (thd=thd@entry=0x147588000c68, lex=lex@entry=0x147588004ed8, result=result@entry=0x147588012310, setup_tables_done_option=setup_tables_done_option@entry=0) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_select.cc:628
#8  0x000055cd15ab6685 in execute_sqlcom_select (thd=0x147588000c68, all_tables=0x1475880111a8) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_parse.cc:6012
#9  0x000055cd15ac5792 in mysql_execute_command (thd=0x147588000c68, is_called_from_prepared_stmt=<optimized out>) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_parse.cc:3911
#10 0x000055cd15ac6b66 in mysql_parse (thd=0x147588000c68, rawbuf=<optimized out>, length=<optimized out>, parser_state=<optimized out>) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_parse.cc:7734
#11 0x000055cd15ac92fd in dispatch_command (command=COM_QUERY, thd=0x147588000c68, packet=<optimized out>, packet_length=<optimized out>, blocking=<optimized out>) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_parse.cc:1990
#12 0x000055cd15acb0a0 in do_command (thd=0x147588000c68, blocking=blocking@entry=true) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_parse.cc:1406
#13 0x000055cd15bf30ff in do_handle_one_connection (connect=<optimized out>, put_in_cache=true) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_connect.cc:1445
#14 0x000055cd15bf344d in handle_one_connection (arg=arg@entry=0x55cd194a31b8) at /test/bb-11.3-mdev-28856-and-fixes_opt/sql/sql_connect.cc:1347
#15 0x000055cd15f9d4f1 in pfs_spawn_thread (arg=0x55cd1948d0a8) at /test/bb-11.3-mdev-28856-and-fixes_opt/storage/perfschema/pfs.cc:2201
#16 0x000014769a694ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#17 0x000014769a726a40 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

And we see again the link_idx=21965. Perhaps also noteworth is the missing CREATE SERVER, which may connect it with the previously fixed MDEV-32486. Update after discussion with ycp: not connected with MDEV-32486 as even in that case it failed before noticing the missing create server.
Additionally, here, due to the SPIDER_IGNORE_COMMENTS=1 the server def is not evaluated.

Generated at Thu Feb 08 10:32:15 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.