Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-36357

SEGV on the test spider/bg.basic_sql when run with --view-protocol

Details

    Description

      CURRENT_TEST: spider/bg.basic_sql
      mysqltest: At line 1042: query 'SELECT a.a, a.b, date_format(a.c, '%Y-%m-%d %H:%i:%s') FROM tb_l a WHERE
      EXISTS (SELECT * FROM ta_l b WHERE b.b = a.b) ORDER BY a.a' failed: 2013: Lost connection to MySQL server during query
       
      The result from queries just before the failure was:
      < snip >
      connection master_1;
      SELECT DISTINCT a, b, date_format(c, '%Y-%m-%d %H:%i:%s') FROM ta_l
      ORDER BY a;
      a	b	date_format(c, '%Y-%m-%d %H:%i:%s')
      1	f	2008-07-01 10:21:39
      2	g	2000-02-01 00:00:00
      3	j	2007-05-04 20:03:11
      4	i	2003-10-30 05:01:03
      5	h	2001-10-31 23:59:59
       
      select count
      connection master_1;
      SELECT count(*) FROM ta_l ORDER BY a;
      count(*)
      5
       
      select table join not use index
      connection master_1;
      SELECT a.a, a.b, date_format(a.c, '%Y-%m-%d %H:%i:%s') FROM tb_l a WHERE
      EXISTS (SELECT * FROM ta_l b WHERE b.b = a.b) ORDER BY a.a;
       
      More results from queries before failure can be found in /home/ycp/source/mariadb-server/10.5/build/mysql-test/var/log/basic_sql.log
       
       
      Server [mysqld.1.1 - pid: 1757560, winpid: 1757560, exit: 256] failed during test run
      Server log from this test:
      ----------SERVER LOG START-----------
      $ env _RR_TRACE_DIR=/home/ycp/source/mariadb-server/10.5/build/mysql-test/var/log/mysqld.1.1.rr rr record /home/ycp/source/mariadb-server/10.5/build/sql/mariadbd --defaults-group-suffix=.1.1 --defaults-file=/home/ycp/source/mariadb-server/10.5/build/mysql-test/var/my.cnf --log-output=file --loose-innodb --loose-skip-performance-schema --core-file --loose-debug-sync-timeout=300 --loose-debug-gdb --loose-skip-stack-trace
      2025-03-20 16:15:14 0 [Warning] Could not increase number of max_open_files to more than 1024 (request: 32183)
      2025-03-20 16:15:14 0 [Warning] Changed limits: max_open_files: 1024  max_connections: 151 (was 151)  table_cache: 421 (was 2000)
      2025-03-20 16:15:14 0 [Note] Starting MariaDB 10.5.29-MariaDB-debug-log source revision 3a4c0295ae0973a068eaac5f1edef6c69bbf5e48 server_uid BaUiTqnaS0SDCOkwckWs2sBjduA= as process 1757569
      2025-03-20 16:15:14 0 [Note] Plugin 'SEQUENCE' is disabled.
      2025-03-20 16:15:14 0 [Note] InnoDB: !!!!!!!! UNIV_DEBUG switched on !!!!!!!!!
      2025-03-20 16:15:14 0 [Note] InnoDB: Uses event mutexes
      2025-03-20 16:15:14 0 [Note] InnoDB: Compressed tables use zlib 1.2.13
      2025-03-20 16:15:14 0 [Note] InnoDB: Number of pools: 1
      2025-03-20 16:15:14 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
      2025-03-20 16:15:14 0 [Warning] InnoDB: Linux Native AIO disabled.
      2025-03-20 16:15:14 0 [Note] InnoDB: Initializing buffer pool, total size = 8388608, chunk size = 8388608
      2025-03-20 16:15:14 0 [Note] InnoDB: Completed initialization of buffer pool
      2025-03-20 16:15:14 0 [Note] InnoDB: 128 rollback segments are active.
      2025-03-20 16:15:14 0 [Note] InnoDB: Creating shared tablespace for temporary tables
      2025-03-20 16:15:14 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
      2025-03-20 16:15:14 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
      2025-03-20 16:15:14 0 [Note] InnoDB: 10.5.29 started; log sequence number 47807; transaction id 27
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_SYS_DATAFILES' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_SYS_TABLESTATS' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_BUFFER_PAGE' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_TRX' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_CMP_PER_INDEX' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_METRICS' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_LOCK_WAITS' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_CMP' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'THREAD_POOL_WAITS' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_CMP_RESET' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'THREAD_POOL_QUEUES' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_SYS_FIELDS' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_BUFFER_PAGE_LRU' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'FEEDBACK' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_LOCKS' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_FT_INDEX_TABLE' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_CMPMEM' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'THREAD_POOL_GROUPS' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_CMP_PER_INDEX_RESET' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_SYS_FOREIGN_COLS' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_FT_INDEX_CACHE' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_BUFFER_POOL_STATS' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_FT_BEING_DELETED' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_SYS_FOREIGN' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_CMPMEM_RESET' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_FT_DEFAULT_STOPWORD' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_SYS_TABLES' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_SYS_COLUMNS' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_FT_CONFIG' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_SYS_TABLESPACES' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_SYS_VIRTUAL' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_SYS_INDEXES' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_SYS_SEMAPHORE_WAITS' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_MUTEXES' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'user_variables' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_TABLESPACES_ENCRYPTION' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'INNODB_FT_DELETED' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'THREAD_POOL_STATS' is disabled.
      2025-03-20 16:15:14 0 [Note] Plugin 'unix_socket' is disabled.
      2025-03-20 16:15:14 0 [Warning] /home/ycp/source/mariadb-server/10.5/build/sql/mariadbd: unknown option '--loose-pam-debug'
      2025-03-20 16:15:14 0 [Warning] /home/ycp/source/mariadb-server/10.5/build/sql/mariadbd: unknown option '--loose-aria'
      2025-03-20 16:15:14 0 [Warning] /home/ycp/source/mariadb-server/10.5/build/sql/mariadbd: unknown variable 'loose_handlersocket_port=20000'
      2025-03-20 16:15:14 0 [Warning] /home/ycp/source/mariadb-server/10.5/build/sql/mariadbd: unknown variable 'loose_handlersocket_port_wr=20001'
      2025-03-20 16:15:14 0 [Warning] /home/ycp/source/mariadb-server/10.5/build/sql/mariadbd: unknown variable 'loose_handlersocket_threads=2'
      2025-03-20 16:15:14 0 [Warning] /home/ycp/source/mariadb-server/10.5/build/sql/mariadbd: unknown variable 'loose_handlersocket_threads_wr=1'
      2025-03-20 16:15:14 0 [Warning] /home/ycp/source/mariadb-server/10.5/build/sql/mariadbd: unknown variable 'loose_handlersocket_support_merge_table=0'
      2025-03-20 16:15:14 0 [Warning] /home/ycp/source/mariadb-server/10.5/build/sql/mariadbd: unknown variable 'loose_handlersocket_direct_update_mode=2'
      2025-03-20 16:15:14 0 [Warning] /home/ycp/source/mariadb-server/10.5/build/sql/mariadbd: unknown variable 'loose_handlersocket_unlimited_boundary=65536'
      2025-03-20 16:15:14 0 [Warning] /home/ycp/source/mariadb-server/10.5/build/sql/mariadbd: unknown variable 'loose_handlersocket_bulk_insert=0'
      2025-03-20 16:15:14 0 [Warning] /home/ycp/source/mariadb-server/10.5/build/sql/mariadbd: unknown variable 'loose_handlersocket_bulk_insert_timeout=0'
      2025-03-20 16:15:14 0 [Warning] /home/ycp/source/mariadb-server/10.5/build/sql/mariadbd: unknown variable 'loose_handlersocket_general_log=1'
      2025-03-20 16:15:14 0 [Warning] /home/ycp/source/mariadb-server/10.5/build/sql/mariadbd: unknown variable 'loose_handlersocket_timeout=30'
      2025-03-20 16:15:14 0 [Warning] /home/ycp/source/mariadb-server/10.5/build/sql/mariadbd: unknown variable 'loose_handlersocket_close_table_interval=2'
      2025-03-20 16:15:14 0 [Note] InnoDB: Loading buffer pool(s) from /dev/shm/var_auto_RcBs/mysqld.1.1/data/ib_buffer_pool
      2025-03-20 16:15:14 0 [Note] Server socket created on IP: '127.0.0.1'.
      2025-03-20 16:15:14 0 [Note] InnoDB: Buffer pool(s) load completed at 250320 16:15:14
      2025-03-20 16:15:14 0 [Note] Reading of all Master_info entries succeeded
      2025-03-20 16:15:14 0 [Note] Added new Master_info '' to hash table
      2025-03-20 16:15:14 0 [Note] /home/ycp/source/mariadb-server/10.5/build/sql/mariadbd: ready for connections.
      Version: '10.5.29-MariaDB-debug-log'  socket: '/home/ycp/source/mariadb-server/10.5/build/mysql-test/var/tmp/mysqld.1.1.sock'  port: 19000  Source distribution
      AddressSanitizer:DEADLYSIGNAL
      =================================================================
      ==1757569==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f4a039368d4 bp 0x7f49fe99ba80 sp 0x7f49fe99b430 T20)
      ==1757569==The signal is caused by a READ memory access.
      ==1757569==Hint: address points to the zero page.
          #0 0x7f4a039368d4 in spider_db_store_result(ha_spider*, int, TABLE*) /home/ycp/source/mariadb-server/10.5/src/storage/spider/spd_db_conn.cc:3617
          #1 0x7f4a03989324 in spider_bg_conn_action(void*) /home/ycp/source/mariadb-server/10.5/src/storage/spider/spd_conn.cc:2629
          #2 0x7f4a2065ae55 in asan_thread_start ../../../../src/libsanitizer/asan/asan_interceptors.cpp:234
          #3 0x7f4a1f8a8133 in start_thread nptl/pthread_create.c:442
          #4 0x7f4a1f927a3f in clone ../sysdeps/unix/sysv/linux/x86_64/clone.S:100
       
      AddressSanitizer can not provide additional info.
      SUMMARY: AddressSanitizer: SEGV /home/ycp/source/mariadb-server/10.5/src/storage/spider/spd_db_conn.cc:3617 in spider_db_store_result(ha_spider*, int, TABLE*)
      Thread T20 created by T16 here:
          #0 0x7f4a206ebae1 in pthread_create ../../../../src/libsanitizer/asan/asan_interceptors.cpp:245
          #1 0x55e214d872a7 in spawn_thread_noop /home/ycp/source/mariadb-server/10.5/src/mysys/psi_noop.c:200
          #2 0x7f4a03974844 in inline_mysql_thread_create /home/ycp/source/mariadb-server/10.5/src/include/mysql/psi/mysql_thread.h:1323
          #3 0x7f4a03982f75 in spider_create_conn_thread(st_spider_conn*) /home/ycp/source/mariadb-server/10.5/src/storage/spider/spd_conn.cc:1839
          #4 0x7f4a039826b5 in spider_set_conn_bg_param(ha_spider*) /home/ycp/source/mariadb-server/10.5/src/storage/spider/spd_conn.cc:1768
          #5 0x7f4a03a65369 in ha_spider::rnd_next_internal(unsigned char*) /home/ycp/source/mariadb-server/10.5/src/storage/spider/ha_spider.cc:5028
          #6 0x7f4a03a669ec in ha_spider::rnd_next(unsigned char*) /home/ycp/source/mariadb-server/10.5/src/storage/spider/ha_spider.cc:5215
          #7 0x55e21388255f in handler::ha_rnd_next(unsigned char*) /home/ycp/source/mariadb-server/10.5/src/sql/handler.cc:3191
          #8 0x55e21385989c in find_all_keys /home/ycp/source/mariadb-server/10.5/src/sql/filesort.cc:912
          #9 0x55e21385554a in filesort(THD*, TABLE*, Filesort*, Filesort_tracker*, JOIN*, unsigned long long) /home/ycp/source/mariadb-server/10.5/src/sql/filesort.cc:351
          #10 0x55e2131f488a in create_sort_index(THD*, JOIN*, st_join_table*, Filesort*) /home/ycp/source/mariadb-server/10.5/src/sql/sql_select.cc:24709
          #11 0x55e2131e5301 in st_join_table::sort_table() /home/ycp/source/mariadb-server/10.5/src/sql/sql_select.cc:22314
          #12 0x55e2131e481e in join_init_read_record(st_join_table*) /home/ycp/source/mariadb-server/10.5/src/sql/sql_select.cc:22253
          #13 0x55e2131dea61 in sub_select(JOIN*, st_join_table*, bool) /home/ycp/source/mariadb-server/10.5/src/sql/sql_select.cc:21307
          #14 0x55e2131dd2ed in do_select /home/ycp/source/mariadb-server/10.5/src/sql/sql_select.cc:20827
          #15 0x55e21316de95 in JOIN::exec_inner() /home/ycp/source/mariadb-server/10.5/src/sql/sql_select.cc:4664
          #16 0x55e21316b867 in JOIN::exec() /home/ycp/source/mariadb-server/10.5/src/sql/sql_select.cc:4444
          #17 0x55e21316f12c in mysql_select(THD*, TABLE_LIST*, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*) /home/ycp/source/mariadb-server/10.5/src/sql/sql_select.cc:4921
          #18 0x55e213144511 in handle_select(THD*, LEX*, select_result*, unsigned long) /home/ycp/source/mariadb-server/10.5/src/sql/sql_select.cc:449
          #19 0x55e2130beb7b in execute_sqlcom_select /home/ycp/source/mariadb-server/10.5/src/sql/sql_parse.cc:6452
          #20 0x55e2130ad43a in mysql_execute_command(THD*) /home/ycp/source/mariadb-server/10.5/src/sql/sql_parse.cc:4043
          #21 0x55e2130c872f in mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool) /home/ycp/source/mariadb-server/10.5/src/sql/sql_parse.cc:8252
          #22 0x55e21309fa4b in dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool) /home/ycp/source/mariadb-server/10.5/src/sql/sql_parse.cc:1891
          #23 0x55e21309cc26 in do_command(THD*) /home/ycp/source/mariadb-server/10.5/src/sql/sql_parse.cc:1375
          #24 0x55e21349b396 in do_handle_one_connection(CONNECT*, bool) /home/ycp/source/mariadb-server/10.5/src/sql/sql_connect.cc:1386
          #25 0x55e21349af2d in handle_one_connection /home/ycp/source/mariadb-server/10.5/src/sql/sql_connect.cc:1298
          #26 0x55e213fec21b in pfs_spawn_thread /home/ycp/source/mariadb-server/10.5/src/storage/perfschema/pfs.cc:2201
          #27 0x7f4a2065ae55 in asan_thread_start ../../../../src/libsanitizer/asan/asan_interceptors.cpp:234
       
      Thread T16 created by T0 here:
          #0 0x7f4a206ebae1 in pthread_create ../../../../src/libsanitizer/asan/asan_interceptors.cpp:245
          #1 0x55e213fe7f50 in my_thread_create /home/ycp/source/mariadb-server/10.5/src/storage/perfschema/my_thread.h:52
          #2 0x55e213fec60a in pfs_spawn_thread_v1 /home/ycp/source/mariadb-server/10.5/src/storage/perfschema/pfs.cc:2252
          #3 0x55e212dccd32 in inline_mysql_thread_create /home/ycp/source/mariadb-server/10.5/src/include/mysql/psi/mysql_thread.h:1323
          #4 0x55e212de0ba8 in create_thread_to_handle_connection(CONNECT*) /home/ycp/source/mariadb-server/10.5/src/sql/mysqld.cc:6072
          #5 0x55e212de0eed in create_new_thread(CONNECT*) /home/ycp/source/mariadb-server/10.5/src/sql/mysqld.cc:6131
          #6 0x55e212de10cb in handle_accepted_socket(st_mysql_socket, st_mysql_socket) /home/ycp/source/mariadb-server/10.5/src/sql/mysqld.cc:6196
          #7 0x55e212de2056 in handle_connections_sockets() /home/ycp/source/mariadb-server/10.5/src/sql/mysqld.cc:6327
          #8 0x55e212ddf20f in run_main_loop /home/ycp/source/mariadb-server/10.5/src/sql/mysqld.cc:5313
          #9 0x55e212de0774 in mysqld_main(int, char**) /home/ycp/source/mariadb-server/10.5/src/sql/mysqld.cc:5724
          #10 0x55e212dcb8f8 in main /home/ycp/source/mariadb-server/10.5/src/sql/main.cc:25
          #11 0x7f4a1f846249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
       
      ==1757569==ABORTING
      ----------SERVER LOG END-------------
      

      Attachments

        Activity

          ycp Yuchen Pei added a comment -

          Comparing the query logs at the data node, at the crash query we have the following diff, which shows 4 extra queries with view protocol

           Query	select `b` from `auto_test_remote`.`ta_r` limit 2
           Query	select `b` from `auto_test_remote`.`ta_r` limit 2,2
          +Query	select `b` from `auto_test_remote`.`ta_r` limit 2
          +Query	select `b` from `auto_test_remote`.`ta_r` limit 2,2
          +Query	select `b` from `auto_test_remote`.`ta_r` limit 2
          +Query	select `b` from `auto_test_remote`.`ta_r` limit 2,2
           Query	select `b` from `auto_test_remote`.`ta_r` limit 4,2

          The function spider_bg_conn_search wakes up the spider bg thread which executes these queries at the data node. There are three callsites in the function where this waking happens, with the code block

                  pthread_mutex_lock(&conn->bg_conn_sync_mutex);
                  pthread_cond_signal(&conn->bg_conn_cond);
                  pthread_mutex_unlock(&conn->bg_conn_mutex);
                  pthread_cond_wait(&conn->bg_conn_sync_cond, &conn->bg_conn_sync_mutex);

          So one could place a break point at each of these three sites. Let's call them callsite A, B and C, which are at line 2112, 2229 and 2376 at my commit. Further, one could display the query in the debugger:

          disp *spider->dbton_handler[0]->sql->str.Ptr@spider->dbton_handler[0]->sql->str.str_length

          Without view-protocol:

          1. Reaches callsite A, with limit 2 in the query
          2. callsite B with limit 2,2
          3. callsite C with limit 4,2, no segv

          With view-protocol:

          1. Reaches callsite A, with limit 2 in query
          2. callsite B with limit 2,2
          3. callsite A with limit 2
          4. callsite B with limit 2,2
          5. callsite A with limit 2
          6. callsite B with limit 2,2
          7. callsite C with limit 4,2, then segv

          ycp Yuchen Pei added a comment - Comparing the query logs at the data node, at the crash query we have the following diff, which shows 4 extra queries with view protocol Query select `b` from `auto_test_remote`.`ta_r` limit 2 Query select `b` from `auto_test_remote`.`ta_r` limit 2,2 +Query select `b` from `auto_test_remote`.`ta_r` limit 2 +Query select `b` from `auto_test_remote`.`ta_r` limit 2,2 +Query select `b` from `auto_test_remote`.`ta_r` limit 2 +Query select `b` from `auto_test_remote`.`ta_r` limit 2,2 Query select `b` from `auto_test_remote`.`ta_r` limit 4,2 The function spider_bg_conn_search wakes up the spider bg thread which executes these queries at the data node. There are three callsites in the function where this waking happens, with the code block pthread_mutex_lock(&conn->bg_conn_sync_mutex); pthread_cond_signal(&conn->bg_conn_cond); pthread_mutex_unlock(&conn->bg_conn_mutex); pthread_cond_wait(&conn->bg_conn_sync_cond, &conn->bg_conn_sync_mutex); So one could place a break point at each of these three sites. Let's call them callsite A, B and C, which are at line 2112, 2229 and 2376 at my commit. Further, one could display the query in the debugger: disp *spider->dbton_handler[0]->sql->str.Ptr@spider->dbton_handler[0]->sql->str.str_length Without view-protocol: 1. Reaches callsite A, with limit 2 in the query 2. callsite B with limit 2,2 3. callsite C with limit 4,2 , no segv With view-protocol: 1. Reaches callsite A, with limit 2 in query 2. callsite B with limit 2,2 3. callsite A with limit 2 4. callsite B with limit 2,2 5. callsite A with limit 2 6. callsite B with limit 2,2 7. callsite C with limit 4,2 , then segv
          ycp Yuchen Pei added a comment - - edited

          A simple testcase, where we could do without view protocol by turning off exists_to_in optimization:

          --disable_query_log
          --disable_result_log
          --source ../../t/test_init.inc
          --enable_result_log
          --enable_query_log
          SET spider_bgs_mode= 1;
          # not needed, but a unique number here is useful for debugging
          SET spider_bgs_second_read= 3;
          # not needed, but a unique number here is useful for debugging
          SET spider_split_read= 6;
          set global spider_same_server_link= 1;
          set spider_same_server_link= 1;
          set optimizer_switch='exists_to_in=off';
           
          evalp CREATE SERVER srv FOREIGN DATA WRAPPER mysql
          OPTIONS (SOCKET "$MASTER_1_MYSOCK", DATABASE 'test',user 'root');
          create table t2 (c1 int);
          create table t1 (c1 int) ENGINE=Spider
          COMMENT='WRAPPER "mysql", srv "srv",TABLE "t2"';
          insert into t1 values (1), (3), (5), (7), (9);
          select a.c1 from t1 a where exists (select * from t1 b where a.c1 = b.c1);
          drop table t1, t2;
          drop server srv;
          --disable_query_log
          --disable_result_log
          --source ../../t/test_deinit.inc
          --enable_result_log
          --enable_query_log

          ycp Yuchen Pei added a comment - - edited A simple testcase, where we could do without view protocol by turning off exists_to_in optimization: --disable_query_log --disable_result_log --source ../../t/test_init.inc --enable_result_log --enable_query_log SET spider_bgs_mode= 1; # not needed, but a unique number here is useful for debugging SET spider_bgs_second_read= 3; # not needed, but a unique number here is useful for debugging SET spider_split_read= 6; set global spider_same_server_link= 1; set spider_same_server_link= 1; set optimizer_switch= 'exists_to_in=off' ;   evalp CREATE SERVER srv FOREIGN DATA WRAPPER mysql OPTIONS (SOCKET "$MASTER_1_MYSOCK" , DATABASE 'test' , user 'root' ); create table t2 (c1 int ); create table t1 (c1 int ) ENGINE=Spider COMMENT= 'WRAPPER "mysql", srv "srv",TABLE "t2"' ; insert into t1 values (1), (3), (5), (7), (9); select a.c1 from t1 a where exists ( select * from t1 b where a.c1 = b.c1); drop table t1, t2; drop server srv; --disable_query_log --disable_result_log --source ../../t/test_deinit.inc --enable_result_log --enable_query_log
          ycp Yuchen Pei added a comment -

          At a higher level, we annotate the queries sent to the data node as follows:

          # outer read_first_record, now a.c1 = 1
          select `c1` from `test`.`t2` limit 2 (q11)
          select `c1` from `test`.`t2` limit 2,3 (q10)
            # inner read_first_record for b.c1
            select `c1` from `test`.`t2` limit 2 (q9)
            select `c1` from `test`.`t2` limit 2,3 (q8)
          # outer read_record, now a.c1 = 3
            # inner read_first_record for b.c1
            select `c1` from `test`.`t2` limit 2 (q7)
            select `c1` from `test`.`t2` limit 2,3 (q6)
          # outer read_record (from spider_db_seek_next), now a.c1 = 5
          select `c1` from `test`.`t2` limit 5,6 (q5)
            # inner read_first_record for b.c1
            select `c1` from `test`.`t2` limit 2 (q4)
            select `c1` from `test`.`t2` limit 2,3 (q3)
            # inner read_record (from spider_db_seek_next)
            select `c1` from `test`.`t2` limit 5,2 (q2)
          # outer read_record, now a.c1 = 7
            # inner read_first_record, segv
            select `c1` from `test`.`t2` limit 2 (q1)

          At the lower level, for the SEGV to trigger, we need both result_list->current to be non-NULL and result_list->bgs_phase == 1. The bgs_phase field has the following comment

            /* 0:not use bg 1:first read 2:second read 3:after second read */
            volatile int            bgs_phase;

          So in query q4, bgs_phase is assigned value 1, then in q3 it is 2, and in q2 it is 3, and so on. In other words, bgs_phase = 1 in q1, q4, q7, q9 and q11.

          For the result_list->current to be non-NULL, the prev field of a specific SPIDER_RESULT needs to be non-NULL, as spider_db_free_one_result_for_start_next during rnd_init (called before "first read" queries q1, q4, q7, ...) assigns the latter to the former:

          // spider_db_free_one_result_for_start_next:
                    result_list->current = result->prev;

          For the prev field of a SPIDER_RESULT to be non-NULL, with watchpoint placement and rr reverse-continue, the following statement is the cause, and it happens when storing results after q8:

          // spider_db_store_result:
                    result_list->last->prev = result_list->bgs_current;

          So, with all the manipulation of the result_list linked list, it takes three iterations for the result_list->current != NULL condition to be met at the "right" time that causes the SEGV. So the next step could be to find out how the linked list got manipulated into this state.

          ycp Yuchen Pei added a comment - At a higher level, we annotate the queries sent to the data node as follows: # outer read_first_record, now a.c1 = 1 select `c1` from `test`.`t2` limit 2 (q11) select `c1` from `test`.`t2` limit 2,3 (q10) # inner read_first_record for b.c1 select `c1` from `test`.`t2` limit 2 (q9) select `c1` from `test`.`t2` limit 2,3 (q8) # outer read_record, now a.c1 = 3 # inner read_first_record for b.c1 select `c1` from `test`.`t2` limit 2 (q7) select `c1` from `test`.`t2` limit 2,3 (q6) # outer read_record (from spider_db_seek_next), now a.c1 = 5 select `c1` from `test`.`t2` limit 5,6 (q5) # inner read_first_record for b.c1 select `c1` from `test`.`t2` limit 2 (q4) select `c1` from `test`.`t2` limit 2,3 (q3) # inner read_record (from spider_db_seek_next) select `c1` from `test`.`t2` limit 5,2 (q2) # outer read_record, now a.c1 = 7 # inner read_first_record, segv select `c1` from `test`.`t2` limit 2 (q1) At the lower level, for the SEGV to trigger, we need both result_list->current to be non-NULL and result_list->bgs_phase == 1 . The bgs_phase field has the following comment /* 0:not use bg 1:first read 2:second read 3:after second read */ volatile int bgs_phase; So in query q4, bgs_phase is assigned value 1, then in q3 it is 2, and in q2 it is 3, and so on. In other words, bgs_phase = 1 in q1, q4, q7, q9 and q11. For the result_list->current to be non-NULL, the prev field of a specific SPIDER_RESULT needs to be non-NULL, as spider_db_free_one_result_for_start_next during rnd_init (called before "first read" queries q1, q4, q7, ...) assigns the latter to the former: // spider_db_free_one_result_for_start_next: result_list->current = result->prev; For the prev field of a SPIDER_RESULT to be non-NULL, with watchpoint placement and rr reverse-continue, the following statement is the cause, and it happens when storing results after q8: // spider_db_store_result: result_list->last->prev = result_list->bgs_current; So, with all the manipulation of the result_list linked list, it takes three iterations for the result_list->current != NULL condition to be met at the "right" time that causes the SEGV. So the next step could be to find out how the linked list got manipulated into this state.
          ycp Yuchen Pei added a comment - - edited

          This is taking too long... I wonder if we could simply take the easy way out and fiddling with the condition triggering SEGV:

                    if (result_list->low_mem_read &&
                      result_list->current->result->limit_mode() == 0)
          

          The NULL object is result_list->current->result, and the method call limit_mode() always returns 0 for mariadb backend, and 1 for odbc backend.

          If we comment out the second condition, then the simple test passes, as well as all the other tests in the spider suites without view protocol. The view protocol test still fails for spider/bg.basic_sql, but in a different way similar to certain other view protocol failures (MDEV-36478):

          CURRENT_TEST: spider/bg.basic_sql
          --- /home/ycp/source/mariadb-server/push-10.5/src/storage/spider/mysql-test/spider/bg/r/basic_sql.result	2024-10-03 14:55:57.954250528 +1000
          +++ /home/ycp/source/mariadb-server/push-10.5/src/storage/spider/mysql-test/spider/bg/r/basic_sql.reject	2025-03-28 21:53:55.843215178 +1100
          @@ -244,7 +244,7 @@
           connection master_1;
           SELECT found_rows();
           found_rows()
          -5
          +4
           
           select high_priority
           connection master_1;
          

          ycp Yuchen Pei added a comment - - edited This is taking too long... I wonder if we could simply take the easy way out and fiddling with the condition triggering SEGV: if (result_list->low_mem_read && result_list->current->result->limit_mode() == 0) The NULL object is result_list->current->result , and the method call limit_mode() always returns 0 for mariadb backend, and 1 for odbc backend. If we comment out the second condition, then the simple test passes, as well as all the other tests in the spider suites without view protocol. The view protocol test still fails for spider/bg.basic_sql, but in a different way similar to certain other view protocol failures ( MDEV-36478 ): CURRENT_TEST: spider/bg.basic_sql --- /home/ycp/source/mariadb-server/push-10.5/src/storage/spider/mysql-test/spider/bg/r/basic_sql.result 2024-10-03 14:55:57.954250528 +1000 +++ /home/ycp/source/mariadb-server/push-10.5/src/storage/spider/mysql-test/spider/bg/r/basic_sql.reject 2025-03-28 21:53:55.843215178 +1100 @@ -244,7 +244,7 @@ connection master_1; SELECT found_rows(); found_rows() -5 +4 select high_priority connection master_1;
          ycp Yuchen Pei added a comment - - edited

          I've pushed a commit f89f8aa313cb891b488a0911d9f7e90ad27d0c02 to temporarily disable view protocol for the offending statement until this issue is resolved

          ycp Yuchen Pei added a comment - - edited I've pushed a commit f89f8aa313cb891b488a0911d9f7e90ad27d0c02 to temporarily disable view protocol for the offending statement until this issue is resolved

          People

            ycp Yuchen Pei
            ycp Yuchen Pei
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.