Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-29328

rocksdb.allow_to_start_after_corruption fails on ASAN build

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.5, 10.6, 10.7(EOL), 10.8(EOL), 10.9(EOL), 10.10(EOL)
    • 10.5, 10.6
    • None

    Description

      On a debug ASAN build,

      perl ./mtr rocksdb.allow_to_start_after_corruption
      

      The visible effect in MTR stdout is

      10.5 5fc172fd

      rocksdb.allow_to_start_after_corruption 'write_committed' [ fail ]
              Test ended at 2022-08-18 18:45:57
       
      CURRENT_TEST: rocksdb.allow_to_start_after_corruption
      2022-08-18 18:45:42 0 [Note] /mnt8t/bld/10.5-asan/bin/mariadbd (mysqld 10.5.18-MariaDB-debug-log) starting as process 1206303 ...
      2022-08-18 18:45:42 0 [Warning] Could not increase number of max_open_files to more than 1024 (request: 32198)
      Aborted
      mysqltest: At line 62: command "$MYSQLD_CMD --plugin_load=$HA_ROCKSDB_SO $_mysqld_option" failed with wrong error: 134
       
      The result from queries just before the failure was:
      < snip >
      1	1
      set session debug_dbug= "+d,rocksdb_return_status_corrupted";
      select * from t1 where pk=1;
      ERROR HY000: Lost connection to MySQL server during query
      FOUND 1 /data corruption detected/ in allow_to_start_after_corruption_debug.err
      #
      #  The same for scan queries
      #
      select * from t1;
      pk	col1
      1	1
      2	2
      3	3
      set session debug_dbug= "+d,rocksdb_return_status_corrupted";
      select * from t1;
      ERROR HY000: Lost connection to MySQL server during query
      FOUND 1 /data corruption detected/ in allow_to_start_after_corruption_debug.err
      #
      #  Test restart failure. The server is shutdown at this point.
      #
       
      More results from queries before failure can be found in /mnt8t/bld/10.5-asan/mysql-test/var/log/allow_to_start_after_corruption.log
       
       - found 'core' (0/5)
      Core generated by '/mnt8t/bld/10.5-asan/bin/mariadbd'
      Output from gdb follows. The first stack trace is from the failing thread.
      The following stack traces are from all threads (so the failing one is
      duplicated).
      --------------------------
      [New LWP 1206291]
      [New LWP 1206250]
      [New LWP 1206285]
      [New LWP 1206251]
      [New LWP 1206249]
      [New LWP 1206252]
      [New LWP 1206284]
      [New LWP 1206283]
      [New LWP 1206287]
      [New LWP 1206248]
      [New LWP 1206289]
      [New LWP 1206288]
      [New LWP 1206290]
      [New LWP 1206286]
      [Thread debugging using libthread_db enabled]
      Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
      Core was generated by `/mnt8t/bld/10.5-asan/bin/mariadbd --defaults-group-suffix=.1 --defaults-file=/m'.
      Program terminated with signal SIGABRT, Aborted.
      #0  __pthread_kill (threadid=<optimized out>, signo=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
      56	../sysdeps/unix/sysv/linux/pthread_kill.c: No such file or directory.
      

      [Current thread is 1 (Thread 0x7f3c48646700 (LWP 1206291))]
      #0  __pthread_kill (threadid=<optimized out>, signo=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
      #1  0x0000000003458f0b in my_write_core (sig=6) at /data/src/10.5/mysys/stacktrace.c:424
      #2  0x00000000017e8054 in handle_fatal_signal (sig=6) at /data/src/10.5/sql/signal_handler.cc:354
      #3  <signal handler called>
      #4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
      #5  0x00007f3c5894c537 in __GI_abort () at abort.c:79
      #6  0x00007f3c4fdcbe4c in myrocks::rdb_handle_io_error (status=..., err_type=myrocks::RDB_IO_ERROR_GENERAL) at /data/src/10.5/storage/rocksdb/ha_rocksdb.cc:14046
      #7  0x00007f3c4fda0c23 in myrocks::is_valid (scan_it=0x608000017620) at /data/src/10.5/storage/rocksdb/ha_rocksdb.cc:5969
      #8  0x00007f3c4fdb0c3b in myrocks::ha_rocksdb::rnd_next_with_direction (this=0x61d000195128, buf=0x619000082b38 "\376\003", move_forward=true) at /data/src/10.5/storage/rocksdb/ha_rocksdb.cc:10687
      #9  0x00007f3c4fdbfa7e in myrocks::ha_rocksdb::rnd_next (this=0x61d000195128, buf=0x619000082b38 "\376\003") at /data/src/10.5/storage/rocksdb/ha_rocksdb.cc:10627
      #10 0x000000000180824a in handler::ha_rnd_next (this=0x61d000195128, buf=0x619000082b38 "\376\003") at /data/src/10.5/sql/handler.cc:3096
      #11 0x0000000001debdc3 in rr_sequential (info=0x62b00003b070) at /data/src/10.5/sql/records.cc:519
      #12 0x0000000000ac10f9 in READ_RECORD::read_record (this=0x62b00003b070) at /data/src/10.5/sql/records.h:80
      #13 0x0000000000efc629 in sub_select (join=0x62b0000399f0, join_tab=0x62b00003afa8, end_of_records=false) at /data/src/10.5/sql/sql_select.cc:20985
      #14 0x0000000000e93106 in do_select (join=0x62b0000399f0, procedure=0x0) at /data/src/10.5/sql/sql_select.cc:20498
      #15 0x0000000000e8f516 in JOIN::exec_inner (this=0x62b0000399f0) at /data/src/10.5/sql/sql_select.cc:4560
      #16 0x0000000000e8ba87 in JOIN::exec (this=0x62b0000399f0) at /data/src/10.5/sql/sql_select.cc:4340
      #17 0x0000000000e0714c in mysql_select (thd=0x62b000069288, tables=0x62b000038980, fields=..., conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2164525824, result=0x62b0000399c0, unit=0x62b00006d430, select_lex=0x62b000038338) at /data/src/10.5/sql/sql_select.cc:4817
      #18 0x0000000000e05d73 in handle_select (thd=0x62b000069288, lex=0x62b00006d368, result=0x62b0000399c0, setup_tables_done_option=0) at /data/src/10.5/sql/sql_select.cc:444
      #19 0x0000000000d36ee9 in execute_sqlcom_select (thd=0x62b000069288, all_tables=0x62b000038980) at /data/src/10.5/sql/sql_parse.cc:6315
      #20 0x0000000000d1b621 in mysql_execute_command (thd=0x62b000069288) at /data/src/10.5/sql/sql_parse.cc:4006
      #21 0x0000000000d02383 in mysql_parse (thd=0x62b000069288, rawbuf=0x62b0000382a8 "select * from t1", length=16, parser_state=0x7f3c48643a40, is_com_multi=false, is_next_command=false) at /data/src/10.5/sql/sql_parse.cc:8101
      #22 0x0000000000cfa5ee in dispatch_command (command=COM_QUERY, thd=0x62b000069288, packet=0x6290000eb289 "", packet_length=16, is_com_multi=false, is_next_command=false) at /data/src/10.5/sql/sql_parse.cc:1891
      #23 0x0000000000d04c15 in do_command (thd=0x62b000069288) at /data/src/10.5/sql/sql_parse.cc:1375
      #24 0x00000000012af1d0 in do_handle_one_connection (connect=0x611000034648, put_in_cache=true) at /data/src/10.5/sql/sql_connect.cc:1418
      #25 0x00000000012ae754 in handle_one_connection (arg=0x611000034648) at /data/src/10.5/sql/sql_connect.cc:1312
      #26 0x00000000023bf7a8 in pfs_spawn_thread (arg=0x616000122b08) at /data/src/10.5/storage/perfschema/pfs.cc:2201
      #27 0x00007f3c58cecea7 in start_thread (arg=<optimized out>) at pthread_create.c:477
      #28 0x00007f3c58a24def in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
      

      The actual ASAN error is apparently this (got it by adding --log-error option to the failing intermediate server startup):

      ==1206303==ERROR: AddressSanitizer: heap-use-after-free on address 0x625000000109 at pc 0x00000242c834 bp 0x7ffd16978840 sp 0x7ffd16978838
      WRITE of size 1 at 0x625000000109 thread T0
          #0 0x242c833 in PFS_buffer_scalable_container<PFS_mutex, 1024, 1024, PFS_buffer_default_array<PFS_mutex>, PFS_buffer_default_allocator<PFS_mutex> >::static_deallocate(PFS_m
      utex*) /data/src/10.5/storage/perfschema/pfs_buffer_container.h:747:22
          #1 0x241f8f8 in PFS_partitioned_buffer_scalable_container<PFS_buffer_scalable_container<PFS_mutex, 1024, 1024, PFS_buffer_default_array<PFS_mutex>, PFS_buffer_default_alloc
      ator<PFS_mutex> >, 2>::deallocate(PFS_mutex*) /data/src/10.5/storage/perfschema/pfs_buffer_container.h:1198:5
          #2 0x240f997 in destroy_mutex(PFS_mutex*) /data/src/10.5/storage/perfschema/pfs_instr.cc:339:26
          #3 0x23be190 in pfs_destroy_mutex_v1 /data/src/10.5/storage/perfschema/pfs.cc:1844:3
          #4 0x1446842 in inline_mysql_mutex_destroy(st_mysql_mutex*, char const*, unsigned int) /data/src/10.5/include/mysql/psi/mysql_thread.h:719:5
          #5 0x144d806 in rpl_binlog_state::free() /data/src/10.5/sql/rpl_gtid.cc:1516:5
          #6 0x144dfd4 in rpl_binlog_state::~rpl_binlog_state() /data/src/10.5/sql/rpl_gtid.cc:1565:3
          #7 0x7f0d23d744d6 in __run_exit_handlers stdlib/exit.c:108:8
          #8 0x7f0d23d74679 in exit stdlib/exit.c:139:3
          #9 0x7f0d1b1f837c in myrocks::rocksdb_init_func(void*) /data/src/10.5/storage/rocksdb/ha_rocksdb.cc:5255:7
          #10 0x17ee147 in ha_initialize_handlerton(st_plugin_int*) /data/src/10.5/sql/handler.cc:645:31
          #11 0xd64376 in plugin_initialize(st_mem_root*, st_plugin_int*, int*, char**, bool) /data/src/10.5/sql/sql_plugin.cc:1461:9
          #12 0xd62e5a in plugin_init(int*, char**, int) /data/src/10.5/sql/sql_plugin.cc:1753:15
          #13 0x8df5bf in init_server_components() /data/src/10.5/sql/mysqld.cc:4928:7
          #14 0x8d7fcd in mysqld_main(int, char**) /data/src/10.5/sql/mysqld.cc:5522:7
          #15 0x8cdc41 in main /data/src/10.5/sql/main.cc:25:10
          #16 0x7f0d23d5cd09 in __libc_start_main csu/../csu/libc-start.c:308:16
          #17 0x821629 in _start (/mnt8t/bld/10.5-asan/bin/mariadbd+0x821629)
      

      but there are also lots of non-ASAN errors before. Since the test is meant to cause corruption, it's hard to say where the intended failures end and the unexpected one starts.

      Reproducible on 10.5+. Not reproducible on 10.4.

      Attachments

        Activity

          People

            psergei Sergei Petrunia
            elenst Elena Stepanova
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.