Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-31354

SIGSEGV in log_sort_flush_list() in InnoDB crash recovery

Details

    Description

      While testing innodb.recovery_memory related to MDEV-31350 and MDEV-31353, I got a surprise crash near the end of my test run of a RelWithDebInfo executable:

      10.9 44c9008ba65686abf1c82c9166255a8c52d61f74

      innodb.recovery_memory 'innodb,release'  w76 [ 100 fail ]
              Test ended at 2023-05-26 12:43:08
       
      CURRENT_TEST: innodb.recovery_memory
       
       
      Server [mysqld.1 - pid: 1375448, winpid: 1375448, exit: 256] failed during test run
      Server log from this test:
      ----------SERVER LOG START-----------
      $ /dev/shm/10.9/sql/mariadbd --defaults-group-suffix=.1 --defaults-file=/dev/shm/10.9/mysql-test/var/76/my.cnf --log-output=file --innodb --innodb-cmpmem --innodb-cmp-per-index --innodb-trx --innodb-locks --innodb-lock-waits --innodb-metrics --innodb-buffer-pool-stats --innodb-buffer-page --innodb-buffer-page-lru --innodb-sys-columns --innodb-sys-fields --innodb-sys-foreign --innodb-sys-foreign-cols --innodb-sys-indexes --innodb-sys-tables --innodb-sys-virtual --disable-gdb --loose-sequence --innodb_buffer_pool_size=1073741824 --innodb_buffer_pool_size=5242880 --core-file --loose-debug-sync-timeout=300
      2023-05-26 12:42:51 0 [Warning] Could not increase number of max_open_files to more than 1024 (request: 32222)
      2023-05-26 12:42:51 0 [Warning] Changed limits: max_open_files: 1024  max_connections: 151 (was 151)  table_cache: 421 (was 2000)
      2023-05-26 12:42:51 0 [Note] Starting MariaDB 10.9.7-MariaDB-log source revision 44c9008ba65686abf1c82c9166255a8c52d61f74 as process 1375542
      2023-05-26 12:42:51 0 [Note] InnoDB: Compressed tables use zlib 1.2.13
      2023-05-26 12:42:51 0 [Note] InnoDB: Using transactional memory
      2023-05-26 12:42:51 0 [Note] InnoDB: Number of transaction pools: 1
      2023-05-26 12:42:51 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
      2023-05-26 12:42:51 0 [Note] InnoDB: Using liburing
      2023-05-26 12:42:51 0 [Note] InnoDB: Initializing buffer pool, total size = 5.000MiB, chunk size = 1.000MiB
      2023-05-26 12:42:51 0 [Note] InnoDB: Completed initialization of buffer pool
      2023-05-26 12:42:51 0 [Note] InnoDB: Setting O_DIRECT on file ./ibdata1 failed
      2023-05-26 12:42:51 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=691948272
      2023-05-26 12:42:51 0 [Note] InnoDB: Multi-batch recovery needed at LSN 692430929
      2023-05-26 12:42:51 0 [Note] InnoDB: End of log at LSN=694016960
      2023-05-26 12:42:51 0 [Note] InnoDB: To recover: LSN 692776501/694016960; 356 pages
      2023-05-26 12:42:51 0 [Note] InnoDB: To recover: LSN 693253964/694016960; 317 pages
      2023-05-26 12:42:51 0 [Note] InnoDB: To recover: LSN 693734818/694016960; 268 pages
      2023-05-26 12:42:52 0 [Note] InnoDB: To recover: 9 pages
      230526 12:42:52 [ERROR] mysqld got signal 11 ;
      ...
      #5  buf_page_t::oldest_modification (this=0x0) at /mariadb/11/storage/innobase/include/buf0buf.h:749
      #6  log_sort_flush_list()::$_1::operator()(buf_page_t const*, buf_page_t const*) const (lhs=0x0, rhs=<optimized out>, this=<optimized out>) at /mariadb/11/storage/innobase/log/log0recv.cc:3802
      #7  __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::$_1>::operator()<buf_page_t**, buf_page_t**>(buf_page_t**, buf_page_t**) (__it1=<optimized out>, __it2=0x5578776b7ca0, this=<optimized out>) at /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/predefined_ops.h:158
      #8  std::__unguarded_partition<buf_page_t**, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::$_1> >(buf_page_t**, buf_page_t**, buf_page_t**, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::$_1>) (__first=<optimized out>, __last=0x5578776b7fe8, __pivot=<optimized out>, __comp=<optimized out>) at /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/stl_algo.h:1867
      #9  std::__unguarded_partition_pivot<buf_page_t**, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::$_1> >(buf_page_t**, buf_page_t**, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::$_1>) (__first=0x5578776b7ca0, __last=0x5578776b7ff0, __comp=<optimized out>) at /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/stl_algo.h:1889
      #10 std::__introsort_loop<buf_page_t**, long, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::$_1> >(buf_page_t**, buf_page_t**, long, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::$_1>) (__first=__first@entry=0x5578776b7ca0, __last=0x5578776b7ff0, __depth_limit=11, __comp=<optimized out>) at /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/stl_algo.h:1921
      #11 0x0000557875a531ce in std::__sort<buf_page_t**, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::$_1> >(buf_page_t**, buf_page_t**, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::$_1>) (__first=0x5578776b7ca0, __last=0x5578776b7ff0, __comp=<optimized out>) at /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/stl_algo.h:1937
      #12 std::sort<buf_page_t**, log_sort_flush_list()::$_1>(buf_page_t**, buf_page_t**, log_sort_flush_list()::$_1) (__first=0x5578776b7ca0, __last=0x5578776b7ff0, __comp=<optimized out>) at /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/stl_algo.h:4853
      

      The crash occurs in the following code, which was added in MDEV-27022:

      static void log_sort_flush_list()
      {
        mysql_mutex_lock(&buf_pool.flush_list_mutex);
       
        const size_t size= UT_LIST_GET_LEN(buf_pool.flush_list);
        std::unique_ptr<buf_page_t *[]> list(new buf_page_t *[size]);
       
        size_t idx= 0;
        for (buf_page_t *p= UT_LIST_GET_FIRST(buf_pool.flush_list); p;
             p= UT_LIST_GET_NEXT(list, p))
          list.get()[idx++]= p;
       
        std::sort(list.get(), list.get() + size,
                  [](const buf_page_t *lhs, const buf_page_t *rhs) {
                    return rhs->oldest_modification() < lhs->oldest_modification();
                  });
      

      Based on the core dump, it looks like the comparison function may have been invoked on the first out-of-bounds element list.get() + size.

      The size is equal to buf_pool.flush_list.count, and all elements in the array list.get() are nonnull. The address immediately after the last element (the second argument passed to std::sort()) contains a null pointer.

      To me, this looks like a possible error in the implementation of std::sort() in this version of libstdc+. The executable had been compiled with clang-15, but the option -stdlib=libc+ was not used.

      I think that more investigation will be needed on this. There is also other similar use of std::sort() in InnoDB.

      Attachments

        Issue Links

          Activity

            marko Marko Mäkelä created issue -
            marko Marko Mäkelä made changes -
            Field Original Value New Value
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -

            One year ago, mleich had reproduced this exactly once, actually even with rr replay, when testing a deleted development branch of MDEV-13542. There were many failures with that branch because it was tested with some fault injection (pretending that pages are corrupted when they are not), and the files were removed in order to save storage space. Only a log file was left. I will copy some data from there:

            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 22656]2022-05-27 17:47:26 0 [Note] InnoDB: To recover: 345 pages from log
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 23039]=================================================================
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 23046][rr 3681439 23050]==3681439==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x621000053938 at pc 0x56178acbabf2 bp 0x7fffb454d2e0 sp 0x7fffb454d2d0
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 23053][rr 3681439 23055]READ of size 8 at 0x621000053938 thread T0
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28861]    #0 0x56178acbabf1 in operator()<buf_page_t**, buf_page_t**> /usr/include/c++/9/bits/predefined_ops.h:143
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28869]    #1 0x56178acbaab5 in __unguarded_partition<buf_page_t**, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> > > /usr/include/c++/9/bits/stl_algo.h:1901
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28871]    #2 0x56178acba092 in __unguarded_partition_pivot<buf_page_t**, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> > > /usr/include/c++/9/bits/stl_algo.h:1922
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28873]    #3 0x56178acb932d in __introsort_loop<buf_page_t**, long int, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> > > /usr/include/c++/9/bits/stl_algo.h:1952
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28879]    #4 0x56178acb7497 in __sort<buf_page_t**, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> > > /usr/include/c++/9/bits/stl_algo.h:1967
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28885]    #5 0x56178acb6047 in sort<buf_page_t**, log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> > /usr/include/c++/9/bits/stl_algo.h:4899
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28887]    #6 0x56178aca9887 in log_sort_flush_list /data/Server/bb-10.6-MDEV-13542N/storage/innobase/log/log0recv.cc:3092
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28889]    #7 0x56178acac10e in recv_sys_t::apply(bool) /data/Server/bb-10.6-MDEV-13542N/storage/innobase/log/log0recv.cc:3285
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28895]    #8 0x56178af2c41d in srv_start(bool) /data/Server/bb-10.6-MDEV-13542N/storage/innobase/srv/srv0start.cc:1486
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28926]    #9 0x56178aac124e in innodb_init /data/Server/bb-10.6-MDEV-13542N/storage/innobase/handler/ha_innodb.cc:4316
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28944]    #10 0x56178a1c8cde in ha_initialize_handlerton(st_plugin_int*) /data/Server/bb-10.6-MDEV-13542N/sql/handler.cc:659
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28954]    #11 0x561789a254be in plugin_initialize /data/Server/bb-10.6-MDEV-13542N/sql/sql_plugin.cc:1465
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28956]    #12 0x561789a27223 in plugin_init(int*, char**, int) /data/Server/bb-10.6-MDEV-13542N/sql/sql_plugin.cc:1758
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28968]    #13 0x56178972fc05 in init_server_components /data/Server/bb-10.6-MDEV-13542N/sql/mysqld.cc:5090
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28970]    #14 0x5617897315bc in mysqld_main(int, char**) /data/Server/bb-10.6-MDEV-13542N/sql/mysqld.cc:5707
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28972]    #15 0x56178971ffbc in main /data/Server/bb-10.6-MDEV-13542N/sql/main.cc:34
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28974]    #16 0x7f80dde7f0b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28976]    #17 0x56178971fedd in _start (/data/Server_bin/bb-10.6-MDEV-13542N_asan/bin/mariadbd+0x13ffedd)
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28978]
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28988]0x621000053938 is located 0 bytes to the right of 4152-byte region [0x621000052900,0x621000053938)
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28994]allocated by thread T0 here:
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29000]    #0 0x7f80deb89b47 in operator new[](unsigned long) (/lib/x86_64-linux-gnu/libasan.so.5+0x10fb47)
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29002]    #1 0x56178aca9749 in log_sort_flush_list /data/Server/bb-10.6-MDEV-13542N/storage/innobase/log/log0recv.cc:3085
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29004]    #2 0x56178acac10e in recv_sys_t::apply(bool) /data/Server/bb-10.6-MDEV-13542N/storage/innobase/log/log0recv.cc:3285
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29006]    #3 0x56178af2c41d in srv_start(bool) /data/Server/bb-10.6-MDEV-13542N/storage/innobase/srv/srv0start.cc:1486
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29008]    #4 0x56178aac124e in innodb_init /data/Server/bb-10.6-MDEV-13542N/storage/innobase/handler/ha_innodb.cc:4316
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29010]    #5 0x56178a1c8cde in ha_initialize_handlerton(st_plugin_int*) /data/Server/bb-10.6-MDEV-13542N/sql/handler.cc:659
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29012]    #6 0x561789a254be in plugin_initialize /data/Server/bb-10.6-MDEV-13542N/sql/sql_plugin.cc:1465
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29014]    #7 0x561789a27223 in plugin_init(int*, char**, int) /data/Server/bb-10.6-MDEV-13542N/sql/sql_plugin.cc:1758
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29016]    #8 0x56178972fc05 in init_server_components /data/Server/bb-10.6-MDEV-13542N/sql/mysqld.cc:5090
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29018]    #9 0x5617897315bc in mysqld_main(int, char**) /data/Server/bb-10.6-MDEV-13542N/sql/mysqld.cc:5707
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29020]    #10 0x56178971ffbc in main /data/Server/bb-10.6-MDEV-13542N/sql/main.cc:34
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29022]    #11 0x7f80dde7f0b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29024]
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29034]SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/include/c++/9/bits/predefined_ops.h:143 in operator()<buf_page_t**, buf_page_t**>
            # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29468]Shadow bytes around the buggy address:
            # 2022-05-27T17:48:04 [3595936] |   0x0c42800026d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
            # 2022-05-27T17:48:04 [3595936] |   0x0c42800026e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
            # 2022-05-27T17:48:04 [3595936] |   0x0c42800026f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
            # 2022-05-27T17:48:04 [3595936] |   0x0c4280002700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
            # 2022-05-27T17:48:04 [3595936] |   0x0c4280002710: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
            # 2022-05-27T17:48:04 [3595936] | =>0x0c4280002720: 00 00 00 00 00 00 00[fa]fa fa fa fa fa fa fa fa
            

            It looks like exactly my SIGSEGV, with the AddressSanitizer shadow byte hinting that the element past the end of the buffer is being accessed. According to the above output, this happened with the C++ run-time library of GCC 9 already.

            marko Marko Mäkelä added a comment - One year ago, mleich had reproduced this exactly once, actually even with rr replay , when testing a deleted development branch of MDEV-13542 . There were many failures with that branch because it was tested with some fault injection (pretending that pages are corrupted when they are not), and the files were removed in order to save storage space. Only a log file was left. I will copy some data from there: # 2022-05-27T17:48:04 [3595936] | [rr 3681439 22656]2022-05-27 17:47:26 0 [Note] InnoDB: To recover: 345 pages from log # 2022-05-27T17:48:04 [3595936] | [rr 3681439 23039]================================================================= # 2022-05-27T17:48:04 [3595936] | [rr 3681439 23046][rr 3681439 23050]==3681439==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x621000053938 at pc 0x56178acbabf2 bp 0x7fffb454d2e0 sp 0x7fffb454d2d0 # 2022-05-27T17:48:04 [3595936] | [rr 3681439 23053][rr 3681439 23055]READ of size 8 at 0x621000053938 thread T0 # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28861] #0 0x56178acbabf1 in operator()<buf_page_t**, buf_page_t**> /usr/include/c++/9/bits/predefined_ops.h:143 # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28869] #1 0x56178acbaab5 in __unguarded_partition<buf_page_t**, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> > > /usr/include/c++/9/bits/stl_algo.h:1901 # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28871] #2 0x56178acba092 in __unguarded_partition_pivot<buf_page_t**, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> > > /usr/include/c++/9/bits/stl_algo.h:1922 # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28873] #3 0x56178acb932d in __introsort_loop<buf_page_t**, long int, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> > > /usr/include/c++/9/bits/stl_algo.h:1952 # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28879] #4 0x56178acb7497 in __sort<buf_page_t**, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> > > /usr/include/c++/9/bits/stl_algo.h:1967 # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28885] #5 0x56178acb6047 in sort<buf_page_t**, log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> > /usr/include/c++/9/bits/stl_algo.h:4899 # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28887] #6 0x56178aca9887 in log_sort_flush_list /data/Server/bb-10.6-MDEV-13542N/storage/innobase/log/log0recv.cc:3092 # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28889] #7 0x56178acac10e in recv_sys_t::apply(bool) /data/Server/bb-10.6-MDEV-13542N/storage/innobase/log/log0recv.cc:3285 # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28895] #8 0x56178af2c41d in srv_start(bool) /data/Server/bb-10.6-MDEV-13542N/storage/innobase/srv/srv0start.cc:1486 # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28926] #9 0x56178aac124e in innodb_init /data/Server/bb-10.6-MDEV-13542N/storage/innobase/handler/ha_innodb.cc:4316 # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28944] #10 0x56178a1c8cde in ha_initialize_handlerton(st_plugin_int*) /data/Server/bb-10.6-MDEV-13542N/sql/handler.cc:659 # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28954] #11 0x561789a254be in plugin_initialize /data/Server/bb-10.6-MDEV-13542N/sql/sql_plugin.cc:1465 # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28956] #12 0x561789a27223 in plugin_init(int*, char**, int) /data/Server/bb-10.6-MDEV-13542N/sql/sql_plugin.cc:1758 # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28968] #13 0x56178972fc05 in init_server_components /data/Server/bb-10.6-MDEV-13542N/sql/mysqld.cc:5090 # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28970] #14 0x5617897315bc in mysqld_main(int, char**) /data/Server/bb-10.6-MDEV-13542N/sql/mysqld.cc:5707 # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28972] #15 0x56178971ffbc in main /data/Server/bb-10.6-MDEV-13542N/sql/main.cc:34 # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28974] #16 0x7f80dde7f0b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2) # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28976] #17 0x56178971fedd in _start (/data/Server_bin/bb-10.6-MDEV-13542N_asan/bin/mariadbd+0x13ffedd) # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28978] # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28988]0x621000053938 is located 0 bytes to the right of 4152-byte region [0x621000052900,0x621000053938) # 2022-05-27T17:48:04 [3595936] | [rr 3681439 28994]allocated by thread T0 here: # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29000] #0 0x7f80deb89b47 in operator new[](unsigned long) (/lib/x86_64-linux-gnu/libasan.so.5+0x10fb47) # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29002] #1 0x56178aca9749 in log_sort_flush_list /data/Server/bb-10.6-MDEV-13542N/storage/innobase/log/log0recv.cc:3085 # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29004] #2 0x56178acac10e in recv_sys_t::apply(bool) /data/Server/bb-10.6-MDEV-13542N/storage/innobase/log/log0recv.cc:3285 # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29006] #3 0x56178af2c41d in srv_start(bool) /data/Server/bb-10.6-MDEV-13542N/storage/innobase/srv/srv0start.cc:1486 # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29008] #4 0x56178aac124e in innodb_init /data/Server/bb-10.6-MDEV-13542N/storage/innobase/handler/ha_innodb.cc:4316 # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29010] #5 0x56178a1c8cde in ha_initialize_handlerton(st_plugin_int*) /data/Server/bb-10.6-MDEV-13542N/sql/handler.cc:659 # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29012] #6 0x561789a254be in plugin_initialize /data/Server/bb-10.6-MDEV-13542N/sql/sql_plugin.cc:1465 # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29014] #7 0x561789a27223 in plugin_init(int*, char**, int) /data/Server/bb-10.6-MDEV-13542N/sql/sql_plugin.cc:1758 # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29016] #8 0x56178972fc05 in init_server_components /data/Server/bb-10.6-MDEV-13542N/sql/mysqld.cc:5090 # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29018] #9 0x5617897315bc in mysqld_main(int, char**) /data/Server/bb-10.6-MDEV-13542N/sql/mysqld.cc:5707 # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29020] #10 0x56178971ffbc in main /data/Server/bb-10.6-MDEV-13542N/sql/main.cc:34 # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29022] #11 0x7f80dde7f0b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2) # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29024] # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29034]SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/include/c++/9/bits/predefined_ops.h:143 in operator()<buf_page_t**, buf_page_t**> # 2022-05-27T17:48:04 [3595936] | [rr 3681439 29468]Shadow bytes around the buggy address: # 2022-05-27T17:48:04 [3595936] | 0x0c42800026d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 # 2022-05-27T17:48:04 [3595936] | 0x0c42800026e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 # 2022-05-27T17:48:04 [3595936] | 0x0c42800026f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 # 2022-05-27T17:48:04 [3595936] | 0x0c4280002700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 # 2022-05-27T17:48:04 [3595936] | 0x0c4280002710: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 # 2022-05-27T17:48:04 [3595936] | =>0x0c4280002720: 00 00 00 00 00 00 00[fa]fa fa fa fa fa fa fa fa It looks like exactly my SIGSEGV, with the AddressSanitizer shadow byte hinting that the element past the end of the buffer is being accessed. According to the above output, this happened with the C++ run-time library of GCC 9 already.
            marko Marko Mäkelä made changes -
            Status Open [ 1 ] In Progress [ 3 ]

            I tested the following C++11 program:

            #include<stdio.h>
            #include<algorithm>
             
            int main()
            {
              const char* strings[3]= {"foo","bar",nullptr};
              std::sort(&strings[0], &strings[2],
                        [](const char *a, const char *b)
                        { printf("%s<%s\n", a, b); return *a < *b; });
              printf("%s, %s, %s\n", strings[0], strings[1], strings[2]);
              return 0;
            }
            

            When compiled with g++-10 (the oldest compiler that I happened to have available on my system), this will display the following:

            bar<foo
            bar, foo, (null)
            

            printf("%s",nullptr) may be undefined behaviour according to the standard, but in GNU libc it displays (null).

            According to the output, there was no out-of-bounds access beyond the first two elements of strings in the sort routine. But, the call stack would involve std::__insertion_sort instead of any introsort or unguarded_partition_pivot. I will need to experiment a little more.

            marko Marko Mäkelä added a comment - I tested the following C++11 program: #include<stdio.h> #include<algorithm>   int main() { const char * strings[3]= { "foo" , "bar" ,nullptr}; std::sort(&strings[0], &strings[2], []( const char *a, const char *b) { printf ( "%s<%s\n" , a, b); return *a < *b; }); printf ( "%s, %s, %s\n" , strings[0], strings[1], strings[2]); return 0; } When compiled with g++-10 (the oldest compiler that I happened to have available on my system), this will display the following: bar<foo bar, foo, (null) printf("%s",nullptr) may be undefined behaviour according to the standard, but in GNU libc it displays (null) . According to the output, there was no out-of-bounds access beyond the first two elements of strings in the sort routine. But, the call stack would involve std::__insertion_sort instead of any introsort or unguarded_partition_pivot . I will need to experiment a little more.

            I checked libstdc++ 11 and 13, and both seem to apply an additional pre-sorting phase when the array is larger than _S_threshold or 16 elements. I tried the following program, but it did not access the out-of-bounds null pointer on my system:

            #include<stdio.h>
            #include<algorithm>
             
            int main()
            {
              const char* strings[22]=
                {"a","b","c","d","e","f","g","z","h","j","i",
                 "l","k","m","n","p","q","o","s","t","r",nullptr};
              std::sort(&strings[0], &strings[21],
                        [](const char *a, const char *b)
                        { printf("%s<%s\n", a, b); return *a < *b; });
              return 0;
            }
            

            With this program, printf() is being called from within std::__unguarded_partition_pivot, so we seem to be near the right track. Still, the null pointer was not being dereferenced.

            Perhaps I should programmatically try all permutations of the array. I sadly had not recorded more details of the original failure. I vaguely remember that there were some tens of elements to be sorted. Some of them may compare equal to each other. The initial permutation of the array should be available via the contents of buf_pool.flush_list in a core dump; we should not need rr replay for debugging this.

            marko Marko Mäkelä added a comment - I checked libstdc++ 11 and 13, and both seem to apply an additional pre-sorting phase when the array is larger than _S_threshold or 16 elements. I tried the following program, but it did not access the out-of-bounds null pointer on my system: #include<stdio.h> #include<algorithm>   int main() { const char * strings[22]= { "a" , "b" , "c" , "d" , "e" , "f" , "g" , "z" , "h" , "j" , "i" , "l" , "k" , "m" , "n" , "p" , "q" , "o" , "s" , "t" , "r" ,nullptr}; std::sort(&strings[0], &strings[21], []( const char *a, const char *b) { printf ( "%s<%s\n" , a, b); return *a < *b; }); return 0; } With this program, printf() is being called from within std::__unguarded_partition_pivot , so we seem to be near the right track. Still, the null pointer was not being dereferenced. Perhaps I should programmatically try all permutations of the array. I sadly had not recorded more details of the original failure. I vaguely remember that there were some tens of elements to be sorted. Some of them may compare equal to each other. The initial permutation of the array should be available via the contents of buf_pool.flush_list in a core dump; we should not need rr replay for debugging this.
            marko Marko Mäkelä made changes -

            nikitamalyavin pointed out that the buf_page_t::oldest_modification_ is not fully protected by buf_pool.flush_list_mutex. This is due to the performance optimization MDEV-25113: on page write completion, pages will not immediately be removed from buf_pool.flush_list but they are marked as clean by setting oldest_modification_=1. The ‘garbage’ (blocks that are actually clean) will be removed later from buf_pool.flush_list.

            In other words, the underlying data may change during the execution of std::sort(). I am trying to reproduce this claim by using the following patch:

            diff --git a/mysql-test/suite/innodb/r/recovery_memory.result b/mysql-test/suite/innodb/r/recovery_memory.result
            index 9aba9bccdb3..34bf6f0a5b0 100644
            --- a/mysql-test/suite/innodb/r/recovery_memory.result
            +++ b/mysql-test/suite/innodb/r/recovery_memory.result
            @@ -12,7 +12,7 @@ END LOOP
             connect con1,localhost,root,,,;
             CALL dorepeat();
             connection default;
            -# restart: --innodb_buffer_pool_size=5242880
            +# restart: --innodb_max_dirty_pages_pct=0.00001
             DROP TABLE t1;
             DROP PROCEDURE dorepeat;
             #
            @@ -20,10 +20,10 @@ DROP PROCEDURE dorepeat;
             #        handling scenario
             #
             SET DEBUG_DBUG="+d,ib_log_checkpoint_avoid_hard";
            +SET GLOBAL innodb_max_dirty_pages_pct=0;
             CREATE TABLE t1(f1 INT NOT NULL)ENGINE=InnoDB;
             INSERT INTO t1 SELECT * FROM seq_1_to_65536;
             # restart: with restart_parameters
            -# restart
             SHOW CREATE TABLE t1;
             Table	Create Table
             t1	CREATE TABLE `t1` (
            diff --git a/mysql-test/suite/innodb/t/recovery_memory.test b/mysql-test/suite/innodb/t/recovery_memory.test
            index 145b39d56f6..92bc962a826 100644
            --- a/mysql-test/suite/innodb/t/recovery_memory.test
            +++ b/mysql-test/suite/innodb/t/recovery_memory.test
            @@ -22,7 +22,7 @@ send CALL dorepeat();
             connection default;
             sleep 10;
             let $shutdown_timeout=0;
            -let $restart_parameters=--innodb_buffer_pool_size=5242880;
            +let $restart_parameters=--innodb_max_dirty_pages_pct=0.00001;
             --source include/restart_mysqld.inc
             DROP TABLE t1;
             DROP PROCEDURE dorepeat;
            @@ -37,15 +37,13 @@ let $restart_parameters=--innodb_buffer_pool_size=5242880 --debug_dbug="+d,ibuf_
             }
             if (!$have_debug) {
             --echo SET DEBUG_DBUG="+d,ib_log_checkpoint_avoid_hard";
            -let $restart_parameters=--innodb_buffer_pool_size=5242880;
            +#let $restart_parameters=--innodb_buffer_pool_size=5242880;
             }
            +SET GLOBAL innodb_max_dirty_pages_pct=0;
             CREATE TABLE t1(f1 INT NOT NULL)ENGINE=InnoDB;
             INSERT INTO t1 SELECT * FROM seq_1_to_65536;
             let $restart_noprint=1;
             let $shutdown_timeout=0;
             --source include/restart_mysqld.inc
            -let $restart_noprint=0;
            -let $restart_parameters=;
            ---source include/restart_mysqld.inc
             SHOW CREATE TABLE t1;
             DROP TABLE t1;
            diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc
            index 66b763fdf2e..ec38ac960a2 100644
            --- a/storage/innobase/log/log0recv.cc
            +++ b/storage/innobase/log/log0recv.cc
            @@ -3680,25 +3680,61 @@ inline fil_space_t *fil_system_t::find(const char *path) const
             /** Thread-safe function which sorts flush_list by oldest_modification */
             static void log_sort_flush_list()
             {
            +#if 0
            +  /* Ensure that oldest_modification() cannot change during std::sort() */
            +  for (;;)
            +  {
            +    os_aio_wait_until_no_pending_writes(false);
            +    mysql_mutex_lock(&buf_pool.flush_list_mutex);
            +    if (!buf_pool.flush_list_active())
            +      break;
            +    my_cond_wait(&buf_pool.done_flush_list,
            +                 &buf_pool.flush_list_mutex.m_mutex);
            +    mysql_mutex_unlock(&buf_pool.flush_list_mutex);
            +  }
            +#else
               mysql_mutex_lock(&buf_pool.flush_list_mutex);
            +#endif
             
               const size_t size= UT_LIST_GET_LEN(buf_pool.flush_list);
               std::unique_ptr<buf_page_t *[]> list(new buf_page_t *[size]);
             
            +  /* Copy the dirty blocks from buf_pool.flush_list to an array for sorting. */
               size_t idx= 0;
            +#if 1
            +  for (buf_page_t *p= UT_LIST_GET_FIRST(buf_pool.flush_list); p; )
            +  {
            +    const lsn_t lsn{p->oldest_modification()};
            +    ut_ad(lsn > 2 || lsn == 1);
            +    buf_page_t *n= UT_LIST_GET_NEXT(list, p);
            +    if (lsn > 1)
            +      list.get()[idx++]= p;
            +    else
            +      buf_pool.delete_from_flush_list(p);
            +    p= n;
            +  }
            +#else
               for (buf_page_t *p= UT_LIST_GET_FIRST(buf_pool.flush_list); p;
                    p= UT_LIST_GET_NEXT(list, p))
                 list.get()[idx++]= p;
            +#endif
             
            -  std::sort(list.get(), list.get() + size,
            +  sql_print_information("sorting %zu,%zu", idx, size);
            +
            +  std::sort(list.get(), list.get() + idx,
                         [](const buf_page_t *lhs, const buf_page_t *rhs) {
            -              return rhs->oldest_modification() < lhs->oldest_modification();
            +              lsn_t l{lhs->oldest_modification()},r{rhs->oldest_modification()};
            +              ut_a(l > 2); ut_a(r > 2);
            +              return r < l;
                         });
             
               UT_LIST_INIT(buf_pool.flush_list, &buf_page_t::list);
             
            -  for (size_t i= 0; i < size; i++)
            +  for (size_t i= 0; i < idx; i++)
            +  {
            +    ut_a(list[i]->oldest_modification() > 2);
                 UT_LIST_ADD_LAST(buf_pool.flush_list, list[i]);
            +  }
             
               mysql_mutex_unlock(&buf_pool.flush_list_mutex);
             }
            

            The #if 0 code section above would be the attempted fix. The rest of the code patch is trying to catch a situation where a buf_page_t::oldest_modification_ is being changed to the dummy value 1 while the routine is running. The test has been changed to not reduce the size of the buffer pool, so that there will be more pages in buf_pool.flush_list to sort.

            I have been unsuccessful so far. No matter how low values of innodb_max_dirty_pages_pct I try, the empty output of the following suggests that no concurrent page writes are taking place:

            ./mtr --parallel=100 --repeat=100 innodb.recovery_memory{,,,,,,,,,}{,,,,,,,,,}
            grep -w sorting var/*/log/mysqld.1.err|grep -v '\([0-9][0-9]*\),\1$'
            

            I observe see the following size=buf_pool.flush_list.count values during various phases of the test: 0, 283 to 291, 314 to 335, 359, 420, 461 to 466.

            marko Marko Mäkelä added a comment - nikitamalyavin pointed out that the buf_page_t::oldest_modification_ is not fully protected by buf_pool.flush_list_mutex . This is due to the performance optimization MDEV-25113 : on page write completion, pages will not immediately be removed from buf_pool.flush_list but they are marked as clean by setting oldest_modification_=1 . The ‘garbage’ (blocks that are actually clean) will be removed later from buf_pool.flush_list . In other words, the underlying data may change during the execution of std::sort() . I am trying to reproduce this claim by using the following patch: diff --git a/mysql-test/suite/innodb/r/recovery_memory.result b/mysql-test/suite/innodb/r/recovery_memory.result index 9aba9bccdb3..34bf6f0a5b0 100644 --- a/mysql-test/suite/innodb/r/recovery_memory.result +++ b/mysql-test/suite/innodb/r/recovery_memory.result @@ -12,7 +12,7 @@ END LOOP connect con1,localhost,root,,,; CALL dorepeat(); connection default; -# restart: --innodb_buffer_pool_size=5242880 +# restart: --innodb_max_dirty_pages_pct=0.00001 DROP TABLE t1; DROP PROCEDURE dorepeat; # @@ -20,10 +20,10 @@ DROP PROCEDURE dorepeat; # handling scenario # SET DEBUG_DBUG="+d,ib_log_checkpoint_avoid_hard"; +SET GLOBAL innodb_max_dirty_pages_pct=0; CREATE TABLE t1(f1 INT NOT NULL)ENGINE=InnoDB; INSERT INTO t1 SELECT * FROM seq_1_to_65536; # restart: with restart_parameters -# restart SHOW CREATE TABLE t1; Table Create Table t1 CREATE TABLE `t1` ( diff --git a/mysql-test/suite/innodb/t/recovery_memory.test b/mysql-test/suite/innodb/t/recovery_memory.test index 145b39d56f6..92bc962a826 100644 --- a/mysql-test/suite/innodb/t/recovery_memory.test +++ b/mysql-test/suite/innodb/t/recovery_memory.test @@ -22,7 +22,7 @@ send CALL dorepeat(); connection default; sleep 10; let $shutdown_timeout=0; -let $restart_parameters=--innodb_buffer_pool_size=5242880; +let $restart_parameters=--innodb_max_dirty_pages_pct=0.00001; --source include/restart_mysqld.inc DROP TABLE t1; DROP PROCEDURE dorepeat; @@ -37,15 +37,13 @@ let $restart_parameters=--innodb_buffer_pool_size=5242880 --debug_dbug="+d,ibuf_ } if (!$have_debug) { --echo SET DEBUG_DBUG="+d,ib_log_checkpoint_avoid_hard"; -let $restart_parameters=--innodb_buffer_pool_size=5242880; +#let $restart_parameters=--innodb_buffer_pool_size=5242880; } +SET GLOBAL innodb_max_dirty_pages_pct=0; CREATE TABLE t1(f1 INT NOT NULL)ENGINE=InnoDB; INSERT INTO t1 SELECT * FROM seq_1_to_65536; let $restart_noprint=1; let $shutdown_timeout=0; --source include/restart_mysqld.inc -let $restart_noprint=0; -let $restart_parameters=; ---source include/restart_mysqld.inc SHOW CREATE TABLE t1; DROP TABLE t1; diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc index 66b763fdf2e..ec38ac960a2 100644 --- a/storage/innobase/log/log0recv.cc +++ b/storage/innobase/log/log0recv.cc @@ -3680,25 +3680,61 @@ inline fil_space_t *fil_system_t::find(const char *path) const /** Thread-safe function which sorts flush_list by oldest_modification */ static void log_sort_flush_list() { +#if 0 + /* Ensure that oldest_modification() cannot change during std::sort() */ + for (;;) + { + os_aio_wait_until_no_pending_writes(false); + mysql_mutex_lock(&buf_pool.flush_list_mutex); + if (!buf_pool.flush_list_active()) + break; + my_cond_wait(&buf_pool.done_flush_list, + &buf_pool.flush_list_mutex.m_mutex); + mysql_mutex_unlock(&buf_pool.flush_list_mutex); + } +#else mysql_mutex_lock(&buf_pool.flush_list_mutex); +#endif const size_t size= UT_LIST_GET_LEN(buf_pool.flush_list); std::unique_ptr<buf_page_t *[]> list(new buf_page_t *[size]); + /* Copy the dirty blocks from buf_pool.flush_list to an array for sorting. */ size_t idx= 0; +#if 1 + for (buf_page_t *p= UT_LIST_GET_FIRST(buf_pool.flush_list); p; ) + { + const lsn_t lsn{p->oldest_modification()}; + ut_ad(lsn > 2 || lsn == 1); + buf_page_t *n= UT_LIST_GET_NEXT(list, p); + if (lsn > 1) + list.get()[idx++]= p; + else + buf_pool.delete_from_flush_list(p); + p= n; + } +#else for (buf_page_t *p= UT_LIST_GET_FIRST(buf_pool.flush_list); p; p= UT_LIST_GET_NEXT(list, p)) list.get()[idx++]= p; +#endif - std::sort(list.get(), list.get() + size, + sql_print_information("sorting %zu,%zu", idx, size); + + std::sort(list.get(), list.get() + idx, [](const buf_page_t *lhs, const buf_page_t *rhs) { - return rhs->oldest_modification() < lhs->oldest_modification(); + lsn_t l{lhs->oldest_modification()},r{rhs->oldest_modification()}; + ut_a(l > 2); ut_a(r > 2); + return r < l; }); UT_LIST_INIT(buf_pool.flush_list, &buf_page_t::list); - for (size_t i= 0; i < size; i++) + for (size_t i= 0; i < idx; i++) + { + ut_a(list[i]->oldest_modification() > 2); UT_LIST_ADD_LAST(buf_pool.flush_list, list[i]); + } mysql_mutex_unlock(&buf_pool.flush_list_mutex); } The #if 0 code section above would be the attempted fix. The rest of the code patch is trying to catch a situation where a buf_page_t::oldest_modification_ is being changed to the dummy value 1 while the routine is running. The test has been changed to not reduce the size of the buffer pool, so that there will be more pages in buf_pool.flush_list to sort. I have been unsuccessful so far. No matter how low values of innodb_max_dirty_pages_pct I try, the empty output of the following suggests that no concurrent page writes are taking place: . /mtr --parallel=100 --repeat=100 innodb.recovery_memory{,,,,,,,,,}{,,,,,,,,,} grep -w sorting var/* /log/mysqld .1.err| grep - v '\([0-9][0-9]*\),\1$' I observe see the following size=buf_pool.flush_list.count values during various phases of the test: 0, 283 to 291, 314 to 335, 359, 420, 461 to 466.

            I got a failure with unmodified code and test, which might be due to this suspected race condition:

            11.2 e81fa345020ec6a067583db6a7019d6404b26f93

            innodb.recovery_memory 'innodb,release'  w49 [ 49 fail ]  Found warnings/errors in server log file!
            …
            2023-07-27  9:54:36 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=260087921
            2023-07-27  9:54:36 0 [Note] InnoDB: Multi-batch recovery needed at LSN 260979042
            2023-07-27  9:54:36 0 [Note] InnoDB: End of log at LSN=263778199
            2023-07-27  9:54:36 0 [Note] InnoDB: To recover: LSN 260979042/263778199; 68 pages
            2023-07-27  9:54:36 0 [Note] InnoDB: To recover: LSN 261905455/263778199; 54 pages
            2023-07-27  9:54:36 0 [Note] InnoDB: To recover: LSN 262802915/263778199; 54 pages
            2023-07-27  9:54:36 0 [ERROR] InnoDB: Space id and page no stored in the page, read in are [page id: space=1929773056, page number=3179216896], should be [page id: space=1, page number=629]
            2023-07-27  9:54:36 0 [Note] InnoDB: Set innodb_force_recovery=1 to ignore corrupted pages.
            2023-07-27  9:54:36 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=1, page number=629]
            2023-07-27  9:54:36 0 [Note] InnoDB: Set innodb_force_recovery=1 to ignore corrupted pages.
            

            The corruption was reported during the second kill+restart of the test. That recovery did not reach the final batch yet, but the recovery for the previous kill+restart cycle must have executed log_sort_flush_list(), which could have corrupted buf_pool.flush_list and caused the error.

            CREATE TABLE t1(f1 INT NOT NULL)ENGINE=InnoDB;
            INSERT INTO t1 SELECT * FROM seq_1_to_65536;
            let $restart_noprint=1;
            let $shutdown_timeout=0;
            --source include/restart_mysqld.inc
            

            I will run this a couple more times to assess the probability of the failure, and then try to see if it goes away after I apply the attempted fix.

            marko Marko Mäkelä added a comment - I got a failure with unmodified code and test, which might be due to this suspected race condition: 11.2 e81fa345020ec6a067583db6a7019d6404b26f93 innodb.recovery_memory 'innodb,release' w49 [ 49 fail ] Found warnings/errors in server log file! … 2023-07-27 9:54:36 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=260087921 2023-07-27 9:54:36 0 [Note] InnoDB: Multi-batch recovery needed at LSN 260979042 2023-07-27 9:54:36 0 [Note] InnoDB: End of log at LSN=263778199 2023-07-27 9:54:36 0 [Note] InnoDB: To recover: LSN 260979042/263778199; 68 pages 2023-07-27 9:54:36 0 [Note] InnoDB: To recover: LSN 261905455/263778199; 54 pages 2023-07-27 9:54:36 0 [Note] InnoDB: To recover: LSN 262802915/263778199; 54 pages 2023-07-27 9:54:36 0 [ERROR] InnoDB: Space id and page no stored in the page, read in are [page id: space=1929773056, page number=3179216896], should be [page id: space=1, page number=629] 2023-07-27 9:54:36 0 [Note] InnoDB: Set innodb_force_recovery=1 to ignore corrupted pages. 2023-07-27 9:54:36 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=1, page number=629] 2023-07-27 9:54:36 0 [Note] InnoDB: Set innodb_force_recovery=1 to ignore corrupted pages. The corruption was reported during the second kill+restart of the test. That recovery did not reach the final batch yet, but the recovery for the previous kill+restart cycle must have executed log_sort_flush_list() , which could have corrupted buf_pool.flush_list and caused the error. CREATE TABLE t1(f1 INT NOT NULL )ENGINE=InnoDB; INSERT INTO t1 SELECT * FROM seq_1_to_65536; let $restart_noprint=1; let $shutdown_timeout=0; --source include/restart_mysqld.inc I will run this a couple more times to assess the probability of the failure, and then try to see if it goes away after I apply the attempted fix.

            Another failure with the unmodified code and test:

            11.2 e81fa345020ec6a067583db6a7019d6404b26f93

            innodb.recovery_memory 'innodb,release'  w89 [ 39 fail ]
                    Test ended at 2023-07-27 10:20:23
             
            CURRENT_TEST: innodb.recovery_memory
            mysqltest: At line 50: query 'SHOW CREATE TABLE t1' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'InnoDB'
            …
            2023-07-27 10:20:23 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=207549525
            2023-07-27 10:20:23 0 [Note] InnoDB: End of log at LSN=211885217
            2023-07-27 10:20:23 0 [Note] InnoDB: To recover: 493 pages
            2023-07-27 10:20:23 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=160, page number=77]
            2023-07-27 10:20:23 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
            

            This occurred on the last kill+restart of the test, during the only recovery batch, before executing log_sort_flush_list(). That function would have been executed 2 times by the 2 preceding kill+restart steps.

            marko Marko Mäkelä added a comment - Another failure with the unmodified code and test: 11.2 e81fa345020ec6a067583db6a7019d6404b26f93 innodb.recovery_memory 'innodb,release' w89 [ 39 fail ] Test ended at 2023-07-27 10:20:23   CURRENT_TEST: innodb.recovery_memory mysqltest: At line 50: query 'SHOW CREATE TABLE t1' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'InnoDB' … 2023-07-27 10:20:23 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=207549525 2023-07-27 10:20:23 0 [Note] InnoDB: End of log at LSN=211885217 2023-07-27 10:20:23 0 [Note] InnoDB: To recover: 493 pages 2023-07-27 10:20:23 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=160, page number=77] 2023-07-27 10:20:23 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption. This occurred on the last kill+restart of the test, during the only recovery batch, before executing log_sort_flush_list() . That function would have been executed 2 times by the 2 preceding kill+restart steps.

            A third run of the unmodified code and test failed to fail:

            ./mtr --parallel=100 --repeat=100 innodb.recovery_memory{,,,,,,,,,}{,,,,,,,,,}
            

            11.2 e81fa345020ec6a067583db6a7019d6404b26f93

            The servers were restarted 0 times
            Spent 160474.624 of 1667 seconds executing testcases
             
            Completed: All 10000 tests were successful.
            

            I will keep running the test to get an idea of failure probability.

            marko Marko Mäkelä added a comment - A third run of the unmodified code and test failed to fail: . /mtr --parallel=100 --repeat=100 innodb.recovery_memory{,,,,,,,,,}{,,,,,,,,,} 11.2 e81fa345020ec6a067583db6a7019d6404b26f93 The servers were restarted 0 times Spent 160474.624 of 1667 seconds executing testcases   Completed: All 10000 tests were successful. I will keep running the test to get an idea of failure probability.

            No failure on 4th attempt either:

            11.2 e81fa345020ec6a067583db6a7019d6404b26f93

            The servers were restarted 0 times
            Spent 158546.192 of 1645 seconds executing testcases
             
            Completed: All 10000 tests were successful.
            

            marko Marko Mäkelä added a comment - No failure on 4th attempt either: 11.2 e81fa345020ec6a067583db6a7019d6404b26f93 The servers were restarted 0 times Spent 158546.192 of 1645 seconds executing testcases   Completed: All 10000 tests were successful.

            Finally, it failed near the start of a campaign:

            11.2 e81fa345020ec6a067583db6a7019d6404b26f93

            innodb.recovery_memory 'innodb,release'  w90 [ 5 fail ]
                    Test ended at 2023-07-27 11:36:02
             
            CURRENT_TEST: innodb.recovery_memory
            mysqltest: At line 42: query 'CREATE TABLE t1(f1 INT NOT NULL)ENGINE=InnoDB' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'InnoDB'
            …
            2023-07-27 11:36:01 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=19716180
            2023-07-27 11:36:01 0 [Note] InnoDB: Multi-batch recovery needed at LSN 20464797
            2023-07-27 11:36:01 0 [Note] InnoDB: End of log at LSN=21565225
            2023-07-27 11:36:01 0 [Note] InnoDB: To recover: LSN 20624857/21565225; 281 pages
            2023-07-27 11:36:01 0 [Note] InnoDB: To recover: LSN 21116161/21565225; 339 pages
            2023-07-27 11:36:01 0 [ERROR] InnoDB: Space id and page no stored in the page, read in are [page id: space=4294967295, page number=4294967295], should be [page id: space=2, page number=210]
            

            This failed on the first kill+restart attempt of the test. It is a little hard to believe that this corruption of an undo log page could have been caused by an earlier successfully completed kill+restart step, because in order for there to be redo log for the undo log page, the page should have been successfully loaded into the buffer pool of the server instance that wrote the redo log record for it. But, perhaps that previous server instance had somehow skipped the page write because it had mis-sorted the buf_pool.flush_list?
            Another failure:

            11.2 e81fa345020ec6a067583db6a7019d6404b26f93

            innodb.recovery_memory 'innodb,release'  w21 [ 20 fail ]
                    Test ended at 2023-07-27 11:45:54
             
            CURRENT_TEST: innodb.recovery_memory
            mysqltest: At line 50: query 'SHOW CREATE TABLE t1' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'InnoDB'
            …
            2023-07-27 11:45:53 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=101087382
            2023-07-27 11:45:53 0 [Note] InnoDB: End of log at LSN=105059354
            2023-07-27 11:45:53 0 [Note] InnoDB: To recover: 246 pages
            2023-07-27 11:45:53 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=84, page number=145]
            

            This occurred on the third kill+restart, just like the 2nd failure.

            The summary so far is that 4 out of 6 test runs failed, but none with the originally observed SIGSEGV crash. Next, I will try applying the fix.

            marko Marko Mäkelä added a comment - Finally, it failed near the start of a campaign: 11.2 e81fa345020ec6a067583db6a7019d6404b26f93 innodb.recovery_memory 'innodb,release' w90 [ 5 fail ] Test ended at 2023-07-27 11:36:02   CURRENT_TEST: innodb.recovery_memory mysqltest: At line 42: query 'CREATE TABLE t1(f1 INT NOT NULL)ENGINE=InnoDB' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'InnoDB' … 2023-07-27 11:36:01 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=19716180 2023-07-27 11:36:01 0 [Note] InnoDB: Multi-batch recovery needed at LSN 20464797 2023-07-27 11:36:01 0 [Note] InnoDB: End of log at LSN=21565225 2023-07-27 11:36:01 0 [Note] InnoDB: To recover: LSN 20624857/21565225; 281 pages 2023-07-27 11:36:01 0 [Note] InnoDB: To recover: LSN 21116161/21565225; 339 pages 2023-07-27 11:36:01 0 [ERROR] InnoDB: Space id and page no stored in the page, read in are [page id: space=4294967295, page number=4294967295], should be [page id: space=2, page number=210] This failed on the first kill+restart attempt of the test. It is a little hard to believe that this corruption of an undo log page could have been caused by an earlier successfully completed kill+restart step, because in order for there to be redo log for the undo log page, the page should have been successfully loaded into the buffer pool of the server instance that wrote the redo log record for it. But, perhaps that previous server instance had somehow skipped the page write because it had mis-sorted the buf_pool.flush_list ? Another failure: 11.2 e81fa345020ec6a067583db6a7019d6404b26f93 innodb.recovery_memory 'innodb,release' w21 [ 20 fail ] Test ended at 2023-07-27 11:45:54   CURRENT_TEST: innodb.recovery_memory mysqltest: At line 50: query 'SHOW CREATE TABLE t1' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'InnoDB' … 2023-07-27 11:45:53 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=101087382 2023-07-27 11:45:53 0 [Note] InnoDB: End of log at LSN=105059354 2023-07-27 11:45:53 0 [Note] InnoDB: To recover: 246 pages 2023-07-27 11:45:53 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=84, page number=145] This occurred on the third kill+restart, just like the 2nd failure. The summary so far is that 4 out of 6 test runs failed, but none with the originally observed SIGSEGV crash. Next, I will try applying the fix.

            Now I got some proof of the suspected race condition:

            11.2 e81fa345020ec6a067583db6a7019d6404b26f93 with patch

            innodb.recovery_memory 'innodb,release'  w75 [ 9 fail ]
            …
            InnoDB: Failing assertion: list[i]->oldest_modification() > 2
            

            The patch was as follows:

            diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc
            index 66b763fdf2e..14590476511 100644
            --- a/storage/innobase/log/log0recv.cc
            +++ b/storage/innobase/log/log0recv.cc
            @@ -3680,25 +3680,51 @@ inline fil_space_t *fil_system_t::find(const char *path) const
             /** Thread-safe function which sorts flush_list by oldest_modification */
             static void log_sort_flush_list()
             {
            -  mysql_mutex_lock(&buf_pool.flush_list_mutex);
            +  /* Ensure that oldest_modification() cannot change during std::sort() */
            +  for (;;)
            +  {
            +    os_aio_wait_until_no_pending_writes(false);
            +    mysql_mutex_lock(&buf_pool.flush_list_mutex);
            +    if (!buf_pool.flush_list_active())
            +      break;
            +    my_cond_wait(&buf_pool.done_flush_list,
            +                 &buf_pool.flush_list_mutex.m_mutex);
            +    mysql_mutex_unlock(&buf_pool.flush_list_mutex);
            +  }
             
               const size_t size= UT_LIST_GET_LEN(buf_pool.flush_list);
               std::unique_ptr<buf_page_t *[]> list(new buf_page_t *[size]);
             
            +  /* Copy the dirty blocks from buf_pool.flush_list to an array for sorting. */
               size_t idx= 0;
            -  for (buf_page_t *p= UT_LIST_GET_FIRST(buf_pool.flush_list); p;
            -       p= UT_LIST_GET_NEXT(list, p))
            -    list.get()[idx++]= p;
            +  for (buf_page_t *p= UT_LIST_GET_FIRST(buf_pool.flush_list); p; )
            +  {
            +    const lsn_t lsn{p->oldest_modification()};
            +    ut_ad(lsn > 2 || lsn == 1);
            +    buf_page_t *n= UT_LIST_GET_NEXT(list, p);
            +    if (lsn > 1)
            +      list.get()[idx++]= p;
            +    else
            +      buf_pool.delete_from_flush_list(p);
            +    p= n;
            +  }
            +
            +  sql_print_information("sorting %zu,%zu", idx, size);
             
            -  std::sort(list.get(), list.get() + size,
            +  std::sort(list.get(), list.get() + idx,
                         [](const buf_page_t *lhs, const buf_page_t *rhs) {
            -              return rhs->oldest_modification() < lhs->oldest_modification();
            +              lsn_t l{lhs->oldest_modification()},r{rhs->oldest_modification()};
            +              ut_a(l > 2); ut_a(r > 2);
            +              return r < l;
                         });
             
               UT_LIST_INIT(buf_pool.flush_list, &buf_page_t::list);
             
            -  for (size_t i= 0; i < size; i++)
            +  for (size_t i= 0; i < idx; i++)
            +  {
            +    ut_a(list[i]->oldest_modification() > 2);
                 UT_LIST_ADD_LAST(buf_pool.flush_list, list[i]);
            +  }
             
               mysql_mutex_unlock(&buf_pool.flush_list_mutex);
             }
            

            Unfortunately, all variables are optimized out in the core dump. I suspect that we had oldest_modification()==1. I will move the assertion after the UT_LIST_ADD_LAST so that I can find the entry that way. The assertion failed on the very first sorted element.

            marko Marko Mäkelä added a comment - Now I got some proof of the suspected race condition: 11.2 e81fa345020ec6a067583db6a7019d6404b26f93 with patch innodb.recovery_memory 'innodb,release' w75 [ 9 fail ] … InnoDB: Failing assertion: list[i]->oldest_modification() > 2 The patch was as follows: diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc index 66b763fdf2e..14590476511 100644 --- a/storage/innobase/log/log0recv.cc +++ b/storage/innobase/log/log0recv.cc @@ -3680,25 +3680,51 @@ inline fil_space_t *fil_system_t::find(const char *path) const /** Thread-safe function which sorts flush_list by oldest_modification */ static void log_sort_flush_list() { - mysql_mutex_lock(&buf_pool.flush_list_mutex); + /* Ensure that oldest_modification() cannot change during std::sort() */ + for (;;) + { + os_aio_wait_until_no_pending_writes(false); + mysql_mutex_lock(&buf_pool.flush_list_mutex); + if (!buf_pool.flush_list_active()) + break; + my_cond_wait(&buf_pool.done_flush_list, + &buf_pool.flush_list_mutex.m_mutex); + mysql_mutex_unlock(&buf_pool.flush_list_mutex); + } const size_t size= UT_LIST_GET_LEN(buf_pool.flush_list); std::unique_ptr<buf_page_t *[]> list(new buf_page_t *[size]); + /* Copy the dirty blocks from buf_pool.flush_list to an array for sorting. */ size_t idx= 0; - for (buf_page_t *p= UT_LIST_GET_FIRST(buf_pool.flush_list); p; - p= UT_LIST_GET_NEXT(list, p)) - list.get()[idx++]= p; + for (buf_page_t *p= UT_LIST_GET_FIRST(buf_pool.flush_list); p; ) + { + const lsn_t lsn{p->oldest_modification()}; + ut_ad(lsn > 2 || lsn == 1); + buf_page_t *n= UT_LIST_GET_NEXT(list, p); + if (lsn > 1) + list.get()[idx++]= p; + else + buf_pool.delete_from_flush_list(p); + p= n; + } + + sql_print_information("sorting %zu,%zu", idx, size); - std::sort(list.get(), list.get() + size, + std::sort(list.get(), list.get() + idx, [](const buf_page_t *lhs, const buf_page_t *rhs) { - return rhs->oldest_modification() < lhs->oldest_modification(); + lsn_t l{lhs->oldest_modification()},r{rhs->oldest_modification()}; + ut_a(l > 2); ut_a(r > 2); + return r < l; }); UT_LIST_INIT(buf_pool.flush_list, &buf_page_t::list); - for (size_t i= 0; i < size; i++) + for (size_t i= 0; i < idx; i++) + { + ut_a(list[i]->oldest_modification() > 2); UT_LIST_ADD_LAST(buf_pool.flush_list, list[i]); + } mysql_mutex_unlock(&buf_pool.flush_list_mutex); } Unfortunately, all variables are optimized out in the core dump. I suspect that we had oldest_modification()==1 . I will move the assertion after the UT_LIST_ADD_LAST so that I can find the entry that way. The assertion failed on the very first sorted element.

            Another failure with the above buggy patch:

            CURRENT_TEST: innodb.recovery_memory
            mysqltest: At line 50: query 'SHOW CREATE TABLE t1' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'InnoDB'
            …
            2023-07-27 12:34:36 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=311491401
            2023-07-27 12:34:36 0 [Note] InnoDB: End of log at LSN=315636465
            2023-07-27 12:34:36 0 [Note] InnoDB: To recover: 279 pages
            2023-07-27 12:34:36 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=239, page number=110]
            2023-07-27 12:34:36 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
            

            Hopefully this revised wait loop does the trick and ensures that no page writes are in progress while we sort the buf_pool.flush_list:

              for (;;)
              {
                os_aio_wait_until_no_pending_writes(false);
                mysql_mutex_lock(&buf_pool.flush_list_mutex);
                if (buf_pool.flush_list_active())
                  my_cond_wait(&buf_pool.done_flush_list,
                               &buf_pool.flush_list_mutex.m_mutex);
                else if (!os_aio_pending_writes())
                  break;
                mysql_mutex_unlock(&buf_pool.flush_list_mutex);
              }
            

            marko Marko Mäkelä added a comment - Another failure with the above buggy patch: CURRENT_TEST: innodb.recovery_memory mysqltest: At line 50: query 'SHOW CREATE TABLE t1' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'InnoDB' … 2023-07-27 12:34:36 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=311491401 2023-07-27 12:34:36 0 [Note] InnoDB: End of log at LSN=315636465 2023-07-27 12:34:36 0 [Note] InnoDB: To recover: 279 pages 2023-07-27 12:34:36 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=239, page number=110] 2023-07-27 12:34:36 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption. Hopefully this revised wait loop does the trick and ensures that no page writes are in progress while we sort the buf_pool.flush_list : for (;;) { os_aio_wait_until_no_pending_writes( false ); mysql_mutex_lock(&buf_pool.flush_list_mutex); if (buf_pool.flush_list_active()) my_cond_wait(&buf_pool.done_flush_list, &buf_pool.flush_list_mutex.m_mutex); else if (!os_aio_pending_writes()) break ; mysql_mutex_unlock(&buf_pool.flush_list_mutex); }

            Unfortunately, also with the revised patch I got a failure:

            innodb.recovery_memory 'innodb,release'  w59 [ 53 fail ]
                    Test ended at 2023-07-27 12:52:00
             
            CURRENT_TEST: innodb.recovery_memory
            mysqltest: At line 50: query 'SHOW CREATE TABLE t1' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'InnoDB'
            …
            2023-07-27 12:51:59 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=280364554
            2023-07-27 12:51:59 0 [Note] InnoDB: End of log at LSN=284055215
            2023-07-27 12:51:59 0 [Note] InnoDB: To recover: 198 pages
            2023-07-27 12:51:59 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=218, page number=113]
            2023-07-27 12:51:59 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.
            2023-07-27 12:51:59 0 [Note] InnoDB: Set innodb_force_recovery=1 to ignore corrupted pages.
            2023-07-27 12:51:59 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=218, page number=113]
            2023-07-27 12:51:59 0 [Note] sorting 194,194
            

            The kill+restart before this failure processed much fewer pages in the last batch:

            2023-07-27 12:51:55 0 [Note] sorting 11,11
            

            This means that the test is reproducing two independent bugs. I started one more campaign, hoping to produce an rr replay trace:

            while ./mtr --rr=-h --parallel=60 innodb.recovery_memory{,,,,,,,,,}{,,,,,}; do :; done
            

            marko Marko Mäkelä added a comment - Unfortunately, also with the revised patch I got a failure: innodb.recovery_memory 'innodb,release' w59 [ 53 fail ] Test ended at 2023-07-27 12:52:00   CURRENT_TEST: innodb.recovery_memory mysqltest: At line 50: query 'SHOW CREATE TABLE t1' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'InnoDB' … 2023-07-27 12:51:59 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=280364554 2023-07-27 12:51:59 0 [Note] InnoDB: End of log at LSN=284055215 2023-07-27 12:51:59 0 [Note] InnoDB: To recover: 198 pages 2023-07-27 12:51:59 0 [ERROR] InnoDB: Not applying INSERT_HEAP_DYNAMIC due to corruption on [page id: space=218, page number=113] 2023-07-27 12:51:59 0 [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption. 2023-07-27 12:51:59 0 [Note] InnoDB: Set innodb_force_recovery=1 to ignore corrupted pages. 2023-07-27 12:51:59 0 [ERROR] InnoDB: Unable to apply log to corrupted page [page id: space=218, page number=113] 2023-07-27 12:51:59 0 [Note] sorting 194,194 The kill+restart before this failure processed much fewer pages in the last batch: 2023-07-27 12:51:55 0 [Note] sorting 11,11 This means that the test is reproducing two independent bugs. I started one more campaign, hoping to produce an rr replay trace: while . /mtr --rr=-h --parallel=60 innodb.recovery_memory{,,,,,,,,,}{,,,,,}; do :; done

            I failed to reproduce this with rr so far. I got the idea to test with simulated asynchronous I/O:

            ./mtr --parallel=100 --repeat=100 --mysqld=--innodb-use-native-aio=0 innodb.recovery_memory{,,,,,,,,,}{,,,,,,,,,}
            

            This did fail as well, so a bug like MDEV-29610 should not play a role.

            innodb.recovery_memory 'innodb,release'  w87 [ 62 fail ]
                    Test ended at 2023-07-27 17:32:06
             
            CURRENT_TEST: innodb.recovery_memory
            mysqltest: At line 42: query 'CREATE TABLE t1(f1 INT NOT NULL)ENGINE=InnoDB' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'InnoDB'
            …
            2023-07-27 17:32:05 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=324014657
            2023-07-27 17:32:05 0 [Note] InnoDB: Multi-batch recovery needed at LSN 324529178
            2023-07-27 17:32:05 0 [Note] InnoDB: End of log at LSN=325255352
            2023-07-27 17:32:05 0 [Note] InnoDB: To recover: LSN 324637270/325255352; 287 pages
            2023-07-27 17:32:06 0 [Note] InnoDB: To recover: LSN 325135744/325255352; 383 pages
            2023-07-27 17:32:06 0 [ERROR] InnoDB: Space id and page no stored in the page, read in are [page id: space=2398, page number=729743360], should be [page id: space=1, page number=556]
            

            marko Marko Mäkelä added a comment - I failed to reproduce this with rr so far. I got the idea to test with simulated asynchronous I/O: . /mtr --parallel=100 --repeat=100 --mysqld=--innodb-use-native-aio=0 innodb.recovery_memory{,,,,,,,,,}{,,,,,,,,,} This did fail as well, so a bug like MDEV-29610 should not play a role. innodb.recovery_memory 'innodb,release' w87 [ 62 fail ] Test ended at 2023-07-27 17:32:06   CURRENT_TEST: innodb.recovery_memory mysqltest: At line 42: query 'CREATE TABLE t1(f1 INT NOT NULL)ENGINE=InnoDB' failed: ER_UNKNOWN_STORAGE_ENGINE (1286): Unknown storage engine 'InnoDB' … 2023-07-27 17:32:05 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=324014657 2023-07-27 17:32:05 0 [Note] InnoDB: Multi-batch recovery needed at LSN 324529178 2023-07-27 17:32:05 0 [Note] InnoDB: End of log at LSN=325255352 2023-07-27 17:32:05 0 [Note] InnoDB: To recover: LSN 324637270/325255352; 287 pages 2023-07-27 17:32:06 0 [Note] InnoDB: To recover: LSN 325135744/325255352; 383 pages 2023-07-27 17:32:06 0 [ERROR] InnoDB: Space id and page no stored in the page, read in are [page id: space=2398, page number=729743360], should be [page id: space=1, page number=556]
            marko Marko Mäkelä made changes -

            I pushed a fix for the SIGSEGV issue (likely due to a data race with std::sort). The remaining failures will be addressed in MDEV-31791.

            marko Marko Mäkelä added a comment - I pushed a fix for the SIGSEGV issue (likely due to a data race with std::sort ). The remaining failures will be addressed in MDEV-31791 .
            marko Marko Mäkelä made changes -
            issue.field.resolutiondate 2023-07-28 10:20:43.0 2023-07-28 10:20:43.158
            marko Marko Mäkelä made changes -
            Fix Version/s 10.6.15 [ 29013 ]
            Fix Version/s 10.9.8 [ 29015 ]
            Fix Version/s 10.10.6 [ 29017 ]
            Fix Version/s 10.11.5 [ 29019 ]
            Fix Version/s 11.0.3 [ 28920 ]
            Fix Version/s 11.1.2 [ 28921 ]
            Fix Version/s 11.2.1 [ 29034 ]
            Fix Version/s 10.5 [ 23123 ]
            Fix Version/s 10.6 [ 24028 ]
            Fix Version/s 10.9 [ 26905 ]
            Fix Version/s 10.10 [ 27530 ]
            Fix Version/s 10.11 [ 27614 ]
            Fix Version/s 11.0 [ 28320 ]
            Fix Version/s 11.1 [ 28549 ]
            Resolution Fixed [ 1 ]
            Status In Progress [ 3 ] Closed [ 6 ]

            mleich was able to reproduce this in a branch where the fix had not been merged yet:

            ssh pluto
            rr replay /data/results/1690831257/TBR-2024/1/rr/latest-trace/
            

            break std::sort<buf_page_t**, log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> >
            continue
            break buf_page_write_complete
            ignore 2 1000
            continue
            info breakpoints
            

            Thread 1 hit Breakpoint 1, std::sort<buf_page_t**, log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> > (__last=0x5581033f8720, __first=0x5581033f8670, __comp=...)
                at /usr/include/c++/9/bits/stl_algo.h:4899
            4899	      std::__sort(__first, __last, __gnu_cxx::__ops::__iter_comp_iter(__comp));
            (rr) break buf_page_write_complete
            Breakpoint 2 at 0x5581000b83d1: buf_page_write_complete. (2 locations)
            (rr) ign 2 10000
            Will ignore next 10000 crossings of breakpoint 2.
            (rr) continue
            Continuing.
             
            Thread 1 received signal SIGSEGV, Segmentation fault.
            0x00005581006f1b82 in std::__atomic_base<unsigned long>::load (__m=std::memory_order_seq_cst, this=0xd1) at /usr/include/c++/9/bits/atomic_base.h:413
            413	      load(memory_order __m = memory_order_seq_cst) const noexcept
            (rr) info breakpoints
            Num     Type           Disp Enb Address            What
            1       breakpoint     keep y   0x00005581006f4a55 in std::sort<buf_page_t**, log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> > at /usr/include/c++/9/bits/stl_algo.h:1962
            	breakpoint already hit 1 time
            2       breakpoint     keep y   <MULTIPLE>         
            	breakpoint already hit 22 times
            	ignore next 9978 hits
            2.1                         y   0x00005581000b83d1 in buf_page_write_complete(IORequest const&) at /data/Server/bb-11.2-MDEV-14795_1E/storage/innobase/buf/buf0flu.cc:337
            2.2                         y   0x00005581007cf270 in buf_page_write_complete(IORequest const&) at /data/Server/bb-11.2-MDEV-14795_1E/storage/innobase/buf/buf0flu.cc:321
            

            I observed one page for which oldest_modification_ was modified from 56630890 to 1 while the std::sort() was in progress.

            marko Marko Mäkelä added a comment - mleich was able to reproduce this in a branch where the fix had not been merged yet: ssh pluto rr replay /data/results/1690831257/TBR-2024/1/rr/latest-trace/ break std::sort<buf_page_t**, log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> > continue break buf_page_write_complete ignore 2 1000 continue info breakpoints Thread 1 hit Breakpoint 1, std::sort<buf_page_t**, log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> > (__last=0x5581033f8720, __first=0x5581033f8670, __comp=...) at /usr/include/c++/9/bits/stl_algo.h:4899 4899 std::__sort(__first, __last, __gnu_cxx::__ops::__iter_comp_iter(__comp)); (rr) break buf_page_write_complete Breakpoint 2 at 0x5581000b83d1: buf_page_write_complete. (2 locations) (rr) ign 2 10000 Will ignore next 10000 crossings of breakpoint 2. (rr) continue Continuing.   Thread 1 received signal SIGSEGV, Segmentation fault. 0x00005581006f1b82 in std::__atomic_base<unsigned long>::load (__m=std::memory_order_seq_cst, this=0xd1) at /usr/include/c++/9/bits/atomic_base.h:413 413 load(memory_order __m = memory_order_seq_cst) const noexcept (rr) info breakpoints Num Type Disp Enb Address What 1 breakpoint keep y 0x00005581006f4a55 in std::sort<buf_page_t**, log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> > at /usr/include/c++/9/bits/stl_algo.h:1962 breakpoint already hit 1 time 2 breakpoint keep y <MULTIPLE> breakpoint already hit 22 times ignore next 9978 hits 2.1 y 0x00005581000b83d1 in buf_page_write_complete(IORequest const&) at /data/Server/bb-11.2-MDEV-14795_1E/storage/innobase/buf/buf0flu.cc:337 2.2 y 0x00005581007cf270 in buf_page_write_complete(IORequest const&) at /data/Server/bb-11.2-MDEV-14795_1E/storage/innobase/buf/buf0flu.cc:321 I observed one page for which oldest_modification_ was modified from 56630890 to 1 while the std::sort() was in progress.
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.