[MDEV-32029] Assertion failures in log_sort_flush_list upon crash recovery Created: 2023-08-28  Updated: 2023-10-03  Resolved: 2023-08-31

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.6, 10.10, 10.11, 11.0, 11.1, 11.2
Fix Version/s: 10.6.16, 10.10.7, 10.11.6, 11.0.4, 11.1.3, 11.2.2, 11.3.0

Type: Bug Priority: Critical
Reporter: Elena Stepanova Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: race, recovery, rr-profile-analysed

Issue Links:
Problem/Incident
is caused by MDEV-26827 Make page flushing even faster Closed
Relates
relates to MDEV-25113 Reduce effect of parallel background ... Closed
relates to MDEV-29911 InnoDB recovery and mariadb-backup --... Closed
relates to MDEV-31354 SIGSEGV in log_sort_flush_list() in I... Closed
relates to MDEV-32042 Special handling of crash recovery in... Closed
relates to MDEV-32134 InnoDB hang in buf_flush_wait_LRU_bat... Closed

 Description   

The failure happens upon crash recovery.

2023-08-28 14:46:25 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=54805
2023-08-28 14:46:26 0 [Note] InnoDB: End of log at LSN=30467662
2023-08-28 14:46:26 0 [Note] InnoDB: To recover: 6392 pages
mysqld: /data/MDEV-32029/11.2/storage/innobase/log/log0recv.cc:3721: log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)>: Assertion `l > 2' failed.
230828 14:46:29 [ERROR] mysqld got signal 6 ;

11.2 9ad7c899ac51ee7959f312a84402bb2082fa5e56

(rr) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00001f1f49180859 in __GI_abort () at abort.c:79
#2  0x00001f1f49180729 in __assert_fail_base (fmt=0x1f1f49316588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x56341f5358a0 "l > 2", 
    file=0x56341f52fd80 "/data/MDEV-32029/11.2/storage/innobase/log/log0recv.cc", line=3721, function=<optimized out>) at assert.c:92
#3  0x00001f1f49191fd6 in __GI___assert_fail (assertion=0x56341f5358a0 "l > 2", file=0x56341f52fd80 "/data/MDEV-32029/11.2/storage/innobase/log/log0recv.cc", line=3721, 
    function=0x56341f5358e0 "log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)>") at assert.c:101
#4  0x000056341df7de41 in <lambda(const buf_page_t*, const buf_page_t*)>::operator()(const buf_page_t *, const buf_page_t *) const (__closure=0x7ffc83eaa110, lhs=0x35a95e269c20, rhs=0x4f9c5ff60a30)
    at /data/MDEV-32029/11.2/storage/innobase/log/log0recv.cc:3721
#5  0x000056341df8b3e2 in __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> >::operator()<buf_page_t**, buf_page_t**>(buf_page_t **, buf_page_t **) (
    this=0x7ffc83eaa110, __it1=0x62f000000408, __it2=0x62f0000046d8) at /usr/include/c++/9/bits/predefined_ops.h:143
#6  0x000056341df8b044 in std::__move_median_to_first<buf_page_t**, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> > >(buf_page_t **, buf_page_t **, buf_page_t **, buf_page_t **, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> >) (__result=0x62f000000400, __a=0x62f000000408, __b=0x62f0000046d8, 
    __c=0x62f0000089a8, __comp=...) at /usr/include/c++/9/bits/stl_algo.h:81
#7  0x000056341df8a84e in std::__unguarded_partition_pivot<buf_page_t**, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> > >(buf_page_t **, buf_page_t **, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> >) (__first=0x62f000000400, __last=0x62f0000089b0, 
    __comp=<error reading variable: Cannot access memory at address 0xffffffffffffffc0>) at /usr/include/c++/9/bits/stl_algo.h:1926
#8  0x000056341df89c20 in std::__introsort_loop<buf_page_t**, long int, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> > >(buf_page_t **, buf_page_t **, long, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> >) (__first=0x62f000000400, __last=0x62f0000089b0, __depth_limit=23, 
    __comp=<error reading variable: Cannot access memory at address 0xffffffffffffffc0>) at /usr/include/c++/9/bits/stl_algo.h:1958
#9  0x000056341df88280 in std::__sort<buf_page_t**, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> > >(buf_page_t **, buf_page_t **, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> >) (__first=0x62f000000400, __last=0x62f0000089b0, 
    __comp=<error reading variable: Cannot access memory at address 0xffffffffffffffc0>) at /usr/include/c++/9/bits/stl_algo.h:1973
#10 0x000056341df871da in std::sort<buf_page_t**, log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> >(buf_page_t **, buf_page_t **, <lambda(const buf_page_t*, const buf_page_t*)>) (
    __first=0x62f000000400, __last=0x62f0000089b0, __comp=<error reading variable: Cannot access memory at address 0xffffffffffffffc0>) at /usr/include/c++/9/bits/stl_algo.h:4905
#11 0x000056341df7e273 in log_sort_flush_list () at /data/MDEV-32029/11.2/storage/innobase/log/log0recv.cc:3717
#12 0x000056341df7fe69 in recv_sys_t::apply (this=0x56342061aa40 <recv_sys>, last_batch=true) at /data/MDEV-32029/11.2/storage/innobase/log/log0recv.cc:3881
#13 0x000056341e22fc49 in srv_start (create_new_db=false) at /data/MDEV-32029/11.2/storage/innobase/srv/srv0start.cc:1577
#14 0x000056341ddb876c in innodb_init (p=0x615000002618) at /data/MDEV-32029/11.2/storage/innobase/handler/ha_innodb.cc:4171
#15 0x000056341d5f8c15 in ha_initialize_handlerton (plugin=0x6210000c1f00) at /data/MDEV-32029/11.2/sql/handler.cc:693
#16 0x000056341cdc06e3 in plugin_do_initialize (plugin=0x6210000c1f00, state=@0x7ffc83ead2c0: 4) at /data/MDEV-32029/11.2/sql/sql_plugin.cc:1453
#17 0x000056341cdc0ed3 in plugin_initialize (tmp_root=0x7ffc83ead9d0, plugin=0x6210000c1f00, argc=0x563420b59720 <remaining_argc>, argv=0x616000000788, options_only=false)
    at /data/MDEV-32029/11.2/sql/sql_plugin.cc:1506
#18 0x000056341cdc2822 in plugin_init (argc=0x563420b59720 <remaining_argc>, argv=0x616000000788, flags=0) at /data/MDEV-32029/11.2/sql/sql_plugin.cc:1765
#19 0x000056341ca0f707 in init_server_components () at /data/MDEV-32029/11.2/sql/mysqld.cc:5271
#20 0x000056341ca11213 in mysqld_main (argc=30, argv=0x616000000788) at /data/MDEV-32029/11.2/sql/mysqld.cc:5900
#21 0x000056341c9ff1dd in main (argc=29, argv=0x7ffc83eb0528) at /data/MDEV-32029/11.2/sql/main.cc:34

Other observed apparently related failures are

11.1 e6ec2b3b

mysqld: /home/vsts/src/storage/innobase/log/log0recv.cc:3715: log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)>: Assertion `r > 2' failed.
230828  7:00:22 [ERROR] mysqld got signal 6 ;
 
#7  0x00007fdeeb4edfd6 in __GI___assert_fail (assertion=0x5584a80ddfc0 "r > 2", file=0x5584a80d8800 "/home/vsts/src/storage/innobase/log/log0recv.cc", line=3715, function=0x5584a80ddf40 "log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)>") at assert.c:101
#8  0x00005584a6afc56c in <lambda(const buf_page_t*, const buf_page_t*)>::operator()(const buf_page_t *, const buf_page_t *) const (__closure=0x7ffc55722770, lhs=0x7fded01aed30, rhs=0x7fded17ab6c0) at /home/vsts/src/storage/innobase/log/log0recv.cc:3715
#9  0x00005584a6b09386 in __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> >::operator()<buf_page_t**, buf_page_t**>(buf_page_t **, buf_page_t **) (this=0x7ffc55722770, __it1=0x630000010400, __it2=0x630000019008) at /usr/include/c++/9/bits/predefined_ops.h:143
#10 0x00005584a6b0926b in std::__unguarded_partition<buf_page_t**, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> > >(buf_page_t **, buf_page_t **, buf_page_t **, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> >) (__first=0x630000017db8, __last=0x630000019008, __pivot=0x630000010400, __comp=...) at /usr/include/c++/9/bits/stl_algo.h:1910
#11 0x00005584a6b08816 in std::__unguarded_partition_pivot<buf_page_t**, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> > >(buf_page_t **, buf_page_t **, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> >) (__first=0x630000010400, __last=0x63000001c1c8, __comp=<error reading variable: Cannot access memory at address 0xffffffffffffffc0>) at /usr/include/c++/9/bits/stl_algo.h:1928
#12 0x00005584a6b07bc0 in std::__introsort_loop<buf_page_t**, long int, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> > >(buf_page_t **, buf_page_t **, long, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> >) (__first=0x630000010400, __last=0x63000001c1c8, __depth_limit=23, __comp=<error reading variable: Cannot access memory at address 0xffffffffffffffc0>) at /usr/include/c++/9/bits/stl_algo.h:1958
#13 0x00005584a6b06206 in std::__sort<buf_page_t**, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> > >(buf_page_t **, buf_page_t **, __gnu_cxx::__ops::_Iter_comp_iter<log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> >) (__first=0x630000010400, __last=0x63000001c1c8, __comp=<error reading variable: Cannot access memory at address 0xffffffffffffffc0>) at /usr/include/c++/9/bits/stl_algo.h:1973
#14 0x00005584a6b0510a in std::sort<buf_page_t**, log_sort_flush_list()::<lambda(const buf_page_t*, const buf_page_t*)> >(buf_page_t **, buf_page_t **, <lambda(const buf_page_t*, const buf_page_t*)>) (__first=0x630000010400, __last=0x63000001c1c8, __comp=<error reading variable: Cannot access memory at address 0xffffffffffffffc0>) at /usr/include/c++/9/bits/stl_algo.h:4905
#15 0x00005584a6afc92f in log_sort_flush_list () at /home/vsts/src/storage/innobase/log/log0recv.cc:3711
#16 0x00005584a6afe009 in recv_sys_t::apply (this=0x5584a93e7240 <recv_sys>, last_batch=true) at /home/vsts/src/storage/innobase/log/log0recv.cc:3851
#17 0x00005584a6db14e6 in srv_start (create_new_db=false) at /home/vsts/src/storage/innobase/srv/srv0start.cc:1577
#18 0x00005584a6930386 in innodb_init (p=0x615000003298) at /home/vsts/src/storage/innobase/handler/ha_innodb.cc:4171
#19 0x00005584a603faf7 in ha_initialize_handlerton (plugin=0x6210000c0b00) at /home/vsts/src/sql/handler.cc:693
#20 0x00005584a58119f5 in plugin_do_initialize (plugin=0x6210000c0b00, state=@0x7ffc55725910: 4) at /home/vsts/src/sql/sql_plugin.cc:1453
#21 0x00005584a58121f1 in plugin_initialize (tmp_root=0x7ffc55726020, plugin=0x6210000c0b00, argc=0x5584a995f2e0 <remaining_argc>, argv=0x616000000a88, options_only=false) at /home/vsts/src/sql/sql_plugin.cc:1506
#22 0x00005584a5813c10 in plugin_init (argc=0x5584a995f2e0 <remaining_argc>, argv=0x616000000a88, flags=0) at /home/vsts/src/sql/sql_plugin.cc:1765
#23 0x00005584a54463fb in init_server_components () at /home/vsts/src/sql/mysqld.cc:5252
#24 0x00005584a5448352 in mysqld_main (argc=25, argv=0x616000000a88) at /home/vsts/src/sql/mysqld.cc:5881
#25 0x00005584a54320cd in main (argc=24, argv=0x7ffc55728b88) at /home/vsts/src/sql/main.cc:34

11.1 e6ec2b3b

mysqld: /home/vsts/src/storage/innobase/log/log0recv.cc:3724: void log_sort_flush_list(): Assertion `list[i]->oldest_modification() > 2' failed.
230828  5:52:31 [ERROR] mysqld got signal 6 ;
 
#7  0x00007fd9c9930fd6 in __GI___assert_fail (assertion=0x55598ad4f0c0 "list[i]->oldest_modification() > 2", file=0x55598ad49800 "/home/vsts/src/storage/innobase/log/log0recv.cc", line=3724, function=0x55598ad4f080 "void log_sort_flush_list()") at assert.c:101
#8  0x000055598976dbce in log_sort_flush_list () at /home/vsts/src/storage/innobase/log/log0recv.cc:3724
#9  0x000055598976f009 in recv_sys_t::apply (this=0x55598c058240 <recv_sys>, last_batch=true) at /home/vsts/src/storage/innobase/log/log0recv.cc:3851
#10 0x0000555989a224e6 in srv_start (create_new_db=false) at /home/vsts/src/storage/innobase/srv/srv0start.cc:1577
#11 0x00005559895a1386 in innodb_init (p=0x615000003298) at /home/vsts/src/storage/innobase/handler/ha_innodb.cc:4171
#12 0x0000555988cb0af7 in ha_initialize_handlerton (plugin=0x6210000c0b00) at /home/vsts/src/sql/handler.cc:693
#13 0x00005559884829f5 in plugin_do_initialize (plugin=0x6210000c0b00, state=@0x7fffd82af9d0: 4) at /home/vsts/src/sql/sql_plugin.cc:1453
#14 0x00005559884831f1 in plugin_initialize (tmp_root=0x7fffd82b0120, plugin=0x6210000c0b00, argc=0x55598c5d02e0 <remaining_argc>, argv=0x616000000a88, options_only=false) at /home/vsts/src/sql/sql_plugin.cc:1506
#15 0x0000555988484c10 in plugin_init (argc=0x55598c5d02e0 <remaining_argc>, argv=0x616000000a88, flags=0) at /home/vsts/src/sql/sql_plugin.cc:1765
#16 0x00005559880b73fb in init_server_components () at /home/vsts/src/sql/mysqld.cc:5252
#17 0x00005559880b9352 in mysqld_main (argc=26, argv=0x616000000a88) at /home/vsts/src/sql/mysqld.cc:5881
#18 0x00005559880a30cd in main (argc=25, argv=0x7fffd82b2c88) at /home/vsts/src/sql/main.cc:34



 Comments   
Comment by Marko Mäkelä [ 2023-08-28 ]

The debug assertions were added as part of the fix of MDEV-31354, to help catch a race condition during std::sort(). Apparently, we must try harder to ensure that there are no writes "in-flight".

Before MDEV-25113, buf_pool.flush_list was fully protected by buf_pool.flush_list_mutex. After that change, the buf_page_t::oldest_modification_ can be changed to 1 (denoting a clean persistent page) while not holding any mutex.

Comment by Marko Mäkelä [ 2023-08-29 ]

Starting with MDEV-26827, buf_flush_page_cleaner() may initiate writes of data pages without holding buf_pool.flush_list. In the rr replay trace that I analyzed, a page write is being initiated by buf_flush_page_cleaner() invoking buf_flush_LRU() after it had called buf_pool_t::page_cleaner_set_idle(). At the time this page write completed, log_sort_flush_list() was copying buf_pool.flush_list into an array for sorting it by buf_page_t::oldest_modification_.

I think that either log_sort_flush_list() must hold buf_pool.mutex in addition to buf_pool.flush_list_mutex, or the condition buf_pool.flush_list_active() needs to be replaced with something that is guaranteed to hold as long as buf_flush_page_cleaner() may be initiating writes.

Comment by Marko Mäkelä [ 2023-08-29 ]

In the rr replay trace, I see that the buf_do_flush_list_batch() in the buf_flush_page_cleaner() thread kept writing pages even though it had already exceeded innodb_io_capacity limit. There are two reasons to that: First, innodb_flush_neighbors=ON would ignore the innodb_io_capacity limit. Second, the innodb_io_capacity limit was being ignored if recv_recovery_is_on(). I believe that the latter condition was made redundant by MDEV-29911. The actual problem still remains that log_sort_flush_list() needs to ensure that the buf_flush_page_cleaner() thread is idle.

Comment by Marko Mäkelä [ 2023-08-29 ]

In the end, the innodb_io_capacity rate limit will remain ignored during crash recovery. We do have some calls to buf_page_get_gen() from srv_start() while the last recovery batch is running. This could be fixed later. This would allow us to rename the function buf_page_get_low() to buf_page_get_gen() and to remove some conditional branches from a busy function.

Comment by Marko Mäkelä [ 2023-08-30 ]

I filed MDEV-32042 for a future performance fix. I don’t think that it is feasible to implement that in MariaDB Server versions that support the change buffer.

Comment by Matthias Leich [ 2023-08-31 ]

origin/bb-10.6-MDEV-32029, bb-10.6-MDEV-32029 9d1466522ea92963ac6ca16b597392714280c9f1 2023-08-30T14:40:13+03:00
which contains also a fix for MDEV-30986 performed well in RQG testing.
It needs to be mentioned that my RQG test battery has an extreme low likelihood to replay MDEV-32029.
I replayed that only two times on some other development tree.

Generated at Thu Feb 08 10:28:16 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.