[MDEV-24709] Assertion !recv_no_ibuf_operations failed in ibuf_page_low() Created: 2021-01-27  Updated: 2021-09-30  Resolved: 2021-01-27

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.2, 10.3, 10.4, 10.5
Fix Version/s: 10.2.37, 10.3.28, 10.4.18, 10.5.9

Type: Bug Priority: Major
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: CS0156770, corruption, rr-profile-analyzed

Issue Links:
Relates
relates to MDEV-24449 Corruption of system tablespace or la... Closed
relates to MDEV-9663 InnoDB assertion failure: *cursor->in... Closed

 Description   

In a 10.5 tree, we got ut_ad(!recv_no_ibuf_operations) failing during startup:

#3  0x00005cd70c7c1f36 in __GI___assert_fail (assertion=0x564aece01e00 "!recv_no_ibuf_operations", 
    file=0x564aece010e0 "/Server/bb-10.5-MDEV-24693/storage/innobase/ibuf/ibuf0ibuf.cc", line=968, 
    function=0x564aece01e40 "bool ibuf_page_low(page_id_t, ulint, bool, const char*, unsigned int, mtr_t*)") at assert.c:101
#4  0x0000564aeb550450 in ibuf_page_low (page_id=..., zip_size=0, x_latch=false, 
    file=0x564aece010e0 "/Server/bb-10.5-MDEV-24693/storage/innobase/ibuf/ibuf0ibuf.cc", line=4157, mtr=0x0)
    at /Server/bb-10.5-MDEV-24693/storage/innobase/ibuf/ibuf0ibuf.cc:968
#5  0x0000564aeba1f76a in buf_page_get_low (page_id=..., zip_size=0, rw_latch=2, guess=0x0, mode=10, 
    file=0x564aece010e0 "/Server/bb-10.5-MDEV-24693/storage/innobase/ibuf/ibuf0ibuf.cc", line=4157, mtr=0x7ffc791e47e0, err=0x7ffc791e46f0, 
    allow_ibuf_merge=false) at /Server/bb-10.5-MDEV-24693/storage/innobase/buf/buf0buf.cc:2946
#6  0x0000564aeba219ba in buf_page_get_gen (page_id=..., zip_size=0, rw_latch=2, guess=0x0, mode=10, 
    file=0x564aece010e0 "/Server/bb-10.5-MDEV-24693/storage/innobase/ibuf/ibuf0ibuf.cc", line=4157, mtr=0x7ffc791e47e0, err=0x7ffc791e46f0, 
    allow_ibuf_merge=false) at /Server/bb-10.5-MDEV-24693/storage/innobase/buf/buf0buf.cc:3479
#7  0x0000564aeb54f489 in ibuf_bitmap_get_map_page_func (page_id=..., zip_size=0, 
    file=0x564aece010e0 "/Server/bb-10.5-MDEV-24693/storage/innobase/ibuf/ibuf0ibuf.cc", line=4157, mtr=0x7ffc791e47e0)
    at /Server/bb-10.5-MDEV-24693/storage/innobase/ibuf/ibuf0ibuf.cc:694
#8  0x0000564aeb561049 in ibuf_page_exists (id=..., zip_size=0) at /Server/bb-10.5-MDEV-24693/storage/innobase/ibuf/ibuf0ibuf.cc:4157
#9  0x0000564aeba216bf in buf_page_get_gen (page_id=..., zip_size=0, rw_latch=1, guess=0x0, mode=10, 
    file=0x564aed127ec0 "/Server/bb-10.5-MDEV-24693/storage/innobase/dict/dict0load.cc", line=3141, mtr=0x7ffc791e65b0, err=0x7ffc791e50d0, 
    allow_ibuf_merge=true) at /Server/bb-10.5-MDEV-24693/storage/innobase/buf/buf0buf.cc:3458
#10 0x0000564aeb9b0b07 in btr_cur_search_to_nth_level_func (index=0x6170000051a0, level=0, tuple=0x6140000058e0, mode=PAGE_CUR_GE, latch_mode=1, 
    cursor=0x7ffc791e6470, ahi_latch=0x0, file=0x564aed127ec0 "/Server/bb-10.5-MDEV-24693/storage/innobase/dict/dict0load.cc", line=3141, 
    mtr=0x7ffc791e65b0, autoinc=0) at /Server/bb-10.5-MDEV-24693/storage/innobase/btr/btr0cur.cc:1609
#11 0x0000564aeb9e6096 in btr_pcur_open_low (index=0x6170000051a0, level=0, tuple=0x6140000058e0, mode=PAGE_CUR_GE, latch_mode=1, 
    cursor=0x7ffc791e6470, file=0x564aed127ec0 "/Server/bb-10.5-MDEV-24693/storage/innobase/dict/dict0load.cc", line=3141, autoinc=0, 
    mtr=0x7ffc791e65b0) at /Server/bb-10.5-MDEV-24693/storage/innobase/include/btr0pcur.ic:441
#12 0x0000564aeb9ea853 in btr_pcur_open_on_user_rec_func (index=0x6170000051a0, tuple=0x6140000058e0, mode=PAGE_CUR_GE, latch_mode=1, 
    cursor=0x7ffc791e6470, file=0x564aed127ec0 "/Server/bb-10.5-MDEV-24693/storage/innobase/dict/dict0load.cc", line=3141, mtr=0x7ffc791e65b0)
    at /Server/bb-10.5-MDEV-24693/storage/innobase/btr/btr0pcur.cc:652
#13 0x0000564aebaebf09 in dict_load_table_on_id (table_id=1300, ignore_err=DICT_ERR_IGNORE_RECOVER_LOCK)
    at /Server/bb-10.5-MDEV-24693/storage/innobase/dict/dict0load.cc:3141
#12 0x0000564aeb9ea853 in btr_pcur_open_on_user_rec_func (index=0x6170000051a0, tuple=0x6140000058e0, mode=PAGE_CUR_GE, latch_mode=1, 
    cursor=0x7ffc791e6470, file=0x564aed127ec0 "/Server/bb-10.5-MDEV-24693/storage/innobase/dict/dict0load.cc", line=3141, mtr=0x7ffc791e65b0)
    at /Server/bb-10.5-MDEV-24693/storage/innobase/btr/btr0pcur.cc:652
#13 0x0000564aebaebf09 in dict_load_table_on_id (table_id=1300, ignore_err=DICT_ERR_IGNORE_RECOVER_LOCK)
    at /Server/bb-10.5-MDEV-24693/storage/innobase/dict/dict0load.cc:3141
#14 0x0000564aebaa929e in dict_table_open_on_id_low (table_id=1300, ignore_err=DICT_ERR_IGNORE_RECOVER_LOCK, cached_only=false)
--Type <RET> for more, q to quit, c to continue without paging--
    at /Server/bb-10.5-MDEV-24693/storage/innobase/dict/dict0dict.cc:222
#15 0x0000564aebaabcb6 in dict_table_open_on_id (table_id=1300, dict_locked=false, table_op=DICT_TABLE_OP_LOAD_TABLESPACE, thd=0x0, mdl=0x0)
    at /Server/bb-10.5-MDEV-24693/storage/innobase/dict/dict0dict.cc:948
#16 0x0000564aeb91ed52 in trx_resurrect_table_locks (trx=0x7daf620f69b8, undo=0x6120000277d8)
    at /Server/bb-10.5-MDEV-24693/storage/innobase/trx/trx0trx.cc:615
#17 0x0000564aeb91ff4c in trx_resurrect (undo=0x6120000277d8, rseg=0x613000006418, start_time=1611693453, start_time_micro=1079325810688, 
    rows_to_undo=0x7ffc791e78c0, is_old_insert=false) at /Server/bb-10.5-MDEV-24693/storage/innobase/trx/trx0trx.cc:711
#18 0x0000564aeb920673 in trx_lists_init_at_db_start () at /Server/bb-10.5-MDEV-24693/storage/innobase/trx/trx0trx.cc:772
#19 0x0000564aeb873542 in srv_start (create_new_db=false) at /Server/bb-10.5-MDEV-24693/storage/innobase/srv/srv0start.cc:1479
#20 0x0000564aeb3cd78d in innodb_init (p=0x615000002b18) at /Server/bb-10.5-MDEV-24693/storage/innobase/handler/ha_innodb.cc:3909

thiru pointed out that this is not a sign of a real problem, because at the time when recv_recovery_from_checkpoint_start() returned to this thread, the last-but-one recovery batches had already been completed and writes to the redo log had been enabled. The problem merely is that we were missing the opportunity to clear this flag:

diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc
index 95179ec2271..f8b2cb95a3e 100644
--- a/storage/innobase/log/log0recv.cc
+++ b/storage/innobase/log/log0recv.cc
@@ -4001,6 +4001,7 @@ recv_recovery_from_checkpoint_start(lsn_t flush_lsn)
 	mutex_enter(&recv_sys->mutex);
 
 	recv_sys->apply_log_recs = TRUE;
+	recv_no_ibuf_operations = is_mariabackup_restore_or_export();
 
 	mutex_exit(&recv_sys->mutex);
 

This omission may have caused some corruption (MDEV-9663), because we could wrongly skip some change buffer merges:

		if (!access_time && !recv_no_ibuf_operations) {
			ibuf_merge_or_delete_for_page(
				block, page_id, page_size);
		}



 Comments   
Comment by Marko Mäkelä [ 2021-01-27 ]

I am not fully convinced that this could cause any corruption of user tables. We are only directly accessing the secondary indexes of SYS_TABLES during the last recovery batch.

It might be the case that some secondary index pages of user tables that are read into the buffer pool by read-ahead will miss change buffer merge and thus become corrupted. We only ever reproduced the assertion failure.

Comment by Marko Mäkelä [ 2021-01-28 ]

This bug is similar to MDEV-24449, which to my understanding could not cause any corruption in 10.5. This one could.

Comment by Alice Sherepa [ 2021-02-02 ]

I am getting this asserion on current 10.5 (which does not contain the fix yet), during recovery after crash.

10.5 b5dab19efaf9dfc

mysqld:/10.5/storage/innobase/ibuf/ibuf0ibuf.cc:968: bool ibuf_page_low(page_id_t, ulint, bool, const char*, unsigned int, mtr_t*): Assertion `!recv_no_ibuf_operations' failed.
210202 11:55:11 [ERROR] mysqld got signal 6 ;
 
linux/raise.c:51(__GI_raise)[0x7fc2d15267bb]
stdlib/abort.c:81(__GI_abort)[0x7fc2d1511535]
intl/loadmsgcat.c:1177(_nl_load_domain)[0x7fc2d151140f]
??:0(__assert_fail)[0x7fc2d151f102]
ibuf/ibuf0ibuf.cc:969(ibuf_page_low(page_id_t, unsigned long, bool, char const*, unsigned int, mtr_t*))[0x563794f07121]
buf/buf0buf.cc:2946(buf_page_get_low(page_id_t, unsigned long, unsigned long, buf_block_t*, unsigned long, char const*, unsigned int, mtr_t*, dberr_t*, bool))[0x5637953beaa1]
buf/buf0buf.cc:3479(buf_page_get_gen(page_id_t, unsigned long, unsigned long, buf_block_t*, unsigned long, char const*, unsigned int, mtr_t*, dberr_t*, bool))[0x5637953c0c6b]
ibuf/ibuf0ibuf.cc:694(ibuf_bitmap_get_map_page_func(page_id_t, unsigned long, char const*, unsigned int, mtr_t*))[0x563794f061ef]
ibuf/ibuf0ibuf.cc:4157(ibuf_page_exists(page_id_t, unsigned long))[0x563794f178ac]
buf/buf0buf.cc:3457(buf_page_get_gen(page_id_t, unsigned long, unsigned long, buf_block_t*, unsigned long, char const*, unsigned int, mtr_t*, dberr_t*, bool))[0x5637953c097e]
btr/btr0cur.cc:1609(btr_cur_search_to_nth_level_func(dict_index_t*, unsigned long, dtuple_t const*, page_cur_mode_t, unsigned long, btr_cur_t*, rw_lock_t*, char const*, unsigned int, mtr_t*, unsigned long))[0x563795351f42]
include/btr0pcur.ic:441(btr_pcur_open_low(dict_index_t*, unsigned long, dtuple_t const*, page_cur_mode_t, unsigned long, btr_pcur_t*, char const*, unsigned int, unsigned long, mtr_t*))[0x563795385d0b]
btr/btr0pcur.cc:652(btr_pcur_open_on_user_rec_func(dict_index_t*, dtuple_t const*, page_cur_mode_t, unsigned long, btr_pcur_t*, char const*, unsigned int, mtr_t*))[0x56379538a3e5]
dict/dict0load.cc:3141(dict_load_table_on_id(unsigned long, dict_err_ignore_t))[0x563795487ad4]
dict/dict0dict.cc:222(dict_table_open_on_id_low(unsigned long, dict_err_ignore_t, bool))[0x5637954459a5]
dict/dict0dict.cc:948(dict_table_open_on_id(unsigned long, bool, dict_table_op_t, THD*, MDL_ticket**))[0x563795448322]
trx/trx0trx.cc:615(trx_resurrect_table_locks(trx_t*, trx_undo_t const*))[0x5637952c32b4]
trx/trx0trx.cc:712(trx_resurrect(trx_undo_t*, trx_rseg_t*, long, unsigned long long, unsigned long*, bool))[0x5637952c443b]
trx/trx0trx.cc:772(trx_lists_init_at_db_start())[0x5637952c4b2a]
srv/srv0start.cc:1480(srv_start(bool))[0x56379521c103]
handler/ha_innodb.cc:3934(innodb_init(void*))[0x563794d89e09]
sql/handler.cc:645(ha_initialize_handlerton(st_plugin_int*))[0x5637942a4657]
sql/sql_plugin.cc:1459(plugin_initialize(st_mem_root*, st_plugin_int*, int*, char**, bool))[0x563793b00f03]
sql/sql_plugin.cc:1751(plugin_init(int*, char**, int))[0x563793b02dda]
sql/mysqld.cc:4915(init_server_components())[0x5637937d0626]
sql/mysqld.cc:5497(mysqld_main(int, char**))[0x5637937d25cb]
sql/main.cc:25(main)[0x5637937bc775]
csu/libc-start.c:342(__libc_start_main)[0x7fc2d151309b]
??:0(_start)[0x5637937bc69a]

Generated at Thu Feb 08 09:32:04 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.