[MDEV-22611] Assertion btr_search_enabled failed during buffer pool resizing Created: 2020-05-18  Updated: 2021-09-30  Resolved: 2020-05-18

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.2, 10.3, 10.4, 10.5
Fix Version/s: 10.5.4, 10.2.33, 10.3.24, 10.4.14

Type: Bug Priority: Blocker
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: regression

Issue Links:
Problem/Incident
is caused by MDEV-22456 Dropping the adaptive hash index may ... Closed

 Description   

After I merged the MDEV-22456 changes to 10.4, I observed a crash during InnoDB buffer pool resizing.

It is unclear whether this is a true regression of MDEV-22456. Here are two examples:

10.4 66f1e288a12c96b1306e204cca37ffee09e97a64

CURRENT_TEST: innodb.innodb_buffer_pool_resize_with_chunks
mysqltest: In included file "./include/wait_condition.inc": 
included from /mariadb/10.4/mysql-test/suite/innodb/t/innodb_buffer_pool_resize_with_chunks.test at line 45:
At line 54: query 'let $success= `$wait_condition`' failed with wrong errno 2013: 'Lost connection to MySQL server during query', instead of 0...
2020-05-18 12:07:31 9 [Note] InnoDB: Resizing buffer pool from 838860 (new size: 16777216 bytes)
2020-05-18 12:07:31 0 [Note] InnoDB: Disabling adaptive hash index.
2020-05-18 12:07:31 0 [Note] InnoDB: disabled adaptive hash index.
2020-05-18 12:07:31 0 [Note] InnoDB: Withdrawing blocks to be shrunken.
2020-05-18 12:07:31 0 [Note] InnoDB: Latching whole of buffer pool.
2020-05-18 12:07:31 0 [Note] InnoDB: buffer pool 0 : resizing with chunks 4 to 8.
mysqld: /mariadb/10.4/storage/innobase/ha/ha0ha.cc:233: ulint ha_insert_for_fold_func(hash_table_t *, ulint, buf_block_t *, const rec_t *): Assertion `btr_search_enabled' failed.
200518 12:07:31 [ERROR] mysqld got signal 6 ;
#7  0x00005636497fd0a9 in ha_insert_for_fold_func (table=0x56364cb75200, fold=3381765660571125693, block=0x7fda948f1d98, data=0x7fda94a3f529 "\200\004x\250") at /mariadb/10.4/storage/innobase/ha/ha0ha.cc:233
#8  0x0000563649710c9a in btr_search_update_hash_ref (info=<optimized out>, block=<optimized out>, cursor=<optimized out>) at /mariadb/10.4/storage/innobase/btr/btr0sea.cc:722
#9  btr_search_info_update_slow (info=<optimized out>, cursor=<optimized out>) at /mariadb/10.4/storage/innobase/btr/btr0sea.cc:1580
#10 0x00005636496ddbfd in btr_search_info_update (index=0x7fda4011b740, cursor=0x56364ccc80c8) at /mariadb/10.4/storage/innobase/include/btr0sea.ic:84
#11 btr_cur_search_to_nth_level_func (index=<optimized out>, level=0, tuple=<optimized out>, mode=<optimized out>, latch_mode=2, cursor=<optimized out>, ahi_latch=0x0, file=0x563649c74ef5 "/mariadb/10.4/storage/innobase/row/row0row.cc", line=1212, mtr=0x7fda80ff8298, autoinc=0) at /mariadb/10.4/storage/innobase/btr/btr0cur.cc:2477
#12 0x000056364960bf76 in btr_pcur_open_low (index=<optimized out>, level=0, tuple=0x7fda687d0270, mode=PAGE_CUR_LE, latch_mode=2, cursor=0x56364ccc80c8, file=0x0, line=1212, autoinc=0, mtr=0x8) at /mariadb/10.4/storage/innobase/include/btr0pcur.ic:441
#13 0x000056364960aa13 in row_search_on_row_ref (pcur=0x56364ccc80c8, mode=2, table=0x7fda40118660, ref=0x7fda687d0270, mtr=0x7fda80ff8298) at /mariadb/10.4/storage/innobase/row/row0row.cc:1212
#14 0x00005636495f8e7b in row_purge_reposition_pcur (mode=2, node=0x56364ccc8020, mtr=0x7fda80ff8298) at /mariadb/10.4/storage/innobase/row/row0purge.cc:78
#15 0x00005636495fc541 in row_purge_reset_trx_id (node=0x56364ccc8020, mtr=0x7fda80ff8298) at /mariadb/10.4/storage/innobase/row/row0purge.cc:798

and another, from the same test run:

10.4 66f1e288a12c96b1306e204cca37ffee09e97a64

CURRENT_TEST: innodb.innodb_buffer_pool_resize
mysqltest: In included file "./include/wait_condition.inc": 
included from /mariadb/10.4/mysql-test/suite/innodb/t/innodb_buffer_pool_resize.test at line 58:
At line 54: query 'let $success= `$wait_condition`' failed with wrong errno 2013: 'Lost connection to MySQL server during query', instead of 0...
2020-05-18 12:07:47 0 [Note] InnoDB: buffer pool 0 : 1 chunks (1786 blocks) were freed.
2020-05-18 12:07:47 0 [Note] InnoDB: Completed to resize buffer pool from 16777216 to 8388608.
2020-05-18 12:07:47 0 [Note] InnoDB: Re-enabled adaptive hash index.
2020-05-18 12:07:47 0 [Note] InnoDB: Completed resizing buffer pool at 200518 12:07:47.
mysqld: /mariadb/10.4/storage/innobase/ha/ha0ha.cc:233: ulint ha_insert_for_fold_func(hash_table_t *, ulint, buf_block_t *, const rec_t *): Assertion `btr_search_enabled' failed.
#7  0x0000560a71d540a9 in ha_insert_for_fold_func (table=0x560a743a7200, fold=3381765663485158439, block=0x7fd0496da8b0, data=0x7fd049adf5ab "\200\004~\b") at /mariadb/10.4/storage/innobase/ha/ha0ha.cc:233
#8  0x0000560a71c67c9a in btr_search_update_hash_ref (info=<optimized out>, block=<optimized out>, cursor=<optimized out>) at /mariadb/10.4/storage/innobase/btr/btr0sea.cc:722
#9  btr_search_info_update_slow (info=<optimized out>, cursor=<optimized out>) at /mariadb/10.4/storage/innobase/btr/btr0sea.cc:1580
#10 0x0000560a71c34bfd in btr_search_info_update (index=0x7fcff01555b0, cursor=0x560a744797f8) at /mariadb/10.4/storage/innobase/include/btr0sea.ic:84
#11 btr_cur_search_to_nth_level_func (index=<optimized out>, level=0, tuple=<optimized out>, mode=<optimized out>, latch_mode=2, cursor=<optimized out>, ahi_latch=0x0, file=0x560a721cbef5 "/mariadb/10.4/storage/innobase/row/row0row.cc", line=1212, mtr=0x7fd02dffa298, autoinc=0) at /mariadb/10.4/storage/innobase/btr/btr0cur.cc:2477
#12 0x0000560a71b62f76 in btr_pcur_open_low (index=<optimized out>, level=0, tuple=0x7fd0247ceec0, mode=PAGE_CUR_LE, latch_mode=2, cursor=0x560a744797f8, file=0x0, line=1212, autoinc=0, mtr=0x8) at /mariadb/10.4/storage/innobase/include/btr0pcur.ic:441
#13 0x0000560a71b61a13 in row_search_on_row_ref (pcur=0x560a744797f8, mode=2, table=0x7fcff01185d0, ref=0x7fd0247ceec0, mtr=0x7fd02dffa298) at /mariadb/10.4/storage/innobase/row/row0row.cc:1212
#14 0x0000560a71b4fe7b in row_purge_reposition_pcur (mode=2, node=0x560a74479750, mtr=0x7fd02dffa298) at /mariadb/10.4/storage/innobase/row/row0purge.cc:78
#15 0x0000560a71b53541 in row_purge_reset_trx_id (node=0x560a74479750, mtr=0x7fd02dffa298) at /mariadb/10.4/storage/innobase/row/row0purge.cc:798

It looks like we should re-evaluate btr_search_enabled after acquiring the adaptive hash index latch.



 Comments   
Comment by Marko Mäkelä [ 2020-05-18 ]

This is reasonably repeatable with

./mtr --parallel=auto --big-test --repeat=100 innodb.innodb_buffer_pool_resize_with_chunks innodb.innodb_buffer_pool_resize

I checked my previous test log, and it looks like the server was restarted immediately before running either of these tests.

Comment by Marko Mäkelä [ 2020-05-18 ]

I confirm that this is a regression of MDEV-22456. Before that, we invoked btr_search_update_hash_ref() while the adaptive hash index latch was already being held.
mleich, can you please test the adaptive hash index and the buffer pool resizing under RQG with the following patch on 10.4 or 10.2?

diff --git a/storage/innobase/btr/btr0sea.cc b/storage/innobase/btr/btr0sea.cc
index fbd3d5a4b1f..d09cbd8943d 100644
--- a/storage/innobase/btr/btr0sea.cc
+++ b/storage/innobase/btr/btr0sea.cc
@@ -685,7 +685,7 @@ btr_search_update_hash_ref(
 
 	dict_index_t* index = block->index;
 
-	if (!index) {
+	if (!index || !info->n_hash_potential) {
 		return;
 	}
 
@@ -695,7 +695,7 @@ btr_search_update_hash_ref(
 	rw_lock_t* const latch = btr_get_search_latch(index);
 	rw_lock_x_lock(latch);
 
-	if ((info->n_hash_potential > 0)
+	if (block->index
 	    && (block->curr_n_fields == info->n_fields)
 	    && (block->curr_n_bytes == info->n_bytes)
 	    && (block->curr_left_side == info->left_side)) {

Comment by Matthias Leich [ 2020-05-18 ]

I was able to replay that bug with RQG on 10.4 commit ea912d1605b463a16a0c186bf10e61c4af519439 compiled with debug.
perl rqg.pl \ 
--mysqld=--loose-innodb_lock_schedule_algorithm=fcfs \
--grammar=conf/mariadb/table_stress_innodb.yy \
--gendata=conf/mariadb/table_stress.zz \
--gendata_sql=conf/mariadb/table_stress.sql \
--engine=Innodb \
--reporters=Deadlock1,ErrorLog,Backtrace \
--mysqld=--connect_timeout=60 \
--mysqld=--net_read_timeout=30 \
--mysqld=--net_write_timeout=60 \
--mysqld=--loose-idle_readonly_transaction_timeout=0 \
--mysqld=--loose-idle_transaction_timeout=0 \
--mysqld=--loose-idle_write_transaction_timeout=0 \
--mysqld=--interactive_timeout=28800 \
--mysqld=--lock_wait_timeout=86400 \
--mysqld=--innodb-lock-wait-timeout=50 \
--mysqld=--loose-table_lock_wait_timeout=50 \
--mysqld=--wait_timeout=28800 \
--mysqld=--slave_net_timeout=60 \
--mysqld=--loose-max-statement-time=30 \
--mysqld=--loose-debug_assert_on_not_freed_memory=0 \
--mysqld=--log-output=none \
--duration=300 \
--seed=random \
--sqltrace=MarkErrors \
--threads=1 \
 --mysqld=--innodb_adaptive_hash_index=ON \
--mysqld=--innodb_page_size=4K \
--mysqld=--innodb-buffer-pool-size=8M \
--no_mask \
--workdir=<local settings> \
--vardir=<local settings> \
--mtr-build-thread=<local settings>3 \
--basedir1=<local settings> \
--basedir2=<local settings> \
--script_debug=_nix_ \
--rr=Server
But only one of 282 tests hit that problem.

Comment by Matthias Leich [ 2020-05-18 ]

The bug did no more show up after applying the patch.

Generated at Thu Feb 08 09:16:05 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.