[MDEV-30148] Race condition between non-persistent statistics and purge of InnoDB history Created: 2022-12-02  Updated: 2022-12-06  Resolved: 2022-12-05

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.6, 10.7, 10.8, 10.9, 10.10, 10.11
Fix Version/s: 10.11.2, 10.6.12, 10.7.8, 10.8.7, 10.9.5, 10.10.3

Type: Bug Priority: Critical
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: crash, race, rr-profile-analyzed, statistics

Issue Links:
Relates
relates to MDEV-21136 InnoDB's records_in_range estimates c... Closed
relates to MDEV-29694 Remove the InnoDB change buffer Closed

 Description   

mleich provided an rr record trace where several btr_estimate_number_of_different_key_vals() are accessing the same innodb_page_size=64k page while a purge thread was holding an exclusive page latch and executing a page reorganize:

Thread 7 (Thread 511063.511299 (mariadbd)):
#0  0x0000560e2dbac68c in page_offset (ptr=0x7f4f15ba02fa) at /data/Server/bb-10.11-MDEV-29694/storage/innobase/include/page0page.h:216
#1  page_cur_insert_rec_low (cur=cur@entry=0x7f4f024406a0, rec=<optimized out>, offsets=offsets@entry=0x7f4f024406e0, mtr=mtr@entry=0x7f4f02441c50) at /data/Server/bb-10.11-MDEV-29694/storage/innobase/page/page0cur.cc:1479
#2  0x0000560e2dbc3c44 in page_copy_rec_list_end_no_locks (new_block=new_block@entry=0x7f4f1596fba0, block=block@entry=0x7f4f14d6ed00, rec=<optimized out>, index=index@entry=0x6160007d1a08, mtr=mtr@entry=0x7f4f02441c50) at /data/Server/bb-10.11-MDEV-29694/storage/innobase/page/page0page.cc:477
#3  0x0000560e2ded41a9 in btr_page_reorganize_low (cursor=cursor@entry=0x7f4f02440b40, mtr=mtr@entry=0x7f4f02441c50) at /data/Server/bb-10.11-MDEV-29694/storage/innobase/btr/btr0btr.cc:1174
#4  0x0000560e2ded730f in btr_page_reorganize_block (z_level=<optimized out>, block=block@entry=0x7f4f1596fba0, index=index@entry=0x6160007d1a08, mtr=mtr@entry=0x7f4f02441c50) at /data/Server/bb-10.11-MDEV-29694/storage/innobase/btr/btr0btr.cc:1417
#5  0x0000560e2ded78ce in btr_can_merge_with_page (cursor=cursor@entry=0x7f4f02441980, page_no=page_no@entry=33, merge_block=merge_block@entry=0x7f4f02440d10, mtr=mtr@entry=0x7f4f02441c50) at /data/Server/bb-10.11-MDEV-29694/storage/innobase/btr/btr0btr.cc:5071
#6  0x0000560e2deec714 in btr_compress (cursor=cursor@entry=0x7f4f02441980, adjust=adjust@entry=false, mtr=mtr@entry=0x7f4f02441c50) at /data/Server/bb-10.11-MDEV-29694/storage/innobase/btr/btr0btr.cc:3421
#7  0x0000560e2df25e1f in btr_cur_compress_if_useful (cursor=cursor@entry=0x7f4f02441980, adjust=adjust@entry=false, mtr=mtr@entry=0x7f4f02441c50) at /data/Server/bb-10.11-MDEV-29694/storage/innobase/btr/btr0cur.cc:4624
#8  0x0000560e2df448f4 in btr_cur_pessimistic_delete (err=err@entry=0x7f4f02441890, has_reserved_extents=has_reserved_extents@entry=0, cursor=cursor@entry=0x7f4f02441980, flags=flags@entry=0, rollback=rollback@entry=false, mtr=mtr@entry=0x7f4f02441c50) at /data/Server/bb-10.11-MDEV-29694/storage/innobase/btr/btr0cur.cc:5053
#9  0x0000560e2dd4b30c in row_purge_remove_sec_if_poss_tree (node=node@entry=0x61a00000cd08, index=index@entry=0x6160007d1a08, entry=entry@entry=0x619000046608) at /data/Server/bb-10.11-MDEV-29694/storage/innobase/row/row0purge.cc:392

The crashing thread would have been blocked if it had been holding any page latch. It is only holding a buffer-fix:

#5  0x0000560e2deb6f21 in ut_dbg_assertion_failed (
    expr=expr@entry=0x560e2ede6380 "page_offset(rec) <= page_header_get_field(page, PAGE_HEAP_TOP)", 
    file=file@entry=0x560e2ede6240 "/data/Server/bb-10.11-MDEV-29694/storage/innobase/include/page0page.inl", line=line@entry=310)
    at /data/Server/bb-10.11-MDEV-29694/storage/innobase/ut/ut0dbg.cc:60
#6  0x0000560e2e0b2eea in page_rec_check (rec=0x7f4f15ba24f3 "")
    at /data/Server/bb-10.11-MDEV-29694/storage/innobase/include/page0page.inl:310
#7  page_rec_is_supremum (rec=0x7f4f15ba24f3 "")
    at /data/Server/bb-10.11-MDEV-29694/storage/innobase/include/page0page.inl:165
#8  btr_estimate_number_of_different_key_vals (
    index=index@entry=0x6160007d1a08, bulk_trx_id=<optimized out>)
    at /data/Server/bb-10.11-MDEV-29694/storage/innobase/dict/dict0stats.cc:1378
#9  0x0000560e2e0b4866 in dict_stats_update_transient_for_index (
    index=index@entry=0x6160007d1a08)
    at /data/Server/bb-10.11-MDEV-29694/storage/innobase/dict/dict0stats.cc:1573
#20 0x0000560e2c64e5b0 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x62b0001ab218, packet=packet@entry=0x6290015cc219 "INSERT IGNORE INTO `oltp3` ( `id`, `k`) VALUES ( NULL, 245760000 ) /* E_R Thread4 QNO 507 CON_ID 19 */ ", packet_length=packet_length@entry=103, blocking=blocking@entry=true) at /data/Server/bb-10.11-MDEV-29694/sql/sql_parse.cc:1894

It turns out that there is a lot of dead or unnecessary code in btr_cur_open_at_rnd_pos(). Each caller only requires a shared latch to be held on the returned leaf page.

MDEV-21136 fixed something similar, but the fix did not touch this code.

We have not tested if older versions are affected by this. It is possible or even likely, but the fix would be hard to port, because it would depend on MDEV-29603, which is only in 10.6.



 Comments   
Comment by Vladislav Lesin [ 2022-12-05 ]

The fix looks good to me. btr_cur_t::open_random_leaf() looks much simpler than btr_cur_open_at_rnd_pos().

Generated at Thu Feb 08 10:14:01 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.