[MDEV-25051] Race condition between persistent statistics and RENAME TABLE or TRUNCATE Created: 2021-03-04  Updated: 2021-03-04  Resolved: 2021-03-04

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.0, 10.1, 10.2, 10.3, 10.4, 10.5, 10.6
Fix Version/s: 10.2.38, 10.3.29, 10.4.19, 10.5.10

Type: Bug Priority: Major
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: ASAN, race, rr-profile-analyzed, upstream

Issue Links:
Relates
relates to MDEV-13564 TRUNCATE TABLE and undo tablespace tr... Closed

 Description   

We observed the following:

10.6 71d30d01aa1426183526f9bdbc6d2a718fac75ac

==ERROR: AddressSanitizer: heap-use-after-free on address 0x604000027fe8 at pc 0x560c718f8cbd bp 0x7c15337d4220 sp 0x7c15337d39c8
READ of size 5 at 0x604000027fe8 thread T29
    #0 0x560c718f8cbc  (/lib/x86_64-linux-gnu/libasan.so.5+0x74cbc)
    #1 0x560c6e43ea51 in dict_stats_update(dict_table_t*, dict_stats_upd_option_t) /Server/bb-10.6-MDEV-25016A/storage/innobase/dict/dict0stats.cc:3213
    #2 0x560c6e44b0cc in dict_stats_process_entry_from_recalc_pool /Server/bb-10.6-MDEV-25016A/storage/innobase/dict/dict0stats_bg.cc:374

At the same time, TRUNCATE TABLE was executing on the table. The reason for this heap-use-after-free is that the original table had been renamed to a temporary name, and that realloc() had freed the originally allocated memory:

#3  0x0000560c719920b2 in realloc () from /lib/x86_64-linux-gnu/libasan.so.5
#4  0x0000560c6e3f7d3a in ut_allocator<unsigned char, true>::reallocate (
    this=0x4a7451c5a170, ptr=0x604000027fe8, n_elements=655, autoevent_idx=9)
    at /Server/bb-10.6-MDEV-25016A/storage/innobase/include/ut0new.h:513
#5  0x0000560c6e3db2fb in dict_table_rename_in_cache (table=0x61800008f520, 
    new_name=0x4a7451c5cc30 "test/#sql-ib849", rename_also_foreigns=false, 
    replace_new_file=false)
    at /Server/bb-10.6-MDEV-25016A/storage/innobase/dict/dict0dict.cc:1681
#6  0x0000560c6e10ca9d in row_rename_table_for_mysql (
    old_name=0x4a7451c5ce70 "test/unrelated", 
    new_name=0x4a7451c5cc30 "test/#sql-ib849", trx=0x44b705a03d00, 
    commit=false, use_fk=false)
    at /Server/bb-10.6-MDEV-25016A/storage/innobase/row/row0mysql.cc:4420

The TRUNCATE TABLE itself was still waiting to drop the original table:

#3  0x0000560c6e44aa45 in dict_stats_wait_bg_to_stop_using_table (table=0x61800008f520, trx=0x44b705a03d00) at /Server/bb-10.6-MDEV-25016A/storage/innobase/dict/dict0stats_bg.cc:280
#4  0x0000560c6e105cd9 in row_drop_table_for_mysql (name=0x4a7451c5cc30 "test/#sql-ib849", trx=0x44b705a03d00, sqlcom=SQLCOM_TRUNCATE, create_failed=false, nonatomic=true) at /Server/bb-10.6-MDEV-25016A/storage/innobase/row/row0mysql.cc:3344

Because the persistent statistics code is not properly protected by MDL, the work-around dict_stats_wait_bg_to_stop_using_table() must be invoked in all code that is about to modify or free a table definition. Other DDL operations (including ALTER TABLE...DISCARD TABLESPACE) seem to do the right thing, but RENAME TABLE as well as the MDEV-13564 TRUNCATE TABLE are missing it.

It seems that table eviction from the dictionary cache is protected by the table reference count.


Generated at Thu Feb 08 09:34:45 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.