[MDEV-30941] ASAN: heap-use-after-free in std::__atomic_base from trx_t::commit_tables on XA COMMIT with pseudo_slave_mode enabled Created: 2023-03-28  Updated: 2023-11-28

Status: Confirmed
Project: MariaDB Server
Component/s: Replication, XA
Affects Version/s: 10.6, 10.7, 10.8, 10.9, 10.10, 10.11, 11.0
Fix Version/s: 10.6, 10.11, 11.0

Type: Bug Priority: Major
Reporter: Roel Van de Paar Assignee: Brandon Nesterenko
Resolution: Unresolved Votes: 0
Labels: ASAN, memory_corruption, not-10.5, regression-10.6, sporadic

Issue Links:
Relates
relates to MDEV-22733 XA PREPARE breaks MDL in pseudo_slave... Stalled
relates to MDEV-32257 Assertion `thd->is_killed()' failed i... Closed
relates to MDEV-32262 InnoDB: Failing assertion: m_prebuilt... Confirmed
relates to MDEV-32272 lock_release_on_prepare_try() does no... Closed

 Description   

Reduced testcase is lightly sporadic. Repeat if necessary. Original longer testcase was much more sporadic. i.e. sporadicity seems to be affected by other surrounding SQL.

CREATE TABLE t (c INT) ENGINE=InnoDB;
XA BEGIN 'x';
INSERT INTO t VALUES (0);
CREATE TEMPORARY TABLE IF NOT EXISTS t (c INT) ENGINE=InnoDB;
INSERT INTO t VALUES (0);
SET SESSION pseudo_slave_mode=TRUE;
XA END 'x';
XA PREPARE 'x';
ALTER TABLE t ENGINE=InnoDB;
XA COMMIT 'x';

Leads to:

11.0.2 a79abb6517f2fa68b48e61aa3354a0631e3a63f7 (Debug)

==1169874==ERROR: AddressSanitizer: heap-use-after-free on address 0x618000061388 at pc 0x5608d1fac498 bp 0x14a9b8959ad0 sp 0x14a9b8959ac0
WRITE of size 8 at 0x618000061388 thread T15
    #0 0x5608d1fac497 in std::__atomic_base<long>::store(long, std::memory_order) /usr/include/c++/11/bits/atomic_base.h:457
    #1 0x5608d1fac497 in Atomic_relaxed<long>::store(long, std::memory_order) /test/11.0_dbg_san/include/my_atomic_wrapper.h:47
    #2 0x5608d1fac497 in Atomic_relaxed<long>::operator=(long) /test/11.0_dbg_san/include/my_atomic_wrapper.h:49
    #3 0x5608d1fac497 in trx_t::commit_tables() /test/11.0_dbg_san/storage/innobase/trx/trx0trx.cc:1189
    #4 0x5608d1fac497 in trx_t::commit_in_memory(mtr_t const*) /test/11.0_dbg_san/storage/innobase/trx/trx0trx.cc:1294
    #5 0x5608d1fac497 in trx_t::commit_low(mtr_t*) /test/11.0_dbg_san/storage/innobase/trx/trx0trx.cc:1454
    #6 0x5608d1fada2a in trx_t::commit_persist() /test/11.0_dbg_san/storage/innobase/trx/trx0trx.cc:1468
    #7 0x5608d1fadc73 in trx_t::commit() /test/11.0_dbg_san/storage/innobase/trx/trx0trx.cc:1477
    #8 0x5608d1faea8c in trx_commit_for_mysql(trx_t*) /test/11.0_dbg_san/storage/innobase/trx/trx0trx.cc:1597
    #9 0x5608d179c376 in innobase_commit_low(trx_t*) /test/11.0_dbg_san/storage/innobase/handler/ha_innodb.cc:4267
    #10 0x5608d179d51f in innobase_commit_by_xid /test/11.0_dbg_san/storage/innobase/handler/ha_innodb.cc:17038
    #11 0x5608d0063e45 in xacommit_handlerton /test/11.0_dbg_san/sql/handler.cc:2326
    #12 0x5608cea536fd in plugin_foreach_with_mask(THD*, char (*)(THD*, st_plugin_int**, void*), int, unsigned int, void*) /test/11.0_dbg_san/sql/sql_plugin.cc:2515
    #13 0x5608d0088967 in ha_commit_or_rollback_by_xid(xid_t*, bool) /test/11.0_dbg_san/sql/handler.cc:2360
    #14 0x5608cfafec51 in trans_xa_commit(THD*) /test/11.0_dbg_san/sql/xa.cc:635
    #15 0x5608ce9aa4ce in mysql_execute_command(THD*, bool) /test/11.0_dbg_san/sql/sql_parse.cc:5869
    #16 0x5608ce9b55e6 in mysql_parse(THD*, char*, unsigned int, Parser_state*) /test/11.0_dbg_san/sql/sql_parse.cc:7999
    #17 0x5608ce9c537a in dispatch_command(enum_server_command, THD*, char*, unsigned int, bool) /test/11.0_dbg_san/sql/sql_parse.cc:1894
    #18 0x5608ce9d317f in do_command(THD*, bool) /test/11.0_dbg_san/sql/sql_parse.cc:1407
    #19 0x5608cf397459 in do_handle_one_connection(CONNECT*, bool) /test/11.0_dbg_san/sql/sql_connect.cc:1416
    #20 0x5608cf398974 in handle_one_connection /test/11.0_dbg_san/sql/sql_connect.cc:1318
    #21 0x14a9db38db42 in start_thread nptl/pthread_create.c:442
    #22 0x14a9db41f9ff  (/lib/x86_64-linux-gnu/libc.so.6+0x1269ff)
 
0x618000061388 is located 776 bytes inside of 808-byte region [0x618000061080,0x6180000613a8)
freed by thread T15 here:
    #0 0x5608ce03afe7 in __interceptor_free (/test/UBASAN_MD250323-mariadb-11.0.2-linux-x86_64-dbg/bin/mariadbd+0x7929fe7)
    #1 0x5608d1afcca3 in mem_heap_block_free(mem_block_info_t*, mem_block_info_t*) /test/11.0_dbg_san/storage/innobase/mem/mem0mem.cc:416
    #2 0x5608d22b50e8 in mem_heap_free /test/11.0_dbg_san/storage/innobase/include/mem0mem.inl:419
    #3 0x5608d22b50e8 in dict_mem_table_free(dict_table_t*) /test/11.0_dbg_san/storage/innobase/dict/dict0mem.cc:234
    #4 0x5608d2248d70 in dict_sys_t::remove(dict_table_t*, bool, bool) /test/11.0_dbg_san/storage/innobase/dict/dict0dict.cc:1954
    #5 0x5608d1825778 in ha_innobase::delete_table(char const*) /test/11.0_dbg_san/storage/innobase/handler/ha_innodb.cc:13362
    #6 0x5608d008db2f in hton_drop_table /test/11.0_dbg_san/sql/handler.cc:578
    #7 0x5608cfaba795 in THD::rm_temporary_table(handlerton*, char const*) /test/11.0_dbg_san/sql/temporary_tables.cc:706
    #8 0x5608cfabd86b in THD::free_tmp_table_share(TMP_TABLE_SHARE*, bool) /test/11.0_dbg_san/sql/temporary_tables.cc:1477
    #9 0x5608cfaca31f in THD::drop_temporary_table(TABLE*, bool*, bool) /test/11.0_dbg_san/sql/temporary_tables.cc:674
    #10 0x5608cf0559cb in mysql_alter_table(THD*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, Table_specification_st*, TABLE_LIST*, Recreate_info*, Alter_info*, unsigned int, st_order*, bool, bool) /test/11.0_dbg_san/sql/sql_table.cc:11071
    #11 0x5608cf3c8595 in Sql_cmd_alter_table::execute(THD*) /test/11.0_dbg_san/sql/sql_alter.cc:558
    #12 0x5608ce9abcc7 in mysql_execute_command(THD*, bool) /test/11.0_dbg_san/sql/sql_parse.cc:6003
    #13 0x5608ce9b55e6 in mysql_parse(THD*, char*, unsigned int, Parser_state*) /test/11.0_dbg_san/sql/sql_parse.cc:7999
    #14 0x5608ce9c537a in dispatch_command(enum_server_command, THD*, char*, unsigned int, bool) /test/11.0_dbg_san/sql/sql_parse.cc:1894
    #15 0x5608ce9d317f in do_command(THD*, bool) /test/11.0_dbg_san/sql/sql_parse.cc:1407
    #16 0x5608cf397459 in do_handle_one_connection(CONNECT*, bool) /test/11.0_dbg_san/sql/sql_connect.cc:1416
    #17 0x5608cf398974 in handle_one_connection /test/11.0_dbg_san/sql/sql_connect.cc:1318
    #18 0x14a9db38db42 in start_thread nptl/pthread_create.c:442
 
previously allocated by thread T15 here:
    #0 0x5608ce03b337 in __interceptor_malloc (/test/UBASAN_MD250323-mariadb-11.0.2-linux-x86_64-dbg/bin/mariadbd+0x792a337)
    #1 0x5608d1afd00c in mem_heap_create_block_func(mem_block_info_t*, unsigned long, char const*, unsigned int, unsigned long) /test/11.0_dbg_san/storage/innobase/mem/mem0mem.cc:277
    #2 0x5608d1afdb8f in mem_heap_add_block(mem_block_info_t*, unsigned long) /test/11.0_dbg_san/storage/innobase/mem/mem0mem.cc:378
    #3 0x5608d22a0efb in mem_heap_alloc /test/11.0_dbg_san/storage/innobase/include/mem0mem.inl:193
    #4 0x5608d22acf3e in mem_heap_zalloc /test/11.0_dbg_san/storage/innobase/include/mem0mem.inl:162
    #5 0x5608d22acf3e in dict_table_t::create(st_::span<char const> const&, fil_space_t*, unsigned long, unsigned long, unsigned long, unsigned long) /test/11.0_dbg_san/storage/innobase/dict/dict0mem.cc:148
    #6 0x5608d185b082 in create_table_info_t::create_table_def() (/test/UBASAN_MD250323-mariadb-11.0.2-linux-x86_64-dbg/bin/mariadbd+0xb14a082)
    #7 0x5608d1835bd6 in create_table_info_t::create_table(bool) /test/11.0_dbg_san/storage/innobase/handler/ha_innodb.cc:12639
    #8 0x5608d18381d4 in ha_innobase::create(char const*, TABLE*, HA_CREATE_INFO*, bool, trx_t*) /test/11.0_dbg_san/storage/innobase/handler/ha_innodb.cc:13145
    #9 0x5608d1839764 in ha_innobase::create(char const*, TABLE*, HA_CREATE_INFO*) /test/11.0_dbg_san/storage/innobase/handler/ha_innodb.cc:13189
    #10 0x5608d00bfbce in handler::ha_create(char const*, TABLE*, HA_CREATE_INFO*) /test/11.0_dbg_san/sql/handler.cc:5656
    #11 0x5608d00c53c4 in ha_create_table(THD*, char const*, char const*, char const*, HA_CREATE_INFO*, st_mysql_const_unsigned_lex_string*, bool) /test/11.0_dbg_san/sql/handler.cc:6125
    #12 0x5608cf02139f in create_table_impl /test/11.0_dbg_san/sql/sql_table.cc:4671
    #13 0x5608cf022821 in mysql_create_table_no_lock(THD*, st_ddl_log_state*, st_ddl_log_state*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, Table_specification_st*, Alter_info*, bool*, int, TABLE_LIST*) /test/11.0_dbg_san/sql/sql_table.cc:4772
    #14 0x5608cf02e08f in mysql_create_table /test/11.0_dbg_san/sql/sql_table.cc:4888
    #15 0x5608cf02e08f in Sql_cmd_create_table_like::execute(THD*) /test/11.0_dbg_san/sql/sql_table.cc:12479
    #16 0x5608ce9abcc7 in mysql_execute_command(THD*, bool) /test/11.0_dbg_san/sql/sql_parse.cc:6003
    #17 0x5608ce9b55e6 in mysql_parse(THD*, char*, unsigned int, Parser_state*) /test/11.0_dbg_san/sql/sql_parse.cc:7999
    #18 0x5608ce9c537a in dispatch_command(enum_server_command, THD*, char*, unsigned int, bool) /test/11.0_dbg_san/sql/sql_parse.cc:1894
    #19 0x5608ce9d317f in do_command(THD*, bool) /test/11.0_dbg_san/sql/sql_parse.cc:1407
    #20 0x5608cf397459 in do_handle_one_connection(CONNECT*, bool) /test/11.0_dbg_san/sql/sql_connect.cc:1416
    #21 0x5608cf398974 in handle_one_connection /test/11.0_dbg_san/sql/sql_connect.cc:1318
    #22 0x14a9db38db42 in start_thread nptl/pthread_create.c:442
 
Thread T15 created by T0 here:
    #0 0x5608cdfdf175 in __interceptor_pthread_create (/test/UBASAN_MD250323-mariadb-11.0.2-linux-x86_64-dbg/bin/mariadbd+0x78ce175)
    #1 0x5608ce09567b in create_thread_to_handle_connection(CONNECT*) /test/11.0_dbg_san/sql/mysqld.cc:6125
    #2 0x5608ce0a2c6b in create_new_thread(CONNECT*) /test/11.0_dbg_san/sql/mysqld.cc:6184
    #3 0x5608ce0a34c7 in handle_accepted_socket(st_mysql_socket, st_mysql_socket) /test/11.0_dbg_san/sql/mysqld.cc:6246
    #4 0x5608ce0a4518 in handle_connections_sockets() /test/11.0_dbg_san/sql/mysqld.cc:6370
    #5 0x5608ce0abc8e in mysqld_main(int, char**) /test/11.0_dbg_san/sql/mysqld.cc:6020
    #6 0x5608ce080eca in main /test/11.0_dbg_san/sql/main.cc:34
    #7 0x14a9db322d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
 
SUMMARY: AddressSanitizer: heap-use-after-free /usr/include/c++/11/bits/atomic_base.h:457 in std::__atomic_base<long>::store(long, std::memory_order)
Shadow bytes around the buggy address:
  0x0c3080004220: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c3080004230: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c3080004240: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c3080004250: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c3080004260: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
=>0x0c3080004270: fd[fd]fd fd fd fa fa fa fa fa fa fa fa fa fa fa
  0x0c3080004280: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c3080004290: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c30800042a0: f7 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c30800042b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c30800042c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==1169874==ABORTING

Setup:

Compiled with GCC >=7.5.0 (I use GCC 11.3.0) and:
    -DWITH_ASAN=ON -DWITH_ASAN_SCOPE=ON -DWITH_UBSAN=ON -DWITH_RAPID=OFF -DWSREP_LIB_WITH_ASAN=ON
Set before execution:
    export ASAN_OPTIONS=quarantine_size_mb=512:atexit=0:detect_invalid_pointer_pairs=3:dump_instruction_bytes=1:abort_on_error=1:allocator_may_return_null=1

Bug confirmed present in:
MariaDB: 10.6.13 (dbg), 10.6.13 (opt), 10.7.8 (dbg), 10.7.8 (opt), 10.8.8 (dbg), 10.8.8 (opt), 10.9.6 (dbg), 10.9.6 (opt), 10.10.4 (dbg), 10.10.4 (opt), 10.11.3 (dbg), 10.11.3 (opt), 11.0.2 (dbg), 11.0.2 (opt)

Bug (or feature/syntax) confirmed not present in:
MariaDB: 10.3.39 (dbg), 10.3.39 (opt), 10.4.29 (dbg), 10.4.29 (opt), 10.5.20 (dbg), 10.5.20 (opt)



 Comments   
Comment by Andrei Elkin [ 2023-03-29 ]

Salve Roel.
Thanks for the finding!

This case feels to fall into MDEV-22733 XA PREPARE breaks MDL in pseudo_slave_mode=1 group for which I still need to find time to complete.

The common part is a DDL that runs after XA PREPARE which should not be allowed at all
just as your description mentions

10.4.14>TRUNCATE TABLE t;
 
ERROR 1399 (XAE07): XAER_RMFAIL: The command cannot be executed when global transaction is in the  PREPARED state

Could you please test both TRUNCATE and ALTER when they are run by a separate connection?
Specifically for ALTER, I hope, the memory issue won't show up.

Good luck !

/ndrei

Comment by Roel Van de Paar [ 2023-03-29 ]

Hi Elkin

When the full testcase is executed with only ALTER TABLE t ENGINE=InnoDB; executed in a secondary connection we see ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction in that secondary client connection, and no ASAN issue shows even after finalizing the XA COMMIT 'x'; in the first session and shutting down.

When XA COMMIT 'x'; is executed in the primary client connection before the lock timeout happens, both the XA commit in the primary as well as the ALTER TABLE in the secondary connection, succeed.

For the TRUNCATE test ref here.

Comment by Andrei Elkin [ 2023-03-30 ]

The external ALTER wait behaviour is correct.
Apparently both cases of running the local commands while XA in the prepared state should be fixed with restoring the 10.4.14 XAER_RMFAIL.
Maybe bnestere will find time to fix it in this release term.

Generated at Thu Feb 08 10:20:03 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.