[MDEV-15791] XA: Server crashes in lock_release upon closing connection Created: 2018-04-06  Updated: 2018-05-07  Resolved: 2018-05-07

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB, XA
Affects Version/s: 10.0, 10.1, 10.2, 10.3
Fix Version/s: 10.2.15, 10.3.7

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Marko Mäkelä
Resolution: Duplicate Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-14693 XA: Assertion `!clust_index->online_l... Closed
relates to MDEV-14825 Assertion `col->ord_part' in row_buil... Closed
relates to MDEV-15030 Add ASAN instrumentation Closed
relates to MDEV-15490 XA: ASAN heap-use-after-free or valgr... Closed

 Description   

Note: the test case is non-deterministic, run with --repeat=N.

--source include/have_innodb.inc
CREATE TABLE t1 (col1 INT, col2 INT) ENGINE = InnoDB;
INSERT INTO t1 VALUES (1,1);
 
--connect(con1,localhost,root,,test)
XA BEGIN 'xid';
INSERT INTO t1 VALUES (1,0);
 
--connection default
--send
DELETE FROM t1;
 
--connection con1
--send
DELETE FROM t1;
 
--connect(con2,localhost,root,,test)
SET innodb_lock_wait_timeout= 1;
--send
ALTER TABLE t1 ADD UNIQUE KEY uidx(col1);
 
--connection con1
--error 0,ER_LOCK_DEADLOCK
--reap
--error 0,ER_DUP_ENTRY
INSERT INTO t1 VALUES (2,2),(2,2);
--error ER_XAER_RMFAIL
ALTER TABLE t1 FORCE;
UPDATE t1 SET col2 = 2;
 
# Cleanup
--disconnect con1
--connection con2
--error 0,ER_LOCK_WAIT_TIMEOUT,ER_DUP_ENTRY
--reap
--disconnect con2
--connection default
--error 0,ER_LOCK_DEADLOCK
--reap
DROP TABLE t1;

10.0 6aff5fa27ae863670608ae88b134453fe53c3e17

#3  <signal handler called>
#4  0x00007fcd2cb9cb55 in lock_release (trx=0x7fcd20832478) at /data/src/10.0/storage/innobase/lock/lock0lock.cc:4803
#5  0x00007fcd2cba2095 in lock_trx_release_locks (trx=0x7fcd20832478) at /data/src/10.0/storage/innobase/lock/lock0lock.cc:7047
#6  0x00007fcd2ccb9c9c in trx_commit_in_memory (trx=0x7fcd20832478, lsn=1652367) at /data/src/10.0/storage/innobase/trx/trx0trx.cc:1182
#7  0x00007fcd2ccba6f0 in trx_commit_low (trx=0x7fcd20832478, mtr=0x7fcd36312660) at /data/src/10.0/storage/innobase/trx/trx0trx.cc:1389
#8  0x00007fcd2ccba761 in trx_commit (trx=0x7fcd20832478) at /data/src/10.0/storage/innobase/trx/trx0trx.cc:1410
#9  0x00007fcd2ccaf3a4 in trx_rollback_finish (trx=0x7fcd20832478) at /data/src/10.0/storage/innobase/trx/trx0roll.cc:1339
#10 0x00007fcd2ccac946 in trx_rollback_to_savepoint_low (trx=0x7fcd20832478, savept=0x0) at /data/src/10.0/storage/innobase/trx/trx0roll.cc:114
#11 0x00007fcd2ccacc7f in trx_rollback_for_mysql_low (trx=0x7fcd20832478) at /data/src/10.0/storage/innobase/trx/trx0roll.cc:169
#12 0x00007fcd2ccacfa9 in trx_rollback_for_mysql (trx=0x7fcd20832478) at /data/src/10.0/storage/innobase/trx/trx0roll.cc:200
#13 0x00007fcd2cb41728 in innobase_rollback (hton=0x7fcd2ed11270, thd=0x7fcd2878a070, rollback_trx=true) at /data/src/10.0/storage/innobase/handler/ha_innodb.cc:4018
#14 0x000000000083ce93 in ha_rollback_trans (thd=0x7fcd2878a070, all=true) at /data/src/10.0/sql/handler.cc:1644
#15 0x000000000077cd90 in trans_rollback (thd=0x7fcd2878a070) at /data/src/10.0/sql/transaction.cc:309
#16 0x0000000000610769 in THD::cleanup (this=0x7fcd2878a070) at /data/src/10.0/sql/sql_class.cc:1536
#17 0x000000000058f620 in thd_cleanup (thd=0x7fcd2878a070) at /data/src/10.0/sql/mysqld.cc:2633
#18 0x000000000058f741 in unlink_thd (thd=0x7fcd2878a070) at /data/src/10.0/sql/mysqld.cc:2691
#19 0x000000000058fb38 in one_thread_per_connection_end (thd=0x7fcd2878a070, put_in_cache=true) at /data/src/10.0/sql/mysqld.cc:2817
#20 0x0000000000769d94 in do_handle_one_connection (thd_arg=0x7fcd2878a070) at /data/src/10.0/sql/sql_connect.cc:1388
#21 0x0000000000769a64 in handle_one_connection (arg=0x7fcd2878a070) at /data/src/10.0/sql/sql_connect.cc:1292
#22 0x0000000000aca5dc in pfs_spawn_thread (arg=0x7fcd2871b970) at /data/src/10.0/storage/perfschema/pfs.cc:1861
#23 0x00007fcd35fe2494 in start_thread (arg=0x7fcd36313700) at pthread_create.c:333
#24 0x00007fcd3439b93f in clone () from /lib/x86_64-linux-gnu/libc.so.6

10.0 ASAN c051eaba46

==14178==ERROR: AddressSanitizer: heap-use-after-free on address 0x61600005e918 at pc 0x7f35d8c4d828 bp 0x7f35cb63ae60 sp 0x7f35cb63ae58
READ of size 8 at 0x61600005e918 thread T22
    #0 0x7f35d8c4d827 in lock_release /data/src/10.0/storage/innobase/lock/lock0lock.cc:4803
    #1 0x7f35d8c4d827 in lock_trx_release_locks(trx_t*) /data/src/10.0/storage/innobase/lock/lock0lock.cc:7047
    #2 0x7f35d8ea322e in trx_commit_in_memory /data/src/10.0/storage/innobase/trx/trx0trx.cc:1182
    #3 0x7f35d8ea322e in trx_commit_low(trx_t*, mtr_t*) /data/src/10.0/storage/innobase/trx/trx0trx.cc:1389
    #4 0x7f35d8ea555c in trx_commit(trx_t*) /data/src/10.0/storage/innobase/trx/trx0trx.cc:1410
    #5 0x7f35d8e80bba in trx_rollback_finish /data/src/10.0/storage/innobase/trx/trx0roll.cc:1339
    #6 0x7f35d8e89db2 in trx_rollback_to_savepoint_low /data/src/10.0/storage/innobase/trx/trx0roll.cc:114
    #7 0x7f35d8e8a6f8 in trx_rollback_for_mysql_low /data/src/10.0/storage/innobase/trx/trx0roll.cc:169
    #8 0x7f35d8e8ab47 in trx_rollback_for_mysql(trx_t*) /data/src/10.0/storage/innobase/trx/trx0roll.cc:200
    #9 0x7f35d8b9ab36 in innobase_rollback /data/src/10.0/storage/innobase/handler/ha_innodb.cc:4018
    #10 0xb67976 in ha_rollback_trans(THD*, bool) /data/src/10.0/sql/handler.cc:1642
    #11 0x9a5c56 in trans_rollback(THD*) /data/src/10.0/sql/transaction.cc:309
    #12 0x64802e in THD::cleanup() /data/src/10.0/sql/sql_class.cc:1536
    #13 0x52239a in thd_cleanup(THD*) /data/src/10.0/sql/mysqld.cc:2633
    #14 0x522a03 in unlink_thd(THD*) /data/src/10.0/sql/mysqld.cc:2691
    #15 0x523026 in one_thread_per_connection_end(THD*, bool) /data/src/10.0/sql/mysqld.cc:2817
    #16 0x978ccf in do_handle_one_connection(THD*) /data/src/10.0/sql/sql_connect.cc:1388
    #17 0x978e5e in handle_one_connection /data/src/10.0/sql/sql_connect.cc:1292
    #18 0x11b294a in pfs_spawn_thread /data/src/10.0/storage/perfschema/pfs.cc:1861
    #19 0x7f35e32f6493 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x7493)
    #20 0x7f35e16af93e in __clone (/lib/x86_64-linux-gnu/libc.so.6+0xe893e)
 
0x61600005e918 is located 152 bytes inside of 568-byte region [0x61600005e880,0x61600005eab8)
freed by thread T23 here:
    #0 0x7f35e3560527 in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x54527)
    #1 0x7f35d8c83042 in mem_area_free(void*, mem_pool_t*) /data/src/10.0/storage/innobase/mem/mem0pool.cc:519
    #2 0x7f35d8ecaba8 (/data/bld/10.0-asan/lib/plugin/ha_innodb.so+0x660ba8)
 
previously allocated by thread T23 here:
    #0 0x7f35e356073f in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x5473f)
    #1 0x7f35d8c82609 in mem_area_alloc(unsigned long*, mem_pool_t*) /data/src/10.0/storage/innobase/mem/mem0pool.cc:381
 
Thread T22 created by T0 here:
    #0 0x7f35e352fbba in pthread_create (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x23bba)
    #1 0x11be1a1 in spawn_thread_v1 /data/src/10.0/storage/perfschema/pfs.cc:1911
 
Thread T23 created by T0 here:
    #0 0x7f35e352fbba in pthread_create (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x23bba)
    #1 0x11be1a1 in spawn_thread_v1 /data/src/10.0/storage/perfschema/pfs.cc:1911
 
SUMMARY: AddressSanitizer: heap-use-after-free /data/src/10.0/storage/innobase/lock/lock0lock.cc:4803 lock_release
Shadow bytes around the buggy address:
  0x0c2c80003cd0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2c80003ce0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2c80003cf0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa
  0x0c2c80003d00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c2c80003d10: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
=>0x0c2c80003d20: fd fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2c80003d30: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2c80003d40: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2c80003d50: fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa fa
  0x0c2c80003d60: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c2c80003d70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Contiguous container OOB:fc
  ASan internal:           fe
==14178==ABORTING



 Comments   
Comment by Marko Mäkelä [ 2018-04-27 ]

I repeated this in 10.3. The error code ER_DUP_UNKNOWN_IN_INDEX was missing.
With one ALTER TABLE commented out, this should not be a duplicate of MDEV-15490 or MDEV-14693 (which indicate that there is something wrong with XA transactions and MDL):

--source include/have_innodb.inc
CREATE TABLE t1 (col1 INT, col2 INT) ENGINE = InnoDB;
INSERT INTO t1 VALUES (1,1);
 
--connect(con1,localhost,root,,test)
XA BEGIN 'xid';
INSERT INTO t1 VALUES (1,0);
 
--connection default
--send
DELETE FROM t1;
 
--connection con1
--send
DELETE FROM t1;
 
--connect(con2,localhost,root,,test)
SET innodb_lock_wait_timeout= 1;
--send
ALTER TABLE t1 ADD UNIQUE KEY uidx(col1);
 
--connection con1
--error 0,ER_LOCK_DEADLOCK
--reap
--error 0,ER_DUP_ENTRY
INSERT INTO t1 VALUES (2,2),(2,2);
#--error ER_XAER_RMFAIL
#ALTER TABLE t1 FORCE;
UPDATE t1 SET col2 = 2;
 
# Cleanup
--disconnect con1
--connection con2
--error 0,ER_LOCK_WAIT_TIMEOUT,ER_DUP_ENTRY,ER_DUP_UNKNOWN_IN_INDEX
--reap
--disconnect con2
--connection default
--error 0,ER_LOCK_DEADLOCK
--reap
DROP TABLE t1;

In that case, I occasionally get a failure that looks different, and is thanks to the Pool poisoning that I recently added in MDEV-15030:

CURRENT_TEST: innodb.xaxa
mysqltest: At line 35: query 'reap' failed with wrong errno 2013: 'Lost connection to MySQL server during query', instead of 0
...
==19852==ERROR: AddressSanitizer: use-after-poison on address 0x7f3b7de70fa0 at pc 0x55a7780eef9d bp 0x7f3b70c123a0 sp 0x7f3b70c12398
READ of size 8 at 0x7f3b7de70fa0 thread T27
Shadow bytes around the buggy address:
  0x0fe7efbc61a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0fe7efbc61b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0fe7efbc61c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0fe7efbc61d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0fe7efbc61e0: 00 00 00 00 00 f7 00 00 00 00 00 00 00 00 00 00
=>0x0fe7efbc61f0: 00 00 00 00[f7]f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7
  0x0fe7efbc6200: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7
  0x0fe7efbc6210: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7
  0x0fe7efbc6220: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7
  0x0fe7efbc6230: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7
  0x0fe7efbc6240: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Poisoned by user:        f7
#9  0x00007f3b88533a48 in __asan_report_load8 () from /usr/lib/x86_64-linux-gnu/libasan.so.5
#10 0x000055a7780eef9d in trx_undo_create (trx=0x7f3b7de70f28, rseg=0x6120000076c0, undo=0x7f3b7de711b0, err=0x7f3b70c12600, mtr=0x7f3b70c12700) at /mariadb/10.3/storage/innobase/trx/trx0undo.cc:1285
#11 0x000055a7780f0abd in trx_undo_assign_low (trx=0x7f3b7de70f28, rseg=0x6120000076c0, undo=0x7f3b7de711b0, err=0x7f3b70c12600, mtr=0x7f3b70c12700) at /mariadb/10.3/storage/innobase/trx/trx0undo.cc:1477
#12 0x000055a7780a3258 in trx_undo_report_row_operation (thr=0x61b0000bd9b0, index=0x617000007108, clust_entry=0x61700007ac88, update=0x0, cmpl_info=0, rec=0x0, offsets=0x0, roll_ptr=0x7f3b70c12d30) at /mariadb/10.3/storage/innobase/trx/trx0rec.cc:2010
#13 0x000055a77817060f in btr_cur_ins_lock_and_undo (flags=0, cursor=0x7f3b70c13900, entry=0x61700007ac88, thr=0x61b0000bd9b0, mtr=0x7f3b70c13d60, inherit=0x7f3b70c12ee0) at /mariadb/10.3/storage/innobase/btr/btr0cur.cc:3026
#14 0x000055a778171933 in btr_cur_optimistic_insert (flags=0, cursor=0x7f3b70c13900, offsets=0x7f3b70c13800, heap=0x7f3b70c137c0, entry=0x61700007ac88, rec=0x7f3b70c13880, big_rec=0x7f3b70c13780, n_ext=0, thr=0x61b0000bd9b0, mtr=0x7f3b70c13d60) at /mariadb/10.3/storage/innobase/btr/btr0cur.cc:3233
#15 0x000055a777e8aea3 in row_ins_clust_index_entry_low (flags=0, mode=2, index=0x617000007108, n_uniq=2, entry=0x61700007ac88, n_ext=0, thr=0x61b0000bd9b0, dup_chk_only=false) at /mariadb/10.3/storage/innobase/row/row0ins.cc:2744
#16 0x000055a777e8e049 in row_ins_clust_index_entry (index=0x617000007108, entry=0x61700007ac88, thr=0x61b0000bd9b0, n_ext=0, dup_chk_only=false) at /mariadb/10.3/storage/innobase/row/row0ins.cc:3249
#17 0x000055a777e8ec91 in row_ins_index_entry (index=0x617000007108, entry=0x61700007ac88, thr=0x61b0000bd9b0) at /mariadb/10.3/storage/innobase/row/row0ins.cc:3376
#18 0x000055a777e8fad9 in row_ins_index_entry_step (node=0x61700007a998, thr=0x61b0000bd9b0) at /mariadb/10.3/storage/innobase/row/row0ins.cc:3526
#19 0x000055a777e90503 in row_ins (node=0x61700007a998, thr=0x61b0000bd9b0) at /mariadb/10.3/storage/innobase/row/row0ins.cc:3668
#20 0x000055a777e91dc7 in row_ins_step (thr=0x61b0000bd9b0) at /mariadb/10.3/storage/innobase/row/row0ins.cc:3923
#21 0x000055a777e0690f in que_thr_step (thr=0x61b0000bd9b0) at /mariadb/10.3/storage/innobase/que/que0que.cc:1025
#22 0x000055a777e06f4f in que_run_threads_low (thr=0x61b0000bd9b0) at /mariadb/10.3/storage/innobase/que/que0que.cc:1108
#23 0x000055a777e07338 in que_run_threads (thr=0x61b0000bd9b0) at /mariadb/10.3/storage/innobase/que/que0que.cc:1148
#24 0x000055a777c27bab in create_index_dict (trx=0x7f3b7de70f28, index=0x61700007a588, add_v=0x0) at /mariadb/10.3/storage/innobase/handler/handler0alter.cc:4754
#25 0x000055a777c312f9 in prepare_inplace_alter_table_dict (ha_alter_info=0x7f3b70c16400, altered_table=0x61f0000896f0, old_table=0x61f000093ef0, table_name=0x61b0000dc045 "t1", flags=33, flags2=80, fts_doc_id_col=18446744073709551615, add_fts_doc_id=false, add_fts_doc_id_idx=false) at /mariadb/10.3/storage/innobase/handler/handler0alter.cc:5555
#26 0x000055a777c3c330 in ha_innobase::prepare_inplace_alter_table (this=0x61d0003be308, altered_table=0x61f0000896f0, ha_alter_info=0x7f3b70c16400) at /mariadb/10.3/storage/innobase/handler/handler0alter.cc:6908
#27 0x000055a77763172b in handler::ha_prepare_inplace_alter_table (this=0x61d0003be308, altered_table=0x61f0000896f0, ha_alter_info=0x7f3b70c16400) at /mariadb/10.3/sql/handler.cc:4386
#28 0x000055a777129453 in mysql_inplace_alter_table (thd=0x62a0000ae270, table_list=0x62b000000388, table=0x61f000093ef0, altered_table=0x61f0000896f0, ha_alter_info=0x7f3b70c16400, inplace_supported=HA_ALTER_INPLACE_NO_LOCK_AFTER_PREPARE, target_mdl_request=0x7f3b70c16620, alter_ctx=0x7f3b70c174a0) at /mariadb/10.3/sql/sql_table.cc:7456
#29 0x000055a777138e94 in mysql_alter_table (thd=0x62a0000ae270, new_db=0x62a0000b28d8, new_name=0x62a0000b2c88, create_info=0x7f3b70c181e0, table_list=0x62b000000388, alter_info=0x7f3b70c18100, order_num=0, order=0x0, ignore=false) at /mariadb/10.3/sql/sql_table.cc:9668
#30 0x000055a7772a2dde in Sql_cmd_alter_table::execute (this=0x62b000000ab0, thd=0x62a0000ae270) at /mariadb/10.3/sql/sql_alter.cc:328
#31 0x000055a776eee0ba in mysql_execute_command (thd=0x62a0000ae270) at /mariadb/10.3/sql/sql_parse.cc:6282
#32 0x000055a776ef9948 in mysql_parse (thd=0x62a0000ae270, rawbuf=0x62b000000288 "ALTER TABLE t1 ADD UNIQUE KEY uidx(col1)", length=40, parser_state=0x7f3b70c1ac60, is_com_multi=false, is_next_command=false) at /mariadb/10.3/sql/sql_parse.cc:8001

I will have to figure out if this is a false alarm (bug in the Pool memory poisoning).

Comment by Marko Mäkelä [ 2018-04-30 ]

Unfortunately I cannot reproduce the non-XA problem (accessing a freed transaction object) any more. That bug definitely is out there.

Comment by Elena Stepanova [ 2018-04-30 ]

I've filed my (incomplete so far) version of the problem in the comment as MDEV-16063. I suggest to keep this report, MDEV-15791, for the legacy XA issue, as it was originally filed.

Comment by Marko Mäkelä [ 2018-05-07 ]

It is possible that the fix of MDEV-14825 addressed this issue for the most part. Without the broken MDL for XA, the issue would likely not exist in the first place.

I do not see any failures when running with ASAN_OPTIONS=allow_user_poisoning=0,poison_heap=0. With either of those, there are some occasional failures (AddressSanitizer use-after-poison or heap-use-after-free; sorry, no resolved stack traces this time):

10.3 1a4c355a1c5e6b0e323be450fb38633921b42324

mysqltest: At line 40: query 'DROP TABLE t1' failed: 2013: Lost connection to MySQL server during query

Comment by Marko Mäkelä [ 2018-05-07 ]

It looks like the fix of MDEV-14825 also fixed this crash.
The ASAN heap-use-after-poison will be tracked as MDEV-16063.

Generated at Thu Feb 08 08:24:02 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.