Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.6, 10.11, 11.0(EOL), 11.1(EOL), 11.2(EOL), 11.3(EOL), 11.4
Description
The following failure is reported by MemorySanitizer on our CI systems every now and then, and I can reproduce it locally as well:
10.6 73291de74e49a84700ce4e2aa2c7ec6769d884dc |
innodb_gis.rtree_compress '16k,innodb' w1 [ 3 fail ]
|
Test ended at 2024-04-08 17:14:37
|
|
CURRENT_TEST: innodb_gis.rtree_compress
|
mysqltest: At line 43: query 'rollback' failed: <Unknown> (2013): Lost connection to server during query
|
...
|
==932423==WARNING: MemorySanitizer: use-of-uninitialized-value
|
#0 0x5641413d732e in rtr_pcur_getnext_from_path(dtuple_t const*, page_cur_mode_t, btr_cur_t*, unsigned long, unsigned long, bool, mtr_t*) /mariadb/10.6/storage/innobase/gis/gis0sea.cc:292:3
|
#1 0x5641413c817e in rtr_search(dtuple_t const*, btr_latch_mode, btr_pcur_t*, mtr_t*) /mariadb/10.6/storage/innobase/gis/gis0sea.cc:1087:8
|
#2 0x5641418bbc90 in row_undo_ins_remove_sec_low(btr_latch_mode, dict_index_t*, dtuple_t*, que_thr_t*) /mariadb/10.6/storage/innobase/row/row0uins.cc:283:7
|
#3 0x5641418b6c14 in row_undo_ins_remove_sec(dict_index_t*, dtuple_t*, que_thr_t*) /mariadb/10.6/storage/innobase/row/row0uins.cc:353:8
|
#4 0x5641418b6c14 in row_undo_ins_remove_sec_rec(undo_node_t*, que_thr_t*) /mariadb/10.6/storage/innobase/row/row0uins.cc:547:10
|
#5 0x5641418b6c14 in row_undo_ins(undo_node_t*, que_thr_t*) /mariadb/10.6/storage/innobase/row/row0uins.cc:599:9
|
#6 0x5641418b3429 in row_undo(undo_node_t*, que_thr_t*) /mariadb/10.6/storage/innobase/row/row0undo.cc:399:5
|
#7 0x5641418b3429 in row_undo_step(que_thr_t*) /mariadb/10.6/storage/innobase/row/row0undo.cc:440:8
|
#8 0x5641417021fe in que_thr_step(que_thr_t*) /mariadb/10.6/storage/innobase/que/que0que.cc:586:9
|
#9 0x5641417021fe in que_run_threads_low(que_thr_t*) /mariadb/10.6/storage/innobase/que/que0que.cc:644:25
|
#10 0x5641417021fe in que_run_threads(que_thr_t*) /mariadb/10.6/storage/innobase/que/que0que.cc:664:2
|
#11 0x564141988b70 in trx_t::rollback_low(trx_savept_t*) /mariadb/10.6/storage/innobase/trx/trx0roll.cc:125:5
|
#12 0x564141982dcd in trx_rollback_for_mysql_low(trx_t*) /mariadb/10.6/storage/innobase/trx/trx0roll.cc:196:7
|
#13 0x564141982dcd in trx_rollback_for_mysql(trx_t*) /mariadb/10.6/storage/innobase/trx/trx0roll.cc
|
#14 0x5641410d3d85 in innobase_rollback(handlerton*, THD*, bool) /mariadb/10.6/storage/innobase/handler/ha_innodb.cc:4697:11
|
#15 0x56413f9c4faf in ha_rollback_trans(THD*, bool) /mariadb/10.6/sql/handler.cc:2237:17
|
#16 0x5641408bdfae in trans_rollback(THD*) /mariadb/10.6/sql/transaction.cc:387:8
|
#17 0x5641403aed19 in mysql_execute_command(THD*, bool) /mariadb/10.6/sql/sql_parse.cc:5777:27
|
#18 0x564140395f52 in mysql_parse(THD*, char*, unsigned int, Parser_state*) /mariadb/10.6/sql/sql_parse.cc:8139:18
|
#19 0x56414038e690 in dispatch_command(enum_server_command, THD*, char*, unsigned int, bool) /mariadb/10.6/sql/sql_parse.cc:1896:7
|
#20 0x564140397014 in do_command(THD*, bool) /mariadb/10.6/sql/sql_parse.cc:1409:17
|
#21 0x56414087e4f5 in do_handle_one_connection(CONNECT*, bool) /mariadb/10.6/sql/sql_connect.cc:1415:11
|
#22 0x56414087df65 in handle_one_connection /mariadb/10.6/sql/sql_connect.cc:1317:5
|
#23 0x564140e72c40 in pfs_spawn_thread /mariadb/10.6/storage/perfschema/pfs.cc:2201:3
|
#24 0x7f586a6a645b in start_thread nptl/pthread_create.c:444:8
|
#25 0x7f586a726bbb in clone3 misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
|
|
Uninitialized value was stored to memory at
|
#0 0x5641413d7327 in page_is_leaf(unsigned char const*) /mariadb/10.6/storage/innobase/include/page0page.h:263:10
|
#1 0x5641413d7327 in rtr_pcur_getnext_from_path(dtuple_t const*, page_cur_mode_t, btr_cur_t*, unsigned long, unsigned long, bool, mtr_t*) /mariadb/10.6/storage/innobase/gis/gis0sea.cc:292:3
|
|
Memory was marked as uninitialized
|
#0 0x56413f8dc971 in __msan_allocated_memory (/dev/shm/10.6msan/sql/mariadbd+0xfd9971) (BuildId: 963d974e3992c2b5)
|
#1 0x5641412af2fa in buf_LRU_block_free_non_file_page(buf_block_t*) /mariadb/10.6/storage/innobase/buf/buf0lru.cc:975:2
|
A possible reason why earlier versions are not affected could be that MDEV-23484 was fixed in 10.6.
The problem seems to be inadequate locking in rtr_pcur_getnext_from_path(). I have a patch that passes the following execution on my local system (see MDEV-20377 how to build and run tests with MemorySanitizer):
LD_LIBRARY_PATH=~/libmsan-18 MSAN_SYMBOLIZER_PATH=~/bin/llvm-symbolizer-msan ./mtr --parallel=auto --repeat=10 innodb_gis.rtree_compress{,,,,,,,,,,} |
10.6 73291de74e49a84700ce4e2aa2c7ec6769d884dc with patch |
innodb_gis.rtree_compress '4k,innodb' w14 [ 10 pass ] 17794
|
innodb_gis.rtree_compress '4k,innodb' w17 [ 10 pass ] 17397
|
--------------------------------------------------------------------------
|
The servers were restarted 0 times
|
Spent 8710.576 of 305 seconds executing testcases
|
|
Completed: All 330 tests were successful.
|
Attachments
Issue Links
- is caused by
-
MDEV-24142 rw_lock_t has unnecessarily complex wait logic
-
- Closed
-
- is part of
-
MDEV-33073 always green buildbot
-
- Stalled
-
marko Thanks for the change. I think the debug assert removal is correct but AFAICS latching changes are not required.
We release all the page latches (mtr->rollback_to_savepoint(1)) when rtr_pcur_getnext_from_path() is called. This looks fine as we are either holding X (pessimistic) or SX(optimistic) latch at index level and have all the possible page IDs cached already.
Only issue is that cursor now points to a possible invalid block "btr_cur->page_cur.block". We eventually set the cursor to the next candidate page after latching it and there doesn't seem to be any real issue other than this debug assert trying to access the block.
The block can be freed even can be reused concurrently and MSAN looks to be catching it when it is indeed freed. We should not access the block/page at this point and It justifies the removal of the assert.