[MDEV-21344] Valgrind uninitialised value warnings in dict_acquire_mdl_shared / dict_table_open_on_id Created: 2019-12-18  Updated: 2020-03-27  Resolved: 2020-01-23

Status: Closed
Project: MariaDB Server
Component/s: Locking, Tests
Affects Version/s: 10.5
Fix Version/s: 10.5.1

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Thirunarayanan Balathandayuthapani
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-16678 Use MDL for innodb background threads... Closed
relates to MDEV-21400 encryption.innochecksum fails during ... Closed
relates to MDEV-22060 MSAN use-of-uninitialized-value in ma... Closed

 Description   

10.5 c24253d0

main.partition_innodb_semi_consistent 'innodb' [ fail ]  Found warnings/errors in server log file!
        Test ended at 2019-12-18 02:27:00
line
==22680== Thread 10:
==22680== Conditional jump or move depends on uninitialised value(s)
==22680==    at 0x4C31D21: __memcmp_sse4_1 (vg_replace_strmem.c:1094)
==22680==    by 0x148D14A: dict_table_t* dict_acquire_mdl_shared<false>(dict_table_t*, THD*, MDL_ticket**, dict_table_op_t) (dict0dict.cc:890)
==22680==    by 0x147DC67: dict_table_open_on_id(unsigned long, bool, dict_table_op_t, THD*, MDL_ticket**) (dict0dict.cc:945)
==22680==    by 0x12DD951: row_purge_parse_undo_rec(purge_node_t*, unsigned char*, que_thr_t*, bool*) (row0purge.cc:936)
==22680==    by 0x12DE27E: row_purge(purge_node_t*, unsigned char*, que_thr_t*) (row0purge.cc:1110)
==22680==    by 0x12DE41B: row_purge_step(que_thr_t*) (row0purge.cc:1162)
==22680==    by 0x1246282: que_thr_step(que_thr_t*) (que0que.cc:1038)
==22680==    by 0x1246519: que_run_threads_low(que_thr_t*) (que0que.cc:1100)
==22680==    by 0x1246769: que_run_threads(que_thr_t*) (que0que.cc:1140)
==22680==    by 0x132B5AE: srv_task_execute() (srv0srv.cc:2022)
==22680==    by 0x132BD35: purge_worker_callback(void*) (srv0srv.cc:2172)
==22680==    by 0x157D5A7: tpool::task_group::execute(tpool::task*) (task_group.cc:55)
==22680==    by 0x157D923: tpool::task::execute() (task.cc:32)
==22680==    by 0x157714D: tpool::thread_pool_generic::worker_main(tpool::worker_data*) (tpool_generic.cc:509)
==22680==    by 0x157D3B8: void std::__invoke_impl<void, void (tpool::thread_pool_generic::* const&)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*>(std::__invoke_memfun_deref, void (tpool::thread_pool_generic::* const&)(tpool::worker_data*), tpool::thread_pool_generic*&&, tpool::worker_data*&&) (functional:227)
==22680==    by 0x157D32A: std::result_of<void (tpool::thread_pool_generic::* const&(tpool::thread_pool_generic*&&, tpool::worker_data*&&))(tpool::worker_data*)>::type std::__invoke<void (tpool::thread_pool_generic::* const&)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*>(void (tpool::thread_pool_generic::* const&)(tpool::worker_data*), tpool::thread_pool_generic*&&, tpool::worker_data*&&) (functional:251)
==22680==    by 0x157D2BE: decltype (__invoke((*this)._M_pmf, (forward<tpool::thread_pool_generic*>)({parm#1}), (forward<tpool::worker_data*>)({parm#1}))) std::_Mem_fn_base<void (tpool::thread_pool_generic::*)(tpool::worker_data*), true>::operator()<tpool::thread_pool_generic*, tpool::worker_data*>(tpool::thread_pool_generic*&&, tpool::worker_data*&&) const (functional:604)
==22680==    by 0x157D270: void std::_Bind_simple<std::_Mem_fn<void (tpool::thread_pool_generic::*)(tpool::worker_data*)> (tpool::thread_pool_generic*, tpool::worker_data*)>::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) (functional:1391)
==22680==    by 0x157D16E: std::_Bind_simple<std::_Mem_fn<void (tpool::thread_pool_generic::*)(tpool::worker_data*)> (tpool::thread_pool_generic*, tpool::worker_data*)>::operator()() (functional:1380)
==22680==    by 0x157D14D: std::thread::_State_impl<std::_Bind_simple<std::_Mem_fn<void (tpool::thread_pool_generic::*)(tpool::worker_data*)> (tpool::thread_pool_generic*, tpool::worker_data*)> >::_M_run() (thread:197)
==22680== Thread 12:
==22680== Conditional jump or move depends on uninitialised value(s)
==22680==    at 0x148D14D: dict_table_t* dict_acquire_mdl_shared<false>(dict_table_t*, THD*, MDL_ticket**, dict_table_op_t) (dict0dict.cc:889)
==22680==    by 0x147DC67: dict_table_open_on_id(unsigned long, bool, dict_table_op_t, THD*, MDL_ticket**) (dict0dict.cc:945)
==22680==    by 0x12DD951: row_purge_parse_undo_rec(purge_node_t*, unsigned char*, que_thr_t*, bool*) (row0purge.cc:936)
==22680==    by 0x12DE27E: row_purge(purge_node_t*, unsigned char*, que_thr_t*) (row0purge.cc:1110)
==22680==    by 0x12DE41B: row_purge_step(que_thr_t*) (row0purge.cc:1162)
==22680==    by 0x1246282: que_thr_step(que_thr_t*) (que0que.cc:1038)
==22680==    by 0x1246519: que_run_threads_low(que_thr_t*) (que0que.cc:1100)
==22680==    by 0x1246769: que_run_threads(que_thr_t*) (que0que.cc:1140)
==22680==    by 0x132B5AE: srv_task_execute() (srv0srv.cc:2022)
==22680==    by 0x132BD35: purge_worker_callback(void*) (srv0srv.cc:2172)
==22680==    by 0x157D5A7: tpool::task_group::execute(tpool::task*) (task_group.cc:55)
==22680==    by 0x157D923: tpool::task::execute() (task.cc:32)
==22680==    by 0x157714D: tpool::thread_pool_generic::worker_main(tpool::worker_data*) (tpool_generic.cc:509)
==22680==    by 0x157D3B8: void std::__invoke_impl<void, void (tpool::thread_pool_generic::* const&)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*>(std::__invoke_memfun_deref, void (tpool::thread_pool_generic::* const&)(tpool::worker_data*), tpool::thread_pool_generic*&&, tpool::worker_data*&&) (functional:227)
==22680==    by 0x157D32A: std::result_of<void (tpool::thread_pool_generic::* const&(tpool::thread_pool_generic*&&, tpool::worker_data*&&))(tpool::worker_data*)>::type std::__invoke<void (tpool::thread_pool_generic::* const&)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*>(void (tpool::thread_pool_generic::* const&)(tpool::worker_data*), tpool::thread_pool_generic*&&, tpool::worker_data*&&) (functional:251)
==22680==    by 0x157D2BE: decltype (__invoke((*this)._M_pmf, (forward<tpool::thread_pool_generic*>)({parm#1}), (forward<tpool::worker_data*>)({parm#1}))) std::_Mem_fn_base<void (tpool::thread_pool_generic::*)(tpool::worker_data*), true>::operator()<tpool::thread_pool_generic*, tpool::worker_data*>(tpool::thread_pool_generic*&&, tpool::worker_data*&&) const (functional:604)
==22680==    by 0x157D270: void std::_Bind_simple<std::_Mem_fn<void (tpool::thread_pool_generic::*)(tpool::worker_data*)> (tpool::thread_pool_generic*, tpool::worker_data*)>::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) (functional:1391)
==22680==    by 0x157D16E: std::_Bind_simple<std::_Mem_fn<void (tpool::thread_pool_generic::*)(tpool::worker_data*)> (tpool::thread_pool_generic*, tpool::worker_data*)>::operator()() (functional:1380)
==22680==    by 0x157D14D: std::thread::_State_impl<std::_Bind_simple<std::_Mem_fn<void (tpool::thread_pool_generic::*)(tpool::worker_data*)> (tpool::thread_pool_generic*, tpool::worker_data*)> >::_M_run() (thread:197)
==22680==    by 0x606CE6E: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.22)
^ Found warnings in /data/bld/10.5-valgrind-nightly/mysql-test/var/log/mysqld.1.err

Many more tests in 10.5 fail the same way.



 Comments   
Comment by Marko Mäkelä [ 2019-12-18 ]

I suspect that this was caused by MDEV-16678. Perhaps this even explains the occasional hangs that I have observed, in innodb.alter_large_dml and some other tests.

I was able to repeat this with MemorySanitizer (following the procedure laid out in MDEV-20377). With Valgrind, you probably want to use vgdb and the monitor get_vbits command to find out which bits are claimed to be uninitialized.

Comment by Elena Stepanova [ 2020-01-17 ]

The test mentioned in the description now passes, but another one fails with the same error:

10.5 Valgrind e7558d47

innodb.innodb-alter-debug '32k,innodb'   [ fail ]  Found warnings/errors in server log file!
        Test ended at 2020-01-17 15:54:18
line
==27512== Thread 8:
==27512== Conditional jump or move depends on uninitialised value(s)
==27512==    at 0x4C31D21: __memcmp_sse4_1 (vg_replace_strmem.c:1094)
==27512==    by 0x14AA60C: dict_table_t* dict_acquire_mdl_shared<false>(dict_table_t*, THD*, MDL_ticket**, dict_table_op_t) (dict0dict.cc:893)
==27512==    by 0x149B107: dict_table_open_on_id(unsigned long, bool, dict_table_op_t, THD*, MDL_ticket**) (dict0dict.cc:948)
==27512==    by 0x12F9C64: row_purge_parse_undo_rec(purge_node_t*, unsigned char*, que_thr_t*, bool*) (row0purge.cc:936)
==27512==    by 0x12FA591: row_purge(purge_node_t*, unsigned char*, que_thr_t*) (row0purge.cc:1110)
==27512==    by 0x12FA72E: row_purge_step(que_thr_t*) (row0purge.cc:1162)
==27512==    by 0x126247C: que_thr_step(que_thr_t*) (que0que.cc:1038)
==27512==    by 0x1262713: que_run_threads_low(que_thr_t*) (que0que.cc:1100)
==27512==    by 0x1262963: que_run_threads(que_thr_t*) (que0que.cc:1140)
==27512==    by 0x1347D00: srv_task_execute() (srv0srv.cc:2033)
==27512==    by 0x1348487: purge_worker_callback(void*) (srv0srv.cc:2183)
==27512==    by 0x159AC1F: tpool::task_group::execute(tpool::task*) (task_group.cc:55)
==27512==    by 0x159AFD5: tpool::task::execute() (task.cc:47)
==27512==    by 0x1594859: tpool::thread_pool_generic::worker_main(tpool::worker_data*) (tpool_generic.cc:513)
==27512==    by 0x159AA30: void std::__invoke_impl<void, void (tpool::thread_pool_generic::* const&)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*>(std::__invoke_memfun_deref, void (tpool::thread_pool_generic::* const&)(tpool::worker_data*), tpool::thread_pool_generic*&&, tpool::worker_data*&&) (functional:227)
==27512==    by 0x159A9A2: std::result_of<void (tpool::thread_pool_generic::* const&(tpool::thread_pool_generic*&&, tpool::worker_data*&&))(tpool::worker_data*)>::type std::__invoke<void (tpool::thread_pool_generic::* const&)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*>(void (tpool::thread_pool_generic::* const&)(tpool::worker_data*), tpool::thread_pool_generic*&&, tpool::worker_data*&&) (functional:251)
==27512==    by 0x159A936: decltype (__invoke((*this)._M_pmf, (forward<tpool::thread_pool_generic*>)({parm#1}), (forward<tpool::worker_data*>)({parm#1}))) std::_Mem_fn_base<void (tpool::thread_pool_generic::*)(tpool::worker_data*), true>::operator()<tpool::thread_pool_generic*, tpool::worker_data*>(tpool::thread_pool_generic*&&, tpool::worker_data*&&) const (functional:604)
==27512==    by 0x159A8E8: void std::_Bind_simple<std::_Mem_fn<void (tpool::thread_pool_generic::*)(tpool::worker_data*)> (tpool::thread_pool_generic*, tpool::worker_data*)>::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) (functional:1391)
==27512==    by 0x159A7E6: std::_Bind_simple<std::_Mem_fn<void (tpool::thread_pool_generic::*)(tpool::worker_data*)> (tpool::thread_pool_generic*, tpool::worker_data*)>::operator()() (functional:1380)
==27512==    by 0x159A7C5: std::thread::_State_impl<std::_Bind_simple<std::_Mem_fn<void (tpool::thread_pool_generic::*)(tpool::worker_data*)> (tpool::thread_pool_generic*, tpool::worker_data*)> >::_M_run() (thread:197)

Comment by Marko Mäkelä [ 2020-01-23 ]

I can repeat this WITH_MSAN:

10.5 700e010309ea9d0f7c32d0973ec88c3140c67956

CURRENT_TEST: innodb.innodb-alter-debug
mysqltest: At line 22: query 'ALTER TABLE t1ć DROP FOREIGN KEY t1c2, RENAME TO ②' failed with wrong errno 2013: 'Lost connection to MySQL server during query', instead of 1114...
Version: '10.5.1-MariaDB-debug-log'  socket: '/dev/shm/10.5m/mysql-test/var/tmp/11/mysqld.1.sock'  port: 16220  Source distribution
Uninitialized bytes in MemcmpInterceptorCommon at offset 4 inside [0x7f7485209810, 5)
==507723==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x55585958193e in bcmp (/dev/shm/10.5m/sql/mysqld+0x6d793e)
    #1 0x55585c55a887 in dict_table_t* dict_acquire_mdl_shared<false>(dict_table_t*, THD*, MDL_ticket**, dict_table_op_t) /mariadb/10.5-MDEV-12353/storage/innobase/dict/dict0dict.cc:893:10
    #2 0x55585c508c1b in dict_table_open_on_id(unsigned long, bool, dict_table_op_t, THD*, MDL_ticket**) /mariadb/10.5-MDEV-12353/storage/innobase/dict/dict0dict.cc:947:12
    #3 0x55585bed35fd in row_purge_parse_undo_rec(purge_node_t*, unsigned char*, que_thr_t*, bool*) /mariadb/10.5-MDEV-12353/storage/innobase/row/row0purge.cc:933:16
    #4 0x55585bed35fd in row_purge(purge_node_t*, unsigned char*, que_thr_t*) /mariadb/10.5-MDEV-12353/storage/innobase/row/row0purge.cc:1107:10
    #5 0x55585bed35fd in row_purge_step(que_thr_t*) /mariadb/10.5-MDEV-12353/storage/innobase/row/row0purge.cc:1159:3
    #6 0x55585bca0ed5 in que_thr_step(que_thr_t*) /mariadb/10.5-MDEV-12353/storage/innobase/que/que0que.cc:1038:9
    #7 0x55585bca0ed5 in que_run_threads_low(que_thr_t*) /mariadb/10.5-MDEV-12353/storage/innobase/que/que0que.cc:1100:14
    #8 0x55585bca0ed5 in que_run_threads(que_thr_t*) /mariadb/10.5-MDEV-12353/storage/innobase/que/que0que.cc:1140:2
    #9 0x55585c0047d9 in srv_task_execute() /mariadb/10.5-MDEV-12353/storage/innobase/srv/srv0srv.cc:2033:3
    #10 0x55585c0047d9 in purge_worker_callback(void*) /mariadb/10.5-MDEV-12353/storage/innobase/srv/srv0srv.cc:2183:10
    #11 0x55585c8c6745 in tpool::task_group::execute(tpool::task*) /mariadb/10.5-MDEV-12353/tpool/task_group.cc:55:9
    #12 0x55585c8b9f04 in tpool::thread_pool_generic::worker_main(tpool::worker_data*) /mariadb/10.5-MDEV-12353/tpool/tpool_generic.cc:513:11
    #13 0x55585c8c3ca9 in decltype(*(std::__1::forward<tpool::thread_pool_generic*>(fp0)).*fp(std::__1::forward<tpool::worker_data*>(fp1))) std::__1::__invoke<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*, void>(void (tpool::thread_pool_generic::*&&)(tpool::worker_data*), tpool::thread_pool_generic*&&, tpool::worker_data*&&) /usr/lib/llvm-9/bin/../include/c++/v1/type_traits:3471:1
    #14 0x55585c8c3ca9 in void std::__1::__thread_execute<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*, 2ul, 3ul>(std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*>&, std::__1::__tuple_indices<2ul, 3ul>) /usr/lib/llvm-9/bin/../include/c++/v1/thread:277:5
    #15 0x55585c8c3ca9 in void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> >(void*) /usr/lib/llvm-9/bin/../include/c++/v1/thread:287:5
    #16 0x7f749b04dfb6 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x8fb6)
    #17 0x7f749a89f2de in clone (/lib/x86_64-linux-gnu/libc.so.6+0xfa2de)
 
  Uninitialized value was created by an allocation of 'tbl_buf' in the stack frame of function '_Z23dict_acquire_mdl_sharedILb0EEP12dict_table_tS1_P3THDPP10MDL_ticket15dict_table_op_t'
    #0 0x55585c5599b0 in dict_table_t* dict_acquire_mdl_shared<false>(dict_table_t*, THD*, MDL_ticket**, dict_table_op_t) /mariadb/10.5-MDEV-12353/storage/innobase/dict/dict0dict.cc:793
 
SUMMARY: MemorySanitizer: use-of-uninitialized-value (/dev/shm/10.5m/sql/mysqld+0x6d793e) in bcmp
Exiting
200123 14:18:01 [ERROR] mysqld got signal 6 ;
#9  0x000055c17abba93f in bcmp ()
    at /mariadb/10.5-MDEV-12353/sql/sql_string.h:137
#10 0x000055c17db93888 in dict_acquire_mdl_shared<false> (
    table=0x71800007ec88, thd=0x0, mdl=<optimized out>, 
    table_op=DICT_TABLE_OP_NORMAL)
    at /mariadb/10.5-MDEV-12353/storage/innobase/dict/dict0dict.cc:893
 
(gdb) f 10
#10 0x000055c17db93888 in dict_acquire_mdl_shared<false> (
    table=0x71800007ec88, thd=0x0, mdl=<optimized out>, 
    table_op=DICT_TABLE_OP_NORMAL)
    at /mariadb/10.5-MDEV-12353/storage/innobase/dict/dict0dict.cc:893
893	        !memcmp(tbl_buf, tbl_buf1, tbl_len))
(gdb) p *table
$1 = {id = 21, id_hash = 0x0, name = {m_name = 0x7010000042d0 "test/@2460"}, 
  name_hash = 0x0, heap = 0x70f000017ac0, data_dir_path = 0x0, 
  space = 0x713000030200, space_id = 6, flags = 33, flags2 = 80, 

The table name is test.① (U+2460). I will try to rerun with --manual-gdb so that I will be able to debug by invoking __msan_print_shadow() on a live process.

Comment by Marko Mäkelä [ 2020-01-23 ]

With a patch provided by thiru, the test passes for me:

LD_LIBRARY_PATH=/mariadb/llvm-toolchain-9-9.0.1/libc++msan/lib MSAN_OPTIONS=abort_on_error=1 ./mtr --big-test --parallel=auto innodb.innodb-alter-debug 

innodb.innodb-alter-debug '32k,innodb'   w5 [ pass ]    393
innodb.innodb-alter-debug '16k,innodb'   w7 [ pass ]    535
innodb.innodb-alter-debug '64k,innodb'   w9 [ pass ]    535
innodb.innodb-alter-debug '8k,innodb'    w12 [ pass ]    527
innodb.innodb-alter-debug '4k,innodb'    w10 [ pass ]    521

The patch:

diff --git a/storage/innobase/dict/dict0dict.cc b/storage/innobase/dict/dict0dict.cc
index 98d0f85ccd8..7ae8e65e8e5 100644
--- a/storage/innobase/dict/dict0dict.cc
+++ b/storage/innobase/dict/dict0dict.cc
@@ -754,7 +754,6 @@ bool dict_table_t::parse_name(char (&db_name)[NAME_LEN + 1],
     mutex_exit(&dict_sys.mutex);
 
   *db_name_len= db_len;
-  *tbl_name_len= tbl_len;
 
   filename_to_tablename(db_buf, db_name, MAX_DATABASE_NAME_LEN + 1, true);
 
@@ -763,12 +762,10 @@ bool dict_table_t::parse_name(char (&db_name)[NAME_LEN + 1],
     return false;
 
   if (char* is_part= strchr(tbl_buf, '#'))
-  {
     *is_part = '\0';
-    *tbl_name_len= is_part - tbl_buf;
-  }
 
   filename_to_tablename(tbl_buf, tbl_name, MAX_TABLE_NAME_LEN + 1, true);
+  *tbl_name_len= strlen(tbl_name);
   return true;
 }
 

Generated at Thu Feb 08 09:06:25 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.