[MDEV-24549] ASAN failure or SEGV during DROP TABLE IF EXISTS Created: 2021-01-08  Updated: 2021-01-11

Status: Open
Project: MariaDB Server
Component/s: Locking, Storage Engine - InnoDB
Affects Version/s: 10.3.28
Fix Version/s: 10.3

Type: Bug Priority: Major
Reporter: Matthias Leich Assignee: Vladislav Vaintroub
Resolution: Unresolved Votes: 0
Labels: rr-profile

Attachments: File MDEV-17092.cfg     File MDEV-17092.sql     File MDEV-17092.yy     File os_thread_sleep.patch    

 Description   

 
This is a concurrency bug.
 
HEAD, origin/bb-10.3-MDEV-23536 f35c2e65a7a3df5ee71b0a48474835337f843872 2020-12-29T12:42:01+02:00
 
Effect 1)
pluto:/data/Results/1610048719/Marko1
cd dev/shm/vardir/1610048719/170/1/rr
_RR_TRACE_DIR="." rr replay --mark-stdio
 
Thread 3 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 2800615.2812796]
0x0000561a434158bc in MDL_lock::incompatible_granted_types_bitmap (this=0x61600005eba8) at /Server/bb-10.3-MDEV-23536/sql/mdl.cc:474
474       { return m_strategy->incompatible_granted_types_bitmap(); }
(rr) bt
#0  0x0000561a434158bc in MDL_lock::incompatible_granted_types_bitmap (this=0x61600005eba8) at /Server/bb-10.3-MDEV-23536/sql/mdl.cc:474
#1  0x0000561a4340e0ca in MDL_ticket::has_stronger_or_equal_type (this=0x6060003fd820, type=MDL_SHARED) at /Server/bb-10.3-MDEV-23536/sql/mdl.cc:1722
#2  0x0000561a4340e5bb in MDL_context::find_ticket (this=0x62a000210328, mdl_request=0x35c73ac62e10, result_duration=0x35c73ac62e00) at /Server/bb-10.3-MDEV-23536/sql/mdl.cc:1773
#3  0x0000561a4341337c in MDL_context::is_lock_owner (this=0x62a000210328, mdl_namespace=MDL_key::TABLE, db=0x62b0000e09d8 "test", name=0x62b0000e0328 "t1", mdl_type=MDL_SHARED) at /Server/bb-10.3-MDEV-23536/sql/mdl.cc:2771
#4  0x0000561a4326229b in mysql_rm_table_no_locks (thd=0x62a000210208, tables=0x62b0000e0378, if_exists=true, drop_temporary=false, drop_view=false, drop_sequence=false, dont_log_query=false, dont_free_locks=false)
    at /Server/bb-10.3-MDEV-23536/sql/sql_table.cc:2428
#5  0x0000561a432609e4 in mysql_rm_table (thd=0x62a000210208, tables=0x62b0000e0378, if_exists=true, drop_temporary=false, drop_sequence=false) at /Server/bb-10.3-MDEV-23536/sql/sql_table.cc:2130
#6  0x0000561a430402da in mysql_execute_command (thd=0x62a000210208) at /Server/bb-10.3-MDEV-23536/sql/sql_parse.cc:4824
#7  0x0000561a43055674 in mysql_parse (thd=0x62a000210208, rawbuf=0x62b0000e0228 "DROP TABLE IF EXISTS t1 /* E_R Thread4 QNO 134 CON_ID 141 */", length=60, parser_state=0x35c73ac658b0, is_com_multi=false, is_next_command=false)
    at /Server/bb-10.3-MDEV-23536/sql/sql_parse.cc:7840
#8  0x0000561a4302c4b8 in dispatch_command (command=COM_QUERY, thd=0x62a000210208, packet=0x629001581209 "DROP TABLE IF EXISTS t1 /* E_R Thread4 QNO 134 CON_ID 141 */ ", packet_length=61, is_com_multi=false, is_next_command=false)
    at /Server/bb-10.3-MDEV-23536/sql/sql_parse.cc:1852
#9  0x0000561a43028fef in do_command (thd=0x62a000210208) at /Server/bb-10.3-MDEV-23536/sql/sql_parse.cc:1398
#10 0x0000561a433f3872 in do_handle_one_connection (connect=0x608000001028) at /Server/bb-10.3-MDEV-23536/sql/sql_connect.cc:1403
#11 0x0000561a433f312c in handle_one_connection (arg=0x608000001028) at /Server/bb-10.3-MDEV-23536/sql/sql_connect.cc:1308
#12 0x000028880cfa6609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#13 0x000042c83a78f293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(rr) 
 
Effect 2)
pluto:/data/Results/1610048719/ASAN-11
cd dev/shm/vardir/1610048719/268/1/rr
_RR_TRACE_DIR="." rr replay --mark-stdio
 
Thread 3 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 2571156.2582658]
0x0000564556f948bc in MDL_lock::incompatible_granted_types_bitmap (this=0x616000185aa8) at /Server/bb-10.3-MDEV-23536/sql/mdl.cc:474
474       { return m_strategy->incompatible_granted_types_bitmap(); }
...
(rr) bt
#0  0x0000564556f948bc in MDL_lock::incompatible_granted_types_bitmap (this=0x616000185aa8) at /Server/bb-10.3-MDEV-23536/sql/mdl.cc:474
#1  0x0000564556f8d0ca in MDL_ticket::has_stronger_or_equal_type (this=0x606000083b40, type=MDL_SHARED) at /Server/bb-10.3-MDEV-23536/sql/mdl.cc:1722
#2  0x0000564556f8d5bb in MDL_context::find_ticket (this=0x62a000210328, mdl_request=0x2c296b54fe10, result_duration=0x2c296b54fe00) at /Server/bb-10.3-MDEV-23536/sql/mdl.cc:1773
#3  0x0000564556f9237c in MDL_context::is_lock_owner (this=0x62a000210328, mdl_namespace=MDL_key::TABLE, db=0x62b0000e09d8 "test", name=0x62b0000e0328 "t1", mdl_type=MDL_SHARED) at /Server/bb-10.3-MDEV-23536/sql/mdl.cc:2771
#4  0x0000564556de129b in mysql_rm_table_no_locks (thd=0x62a000210208, tables=0x62b0000e0378, if_exists=true, drop_temporary=false, drop_view=false, drop_sequence=false, dont_log_query=false, dont_free_locks=false)
    at /Server/bb-10.3-MDEV-23536/sql/sql_table.cc:2428
#5  0x0000564556ddf9e4 in mysql_rm_table (thd=0x62a000210208, tables=0x62b0000e0378, if_exists=true, drop_temporary=false, drop_sequence=false) at /Server/bb-10.3-MDEV-23536/sql/sql_table.cc:2130
#6  0x0000564556bbf2da in mysql_execute_command (thd=0x62a000210208) at /Server/bb-10.3-MDEV-23536/sql/sql_parse.cc:4824
#7  0x0000564556bd4674 in mysql_parse (thd=0x62a000210208, rawbuf=0x62b0000e0228 "DROP TABLE IF EXISTS t1 /* E_R Thread2 QNO 1548 CON_ID 1055 */", length=62, parser_state=0x2c296b5528b0, is_com_multi=false, is_next_command=false)
    at /Server/bb-10.3-MDEV-23536/sql/sql_parse.cc:7840
#8  0x0000564556bab4b8 in dispatch_command (command=COM_QUERY, thd=0x62a000210208, packet=0x629000e4c209 "DROP TABLE IF EXISTS t1 /* E_R Thread2 QNO 1548 CON_ID 1055 */ ", packet_length=63, is_com_multi=false, is_next_command=false)
    at /Server/bb-10.3-MDEV-23536/sql/sql_parse.cc:1852
#9  0x0000564556ba7fef in do_command (thd=0x62a000210208) at /Server/bb-10.3-MDEV-23536/sql/sql_parse.cc:1398
#10 0x0000564556f72872 in do_handle_one_connection (connect=0x608000001028) at /Server/bb-10.3-MDEV-23536/sql/sql_connect.cc:1403
#11 0x0000564556f7212c in handle_one_connection (arg=0x608000001028) at /Server/bb-10.3-MDEV-23536/sql/sql_connect.cc:1308
#12 0x0000000067fa0609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#13 0x00007a5e75af3293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(rr)
 
SUMMARY: AddressSanitizer: heap-use-after-free /Server/bb-10.3-MDEV-23536/sql/sql_plist.h:129 in I_P_List<MDL_ticket, I_P_List_adapter<MDL_ticket, &MDL_ticket::next_in_lock, &MDL_ticket::prev_in_lock>, I_P_List_null_counter, I_P_List_fast_push_back<MDL_ticket> >::remove(MDL_ticket*)
 
 
RQG
====
git clone https://github.com/mleich1/rqg --branch experimental RQG
 
rqg.pl \
--duration=300 \
--queries=10000000 \
--no_mask \
--seed=random \
--engine=InnoDB \
--gendata=conf/mariadb/table_stress.zz \
--gendata_sql=MDEV-17092.sql \
--rpl_mode=none \
--mysqld=--net_read_timeout=30 \
--mysqld=--log-bin \
--mysqld=--net_write_timeout=60 \
--mysqld=--log_bin_trust_function_creators=1 \
--mysqld=--wait_timeout=28800 \
--mysqld=--loose-debug_assert_on_not_freed_memory=0 \
--mysqld=--interactive_timeout=28800 \
--mysqld=--loose-idle_transaction_timeout=0 \
--mysqld=--slave_net_timeout=60 \
--mysqld=--log-output=none \
--mysqld=--loose-idle_readonly_transaction_timeout=0 \
--mysqld=--innodb-lock-wait-timeout=50 \
--mysqld=--loose-idle_write_transaction_timeout=0 \
--mysqld=--loose-table_lock_wait_timeout=50 \
--mysqld=--lock-wait-timeout=86400 \
--mysqld=--innodb_stats_persistent=off \
--mysqld=--connect_timeout=60 \
--mysqld=--loose_innodb_lock_schedule_algorithm=fcfs \
--reporters=Backtrace,Deadlock1,ErrorLog \
--validators=None \
--grammar=
--threads=10 \
--grammar=MDEV-17092.yy \
--workdir=<local settings> \
--vardir=<local settings> \
--mtr-build-thread=<local settings> \
--basedir1=<local settings> \
--script_debug=_nix_ \
--rr=Extended \
--rr_options=--chaos



 Comments   
Comment by Matthias Leich [ 2021-01-08 ]

Using the os_thread_sleep patch is highly recommended.
 
If going with the RQG call from the first comment you might need several attempts till one of the described effects shows up.
 
Alternative way for replaying
cd <top directory of RQG install>
Edit REPLAY.sh and raise the number of TRIALS to a value > 500.
./REPLAY.sh MDEV-17092.cfg <path to mariadb binaries>
Output:
... replay ...     One of the three error message patterns which I observed when applying the MDEV-17092 test to some old 10.2 tree.
...interest...   Just some different bad effect and most probably the one MDEV-24549 is about.
 

Comment by Marko Mäkelä [ 2021-01-08 ]

I debugged an rr replay of a SIGSEGV on a 10.3 branch of the MDEV-23536 fix (which I think is innocent). It seems to be a race condition between two threads in the SQL layer. The SIGSEGV occurred because m_strategy==NULL. It was cleared in this thread:

#4  0x0000561a43413a6c in MDL_context::release_transactional_locks (
    this=0x62a0000c0328) at /Server/bb-10.3-MDEV-23536/sql/mdl.cc:2852
#5  0x0000561a42e30476 in THD::release_transactional_locks (this=
    0x62a0000c0208) at /Server/bb-10.3-MDEV-23536/sql/sql_class.h:4478
#6  0x0000561a43049e3b in mysql_execute_command (thd=0x62a0000c0208)
    at /Server/bb-10.3-MDEV-23536/sql/sql_parse.cc:6176
#7  0x0000561a43055674 in mysql_parse (thd=0x62a0000c0208, 
    rawbuf=0x62b00001c228 "DROP TABLE IF EXISTS t1 /* E_R Thread8 QNO 161 CON_ID 150 */", length=60, parser_state=0x383053bea8b0, is_com_multi=false, 
    is_next_command=false) at /Server/bb-10.3-MDEV-23536/sql/sql_parse.cc:7840

At the time of the above execution, the thread that would ultimately fail with the SIGSEGV (if I remember correctly, also deep inside MDL_context::release_transactional_locks()) was executing the following:

#6  0x0000561a4326157f in mysql_rm_table_no_locks (thd=0x62a000210208, tables=0x62b0000e0378, if_exists=true, drop_temporary=false, drop_view=false, drop_sequence=false, dont_log_query=false, dont_free_locks=false) at /Server/bb-10.3-MDEV-23536/sql/sql_table.cc:2310
#7  0x0000561a432609e4 in mysql_rm_table (thd=0x62a000210208, tables=0x62b0000e0378, if_exists=true, drop_temporary=false, drop_sequence=false) at /Server/bb-10.3-MDEV-23536/sql/sql_table.cc:2130
#8  0x0000561a430402da in mysql_execute_command (thd=0x62a000210208) at /Server/bb-10.3-MDEV-23536/sql/sql_parse.cc:4824
#9  0x0000561a43055674 in mysql_parse (thd=0x62a000210208, rawbuf=0x62b0000e0228 "DROP TABLE IF EXISTS t1 /* E_R Thread4 QNO 134 CON_ID 141 */", length=60, pars…

To me, it looks like a race condition between two DROP TABLE IF EXISTS. I did not check whether the table actually existed at that point of time.

Note: Without the MDEV-23536 fix applied to 10.3, you are likely to encounter other errors.

Comment by Marko Mäkelä [ 2021-01-11 ]

The MDEV-23536 fix has been merged to 10.3.

Generated at Thu Feb 08 09:30:49 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.