Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-13316

Unresolved conflict between OPTIMIZE TABLE and DML transaction, causing cluster hang

    XMLWordPrintable

Details

    Description

      MariaDB Galera Cluster node hangs for the duration of innodb_lock_wait_timeout, if there is an OPTIMIZE TABLE statement conflicting with regular DML transaction. The problem appears to be that OPTIMIZE TABLE tried to abort the victim DML transaction, but DML transaction is in innodb lock wait, and aborts only after the lock wait times out. Stack traces for these threads are as follows:

      Thread 3 (Thread 0x7f1e7024fb00 (LWP 11581)):
      #0  pthread_cond_wait@@GLIBC_2.3.2 ()
          at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
      #1  0x00007f1e838bc6e6 in os_cond_wait (fast_mutex=0x7f1e200273a0, 
          cond=0x7f1e200273d8)
          at /home/seppo/work/MariaDB/server/storage/xtradb/os/os0sync.cc:196
      #2  os_event_wait_low (event=0x7f1e200273a0, reset_sig_count=326, 
          reset_sig_count@entry=0)
          at /home/seppo/work/MariaDB/server/storage/xtradb/os/os0sync.cc:477
      #3  0x00007f1e8388945d in lock_wait_suspend_thread (thr=thr@entry=0x7f1e4804e718)
          at /home/seppo/work/MariaDB/server/storage/xtradb/lock/lock0wait.cc:342
      #4  0x00007f1e8392f6af in row_mysql_handle_errors (
          new_err=new_err@entry=0x7f1e7024bbfc, trx=trx@entry=0x7f1e0406af88, 
          thr=thr@entry=0x7f1e4804e718, savept=savept@entry=0x0)
          at /home/seppo/work/MariaDB/server/storage/xtradb/row/row0mysql.cc:1173
      #5  0x00007f1e8395f74a in row_search_for_mysql (
          buf=buf@entry=0x7f1e4804a7e8 "\341\302\001", mode=<optimized out>, 
          mode@entry=0, prebuilt=0x7f1e4804dfb8, match_mode=match_mode@entry=0, 
          direction=direction@entry=1)
          at /home/seppo/work/MariaDB/server/storage/xtradb/row/row0sel.cc:5198
      #6  0x00007f1e8383c62e in ha_innodb::general_fetch (this=0x7f1e4804d378, 
          buf=0x7f1e4804a7e8 "\341\302\001", direction=direction@entry=1, 
          match_mode=match_mode@entry=0)
          at /home/seppo/work/MariaDB/server/storage/xtradb/handler/ha_innodb.cc:10437
      #7  0x00007f1e8383c8d7 in ha_innodb::index_next (this=<optimized out>, 
          buf=<optimized out>)
          at /home/seppo/work/MariaDB/server/storage/xtradb/handler/ha_innodb.cc:10507
      #8  0x000056500ab9fc32 in handler::ha_index_next (this=this@entry=0x7f1e4804d378, 
          buf=0x7f1e4804a7e8 "\341\302\001")
          at /home/seppo/work/MariaDB/server/sql/handler.cc:2666
      #9  0x000056500aba527f in handler::read_range_next (this=0x7f1e4804d378)
          at /home/seppo/work/MariaDB/server/sql/handler.cc:5331
      #10 0x000056500ab01e74 in handler::multi_range_read_next (this=0x7f1e4804d378, 
          range_info=range_info@entry=0x7f1e7024c658)
          at /home/seppo/work/MariaDB/server/sql/multi_range_read.cc:257
      #11 0x000056500ab01f78 in Mrr_simple_index_reader::get_next (this=0x7f1e4804d980, 
          range_info=0x7f1e7024c658)
          at /home/seppo/work/MariaDB/server/sql/multi_range_read.cc:322
      #12 0x000056500ab03806 in DsMrr_impl::dsmrr_next (this=0x7f1e4804d840, 
          range_info=0x7f1e7024c658)
          at /home/seppo/work/MariaDB/server/sql/multi_range_read.cc:1408
      #13 0x00007f1e83827b92 in ha_innodb::multi_range_read_next (this=<optimized out>, 
          range_info=<optimized out>)
          at /home/seppo/work/MariaDB/server/storage/xtradb/handler/ha_innodb.cc:22465
      #14 0x000056500ac8b6b0 in QUICK_RANGE_SELECT::get_next (this=0x7f1e0405fa70)
          at /home/seppo/work/MariaDB/server/sql/opt_range.cc:11184
      #15 0x000056500acab554 in rr_quick (info=0x7f1e7024c830)
          at /home/seppo/work/MariaDB/server/sql/records.cc:349
      #16 0x000056500aa89847 in mysql_update (thd=thd@entry=0x56500df30a50, 
          table_list=0x7f1e040c9fc0, fields=..., values=..., conds=0x7f1e040cd300, 
          order_num=<optimized out>, order=<optimized out>, limit=18446744073709551611, 
          handle_duplicates=DUP_ERROR, ignore=false, found_return=0x7f1e7024cc00, 
          updated_return=0x7f1e7024cc90)
          at /home/seppo/work/MariaDB/server/sql/sql_update.cc:639
      #17 0x000056500a9dfeaa in mysql_execute_command (thd=thd@entry=0x56500df30a50)
          at /home/seppo/work/MariaDB/server/sql/sql_parse.cc:3746
      #18 0x000056500a9e6688 in mysql_parse (thd=thd@entry=0x56500df30a50, 
          rawbuf=rawbuf@entry=0x7f1e040c9d68 "UPDATE comm38 SET x = (x + 1) % 65537, y = ((y*x % 65537) + (41504*y % 65537) + (y*z % 65537)) % 65537, z = ((z*x % 65537) + (z*y % 65537) + (41504*z % 65537)) % 65537, r = RAND() WHERE r > 0.329837 A"..., 
          length=length@entry=215, parser_state=parser_state@entry=0x7f1e7024e5e0)
          at /home/seppo/work/MariaDB/server/sql/sql_parse.cc:7319
      #19 0x000056500a9e71c6 in wsrep_mysql_parse (thd=thd@entry=0x56500df30a50, 
          rawbuf=0x7f1e040c9d68 "UPDATE comm38 SET x = (x + 1) % 65537, y = ((y*x % 65537) + (41504*y % 65537) + (y*z % 65537)) % 65537, z = ((z*x % 65537) + (z*y % 65537) + (41504*z % 65537)) % 65537, r = RAND() WHERE r > 0.329837 A"..., length=215, 
          parser_state=parser_state@entry=0x7f1e7024e5e0)
          at /home/seppo/work/MariaDB/server/sql/sql_parse.cc:7142
      #20 0x000056500a9e8705 in dispatch_command (command=command@entry=COM_QUERY, 
          thd=thd@entry=0x56500df30a50, 
          packet=packet@entry=0x56500de10541 "UPDATE comm38 SET x = (x + 1) % 65537, y = ((y*x % 65537) + (41504*y % 65537) + (y*z % 65537)) % 65537, z = ((z*x % 65537) + (z*y % 65537) + (41504*z % 65537)) % 65537, r = RAND() WHERE r > 0.329837 A"..., 
          packet_length=packet_length@entry=215)
          at /home/seppo/work/MariaDB/server/sql/sql_parse.cc:1485
      #21 0x000056500a9ead14 in do_command (thd=0x56500df30a50)
          at /home/seppo/work/MariaDB/server/sql/sql_parse.cc:1108
      #22 0x000056500aac92d0 in do_handle_one_connection (
          thd_arg=thd_arg@entry=0x56500df30a50)
          at /home/seppo/work/MariaDB/server/sql/sql_connect.cc:1350
      #23 0x000056500aac94bc in handle_one_connection (arg=arg@entry=0x56500df30a50)
          at /home/seppo/work/MariaDB/server/sql/sql_connect.cc:1262
      #24 0x000056500ad26587 in pfs_spawn_thread (arg=0x56500de2d2e0)
          at /home/seppo/work/MariaDB/server/storage/perfschema/pfs.cc:1860
      #25 0x00007f1e91df66ba in start_thread (arg=0x7f1e7024fb00)
          at pthread_create.c:333
      #26 0x00007f1e914a13dd in clone ()
          at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
       
      Thread 2 (Thread 0x7f1e70299b00 (LWP 11591)):
      #0  pthread_cond_timedwait@@GLIBC_2.3.2 ()
          at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
      #1  0x000056500ae6daf9 in safe_cond_timedwait (cond=cond@entry=0x7f1e08014f98, 
          mp=mp@entry=0x7f1e4802e008, abstime=abstime@entry=0x7f1e70295f80, 
          file=file@entry=0x56500aefe398 "/home/seppo/work/MariaDB/server/include/mysql/psi/mysql_thread.h", line=line@entry=1191)
          at /home/seppo/work/MariaDB/server/mysys/thr_mutex.c:547
      #2  0x000056500ae6acb7 in inline_mysql_cond_timedwait (src_line=586, 
          src_file=0x56500b03f710 "/home/seppo/work/MariaDB/server/mysys/thr_lock.c", 
          abstime=0x7f1e70295f80, mutex=0x7f1e4802e008, that=0x7f1e08014f98)
          at /home/seppo/work/MariaDB/server/include/mysql/psi/mysql_thread.h:1191
      #3  wait_for_lock (wait=wait@entry=0x7f1e4802e0d8, 
          data=data@entry=0x7f1e140a8140, in_wait_list=in_wait_list@entry=1 '\001', 
          lock_wait_timeout=lock_wait_timeout@entry=31536000)
          at /home/seppo/work/MariaDB/server/mysys/thr_lock.c:586
      #4  0x000056500ae6bbae in thr_lock (lock_wait_timeout=31536000, 
          owner=0x56500df278d8, data=0x7f1e140a8140)
          at /home/seppo/work/MariaDB/server/mysys/thr_lock.c:1021
      #5  thr_multi_lock (data=0x7f1e08077120, count=<optimized out>, 
          owner=owner@entry=0x56500df278d8, 
          lock_wait_timeout=lock_wait_timeout@entry=31536000)
          at /home/seppo/work/MariaDB/server/mysys/thr_lock.c:1302
      #6  0x000056500ac650ed in mysql_lock_tables (thd=thd@entry=0x56500df26070, 
          sql_lock=sql_lock@entry=0x7f1e08077100, flags=flags@entry=0)
          at /home/seppo/work/MariaDB/server/sql/lock.cc:320
      #7  0x000056500ac65ea1 in mysql_lock_tables (thd=thd@entry=0x56500df26070, 
          tables=tables@entry=0x7f1e080770f8, count=count@entry=1, flags=flags@entry=0)
          at /home/seppo/work/MariaDB/server/sql/lock.cc:275
      #8  0x000056500a98e5f5 in lock_tables (thd=thd@entry=0x56500df26070, 
          tables=0x7f1e080768a0, count=<optimized out>, flags=flags@entry=0)
          at /home/seppo/work/MariaDB/server/sql/sql_base.cc:5438
      #9  0x000056500a9918b6 in open_and_lock_tables (thd=thd@entry=0x56500df26070, 
          options=..., tables=tables@entry=0x7f1e080768a0, derived=derived@entry=true, 
          flags=flags@entry=0, 
          prelocking_strategy=prelocking_strategy@entry=0x7f1e70296440)
          at /home/seppo/work/MariaDB/server/sql/sql_base.cc:5267
      #10 0x000056500aad5c66 in open_and_lock_tables (flags=0, derived=true, 
          tables=0x7f1e080768a0, thd=0x56500df26070)
          at /home/seppo/work/MariaDB/server/sql/sql_base.h:534
      #11 mysql_admin_table(THD *, TABLE_LIST *, HA_CHECK_OPT *, const char *, thr_lock_type, bool, bool, uint, int (*)(THD *, TABLE_LIST *, HA_CHECK_OPT *), struct {...}, int (*)(THD *, TABLE_LIST *, HA_CHECK_OPT *)) (thd=thd@entry=0x56500df26070, 
          tables=tables@entry=0x7f1e080768a0, check_opt=check_opt@entry=0x56500df2a8d0, 
          operator_name=operator_name@entry=0x56500af2b42b "optimize", 
          lock_type=lock_type@entry=TL_WRITE, 
          open_for_modify=open_for_modify@entry=true, repair_table_use_frm=false, 
          extra_open_options=0, prepare_func=0x0, operator_func=
          (int (handler::*)(handler * const, THD *, HA_CHECK_OPT *)) 0x56500aba295e <handler::ha_optimize(THD*, st_ha_check_opt*)>, view_operator_func=0x0)
          at /home/seppo/work/MariaDB/server/sql/sql_admin.cc:465
      #12 0x000056500aad7df1 in Sql_cmd_optimize_table::execute (this=<optimized out>, 
          thd=0x56500df26070) at /home/seppo/work/MariaDB/server/sql/sql_admin.cc:1308
      #13 0x000056500a9e511c in mysql_execute_command (thd=thd@entry=0x56500df26070)
          at /home/seppo/work/MariaDB/server/sql/sql_parse.cc:5671
      #14 0x000056500a9e6688 in mysql_parse (thd=thd@entry=0x56500df26070, 
          rawbuf=rawbuf@entry=0x7f1e080767b8 "optimize table test.comm38", 
          length=length@entry=26, parser_state=parser_state@entry=0x7f1e702985e0)
          at /home/seppo/work/MariaDB/server/sql/sql_parse.cc:7319
      #15 0x000056500a9e71c6 in wsrep_mysql_parse (thd=thd@entry=0x56500df26070, 
          rawbuf=0x7f1e080767b8 "optimize table test.comm38", length=26, 
          parser_state=parser_state@entry=0x7f1e702985e0)
          at /home/seppo/work/MariaDB/server/sql/sql_parse.cc:7142
      #16 0x000056500a9e8705 in dispatch_command (command=command@entry=COM_QUERY, 
          thd=thd@entry=0x56500df26070, packet=packet@entry=0x56500df35f41 "", 
          packet_length=packet_length@entry=26)
          at /home/seppo/work/MariaDB/server/sql/sql_parse.cc:1485
      #17 0x000056500a9ead14 in do_command (thd=0x56500df26070)
          at /home/seppo/work/MariaDB/server/sql/sql_parse.cc:1108
      #18 0x000056500aac92d0 in do_handle_one_connection (
          thd_arg=thd_arg@entry=0x56500df26070)
          at /home/seppo/work/MariaDB/server/sql/sql_connect.cc:1350
      #19 0x000056500aac94bc in handle_one_connection (arg=arg@entry=0x56500df26070)
          at /home/seppo/work/MariaDB/server/sql/sql_connect.cc:1262
      #20 0x000056500ad26587 in pfs_spawn_thread (arg=0x56500def29e0)
          at /home/seppo/work/MariaDB/server/storage/perfschema/pfs.cc:1860
      #21 0x00007f1e91df66ba in start_thread (arg=0x7f1e70299b00)
          at pthread_create.c:333
      #22 0x00007f1e914a13dd in clone ()
          at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
      

      I have a test case which can reproduce the issue reliably, but I cannot share it. For reproducing following sinmple scenario will make the issue surface:

      • Start one MariaDB Galera Cluster 10.1.9 node
      • create ~30 simple innodb tables, and populate tables with ~100 rows
      • start 10 client sessions running transactions issuing a few single row updates randomly on the tables. Use proper MST transaction, not AC
      • start one client session cycling through tables issuing OPTIMIZE TABLE for each table in turn

      The OPTIMIZE TABLE session will abort DML transactions, which usually abort gracefully. But, in my test rig, after ̃ ~10 minute test run I can observe a total hang for the duration of innodb_lock_wait_timeout. It makes sense to set innodb_lock_wait_timeout to high value so that the hang is easier to diagnose.

      I have reproduced the issue with MariaDB 10.1.19, so far

      Attachments

        Issue Links

          Activity

            People

              jplindst Jan Lindström (Inactive)
              seppo Seppo Jaakola
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.