Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-4178

Galera Docs: Deadlock in galera_pre_commit on concurrent DML flow with 2 masters due to wrong innodb_autoinc_lock_mode

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Minor
    • Resolution: Won't Fix
    • None
    • None
    • None

    Description

      I have a 3-node cluster. On two nodes I run 1-thread DML flow, the third node is a slave only.
      Soon after the workflow starts, the nodes hang, seemingly forever.

      Node 1 process list:

      +----+-------------+-----------------+------+---------+------+---------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
      | Id | User        | Host            | db   | Command | Time | State                                 | Info                                                                                                                                                                                                                                                                  | Progress |
      +----+-------------+-----------------+------+---------+------+---------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
      |  1 | system user |                 | NULL | Sleep   |   98 | Write_rows_log_event::write_row(1493) | NULL                                                                                                                                                                                                                                                                  |    0.000 |
      |  2 | system user |                 | NULL | Sleep   | 1269 | wsrep aborter idle                    | NULL                                                                                                                                                                                                                                                                  |    0.000 |
      | 10 | root        | localhost:41086 | test | Query   |   92 | query end                             | INSERT LOW_PRIORITY IGNORE INTO `table0_int_autoinc` ( `col_int_key` ) SELECT `col_int_key` FROM `table2_int_autoinc` AS X WHERE X . `col_int_key` IN ( 83 , 'l' , 1757151232 , 5 , 9 ) ORDER BY `col_char_12`,`col_char_12_key`,`col_int`,`col_int_key`,`pk` LIMIT 6 |    0.000 |
      | 19 | root        | localhost:41101 | NULL | Query   |    0 | sleeping                              | show full processlist                                                                                                                                                                                                                                                 |    0.000 |
      +----+-------------+-----------------+------+---------+------+---------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+

      Node 2 process list:

      +----+-------------+-----------------+------+---------+------+--------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
      | Id | User        | Host            | db   | Command | Time | State              | Info                                                                                                                                                                                  | Progress |
      +----+-------------+-----------------+------+---------+------+--------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
      |  1 | system user |                 | NULL | Sleep   |  151 | committed 1494     | NULL                                                                                                                                                                                  |    0.000 |
      |  2 | system user |                 | NULL | Sleep   | 1296 | wsrep aborter idle | NULL                                                                                                                                                                                  |    0.000 |
      |  7 | root        | localhost:42574 | test | Query   |  157 | query end          | DELETE LOW_PRIORITY QUICK  FROM `table0_int_autoinc` WHERE `col_int_key` IN ( 'o' , 2 , 0 , 'm' , 'l' ) ORDER BY `col_char_12`,`col_char_12_key`,`col_int`,`col_int_key`,`pk` LIMIT 1 |    0.000 |
      |  9 | root        | localhost:42584 | NULL | Query   |    0 | sleeping           | show full processlist                                                                                                                                                                 |    0.000 |
      +----+-------------+-----------------+------+---------+------+--------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+

      Node 1 stack trace (selected threads):

      Thread 21 (Thread 0x7fd40959a700 (LWP 14628)):
      #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
      #1  0x0000000000cfc88e in safe_cond_wait (cond=0x26bd3e0, mp=0x26bd330, file=0xeb8cb8 "maria-5.5-galera/storage/xtradb/os/os0sync.c", line=207) at maria-5.5-galera/mysys/thr_mutex.c:493
      #2  0x0000000000c607a6 in os_cond_wait (cond=0x26bd3e0, mutex=0x26bd330) at maria-5.5-galera/storage/xtradb/os/os0sync.c:207
      #3  0x0000000000c60fbb in os_event_wait_low (event=0x26bd330, reset_sig_count=40) at maria-5.5-galera/storage/xtradb/os/os0sync.c:609
      #4  0x0000000000b32f74 in srv_suspend_mysql_thread (thr=0x7fd3c02475e0) at maria-5.5-galera/storage/xtradb/srv/srv0srv.c:1917
      #5  0x0000000000b07ab2 in row_mysql_handle_errors (new_err=0x7fd409596510, trx=0x7fd4040182c8, thr=0x7fd3c02475e0, savept=0x0) at maria-5.5-galera/storage/xtradb/row/row0mysql.c:599
      #6  0x0000000000b086dd in row_lock_table_autoinc_for_mysql (prebuilt=0x7fd3c0246ba8) at maria-5.5-galera/storage/xtradb/row/row0mysql.c:1063
      #7  0x0000000000acca7b in ha_innobase::innobase_lock_autoinc (this=0x7fd3c0244798) at maria-5.5-galera/storage/xtradb/handler/ha_innodb.cc:6364
      #8  0x0000000000accb44 in ha_innobase::innobase_set_max_autoinc (this=0x7fd3c0244798, auto_inc=1162) at maria-5.5-galera/storage/xtradb/handler/ha_innodb.cc:6415
      #9  0x0000000000acd5db in ha_innobase::write_row (this=0x7fd3c0244798, record=0x7fd3c0242b18  <incomplete sequence \347>) at maria-5.5-galera/storage/xtradb/handler/ha_innodb.cc:6705
      #10 0x00000000007efe87 in handler::ha_write_row (this=0x7fd3c0244798, buf=0x7fd3c0242b18  <incomplete sequence \347>) at maria-5.5-galera/sql/handler.cc:5232
      #11 0x00000000008d4d79 in Rows_log_event::write_row (this=0x7fd40402b260, rli=0x7fd404010aa0, overwrite=false) at maria-5.5-galera/sql/log_event.cc:9949
      #12 0x00000000008d4ee7 in Write_rows_log_event::do_exec_row (this=0x7fd40402b260, rli=0x7fd404010aa0) at maria-5.5-galera/sql/log_event.cc:10120
      #13 0x00000000008d0f61 in Rows_log_event::do_apply_event (this=0x7fd40402b260, rli=0x7fd404010aa0) at maria-5.5-galera/sql/log_event.cc:8516
      #14 0x0000000000593196 in Log_event::apply_event (this=0x7fd40402b260, rli=0x7fd404010aa0) at maria-5.5-galera/sql/log_event.h:1230
      #15 0x000000000062948d in wsrep_apply_rbr (thd=0x7fd404000af0, rbr_buf=0x7fd404023cf0 "x1!Q\023", buf_len=0) at maria-5.5-galera/sql/sql_parse.cc:8131
      #16 0x0000000000629a79 in wsrep_apply_cb (ctx=0x7fd404000af0, buf=0x7fd404023cf0, buf_len=125, global_seqno=1493) at maria-5.5-galera/sql/sql_parse.cc:8210
      #17 0x00007fd4147e2abf in apply_wscoll (trx=..., apply_cb=0x6299cb <wsrep_apply_cb(void*, void const*, unsigned long, long)>, recv_ctx=0x7fd404000af0) at galera/src/replicator_smm.cpp:37
      #18 apply_trx_ws (recv_ctx=0x7fd404000af0, apply_cb=0x6299cb <wsrep_apply_cb(void*, void const*, unsigned long, long)>, commit_cb=0x629cd6 <wsrep_commit_cb(void*, long, bool)>, trx=...) at galera/src/replicator_smm.cpp:81
      #19 0x00007fd4147e8030 in galera::ReplicatorSMM::apply_trx (this=<optimized out>, recv_ctx=0x7fd404000af0, trx=0x7fd404023460) at galera/src/replicator_smm.cpp:470
      #20 0x00007fd4147e87c0 in galera::ReplicatorSMM::process_trx (this=0x21dc510, recv_ctx=0x7fd404000af0, trx=0x7fd404023460) at galera/src/replicator_smm.cpp:1056
      #21 0x00007fd4147c5d34 in galera::GcsActionSource::dispatch (this=0x21dca98, recv_ctx=0x7fd404000af0, act=...) at galera/src/gcs_action_source.cpp:110
      #22 0x00007fd4147c69b3 in galera::GcsActionSource::process (this=0x21dca98, recv_ctx=0x7fd404000af0) at galera/src/gcs_action_source.cpp:168
      #23 0x00007fd4147ea37d in async_recv (recv_ctx=0x7fd404000af0, this=0x21dc510) at galera/src/replicator_smm.cpp:375
      #24 galera::ReplicatorSMM::async_recv (this=0x21dc510, recv_ctx=0x7fd404000af0) at galera/src/replicator_smm.cpp:356
      #25 0x00007fd4147fdcae in galera_recv (gh=<optimized out>, recv_ctx=<optimized out>) at galera/src/wsrep_provider.cpp:201
      #26 0x0000000000629ff0 in wsrep_replication_process (thd=0x7fd404000af0) at maria-5.5-galera/sql/sql_parse.cc:8363
      #27 0x00000000005693ff in start_wsrep_THD (arg=0x629f6d) at maria-5.5-galera/sql/mysqld.cc:4832
      #28 0x00007fd415e32e9a in start_thread (arg=0x7fd40959a700) at pthread_create.c:308
      #29 0x00007fd415563cbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
      #30 0x0000000000000000 in ?? ()
       
      Thread 2 (Thread 0x7fd4083b5700 (LWP 14913)):
      #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
      #1  0x00007fd4147e4985 in wait (cond=..., this=<optimized out>) at galerautils/src/gu_lock.hpp:56
      #2  galera::Monitor<galera::ReplicatorSMM::CommitOrder>::enter (this=0x21dcfa8, obj=...) at galera/src/monitor.hpp:126
      #3  0x00007fd4147e73b0 in galera::ReplicatorSMM::pre_commit (this=0x21dc510, trx=0x7fd3c0111320) at galera/src/replicator_smm.cpp:739
      #4  0x00007fd4147ff47f in galera_pre_commit (gh=<optimized out>, conn_id=<optimized out>, trx_handle=<optimized out>, rbr_data=<optimized out>, rbr_data_len=139, flags=1, global_seqno=0x35a5128) at galera/src/wsrep_provider.cpp:396
      #5  0x0000000000771833 in wsrep_run_wsrep_commit (thd=0x35a2d40, hton=0x2202160, all=false) at maria-5.5-galera/sql/wsrep_hton.cc:338
      #6  0x0000000000770685 in wsrep_prepare (hton=0x2202160, thd=0x35a2d40, all=false) at maria-5.5-galera/sql/wsrep_hton.cc:119
      #7  0x00000000007e7988 in ha_commit_trans (thd=0x35a2d40, all=false) at maria-5.5-galera/sql/handler.cc:1294
      #8  0x0000000000730221 in trans_commit_stmt (thd=0x35a2d40) at maria-5.5-galera/sql/transaction.cc:325
      #9  0x0000000000621031 in mysql_execute_command (thd=0x35a2d40) at maria-5.5-galera/sql/sql_parse.cc:4903
      #10 0x0000000000624da9 in mysql_parse (thd=0x35a2d40, rawbuf=0x7fd3c0015558 "INSERT LOW_PRIORITY IGNORE INTO `table0_int_autoinc` ( `col_int_key` ) SELECT `col_int_key` FROM `table2_int_autoinc` AS X WHERE X . `col_int_key` IN ( 83 , 'l' , 1757151232 , 5 , 9 ) ORDER BY `col_ch"..., length=261, parser_state=0x7fd4083b4550) at maria-5.5-galera/sql/sql_parse.cc:6304
      #11 0x0000000000623e70 in wsrep_mysql_parse (thd=0x35a2d40, rawbuf=0x7fd3c0015558 "INSERT LOW_PRIORITY IGNORE INTO `table0_int_autoinc` ( `col_int_key` ) SELECT `col_int_key` FROM `table2_int_autoinc` AS X WHERE X . `col_int_key` IN ( 83 , 'l' , 1757151232 , 5 , 9 ) ORDER BY `col_ch"..., length=261, parser_state=0x7fd4083b4550) at maria-5.5-galera/sql/sql_parse.cc:6069
      #12 0x0000000000616335 in dispatch_command (command=COM_QUERY, thd=0x35a2d40, packet=0x35c7c21 "INSERT LOW_PRIORITY IGNORE INTO `table0_int_autoinc` ( `col_int_key` ) SELECT `col_int_key` FROM `table2_int_autoinc` AS X WHERE X . `col_int_key` IN ( 83 , 'l' , 1757151232 , 5 , 9 ) ORDER BY `col_ch"..., packet_length=261) at maria-5.5-galera/sql/sql_parse.cc:1245
      #13 0x0000000000615114 in do_command (thd=0x35a2d40) at maria-5.5-galera/sql/sql_parse.cc:891
      #14 0x000000000071ddd7 in do_handle_one_connection (thd_arg=0x35a2d40) at maria-5.5-galera/sql/sql_connect.cc:1291
      #15 0x000000000071d7af in handle_one_connection (arg=0x35a2d40) at maria-5.5-galera/sql/sql_connect.cc:1199
      #16 0x00007fd415e32e9a in start_thread (arg=0x7fd4083b5700) at pthread_create.c:308
      #17 0x00007fd415563cbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
      #18 0x0000000000000000 in ?? ()
       

      Node 2 stack trace:

      Thread 2 (Thread 0x7f9b1c53e700 (LWP 14933)):
      #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
      #1  0x00007f9b287017de in _gcs_sm_enqueue_common (cond=0x7f9b1c53b980, sm=0x7f9b2a6e0010) at gcs/src/gcs_sm.h:160
      #2  gcs_sm_enter (scheduled=<optimized out>, cond=0x7f9b1c53b980, sm=0x7f9b2a6e0010) at gcs/src/gcs_sm.h:238
      #3  gcs_repl (conn=0x3626590, act=0x7f9b1c53bcf0, scheduled=<optimized out>) at gcs/src/gcs.c:1490
      #4  0x00007f9b28753b1c in repl (scheduled=true, act=..., this=0x3620a30) at galera/src/gcs.hpp:106
      #5  galera::ReplicatorSMM::replicate (this=0x3620540, trx=0x7f9ae02f6cf0) at galera/src/replicator_smm.cpp:552
      #6  0x00007f9b2876b3ff in galera_pre_commit (gh=<optimized out>, conn_id=<optimized out>, trx_handle=<optimized out>, rbr_data=0x7f9ae02a7f40, rbr_data_len=107, flags=1, global_seqno=0x4aa35d8) at galera/src/wsrep_provider.cpp:388
      #7  0x0000000000771833 in wsrep_run_wsrep_commit (thd=0x4aa11f0, hton=0x36468b0, all=false) at maria-5.5-galera/sql/wsrep_hton.cc:338
      #8  0x0000000000770685 in wsrep_prepare (hton=0x36468b0, thd=0x4aa11f0, all=false) at maria-5.5-galera/sql/wsrep_hton.cc:119
      #9  0x00000000007e7988 in ha_commit_trans (thd=0x4aa11f0, all=false) at maria-5.5-galera/sql/handler.cc:1294
      #10 0x0000000000730221 in trans_commit_stmt (thd=0x4aa11f0) at maria-5.5-galera/sql/transaction.cc:325
      #11 0x0000000000621031 in mysql_execute_command (thd=0x4aa11f0) at maria-5.5-galera/sql/sql_parse.cc:4903
      #12 0x0000000000624da9 in mysql_parse (thd=0x4aa11f0, rawbuf=0x7f9ae02cb358 "DELETE LOW_PRIORITY QUICK  FROM `table0_int_autoinc` WHERE `col_int_key` IN ( 'o' , 2 , 0 , 'm' , 'l' ) ORDER BY `col_char_12`,`col_char_12_key`,`col_int`,`col_int_key`,`pk` LIMIT 1", length=181, parser_state=0x7f9b1c53d550) at maria-5.5-galera/sql/sql_parse.cc:6304
      #13 0x0000000000623e70 in wsrep_mysql_parse (thd=0x4aa11f0, rawbuf=0x7f9ae02cb358 "DELETE LOW_PRIORITY QUICK  FROM `table0_int_autoinc` WHERE `col_int_key` IN ( 'o' , 2 , 0 , 'm' , 'l' ) ORDER BY `col_char_12`,`col_char_12_key`,`col_int`,`col_int_key`,`pk` LIMIT 1", length=181, parser_state=0x7f9b1c53d550) at maria-5.5-galera/sql/sql_parse.cc:6069
      #14 0x0000000000616335 in dispatch_command (command=COM_QUERY, thd=0x4aa11f0, packet=0x4aa4ed1 "DELETE LOW_PRIORITY QUICK  FROM `table0_int_autoinc` WHERE `col_int_key` IN ( 'o' , 2 , 0 , 'm' , 'l' ) ORDER BY `col_char_12`,`col_char_12_key`,`col_int`,`col_int_key`,`pk` LIMIT 1", packet_length=181) at maria-5.5-galera/sql/sql_parse.cc:1245
      #15 0x0000000000615114 in do_command (thd=0x4aa11f0) at maria-5.5-galera/sql/sql_parse.cc:891
      #16 0x000000000071ddd7 in do_handle_one_connection (thd_arg=0x4aa11f0) at maria-5.5-galera/sql/sql_connect.cc:1291
      #17 0x000000000071d7af in handle_one_connection (arg=0x4aa11f0) at maria-5.5-galera/sql/sql_connect.cc:1199
      #18 0x00007f9b29d9ee9a in start_thread (arg=0x7f9b1c53e700) at pthread_create.c:308
      #19 0x00007f9b294cfcbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
      #20 0x0000000000000000 in ?? ()

      Server command lines:

      /home/elenst/maria-5.5-galera/sql/mysqld --no-defaults --basedir=/home/elenst/maria-5.5-galera --lc-messages-dir=/home/elenst/maria-5.5-galera/sql/share/ --core --datadir=/home/elenst/maria-5.5-galera/data1 --tmpdir=/home/elenst/maria-5.5-galera/data1/tmp --port=8306 --socket=/home/elenst/maria-5.5-galera/data1/tmp/node1.sock --wsrep-provider=/home/elenst/galera/libgalera_smm.so --wsrep-cluster-address=gcomm:// --binlog-format=row --wsrep-sst-method=rsync --log-error=/home/elenst/maria-5.5-galera/data1/log.err
       
      /home/elenst/maria-5.5-galera/sql/mysqld --no-defaults --basedir=/home/elenst/maria-5.5-galera --lc-messages-dir=/home/elenst/maria-5.5-galera/sql/share/ --core --datadir=/home/elenst/maria-5.5-galera/data2 --tmpdir=/home/elenst/maria-5.5-galera/data2/tmp --port=8307 --socket=/home/elenst/maria-5.5-galera/data2/tmp/node2.sock --wsrep-provider=/home/elenst/galera/libgalera_smm.so --wsrep-cluster-address=gcomm://127.0.0.1:4567?gmcast.listen_addr=tcp://127.0.0.1:4566 --binlog-format=row --wsrep-sst-method=rsync --log-error=/home/elenst/maria-5.5-galera/data2/log.err
       
      /home/elenst/maria-5.5-galera/sql/mysqld --no-defaults --basedir=/home/elenst/maria-5.5-galera --lc-messages-dir=/home/elenst/maria-5.5-galera/sql/share/ --core --datadir=/home/elenst/maria-5.5-galera/data3 --tmpdir=/home/elenst/maria-5.5-galera/data3/tmp --port=8308 --socket=/home/elenst/maria-5.5-galera/data2/tmp/node3.sock --wsrep-provider=/home/elenst/galera/libgalera_smm.so --wsrep-cluster-address=gcomm://127.0.0.1:4567?gmcast.listen_addr=tcp://127.0.0.1:4565 --binlog-format=row --wsrep-sst-method=rsync --log-error=/home/elenst/maria-5.5-galera/data3/log.err

      To reproduce with RQG:

      # Node 1:
      perl ./gentest.pl --gendata=conf/engines/engine_stress.zz --threads=1 --queries=100M --duration=600 --dsn=dbi:mysql:host=127.0.0.1:port=8306:user=root:database=test --grammar=conf/engines/engine_stress.yy
       
      # Node 2 (after node 1 created the tables and started the flow):
      perl ./gentest.pl  --threads=1 --queries=100M --duration=600 --dsn=dbi:mysql:host=127.0.0.1:port=8307:user=root:database=test --grammar=conf/engines/engine_stress.yy

      revision-id: seppo.jaakola@codership.com-20130216222240-syypg8kc355qcpsp
      revno: 3380
      branch-nick: maria-5.5-galera

      Built as

      cmake . -DCMAKE_BUILD_TYPE=Debug && make

      Attachments

        1. log.err
          608 kB
          Elena Stepanova
        2. log.err
          37 kB
          Elena Stepanova
        3. node1.threads
          34 kB
          Elena Stepanova
        4. node2.threads
          31 kB
          Elena Stepanova

        Activity

          People

            seppo Seppo Jaakola
            elenst Elena Stepanova
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.