Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-22216

Assertion `xid_seqno > wsrep_seqno' failed in trx_rseg_update_wsrep_checkpoint upon concurrent DML/DDL

    XMLWordPrintable

Details

    Description

      Note: Even with the injection below, the test is far from deterministic, the injection only makes the failure more probable. There is still a lot of race involved. On a rotating disk it fails fairly quickly for me (within a minute, usually less), but doesn't fail either on SSD or in shm. I suppose really fast rotating disks can also make reproducing problematic. It seems to be a very good case for debugging through an rr profile.

      10.4 edc3899d9 with provided injection

      mysqld: /data/src/10.4-bug/storage/innobase/trx/trx0rseg.cc:102: void trx_rseg_update_wsrep_checkpoint(trx_rsegf_t*, const XID*, mtr_t*): Assertion `xid_seqno > wsrep_seqno' failed.
      200410 22:01:26 [ERROR] mysqld got signal 6 ;
       
      #15 0x00007f86ceea6102 in __GI___assert_fail (assertion=0x55fcc5daf66c "xid_seqno > wsrep_seqno", file=0x55fcc5daf5a0 "/data/src/10.4-bug/storage/innobase/trx/trx0rseg.cc", line=102, function=0x55fcc5dafde0 <trx_rseg_update_wsrep_checkpoint(unsigned char*, xid_t const*, mtr_t*)::__PRETTY_FUNCTION__> "void trx_rseg_update_wsrep_checkpoint(trx_rsegf_t*, const XID*, mtr_t*)") at assert.c:101
      #16 0x000055fcc55ebc6f in trx_rseg_update_wsrep_checkpoint (rseg_header=0x7f868c664026 "", xid=0x55fcc7b7cfd0, mtr=0x7f86880ad5d0) at /data/src/10.4-bug/storage/innobase/trx/trx0rseg.cc:102
      #17 0x000055fcc55cd60c in trx_purge_add_undo_to_history (trx=0x7f8698bf8390, undo=@0x7f8698bf93b0: 0x7f86142bcb30, mtr=0x7f86880ad5d0) at /data/src/10.4-bug/storage/innobase/trx/trx0purge.cc:281
      #18 0x000055fcc55f7be7 in trx_write_serialisation_history (trx=0x7f8698bf8390, mtr=0x7f86880ad5d0) at /data/src/10.4-bug/storage/innobase/trx/trx0trx.cc:1135
      #19 0x000055fcc55fa015 in trx_commit_low (trx=0x7f8698bf8390, mtr=0x7f86880ad5d0) at /data/src/10.4-bug/storage/innobase/trx/trx0trx.cc:1574
      #20 0x000055fcc55fa175 in trx_commit (trx=0x7f8698bf8390) at /data/src/10.4-bug/storage/innobase/trx/trx0trx.cc:1642
      #21 0x000055fcc55fa65a in trx_commit_for_mysql (trx=0x7f8698bf8390) at /data/src/10.4-bug/storage/innobase/trx/trx0trx.cc:1788
      #22 0x000055fcc53843bc in innobase_commit_low (trx=0x7f8698bf8390) at /data/src/10.4-bug/storage/innobase/handler/ha_innodb.cc:4401
      #23 0x000055fcc53847e6 in innobase_commit_ordered_2 (trx=0x7f8698bf8390, thd=0x7f8614000d50) at /data/src/10.4-bug/storage/innobase/handler/ha_innodb.cc:4523
      #24 0x000055fcc5384ab2 in innobase_commit_ordered (hton=0x55fcc7b57f30, thd=0x7f8614000d50, all=false) at /data/src/10.4-bug/storage/innobase/handler/ha_innodb.cc:4583
      #25 0x000055fcc527a356 in TC_LOG::run_commit_ordered (this=0x55fcc65b4e00 <mysql_bin_log>, thd=0x7f8614000d50, all=false) at /data/src/10.4-bug/sql/log.cc:8944
      #26 0x000055fcc5277ff2 in MYSQL_BIN_LOG::trx_group_commit_leader (this=0x55fcc65b4e00 <mysql_bin_log>, leader=0x7f86880adfe0) at /data/src/10.4-bug/sql/log.cc:8178
      #27 0x000055fcc5276995 in MYSQL_BIN_LOG::write_transaction_to_binlog_events (this=0x55fcc65b4e00 <mysql_bin_log>, entry=0x7f86880adfe0) at /data/src/10.4-bug/sql/log.cc:7767
      #28 0x000055fcc5275e81 in MYSQL_BIN_LOG::write_transaction_to_binlog (this=0x55fcc65b4e00 <mysql_bin_log>, thd=0x7f8614000d50, cache_mngr=0x7f86141f9320, end_ev=0x7f86880ae180, all=false, using_stmt_cache=true, using_trx_cache=true) at /data/src/10.4-bug/sql/log.cc:7415
      #29 0x000055fcc5264def in binlog_flush_cache (thd=0x7f8614000d50, cache_mngr=0x7f86141f9320, end_ev=0x7f86880ae180, all=false, using_stmt=true, using_trx=true) at /data/src/10.4-bug/sql/log.cc:1786
      #30 0x000055fcc5265472 in binlog_commit_flush_xid_caches (thd=0x7f8614000d50, cache_mngr=0x7f86141f9320, all=false, xid=42) at /data/src/10.4-bug/sql/log.cc:1894
      #31 0x000055fcc527c8f9 in MYSQL_BIN_LOG::log_and_order (this=0x55fcc65b4e00 <mysql_bin_log>, thd=0x7f8614000d50, xid=42, all=false, need_prepare_ordered=false, need_commit_ordered=true) at /data/src/10.4-bug/sql/log.cc:9727
      #32 0x000055fcc514292f in ha_commit_trans (thd=0x7f8614000d50, all=false) at /data/src/10.4-bug/sql/handler.cc:1624
      #33 0x000055fcc4f7faf3 in trans_commit_stmt (thd=0x7f8614000d50) at /data/src/10.4-bug/sql/transaction.cc:436
      #34 0x000055fcc4df10ce in mysql_execute_command (thd=0x7f8614000d50) at /data/src/10.4-bug/sql/sql_parse.cc:6158
      #35 0x000055fcc4df6135 in mysql_parse (thd=0x7f8614000d50, rawbuf=0x7f8614014288 "INSERT /* QNO 21 CON_ID 17 */ INTO t1 (k) VALUES (0)", length=52, parser_state=0x7f86880b0190, is_com_multi=false, is_next_command=false) at /data/src/10.4-bug/sql/sql_parse.cc:7900
      #36 0x000055fcc4df5735 in wsrep_mysql_parse (thd=0x7f8614000d50, rawbuf=0x7f8614014288 "INSERT /* QNO 21 CON_ID 17 */ INTO t1 (k) VALUES (0)", length=52, parser_state=0x7f86880b0190, is_com_multi=false, is_next_command=false) at /data/src/10.4-bug/sql/sql_parse.cc:7704
      #37 0x000055fcc4de12fe in dispatch_command (command=COM_QUERY, thd=0x7f8614000d50, packet=0x7f8614008751 "INSERT /* QNO 21 CON_ID 17 */ INTO t1 (k) VALUES (0)", packet_length=52, is_com_multi=false, is_next_command=false) at /data/src/10.4-bug/sql/sql_parse.cc:1827
      #38 0x000055fcc4ddfad3 in do_command (thd=0x7f8614000d50) at /data/src/10.4-bug/sql/sql_parse.cc:1359
      #39 0x000055fcc4f66a9c in do_handle_one_connection (connect=0x55fcc8823e90) at /data/src/10.4-bug/sql/sql_connect.cc:1412
      #40 0x000055fcc4f667eb in handle_one_connection (arg=0x55fcc8823e90) at /data/src/10.4-bug/sql/sql_connect.cc:1316
      #41 0x00007f86cf720fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
      #42 0x00007f86cef6f4cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
      

      All threads' stack traces are attached.

      injection

      index 39035dc545e..c0d53381be0 100644
      --- a/storage/innobase/trx/trx0purge.cc
      +++ b/storage/innobase/trx/trx0purge.cc
      @@ -277,6 +277,7 @@ trx_purge_add_undo_to_history(const trx_t* trx, trx_undo_t*& undo, mtr_t* mtr)
       
       #ifdef WITH_WSREP
              if (wsrep_is_wsrep_xid(trx->xid)) {
      +        DBUG_EXECUTE_IF("sleep_before_update_checkpoint",my_sleep(100000););
                      trx_rseg_update_wsrep_checkpoint(rseg_header, trx->xid, mtr);
              }
       #endif
      

      To reproduce:

      • apply the debug injection above and build with -DCMAKE_BUILD_TYPE=Debug;
      • Run the RQG test (put real paths in the command line instead of placeholders):

        git clone https://github.com/MariaDB/randgen --branch mdev22216 rqg-mdev22216
        cd rqg-mdev22216
        . ./mdev22216.cmd --basedir=<basedir> --vardir=<vardir on the slowest disk that you have> --mysqld=--wsrep-provider=<location of Galera-4 library>
        

      The grammar is already in the RQG branch and in command-line parameters. For the record, the contents of the grammar:

      thread1_init:
          CREATE DATABASE IF NOT EXISTS db
          ; CREATE TABLE IF NOT EXISTS t1 (
              id INT NOT NULL auto_increment,
              k INT,
              PRIMARY KEY (id)
          ) ENGINE=InnoDB
          ; SET debug_dbug= '+d,sleep_before_update_checkpoint'
      ;
       
      thread1:
          INSERT INTO t1 (k) VALUES (0);
       
      thread2:
          CREATE TABLE db.t SELECT id FROM t1 LIMIT 0
          ; DROP TABLE IF EXISTS db.t
      ;
       
      thread3:
          CREATE DATABASE IF NOT EXISTS db
          ; DROP TABLE IF EXISTS x
      ;
      

      Reproducible on at least 10.4 and 10.5. I don't have information about earlier versions or about the effect on non-debug builds.

      rr-packed profile created on Debian 10 Buster: ftp://ftp.askmonty.org/public/rr-mdev22216-debian10-x86_64.tar.gz

      Attachments

        Activity

          People

            jplindst Jan Lindström (Inactive)
            elenst Elena Stepanova
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.