Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-16428

Simple concurrent DML on RocksDB tables makes optimistic parallel replication abort

    XMLWordPrintable

Details

    Description

      Note: run the test case below with --mysqld=--slave_parallel_mode=optimistic --mysqld=--slave-parallel-threads=2 --mysqld=--plugin-load-add=ha_rocksdb. It usually fails on the first attempt for me, but it's still a race condition, so sometimes it misses the mark. Run with --repeat if it doesn't fail right away.

      --source include/have_binlog_format_row.inc
      --source include/master-slave.inc
       
      CREATE TABLE t1 (pk INT AUTO_INCREMENT PRIMARY KEY, a INT, KEY(a)) ENGINE=RocksDB;
      CREATE TABLE t2 (pk INT AUTO_INCREMENT PRIMARY KEY, b INT) ENGINE=RocksDB;
       
      INSERT INTO t1 (pk) VALUES (NULL);
       
      --connect (con1,localhost,root,,test)
      INSERT INTO t1 (pk) VALUES (NULL);
       
      --connection master1
      BEGIN;
       
      --connection con1
      INSERT INTO t1 (pk) VALUES (NULL);
      INSERT INTO t1 (pk) VALUES (NULL);
       
      --connection master1
      INSERT INTO t2 (pk) VALUES (NULL),(NULL);
      --send
        UPDATE t1 SET a = 1;
       
      --connection master
      --send
        DELETE FROM t1;
       
      --connection master1
      --reap
      COMMIT;
       
      --connection master
      --reap
       
      --sync_slave_with_master
       
      # Cleanup
      --disconnect con1
      --connection master
      DROP TABLE t1, t2;
      --source include/rpl_end.inc
      

      10.2 a31e99a89c

      2018-06-08  0:12:38 140524761007872 [Note] Slave SQL thread initialized, starting replication in log 
      'master-bin.000001' at position 4, relay log './slave-relay-bin.000001' position: 4
      2018-06-08  0:12:49 140524760401664 [ERROR] Slave worker thread retried transaction 10 time(s) in
       vain, giving up. Consider raising the value of the slave_transaction_retries variable.
      2018-06-08  0:12:49 140524760401664 [ERROR] Slave SQL: Lock wait timeout exceeded; try 
      restarting transaction, Gtid 0-1-7, Internal MariaDB error code: 1205
      2018-06-08  0:12:49 140524760401664 [Warning] Slave: Lock wait timeout exceeded; try restarting 
      transaction Error_code: 1205
      2018-06-08  0:12:49 140524760401664 [ERROR] Error running query, slave SQL thread aborted. 
      Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at
       log 'master-bin.000001' position 1553
      2018-06-08  0:12:49 140524761007872 [Note] Error reading relay log event: slave SQL thread was killed
      2018-06-08  0:12:49 140524760704768 [ERROR] Slave (additional info): Commit failed due to failure
       of an earlier commit on which this one depends Error_code: 1964
      2018-06-08  0:12:49 140524760704768 [Warning] Slave: Commit failed due to failure of an earlier 
      commit on which this one depends Error_code: 1964
      2018-06-08  0:12:49 140524760704768 [ERROR] Error running query, slave SQL thread aborted. 
      Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log
       'master-bin.000001' position 1553
      2018-06-08  0:12:49 140524761007872 [Note] Slave SQL thread exiting, replication stopped 
      in log 'master-bin.000001' at position 1553
      

      Increasing rocksdb_lock_wait_timeout or slave_transaction_retries doesn't help, just makes the test run longer.
      Doesn't fail with InnoDB.

      Attachments

        Issue Links

          Activity

            People

              psergei Sergei Petrunia
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.