[MDEV-16428] Simple concurrent DML on RocksDB tables makes optimistic parallel replication abort - Jira

XML

Word

Printable

Details

Type: Bug
Status: Confirmed (View Workflow)
Priority: Major
Resolution: Unresolved
Affects Version/s: 10.2, 10.3, 10.4, 10.5
Fix Version/s: 10.4, 10.5
Component/s: Replication, Storage Engine - RocksDB
Labels:
None

Description

Note: run the test case below with --mysqld=--slave_parallel_mode=optimistic --mysqld=--slave-parallel-threads=2 --mysqld=--plugin-load-add=ha_rocksdb. It usually fails on the first attempt for me, but it's still a race condition, so sometimes it misses the mark. Run with --repeat if it doesn't fail right away.

--source include/have_binlog_format_row.inc

--source include/master-slave.inc

CREATE TABLE t1 (pk INT AUTO_INCREMENT PRIMARY KEY, a INT, KEY(a)) ENGINE=RocksDB;

CREATE TABLE t2 (pk INT AUTO_INCREMENT PRIMARY KEY, b INT) ENGINE=RocksDB;

INSERT INTO t1 (pk) VALUES (NULL);

--connect (con1,localhost,root,,test)

INSERT INTO t1 (pk) VALUES (NULL);

--connection master1

BEGIN;

--connection con1

INSERT INTO t1 (pk) VALUES (NULL);

INSERT INTO t1 (pk) VALUES (NULL);

--connection master1

INSERT INTO t2 (pk) VALUES (NULL),(NULL);

--send

  UPDATE t1 SET a = 1;

--connection master

--send

  DELETE FROM t1;

--connection master1

--reap

COMMIT;

--connection master

--reap

--sync_slave_with_master

# Cleanup

--disconnect con1

--connection master

DROP TABLE t1, t2;

--source include/rpl_end.inc

10.2 a31e99a89c
2018-06-08 0:12:38 140524761007872 [Note] Slave SQL thread initialized, starting replication in log
'master-bin.000001' at position 4, relay log './slave-relay-bin.000001' position: 4
2018-06-08 0:12:49 140524760401664 [ERROR] Slave worker thread retried transaction 10 time(s) in
vain, giving up. Consider raising the value of the slave_transaction_retries variable.
2018-06-08 0:12:49 140524760401664 [ERROR] Slave SQL: Lock wait timeout exceeded; try
restarting transaction, Gtid 0-1-7, Internal MariaDB error code: 1205
2018-06-08 0:12:49 140524760401664 [Warning] Slave: Lock wait timeout exceeded; try restarting
transaction Error_code: 1205
2018-06-08 0:12:49 140524760401664 [ERROR] Error running query, slave SQL thread aborted.
Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at
log 'master-bin.000001' position 1553
2018-06-08 0:12:49 140524761007872 [Note] Error reading relay log event: slave SQL thread was killed
2018-06-08 0:12:49 140524760704768 [ERROR] Slave (additional info): Commit failed due to failure
of an earlier commit on which this one depends Error_code: 1964
2018-06-08 0:12:49 140524760704768 [Warning] Slave: Commit failed due to failure of an earlier
commit on which this one depends Error_code: 1964
2018-06-08 0:12:49 140524760704768 [ERROR] Error running query, slave SQL thread aborted.
Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log
'master-bin.000001' position 1553
2018-06-08 0:12:49 140524761007872 [Note] Slave SQL thread exiting, replication stopped
in log 'master-bin.000001' at position 1553

Increasing rocksdb_lock_wait_timeout or slave_transaction_retries doesn't help, just makes the test run longer.
Doesn't fail with InnoDB.

Attachments

Issue Links

relates to

MDEV-16242 MyRocks: parallel slave on a table without PK can stop with ER_KEY_NOT_FOUND

Confirmed

MDEV-24401 Deadlocks on Rocksdb when there should not be one

Open

Activity

People

Assignee:: Sergei Petrunia

Reporter:: Elena Stepanova

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 2018-06-07 21:16

Updated:: 2023-04-27 14:24

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.