Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.2.31, 10.3.22, 10.4.12, 10.4.19
-
None
-
ubuntu 18.04, mariadb galera 3-nodes cluster
Description
Steps to reproduce:
1. Run reproduce script (if the issue was not reproduced after first run of the script, drop database tests and try to run script again):
#!/bin/bash
|
|
set -x
|
|
mysql_cmd="mysql -h $1 -u $2 -p$3" |
|
$mysql_cmd -e "create database tests;" |
|
$mysql_cmd tests << EOF
|
|
CREATE TABLE test1 (
|
id int(11) NOT NULL AUTO_INCREMENT, |
environments varchar(100) COLLATE utf8_unicode_ci NOT NULL, |
test_start datetime NOT NULL,
|
test_end datetime DEFAULT NULL,
|
test_status varchar(100) COLLATE utf8_unicode_ci DEFAULT NULL, |
cmd varchar(2000) COLLATE utf8_unicode_ci DEFAULT NULL, |
scheduled_date date DEFAULT NULL,
|
scheduled_time time DEFAULT NULL,
|
status_message varchar(500) COLLATE utf8_unicode_ci DEFAULT '', |
PRIMARY KEY (id,environments,test_start)
|
) DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
|
|
CREATE TABLE test2 (
|
test_id int(11) NOT NULL, |
environment varchar(100) COLLATE utf8_unicode_ci NOT NULL, |
PRIMARY KEY (test_id,environment),
|
CONSTRAINT fk_test_env FOREIGN KEY (test_id) REFERENCES test1 (id)
|
) DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
|
|
EOF
|
|
for j in {1..9}; do |
for i in {10..50}; do |
c_start="2020-05-27 0${j}:${i}:05" |
c_end="2020-05-27 0${j}:${i}:21" |
$mysql_cmd tests -e "INSERT INTO test1(environments,test_start,test_end,test_status,cmd) VALUES ('[\'test1a\']', '${c_start}', NULL, 'STARTED', '[\'testservices\']');" |
id=$($mysql_cmd -s tests -e "SELECT id from test1 WHERE test_start = '${c_start}';" | tail -1) |
$mysql_cmd tests -e "INSERT INTO test2 (test_id, environment) VALUES(((${id})), 'test1a');" |
done
|
done
|
|
Expected result:
All data is written successfully, all mariadb nodes are synced and in primary state in galera cluster
Actual result:
All data is written successfully, 1 or 2 mariadb nodes have state non-primary
Node(s) which dropped from cluster has following in logs:
2020-06-01 14:50:35 12 [ERROR] Slave SQL: Could not execute Write_rows_v1 event on table tests.test2; Cannot add or update a child row: a foreign key constraint fails (`tests`.`test2`, CONSTRAINT `fk_test_env` FOREIGN KEY (`test_id`) REFERENCES `test1` (`id`)), Error_code: 1452; handler error HA_ERR_NO_REFERENCED_ROW; the event's master log FIRST, end_log_pos 174, Internal MariaDB error code: 1452 |
2020-06-01 14:50:35 12 [Note] WSREP: ha_rollback_trans(12, FALSE) rolled back: INSERT INTO test2 (test_id, environment) VALUES(((92)), 'test1a')\BB\D5^: Cannot add or update a child row: a foreign key constraint fails (`tests`.`test2`, CONSTRAINT `fk_test_env` FOREIGN KEY (`test_id`) REFERENCES `test1` (`id`)); is_real 0 |
|
2020-06-01 14:50:35 15 [Note] InnoDB: SQL1: INSERT INTO test2 (test_id, environment) VALUES(((95)), 'test1a')\BB\D5^ SQL2: NULL |
2020-06-01 14:50:35 15 [Note] InnoDB: BF conflict, modes: 1026:1059 idx: PRIMARY table: tests/test1 n_uniq: 3 n_user: 3 SQL1: INSERT INTO test2 (test_id, environment) VALUES(((95)), 'test1a')\BB\D5^ SQL2: NULL |
It was found that when setting wsrep_slave_threads variable to 1, the issue cannot be reproduced.
Detailed logs are attached. They contain script for reproduce as well as config, variables and status from all 3 nodes.
UPDATE:
On maria 10.2.31, often such conflicts are resolved by retrying (which is absent in maria 10.4.12), and galera node remains in cluster. This is fragment of error and retry on maria 10.2.31:
2020-06-02 8:55:02 139866759632640 [Note] WSREP: Set WSREPXid for InnoDB: 648b7d16-a49f-11ea-9698-9e3da8a1d73f:251 |
2020-06-02 8:55:02 139866760247040 [ERROR] Slave SQL: Could not execute Write_rows_v1 event on table tests.test2; Cannot add or update a child row: a foreign key constraint fails (`tests`.`test2`, CONSTRAINT `fk_test_env` FOREIGN KEY (`test_id`) REFERENCES `test1` (`id`)), Error_code: 1452; handler error HA_ERR_NO_REFERENCED_ROW; the event's master log FIRST, end_log_pos 207, Internal MariaDB error code: 1452 |
2020-06-02 8:55:02 139866760247040 [Warning] WSREP: RBR event 3 Write_rows_v1 apply warning: 151, 254 |
2020-06-02 8:55:02 139866760247040 [Warning] WSREP: Failed to apply app buffer: seqno: 254, status: 1 at galera/src/trx_handle.cpp:apply():351 |
Retrying 2th time
|
2020-06-02 8:55:02 139866760247040 [Note] InnoDB: BF-BF lock conflict, locking: 1 |
RECORD LOCKS space id 101 page no 3 n bits 72 index PRIMARY of table "tests"."test1" trx id 6308 lock_mode X locks rec but not gap |
Record lock, heap no 2 PHYSICAL RECORD: n_fields 11; compact format; info bits 0 |
0: len 4; hex 80000002; asc ;; |
1: len 10; hex 5b27737475623161275d; asc ['test1a'];; |
2: len 5; hex 99a6848daf; asc ;; |
3: len 6; hex 0000000018a4; asc ;; |
4: len 7; hex fe000002020110; asc ;; |
5: SQL NULL; |
6: len 7; hex 53544152544544; asc STARTED;; |
7: len 12; hex 5b277365727669636573275d; asc ['services'];; |
8: SQL NULL; |
b' 9: SQL NULL; |
10: len 0; hex ; asc ;; |
|
2020-06-02 8:55:02 139866760247040 [Note] InnoDB: SQL1: INSERT INTO test2 (test_id, environment) VALUES(((2)), 'test1a')\xe4\x13\xd6^\x13\x01 |
2020-06-02 8:55:02 139866760247040 [Note] InnoDB: SQL2: NULL |
2020-06-02 8:55:02 139866760247040 [Note] WSREP: waiting for BF, trx order: 254 253 |
|
2020-06-02 8:55:02 139866760247040 [Note] InnoDB: BF conflict, modes: 1026:1059 idx: PRIMARY table: tests/test1 n_uniq: 3 n_user: 3' |
2020-06-02 8:55:02 139866760247040 [Note] InnoDB: SQL1: INSERT INTO test2 (test_id, environment) VALUES(((2)), 'test1a')\xe4\x13\xd6^\x13\x01 |
2020-06-02 8:55:02 139866760247040 [Note] InnoDB: SQL2: NULL |
2020-06-02 8:55:02 139866760554240 [Note] InnoDB: BF-BF lock conflict, locking: 1 |
RECORD LOCKS space id 101 page no 3 n bits 72 index PRIMARY of table "tests"."test1" trx id 6310 lock mode S locks rec but not gap |
Record lock, heap no 2 PHYSICAL RECORD: n_fields 11; compact format; info bits 0 |
0: len 4; hex 80000002; asc ;; |
1: len 10; hex 5b27737475623161275d; asc ['test1a'];; |
However in some cases, retries are unsuccessful, and node crashes:
2020-06-02 9:21:25 139643550910208 [Note] InnoDB: SQL2: NULL |
2020-06-02 9:21:25 139643950552832 [ERROR] Slave SQL: Could not execute Write_rows_v1 event on table tests.test2; Cannot add or update a child row: a foreign key constraint fails (`tests`.`test2`, CONSTRAINT `fk_test_env` FOREIGN KEY (`test_id`) REFERENCES `test1` (`id`)), Error_code: 1452; handler error HA_ERR_NO_REFERENCED_ROW; the event's master log FIRST, end_log_pos 176, Internal MariaDB error code: 1452 |
2020-06-02 9:21:25 139643950552832 [Warning] WSREP: RBR event 3 Write_rows_v1 apply warning: 151, 6050 |
2020-06-02 9:21:25 139643950552832 [Warning] WSREP: Failed to apply app buffer: seqno: 6050, status: 1 |
\t at galera/src/trx_handle.cpp:apply():351 |
Retrying 2th time
|
2020-06-02 9:21:25 139643950552832 [ERROR] Slave SQL: Could not execute Write_rows_v1 event on table tests.test2; Cannot add or update a child row: a foreign key constraint fails (`tests`.`test2`, CONSTRAINT `fk_test_env` FOREIGN KEY (`test_id`) REFERENCES `test1` (`id`)), Error_code: 1452; handler error HA_ERR_NO_REFERENCED_ROW; the event's master log FIRST, end_log_pos 176, Internal MariaDB error code: 1452 |
2020-06-02 9:21:25 139643950552832 [Warning] WSREP: RBR event 3 Write_rows_v1 apply warning: 151, 6050 |
2020-06-02 9:21:25 139643950552832 [Warning] WSREP: Failed to apply app buffer: seqno: 6050, status: 1 |
\t at galera/src/trx_handle.cpp:apply():351 |
Retrying 3th time
|
2020-06-02 9:21:25 139643950552832 [ERROR] Slave SQL: Could not execute Write_rows_v1 event on table tests.test2; Cannot add or update a child row: a foreign key constraint fails (`tests`.`test2`, CONSTRAINT `fk_test_env` FOREIGN KEY (`test_id`) REFERENCES `test1` (`id`)), Error_code: 1452; handler error HA_ERR_NO_REFERENCED_ROW; the event's master log FIRST, end_log_pos 176, Internal MariaDB error code: 1452 |
2020-06-02 9:21:25 139643950552832 [Warning] WSREP: RBR event 3 Write_rows_v1 apply warning: 151, 6050 |
2020-06-02 9:21:25 139643950552832 [Warning] WSREP: Failed to apply app buffer: seqno: 6050, status: 1 |
\t at galera/src/trx_handle.cpp:apply():351' |
Retrying 4th time'
|
2020-06-02 9:21:25 139643950552832 [ERROR] Slave SQL: Could not execute Write_rows_v1 event on table tests.test2; Cannot add or update a child row: a foreign key constraint fails (`tests`.`test2`, CONSTRAINT `fk_test_env` FOREIGN KEY (`test_id`) REFERENCES `test1` (`id`)), Error_code: 1452; handler error HA_ERR_NO_REFERENCED_ROW; the event's master log FIRST, end_log_pos 176, Internal MariaDB error code: 1452 |
2020-06-02 9:21:25 139643950552832 [Warning] WSREP: RBR event 3 Write_rows_v1 apply warning: 151, 6050 |
2020-06-02 9:21:25 139643950552832 [ERROR] WSREP: Failed to apply trx: source: 98174c9b-a4a8-11ea-904d-9f9eb4b1de70 version: 3 local: 0 state: APPLYING flags: 1 conn_id: 10274 trx_id: 17928 seqnos (l: 3651, g: 6050, s: 6049, d: 5935, ts: 1562760915024458) |
2020-06-02 9:21:25 139643950552832 [ERROR] WSREP: Failed to apply trx 6050 4 times |
2020-06-02 9:21:25 139643950552832 [ERROR] WSREP: Node consistency compromized, aborting... |
Attachments
Activity
Transition | Time In Source Status | Execution Times |
---|
|
69d 19h 16m | 1 |
|
88d 2h 38m | 1 |
|
4s | 1 |
|
422d 19h 39m | 1 |
|
8m 28s | 1 |