[MDEV-13577] slave_parallel_mode=optimistic should not report the mode's specific temporary errors Created: 2017-08-18 Updated: 2020-08-25 Resolved: 2018-06-12 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Replication |
| Affects Version/s: | 10.1, 10.1.22, 10.1.25, 10.2 |
| Fix Version/s: | 10.1.34, 10.2.16, 10.3.8 |
| Type: | Bug | Priority: | Major |
| Reporter: | Sergey Chernomorets | Assignee: | Andrei Elkin |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | None | ||
| Environment: |
centos7 x86_64 |
||
| Issue Links: |
|
||||||||||||||||
| Sprint: | 10.2.14, 10.3.6-1 | ||||||||||||||||
| Description |
|
I have server mariadb-10.1.22 with slave_parallel_mode=optimistic and see (bogus?) errors in log:
Replication is not crash and still works.
binlog:
|
| Comments |
| Comment by Elena Stepanova [ 2017-08-31 ] | |||||||||||||||||||||||||||||
|
The error as such is easily reproducible with a simple concurrent test which inserts / updates rows with the same PK (see the test below). The result seems to be all right at the end, no inconsistencies; so, I think that the appearance of these errors in optimistic mode is most likely expected – by definition any DML is allowed to run in parallel, so it can happen that UPDATE is attempted before INSERT, causes an error and is retried later. I will assign it to Elkin to confirm (or not) that this behavior is indeed expected. Also, if it is, maybe it makes sense to convert the error into a warning or even a note. It is printed on warning level 2, which has become a default in 10.2.
To reproduce,
(change paths as needed). | |||||||||||||||||||||||||||||
| Comment by Kristian Nielsen [ 2017-11-27 ] | |||||||||||||||||||||||||||||
|
Elena is right, these errors will occur regularly internally for optimistic parallel replication, and will be handled by retrying the failed transactions. But I think the error should never appear in the error log. An error that is handled by optimistic parallel replication as a temporary error that merely causes a transaction retry should not be put into the logs. There is no actual error, and the user cannot do anything about it anyway - and there is a retry count status available to monitor this behaviour. Maybe historycally, slave retries (in single-threaded replication) were logged like this, and so optimistic parallel replication logs retries the same way - but it seems wrong, and should be safe to silence. | |||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2017-11-27 ] | |||||||||||||||||||||||||||||
|
I am also voting for not logging these pseudo-errors, we have enough noise in the error log as it is, and if there is nothing users are expected to do, there is no point alerting them. (I also have my own agenda, since the error messages create false positives in my tests, which I have to remember about and work around). | |||||||||||||||||||||||||||||
| Comment by Andrei Elkin [ 2018-06-06 ] | |||||||||||||||||||||||||||||
|
Sergei, salute. I start with offering you to review the patch whose main accent is I hope you'll find time to check out. Cheers, Andrei | |||||||||||||||||||||||||||||
| Comment by Andrei Elkin [ 2018-06-12 ] | |||||||||||||||||||||||||||||
|
Fixes are pushed as 7bbe324fc17 commit. |