[MDEV-7323] Replica with GTID stopping replication on some transactions when no gtid replica didn't stop Created: 2014-12-15 Updated: 2020-02-13 Resolved: 2015-01-06 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Replication |
| Affects Version/s: | 10.0.14 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Major |
| Reporter: | Peter McLarty | Assignee: | Kristian Nielsen |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Redhat Linux 2.6.32-431.29.2.el6.x86_64 |
||
| Description |
|
We have had replication stop on our GTID enabled slave when the same events are passing through Last_Errno: 1396 Drop user has stalled and a error from what I believe to be a stored procedure which inserts some records into a table has caused issues. |
| Comments |
| Comment by Kristian Nielsen [ 2014-12-15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
We need some more details to understand what the problem here is. 1. You mention "slave when the same events are passing through". Do you mean that you have a setup S1->S2->S3, where S2 is using GTID but S3 is not? Or something else? What makes you think the problem is related to GTID? 2. Might there be some reasonable explanation for the CREATE USER statement to fail, like the user existing before or something like that? What is in the error log? What happens if the CREATE USER statement is run manually on the server? 3. With respect to the "Drop user has stalled and ..." - we really need a detailed, precise description of each problem to be able to say anything meaningful. | |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Peter McLarty [ 2014-12-15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
I have
My thinking around gtid is that the problems are only occurring on the gtid enabled slave
I cannot access ftp server at present from internal network will upload the error log later this evening. A number of errors which have stopped the slave replication in that log | |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Peter McLarty [ 2014-12-15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
Uploaded the error log to ftp.askmonty.org. Let me know what else to investigate and what tools and I will endeavour to find the information | |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Kristian Nielsen [ 2014-12-15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
Thank you for uploading the error log. I took a quick look at it. The error log contains a lot of replication errors, like "Can't find record in A diverged slave can in general lead to various replication failures, if some If the problem can be reproduced on a slave that has been re-provisioned to be Or alternatively, the actual problem needs to be narrowed down more From the information given, the most likely explanation is just that the slave | |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Kristian Nielsen [ 2015-01-06 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
Without further information, I will have to assume that this is caused by the slave being out of sync with the master. So closing ... | |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Peter McLarty [ 2015-01-07 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
Thanks guys it was a replication issue pt-table-sync was not able to repair and we have rebuilt the slave |