[MDEV-4992] 'DROP DATABASE IF EXISTS' causes replication to break (does not break in MySQL) Created: 2013-09-04 Updated: 2022-09-08 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | None |
| Affects Version/s: | 10.0.4, 5.5.33 |
| Fix Version/s: | 5.5 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Frank Flynn | Assignee: | Unassigned |
| Resolution: | Unresolved | Votes: | 1 |
| Labels: | upstream | ||
| Environment: |
Ubuntu 5.5.31-MariaDB-1~lucid-log and 5.5.32-MariaDB-1~precise-log |
||
| Issue Links: |
|
||||||||||||||||
| Description |
|
We have a script that refreshes our staging DB environment. Our staging environment is a master <-> master with each master having another slave. I have been replacing mysql database servers with mariadb servers (5.5) and have run in to this most interesting issue. masterdb1 - mysql This refresh script takes a dump of the production DB and loads into masterdb1 BUT it changes the database name used in the DROP DATABASE, CREATE DATABASE and USE commands so that we will not interrupt the automatic testing until we have the complete database loaded. Then we drop all of the tables in the target DB and rename all of the tables from the freshly loaded database so they are now in the correctly named database. This is only background for the bug. The problem is the script issues this command (in case it exited prematurely)
The first server I changed to mariadb was slavedb2. This worked fine. Then I changed masterdb2 (the master for slavedb2). The DORP DB command is issued on masterdb1 and it works (the db does not exist) and it is replicated to masterdb2 where the command is also successful. The command is replicated to slavedb2 where it fails:
They are not exactly the same version as the os is newer on masterdb2. masterdb2:
slavedb2:
The replication setting appear to be identical. This is running in AWS and I could arrange access or run whatever diagnostics that might help you (us) figure this out. |
| Comments |
| Comment by Sergei Golubchik [ 2013-09-09 ] | |||||||||||||||||||||||||||||||||||||||||||
|
The error says that the DROP DATABASE failed on the master, with the error "Cannot load from mysql.%s. The table is probably corrupted". Which happens if a stored routine or an event cannot be loaded, because the corresponding table is corrupted or has the wrong structure. Make sure all your mysql.* tables on master2 are not corrupted and properly upgraded (with mysql_upgrade). | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Frank Flynn [ 2013-09-09 ] | |||||||||||||||||||||||||||||||||||||||||||
|
Exactly the problem - the drop database command did not fail on the master but the slave thinks it did. This is what happens when I run the same command on the master: mysql> DROP DATABASE IF EXISTS temp_vstreams; mysql> show warnings;
------
------ | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2013-09-10 ] | |||||||||||||||||||||||||||||||||||||||||||
|
Hi, When replication aborted, it also said which position of which binary log it stopped at. Could you please provide the binary log and the record from the error log? Thanks. | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Frank Flynn [ 2013-09-10 ] | |||||||||||||||||||||||||||||||||||||||||||
|
Because this has been going on for a while the slave is stuck at a bin-log which was expired off of the master a while ago. But I think I can recreate this with a new log file - but it might take me a couple of days. | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Frank Flynn [ 2013-09-12 ] | |||||||||||||||||||||||||||||||||||||||||||
|
This is the full show slave status\G message from the slave - please confirm that you want the binary log file 'mariadb-bin.004352' from the master 'masterdb2.staging.ooyala.com' as it is currently about 4 hours behind and a lot of data has changed since. Also there are no errors in either the master or slave error logs.
| |||||||||||||||||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2013-09-12 ] | |||||||||||||||||||||||||||||||||||||||||||
|
That's right, please upload the binary log mariadb-bin.004352. If it's very big or contains some sensitive data, you can instead parse it yourself with mysqlbinlog, find the position 16760054 in the log and copy-paste several events before and after this position. Thanks. As I understand, it's not really MariaDB-Galera server that you use, right? | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Frank Flynn [ 2013-09-13 ] | |||||||||||||||||||||||||||||||||||||||||||
|
Correct - this is just the standard MariaDB not Galera. But in preparing the files to send to you I have resolved the issue. I still believe there is a bug which can be fixed (perhaps through a better message). What was happening was DROP DATABASE on the master was not causing any errors (the error log is empty) but it was generating 2 warnings - these broke replication on the slave. use fmf_temp; Database changed MariaDB [(none)]> show warnings;
------
------ We never saw any warnings because the statement was run on masterdb1 (which had no warnings) then replicated to masterdb2 (where it would generate a warning but never show it to anyone) then replicated to slavedb2 (where it broke replication because the error messages were different). running mysql_upgrade --force (needed because it said it was already upgraded to 5.5.32-MariaDB) did fix the issue because now there were no warnings on masterdb2. I'm not sure how you'd like to proceed - the issues is resolved for me. The bug is there is a statement that will break replication but it is not being logged anywhere making it difficult to track down. | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2013-09-15 ] | |||||||||||||||||||||||||||||||||||||||||||
|
Hi Frank, > use fmf_temp; Could you please comment on this fragment? So, first you switched to fmf_temp, the response was "Database changed". That's expected. Database changed Where do they come from? What were you executing here? Thanks. | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Frank Flynn [ 2013-09-15 ] | |||||||||||||||||||||||||||||||||||||||||||
|
Not exactly - since this is a script I ran three commands on the same line and got the results for all commands in one block - I should have written it like this: MariaDB [(none)]>create database fmf_temp; MariaDB [(none)]>use fmf_temp; MariaDB [(fmf_temp)]> drop database if exists fmf_temp; MariaDB [(none)]> show warnings;
------
------ The issues being that the masterdb2 had not have mysqlupgrade run on it (at least running mysqlupgrade on masterdb2 solved the issue) and the bigger issue was that since masterdb1 ran this with no warnings and replicated to to masterdb2 where the warnings did happen shouldn't masterdb2 have stopped replication saying the commands produced different errors? Instead masterdb2 ran these commands and produced warnings (which no one saw because it was the replication thread - they were not written to the error log either) then slavedb2 got the command and ran it with no warnings and broke because it's errors / warnings were different (none) than masterdb2. Trivial for me to fix - difficult to find. | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2013-09-16 ] | |||||||||||||||||||||||||||||||||||||||||||
|
Thanks Frank, I'd say there's definitely a bug here. The bug is in the confusion between errors and warnings of level "error". When the server executes the DROP, it considers it successful, but generates warnings. Thus, replication failure is caused by the asymmetry between master and slave behavior. Either the master shouldn't write the error code into the binary log when it's actually a code of a warning, or the slave should compare the master error code from the binary log not only to its own query error code, but also to warning codes. It works exactly the same way on MySQL (checked 5.5, 5.6, 5.7). You might want to file a bug report at bugs.mysql.com as well. | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2013-09-16 ] | |||||||||||||||||||||||||||||||||||||||||||
|
Here is also serg's quick analysis: <serg> with DROP there's an error when trying to drop SPs |