Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-717

LP:1003679 - Wrong binlog order on concurrent DROP schema and CREATE function

Details

    • 10.1.13, 10.1.14

    Description

      Also filed as http://bugs.mysql.com/bug.php?id=65428

      Concurrent execution of DROP DATABASE and CREATE FUNCTION | PROCEDURE | EVENT causes a wrong order of events in the binary log and, consequently, replication failure.

      Provided MTR test case is non-deterministic, so I am also adding an RQG grammar to reproduce.
      The problem is that on master CREATE FUNCTION gets executed without an error, but in the binlog DROP DATABASE is written before CREATE FUNCTION.

      Reproducible on maria/5.5 revno 3413, mysql-5.5 revno 3737, mysql-trunk revno 3827.
      Could not reproduce on maria/5.3, although due to the non-deterministic nature of the test there is no guarantee the problem is not present there.

      # MTR test case t1.test
      # run as perl ./mtr --mysqld=--binlog-format=mixed t1
       
      --source include/master-slave.inc
       
      DROP DATABASE IF EXISTS db1;
       
      CREATE DATABASE db1;
      CREATE FUNCTION db1.f1() RETURNS INT RETURN 1;
      --send
      DROP DATABASE db1;
       
      --connection master1
      --error 0,ER_BAD_DB_ERROR
      CREATE FUNCTION db1.f2() RETURNS INT RETURN 1;
       
      --connection master
      --reap
      SHOW BINLOG EVENTS;
       
      --sync_slave_with_master
       

      # Binlog contents:
       
      Log_name        Pos     Event_type      Server_id       End_log_pos     Info
      slave-relay-bin.000002  4       Format_desc     2       245     Server ver: 5.5.23-MariaDB-debug-log, Binlog ver: 4
      slave-relay-bin.000002  245     Rotate  1       0       master-bin.000001;pos=4
      slave-relay-bin.000002  289     Format_desc     1       245     Server ver: 5.5.23-MariaDB-debug-log, Binlog ver: 4
      slave-relay-bin.000002  530     Query   1       334     DROP DATABASE IF EXISTS db1
      slave-relay-bin.000002  619     Query   1       415     CREATE DATABASE db1
      slave-relay-bin.000002  700     Query   1       574     use `test`; CREATE DEFINER=`root`@`localhost` FUNCTION `db1`.`f1`() RETURNS int(11)
      RETURN 1
      slave-relay-bin.000002  859     Query   1       653     DROP DATABASE db1
      slave-relay-bin.000002  938     Query   1       812     use `test`; CREATE DEFINER=`root`@`localhost` FUNCTION `db1`.`f2`() RETURNS int(11)
      RETURN 1

      # RQG grammar t1.yy
       
      query:
           CREATE DATABASE IF NOT EXISTS db1; CREATE FUNCTION db1.f1() RETURNS INT RETURN 1; DROP DATABASE db1;
       
      thread1:
           CREATE FUNCTION db1.f2() RETURNS INT RETURN 1; DROP FUNCTION db1.f2;
       

      # RQG command line:
       
      perl runall.pl \
      --queries=100M \
      --threads=2 \
      --duration=600 \
      --grammar=t1.yy \
      --rpl_mode=mixed \
      --basedir1=<your basedir> \
      --vardir1=<your vardir>

      Attachments

        Activity

          Wrong binlog order on concurrent DROP schema and CREATE function
          Concurrent execution of DROP DATABASE and CREATE FUNCTION | PROCEDURE | EVENT causes a wrong order of events in the binary log and, consequently, replication failure.

          Provided MTR test case is non-deterministic, so I am also adding an RQG grammar to reproduce.
          The problem is that on master CREATE FUNCTION gets executed without an error, but in the binlog DROP DATABASE is written before CREATE FUNCTION.

          Reproducible on maria/5.5 revno 3413, mysql-5.5 revno 3737, mysql-trunk revno 3827.
          Could not reproduce on maria/5.3, although due to the non-deterministic nature of the test there is no guarantee the problem is not present there.

          1. MTR test case t1.test
          2. run as perl ./mtr -mysqld=-binlog-format=mixed t1

          --source include/master-slave.inc

          DROP DATABASE IF EXISTS db1;

          CREATE DATABASE db1;
          CREATE FUNCTION db1.f1() RETURNS INT RETURN 1;
          --send
          DROP DATABASE db1;

          --connection master1
          --error 0,ER_BAD_DB_ERROR
          CREATE FUNCTION db1.f2() RETURNS INT RETURN 1;

          --connection master
          --reap
          SHOW BINLOG EVENTS;

          --sync_slave_with_master

          1. End of MTR test case
          1. Binlog contents:

          Log_name Pos Event_type Server_id End_log_pos Info
          slave-relay-bin.000002 4 Format_desc 2 245 Server ver: 5.5.23-MariaDB-debug-log, Binlog ver: 4
          slave-relay-bin.000002 245 Rotate 1 0 master-bin.000001;pos=4
          slave-relay-bin.000002 289 Format_desc 1 245 Server ver: 5.5.23-MariaDB-debug-log, Binlog ver: 4
          slave-relay-bin.000002 530 Query 1 334 DROP DATABASE IF EXISTS db1
          slave-relay-bin.000002 619 Query 1 415 CREATE DATABASE db1
          slave-relay-bin.000002 700 Query 1 574 use `test`; CREATE DEFINER=`root`@`localhost` FUNCTION `db1`.`f1`() RETURNS int(11)
          RETURN 1
          slave-relay-bin.000002 859 Query 1 653 DROP DATABASE db1
          slave-relay-bin.000002 938 Query 1 812 use `test`; CREATE DEFINER=`root`@`localhost` FUNCTION `db1`.`f2`() RETURNS int(11)
          RETURN 1

          1. RQG grammar t1.yy

          query:
          CREATE DATABASE IF NOT EXISTS db1; CREATE FUNCTION db1.f1() RETURNS INT RETURN 1; DROP DATABASE db1;

          thread1:
          CREATE FUNCTION db1.f2() RETURNS INT RETURN 1; DROP FUNCTION db1.f2;

          1. End of RQG grammar
          1. RQG command line:

          perl runall.pl \
          --queries=100M \
          --threads=2 \
          --duration=600 \
          --grammar=t1.yy \
          --rpl_mode=mixed \
          --basedir1=<your basedir> \
          --vardir1=<your vardir>

          elenst Elena Stepanova added a comment - Wrong binlog order on concurrent DROP schema and CREATE function Concurrent execution of DROP DATABASE and CREATE FUNCTION | PROCEDURE | EVENT causes a wrong order of events in the binary log and, consequently, replication failure. Provided MTR test case is non-deterministic, so I am also adding an RQG grammar to reproduce. The problem is that on master CREATE FUNCTION gets executed without an error, but in the binlog DROP DATABASE is written before CREATE FUNCTION. Reproducible on maria/5.5 revno 3413, mysql-5.5 revno 3737, mysql-trunk revno 3827. Could not reproduce on maria/5.3, although due to the non-deterministic nature of the test there is no guarantee the problem is not present there. MTR test case t1.test run as perl ./mtr - mysqld= -binlog-format=mixed t1 --source include/master-slave.inc DROP DATABASE IF EXISTS db1; CREATE DATABASE db1; CREATE FUNCTION db1.f1() RETURNS INT RETURN 1; --send DROP DATABASE db1; --connection master1 --error 0,ER_BAD_DB_ERROR CREATE FUNCTION db1.f2() RETURNS INT RETURN 1; --connection master --reap SHOW BINLOG EVENTS; --sync_slave_with_master End of MTR test case Binlog contents: Log_name Pos Event_type Server_id End_log_pos Info slave-relay-bin.000002 4 Format_desc 2 245 Server ver: 5.5.23-MariaDB-debug-log, Binlog ver: 4 slave-relay-bin.000002 245 Rotate 1 0 master-bin.000001;pos=4 slave-relay-bin.000002 289 Format_desc 1 245 Server ver: 5.5.23-MariaDB-debug-log, Binlog ver: 4 slave-relay-bin.000002 530 Query 1 334 DROP DATABASE IF EXISTS db1 slave-relay-bin.000002 619 Query 1 415 CREATE DATABASE db1 slave-relay-bin.000002 700 Query 1 574 use `test`; CREATE DEFINER=`root`@`localhost` FUNCTION `db1`.`f1`() RETURNS int(11) RETURN 1 slave-relay-bin.000002 859 Query 1 653 DROP DATABASE db1 slave-relay-bin.000002 938 Query 1 812 use `test`; CREATE DEFINER=`root`@`localhost` FUNCTION `db1`.`f2`() RETURNS int(11) RETURN 1 RQG grammar t1.yy query: CREATE DATABASE IF NOT EXISTS db1; CREATE FUNCTION db1.f1() RETURNS INT RETURN 1; DROP DATABASE db1; thread1: CREATE FUNCTION db1.f2() RETURNS INT RETURN 1; DROP FUNCTION db1.f2; End of RQG grammar RQG command line: perl runall.pl \ --queries=100M \ --threads=2 \ --duration=600 \ --grammar=t1.yy \ --rpl_mode=mixed \ --basedir1=<your basedir> \ --vardir1=<your vardir>

          Launchpad bug id: 1003679

          ratzpo Rasmus Johansson (Inactive) added a comment - Launchpad bug id: 1003679

          fixed in mysql-5.7.0

          serg Sergei Golubchik added a comment - fixed in mysql-5.7.0
          elenst Elena Stepanova added a comment - - edited

          Still reproducible on current 10.0 (10.0.14+, revno 4471).
          The MTR test case should be run with --repeat=... to reproduce; or use the RQG test.

          Fixed in 5.7.0:

          revno: 3940
          revision-id: thayumanavar.x.sachithanantha@oracle.com-20120702054613-d6e02ydictgn46ph
          parent: mysql-builder@oracle.com-20120701165403-bacrn8oguctyglw4
          committer: Thayumanavar <thayumanavar.x.sachithanantha@oracle.com>
          branch nick: mysql-trunk
          timestamp: Mon 2012-07-02 11:16:13 +0530
          message:
            Bug#14127220 - WRONG BINLOG ORDER ON CONCURRENT DROP SCHEMA
                           AND CREATE FUNCTION. 
            PROBLEM: Concurrent execution of DROP DATABASE and CREATE
                     FUNCTION causes wrong order of events in the binary
                     log and this results in an replication failure.
            FIX: The root cause of the issue is when DROP DATABASE takes
                 metadata lock and is in progress through it's execution,
                 a concurrently running CREATE FUNCTION checks for the
                 existence of database which it succeeds and then it
                 waits on the metadata lock. Once DROP DATABASE writes to
                 BINLOG and finally releases the metadata lock on schema
                 object, the CREATE FUNCTION waiting on metadata lock gets
                 in it's code path and succeeds and writes to binlog. When
                 on slave, the sequence of events in binlog cause a failure
                 since after DROP DATABASE and we attempt to create a function
                 under the dropped database object which causes the replication
                 failure. The fix moves the condition to check for existence
                 of database object under the metadata lock in the CREATE
                 FUNCTION code path so that the check for existence of db
                 name happens after the DROP DATABASE has completed it's
                 execution. This patch fixes similar issues when concurrent
                 execution of DROP DATABASE and CREATE EVENT or ALTER EVENT.

          elenst Elena Stepanova added a comment - - edited Still reproducible on current 10.0 (10.0.14+, revno 4471). The MTR test case should be run with --repeat=... to reproduce; or use the RQG test. Fixed in 5.7.0: revno: 3940 revision-id: thayumanavar.x.sachithanantha@oracle.com-20120702054613-d6e02ydictgn46ph parent: mysql-builder@oracle.com-20120701165403-bacrn8oguctyglw4 committer: Thayumanavar <thayumanavar.x.sachithanantha@oracle.com> branch nick: mysql-trunk timestamp: Mon 2012-07-02 11:16:13 +0530 message: Bug#14127220 - WRONG BINLOG ORDER ON CONCURRENT DROP SCHEMA AND CREATE FUNCTION. PROBLEM: Concurrent execution of DROP DATABASE and CREATE FUNCTION causes wrong order of events in the binary log and this results in an replication failure. FIX: The root cause of the issue is when DROP DATABASE takes metadata lock and is in progress through it's execution, a concurrently running CREATE FUNCTION checks for the existence of database which it succeeds and then it waits on the metadata lock. Once DROP DATABASE writes to BINLOG and finally releases the metadata lock on schema object, the CREATE FUNCTION waiting on metadata lock gets in it's code path and succeeds and writes to binlog. When on slave, the sequence of events in binlog cause a failure since after DROP DATABASE and we attempt to create a function under the dropped database object which causes the replication failure. The fix moves the condition to check for existence of database object under the metadata lock in the CREATE FUNCTION code path so that the check for existence of db name happens after the DROP DATABASE has completed it's execution. This patch fixes similar issues when concurrent execution of DROP DATABASE and CREATE EVENT or ALTER EVENT.
          holyfoot Alexey Botchkov added a comment - Proposed fix: http://lists.askmonty.org/pipermail/commits/2016-March/009141.html

          The fix was pushed, but i still need to add a proper test for this.

          holyfoot Alexey Botchkov added a comment - The fix was pushed, but i still need to add a proper test for this.
          holyfoot Alexey Botchkov added a comment - Test case added: http://lists.askmonty.org/pipermail/commits/2016-May/009339.html
          holyfoot Alexey Botchkov added a comment - Final patch: http://lists.askmonty.org/pipermail/commits/2016-May/009352.html

          People

            holyfoot Alexey Botchkov
            elenst Elena Stepanova
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.