Details

    Description

      We use Xid to failover a asynchronous slave to a different Galera master. Some users have automated scripts to do this as well.

      With RBR events the Xid is shown at the bottom of a transaction. With SBR events it it is not shown. Please change this so Xid is always shown in mysqlbinlog output.

      #151222 17:51:03 server id 10  end_log_pos 1166         Query   thread_id=35728 exec_time=0     error_code=0
      use `mariadb_test_7`/*!*/;
      SET TIMESTAMP=1450803063/*!*/;
      create table testy (id serial)
      /*!*/;
      # at 1166
      #151222 17:51:08 server id 10  end_log_pos 1204         GTID 0-10-7
      /*!100001 SET @@session.gtid_seq_no=7*//*!*/;
      BEGIN
      /*!*/;
      # at 1204
      # at 1258
      #151222 17:51:08 server id 10  end_log_pos 1258         Table_map: `mariadb_test_7`.`testy` mapped to number 80
      #151222 17:51:08 server id 10  end_log_pos 1296         Write_rows: table id 80 flags: STMT_END_F
       
      BINLOG '
      fH95VhMKAAAANgAAAOoEAAAAAFAAAAAAAAEADm1hcmlhZGJfdGVzdF83AAV0ZXN0eQABCAAA
      fH95VhcKAAAAJgAAABAFAAAAAFAAAAAAAAEAAf/+AwAAAAAAAAA=
      '/*!*/;
      ### INSERT INTO `mariadb_test_7`.`testy`
      ### SET
      ###   @1=3 /* LONGINT meta=0 nullable=0 is_null=0 */
      # at 1296
      #151222 17:51:08 server id 10  end_log_pos 1323         Xid = 153192
      COMMIT/*!*/;
      # at 1323

      Attachments

        Issue Links

          Activity

            It is not about STATEMENT vs ROW format, it is about being transactional vs non-transactional.
            DML on an InnoDB table is transactional, so it's written with Xid, both in row- and statement format.
            DDL is non-transactional, so it's written without Xid.

            Why do you need it for DDL, how do you use it for failover?

            elenst Elena Stepanova added a comment - It is not about STATEMENT vs ROW format, it is about being transactional vs non-transactional. DML on an InnoDB table is transactional, so it's written with Xid, both in row- and statement format. DDL is non-transactional, so it's written without Xid. Why do you need it for DDL, how do you use it for failover?

            Aha that explains! It does not remove the issue though.

            For asynchronous slave to galera cluster failover the easiest way (it does not require SSH access) to automate the failover is:
            1. Maintain a table in Galera where slaves can write their desired position
            2. When a slave notices it's replication is down, stop the SQL thread and find the galera last comitted id
            3. Save this last committed id in the Galera system
            4. On the remaining masters, monitor this table every second
            5. If a slave requests the binary log position, find out which Galera node is the new master (first try wins?)
            6. Look up the binary log file and position, save this in the table
            7. On the slave, monitor the response every second and change the master to the new master

            If there are frequent DDL statements going on in the system, this means the script would find the Xid before the DDL statement. For example a INDEX creation is already executed on the slave. When failing over to a galera master, it will find the position of that Xid. This, the CREATE INDEX will be executed again and fail the replication.

            Is this something in mysqlbinlog or in Galera? If it's in Galera: How does the galera cluster itself know if the DDL has already been executed in case of a failing node doing IST?

            michaeldg Michaël de groot added a comment - Aha that explains! It does not remove the issue though. For asynchronous slave to galera cluster failover the easiest way (it does not require SSH access) to automate the failover is: 1. Maintain a table in Galera where slaves can write their desired position 2. When a slave notices it's replication is down, stop the SQL thread and find the galera last comitted id 3. Save this last committed id in the Galera system 4. On the remaining masters, monitor this table every second 5. If a slave requests the binary log position, find out which Galera node is the new master (first try wins?) 6. Look up the binary log file and position, save this in the table 7. On the slave, monitor the response every second and change the master to the new master If there are frequent DDL statements going on in the system, this means the script would find the Xid before the DDL statement. For example a INDEX creation is already executed on the slave. When failing over to a galera master, it will find the position of that Xid. This, the CREATE INDEX will be executed again and fail the replication. Is this something in mysqlbinlog or in Galera? If it's in Galera: How does the galera cluster itself know if the DDL has already been executed in case of a failing node doing IST?

            The above algorithm does not sound either easy or reliable to me.
            If it's some hand-made script this kind of failove automation, I don't think that the server should print fake information to the binary log just to facilitate it.
            I don't have a really strong opinion on the subject though, if nirbhay_c and knielsen decide to go for it, then fine. Reassigning to nirbhay_c.

            elenst Elena Stepanova added a comment - The above algorithm does not sound either easy or reliable to me. If it's some hand-made script this kind of failove automation, I don't think that the server should print fake information to the binary log just to facilitate it. I don't have a really strong opinion on the subject though, if nirbhay_c and knielsen decide to go for it, then fine. Reassigning to nirbhay_c .

            I never said it was pretty

            If MariaDB GTID was kept in sync in the galera cluster this would not be needed by the way.

            michaeldg Michaël de groot added a comment - I never said it was pretty If MariaDB GTID was kept in sync in the galera cluster this would not be needed by the way.

            This issue is a work-around for the issue that MDEV-10715 fixes - keeping GTID in sync within a galera cluster.

            michaeldg Michaël de groot added a comment - This issue is a work-around for the issue that MDEV-10715 fixes - keeping GTID in sync within a galera cluster.

            In my opinion, this change does not need to be implemented. The xid refers to xa transactions and ddl is not transactional in mariadb.

            The underlying reason of requesting this work around has been fixed in the related issue

            michaeldg Michaël de groot added a comment - In my opinion, this change does not need to be implemented. The xid refers to xa transactions and ddl is not transactional in mariadb. The underlying reason of requesting this work around has been fixed in the related issue

            People

              sachin.setiya.007 Sachin Setiya (Inactive)
              michaeldg Michaël de groot
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.