[MDEV-9315] Xid is only shown for RBR events - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Won't Fix
Affects Version/s: 10.0.23
Fix Version/s: N/A
Component/s: Galera, Replication, Scripts & Clients
Labels:
- galera

Description

We use Xid to failover a asynchronous slave to a different Galera master. Some users have automated scripts to do this as well.

With RBR events the Xid is shown at the bottom of a transaction. With SBR events it it is not shown. Please change this so Xid is always shown in mysqlbinlog output.

#151222 17:51:03 server id 10  end_log_pos 1166         Query   thread_id=35728 exec_time=0     error_code=0

use `mariadb_test_7`/*!*/;

SET TIMESTAMP=1450803063/*!*/;

create table testy (id serial)

/*!*/;

# at 1166

#151222 17:51:08 server id 10  end_log_pos 1204         GTID 0-10-7

/*!100001 SET @@session.gtid_seq_no=7*//*!*/;

BEGIN

/*!*/;

# at 1204

# at 1258

#151222 17:51:08 server id 10  end_log_pos 1258         Table_map: `mariadb_test_7`.`testy` mapped to number 80

#151222 17:51:08 server id 10  end_log_pos 1296         Write_rows: table id 80 flags: STMT_END_F

BINLOG '

fH95VhMKAAAANgAAAOoEAAAAAFAAAAAAAAEADm1hcmlhZGJfdGVzdF83AAV0ZXN0eQABCAAA

fH95VhcKAAAAJgAAABAFAAAAAFAAAAAAAAEAAf/+AwAAAAAAAAA=

'/*!*/;

### INSERT INTO `mariadb_test_7`.`testy`

### SET

###   @1=3 /* LONGINT meta=0 nullable=0 is_null=0 */

# at 1296

#151222 17:51:08 server id 10  end_log_pos 1323         Xid = 153192

COMMIT/*!*/;

# at 1323

Attachments

Issue Links

relates to

MDEV-20720 Galera: Replicate MariaDB GTID to other nodes in the cluster

Closed

Activity

Ascending order - Click to sort in descending order

Elena Stepanova added a comment - 2015-12-26 17:39

It is not about STATEMENT vs ROW format, it is about being transactional vs non-transactional.
DML on an InnoDB table is transactional, so it's written with Xid, both in row- and statement format.
DDL is non-transactional, so it's written without Xid.

Why do you need it for DDL, how do you use it for failover?

Elena Stepanova added a comment - 2015-12-26 17:39 It is not about STATEMENT vs ROW format, it is about being transactional vs non-transactional. DML on an InnoDB table is transactional, so it's written with Xid, both in row- and statement format. DDL is non-transactional, so it's written without Xid. Why do you need it for DDL, how do you use it for failover?

Michaël de groot added a comment - 2015-12-27 18:57

Aha that explains! It does not remove the issue though.

For asynchronous slave to galera cluster failover the easiest way (it does not require SSH access) to automate the failover is:
1. Maintain a table in Galera where slaves can write their desired position
2. When a slave notices it's replication is down, stop the SQL thread and find the galera last comitted id
3. Save this last committed id in the Galera system
4. On the remaining masters, monitor this table every second
5. If a slave requests the binary log position, find out which Galera node is the new master (first try wins?)
6. Look up the binary log file and position, save this in the table
7. On the slave, monitor the response every second and change the master to the new master

If there are frequent DDL statements going on in the system, this means the script would find the Xid before the DDL statement. For example a INDEX creation is already executed on the slave. When failing over to a galera master, it will find the position of that Xid. This, the CREATE INDEX will be executed again and fail the replication.

Is this something in mysqlbinlog or in Galera? If it's in Galera: How does the galera cluster itself know if the DDL has already been executed in case of a failing node doing IST?

Michaël de groot added a comment - 2015-12-27 18:57 Aha that explains! It does not remove the issue though. For asynchronous slave to galera cluster failover the easiest way (it does not require SSH access) to automate the failover is: 1. Maintain a table in Galera where slaves can write their desired position 2. When a slave notices it's replication is down, stop the SQL thread and find the galera last comitted id 3. Save this last committed id in the Galera system 4. On the remaining masters, monitor this table every second 5. If a slave requests the binary log position, find out which Galera node is the new master (first try wins?) 6. Look up the binary log file and position, save this in the table 7. On the slave, monitor the response every second and change the master to the new master If there are frequent DDL statements going on in the system, this means the script would find the Xid before the DDL statement. For example a INDEX creation is already executed on the slave. When failing over to a galera master, it will find the position of that Xid. This, the CREATE INDEX will be executed again and fail the replication. Is this something in mysqlbinlog or in Galera? If it's in Galera: How does the galera cluster itself know if the DDL has already been executed in case of a failing node doing IST?

Elena Stepanova added a comment - 2015-12-29 02:53

The above algorithm does not sound either easy or reliable to me.
If it's some hand-made script this kind of failove automation, I don't think that the server should print fake information to the binary log just to facilitate it.
I don't have a really strong opinion on the subject though, if nirbhay_c and knielsen decide to go for it, then fine. Reassigning to nirbhay_c.

Elena Stepanova added a comment - 2015-12-29 02:53 The above algorithm does not sound either easy or reliable to me. If it's some hand-made script this kind of failove automation, I don't think that the server should print fake information to the binary log just to facilitate it. I don't have a really strong opinion on the subject though, if nirbhay_c and knielsen decide to go for it, then fine. Reassigning to nirbhay_c .

Michaël de groot added a comment - 2015-12-29 04:19

I never said it was pretty

If MariaDB GTID was kept in sync in the galera cluster this would not be needed by the way.

Michaël de groot added a comment - 2015-12-29 04:19 I never said it was pretty If MariaDB GTID was kept in sync in the galera cluster this would not be needed by the way.

Michaël de groot added a comment - 2017-04-18 20:21

This issue is a work-around for the issue that ~~MDEV-10715~~ fixes - keeping GTID in sync within a galera cluster.

Michaël de groot added a comment - 2017-04-18 20:21 This issue is a work-around for the issue that MDEV-10715 fixes - keeping GTID in sync within a galera cluster.

Michaël de groot added a comment - 2017-11-05 17:40

In my opinion, this change does not need to be implemented. The xid refers to xa transactions and ddl is not transactional in mariadb.

The underlying reason of requesting this work around has been fixed in the related issue

Michaël de groot added a comment - 2017-11-05 17:40 In my opinion, this change does not need to be implemented. The xid refers to xa transactions and ddl is not transactional in mariadb. The underlying reason of requesting this work around has been fixed in the related issue

People

Assignee:: Sachin Setiya (Inactive)

Reporter:: Michaël de groot

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 2015-12-22 19:00

Updated:: 2017-11-05 18:16

Resolved:: 2017-11-05 17:40

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server