[MDEV-23586] Mariabackup: GTID saved for replication in 10.4.14 is wrong - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Blocker
Resolution: Fixed
Affects Version/s: 10.4.14
Fix Version/s: 10.4.16, 10.5.7
Component/s: mariabackup, Server
Labels:
None

Description

It appears that the fix for https://jira.mariadb.org/browse/MDEV-21953 has introduced a serious regression and now any backup taken with MariaDB server 10.4.14 and above saves the wrong GTID.

To reproduce, take a mariabackup from a 10.4.14 instance, then notice how the reported position does refer to a GTID which is not proper. In the example below, the PIT corresponds to position 25035029, but the reported GTID 0-1-18691 is wrong and if used, yields several duplicate keys errors. In fact the correct GTID for that position would be 0-1-18699 as reported by BINLOG_GTID_POS() function.

This wrong GTID is saved to the xtrabackup_slave_info and since the position is no longer saved there (WHY???) the resulting backup does not carry a valid PIT information anymore.
This is hitting a customer of ours and makes all backups inconsistent due to wrong PIT info, hence the critical priority.

Thanks to aponzo who found and verified this issue.

Workaround: downgrade server to 10.4.13

[00] 2020-08-26 12:25:52 Backup created in directory '/root/backup/andre2/xtrabackup_backupfiles/'

[00] 2020-08-26 12:25:52 MySQL binlog position: filename 'm2-bin.000007', position '25035029', GTID of the last change '0-1-18691'

[00] 2020-08-26 12:25:52 MySQL slave binlog position: master host '192.168.56.101', gtid_slave_pos 0-1-18691

[00] 2020-08-26 12:25:52 Streaming backup-my.cnf

[00] 2020-08-26 12:25:52         ...done

[00] 2020-08-26 12:25:52 Streaming xtrabackup_info

[00] 2020-08-26 12:25:52         ...done

[00] 2020-08-26 12:25:52 Redo log (from LSN 29523906 to 31884588) was copied.

[00] 2020-08-26 12:25:52 completed OK!

prepare

2020-08-26 12:26:53 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=30269374

2020-08-26 12:26:53 0 [Note] InnoDB: Starting final batch to recover 286 pages from redo log.

2020-08-26 12:26:53 0 [Note] InnoDB: Last binlog file './m2-bin.000007', position 25035029

[00] 2020-08-26 12:26:54 Last binlog file ./m2-bin.000007, position 25035029

[00] 2020-08-26 12:26:54 completed OK!

MariaDB [test]> SELECT BINLOG_GTID_POS("m2-bin.000007",25035029 );

+--------------------------------------------+

| BINLOG_GTID_POS("m2-bin.000007",25035029 ) |

+--------------------------------------------+

| 0-1-18699                                  |

+--------------------------------------------+

1 row in set (0.133 sec)

Attachments

Issue Links

causes

MDEV-25334 FTWRL/Backup blocks DDL on temporary tables with binlog enabled, assertion fails in Diagnostics_area::set_error_status

Closed

relates to

MDEV-21611 mariabackup should update binlog position in InnoDB during prepare

Open

MDEV-21953 deadlock between BACKUP STAGE BLOCK_COMMIT and parallel replication

Closed

Activity

Ascending order - Click to sort in descending order

Rick Pizzi (Inactive) added a comment - 2020-08-26 12:21 - edited

From further testing:

problem hits with slave_parallel_threads = 0 too
it appears the drift between incorrect and correct GTID == number of parallel threads (with no parallel threads there is only one thread so it is == 1)

Rick Pizzi (Inactive) added a comment - 2020-08-26 12:21 - edited From further testing: problem hits with slave_parallel_threads = 0 too it appears the drift between incorrect and correct GTID == number of parallel threads (with no parallel threads there is only one thread so it is == 1)

Geoff Montee (Inactive) added a comment - 2020-08-26 17:24

In the example below, the PIT corresponds to position 25035029, but the reported GTID 0-1-18691 is wrong and if used, yields several duplicate keys errors. In fact the correct GTID for that position would be 0-1-18699 as reported by BINLOG_GTID_POS() function.

Are you using the position printed by mariabackup --prepare to validate the correctness of the position in xtrabackup_binlog_info? i.e. this position?:

2020-08-26 12:26:53 0 [Note] InnoDB: Last binlog file './m2-bin.000007', position 25035029

[00] 2020-08-26 12:26:54 Last binlog file ./m2-bin.000007, position 25035029

If so, then I want to point out that the position printed by mariabackup --prepare is not guaranteed to be consistent, so it is not a good idea to use this position to validate the position in xtrabackup_binlog_info. In fact, the position printed by mariabackup --prepare should never be used at all. For more information about that, see MDEV-21611.

Only the positions in xtrabackup_binlog_info and xtrabackup_slave_info are currently guaranteed to be consistent by Mariabackup.

In other words, it is totally normal and expected if the position saved in xtrabackup_binlog_info is different from the position printed by Mariabackup, because only the former position is supposed to be consistent.

Is your position in xtrabackup_binlog_info actually incorrect? I don't know, so I can't comment on that. I just want to point out that it is unsafe to use the position printed by mariabackup --prepare to make any judgement about the correctness of the position in xtrabackup_binlog_info.

Geoff Montee (Inactive) added a comment - 2020-08-26 17:24 In the example below, the PIT corresponds to position 25035029, but the reported GTID 0-1-18691 is wrong and if used, yields several duplicate keys errors. In fact the correct GTID for that position would be 0-1-18699 as reported by BINLOG_GTID_POS() function. Are you using the position printed by mariabackup --prepare to validate the correctness of the position in xtrabackup_binlog_info ? i.e. this position?: 2020-08-26 12:26:53 0 [Note] InnoDB: Last binlog file './m2-bin.000007', position 25035029 [00] 2020-08-26 12:26:54 Last binlog file ./m2-bin.000007, position 25035029 If so, then I want to point out that the position printed by mariabackup --prepare is not guaranteed to be consistent, so it is not a good idea to use this position to validate the position in xtrabackup_binlog_info . In fact, the position printed by mariabackup --prepare should never be used at all. For more information about that, see MDEV-21611 . Only the positions in xtrabackup_binlog_info and xtrabackup_slave_info are currently guaranteed to be consistent by Mariabackup. In other words, it is totally normal and expected if the position saved in xtrabackup_binlog_info is different from the position printed by Mariabackup, because only the former position is supposed to be consistent. Is your position in xtrabackup_binlog_info actually incorrect? I don't know, so I can't comment on that. I just want to point out that it is unsafe to use the position printed by mariabackup --prepare to make any judgement about the correctness of the position in xtrabackup_binlog_info .

Rick Pizzi (Inactive) added a comment - 2020-08-27 07:29

The printed position and the position saved in the file are identical.

Rick Pizzi (Inactive) added a comment - 2020-08-27 07:29 The printed position and the position saved in the file are identical.

Michael Widenius added a comment - 2020-08-27 10:04

What is PIT? Point in time? (Binlog position would maybe a better term in that case)

From what I understand from above, the printed binlog file and positions are correct.
Only the GTID printed by mariabackup is wrong. Is this correct?

Michael Widenius added a comment - 2020-08-27 10:04 What is PIT? Point in time? (Binlog position would maybe a better term in that case) From what I understand from above, the printed binlog file and positions are correct. Only the GTID printed by mariabackup is wrong. Is this correct?

Rick Pizzi (Inactive) added a comment - 2020-08-27 10:23 - edited

Yes sorry I meant the the binlog position - which is correct, only GTID is wrong.

The issue is, in the file xtrabackup_slave_info only the GTID is saved. I think it woud be best to go back to also printing the position...

Rick

Rick Pizzi (Inactive) added a comment - 2020-08-27 10:23 - edited Yes sorry I meant the the binlog position - which is correct, only GTID is wrong. The issue is, in the file xtrabackup_slave_info only the GTID is saved. I think it woud be best to go back to also printing the position... Rick

Rick Pizzi (Inactive) added a comment - 2020-08-27 13:27 - edited

julien.fritsch A test case is more than trivial to build for this issue... just set up one master, one replica, have GTID enabled and parallel slave threads > 1, then take a mariabackup on the replica and verify that the saved GTID in the slave_info file does not correspond to the GTID of the saved position as reported by BINLOG_GTID_POS().
You need some traffic flowing on master for this of course, we used sysbench.
As an additional proof - if sysbench workload is inserting rows, using the recommended GTID to set up replication will result in duplicate keys errors.

Rick Pizzi (Inactive) added a comment - 2020-08-27 13:27 - edited julien.fritsch A test case is more than trivial to build for this issue... just set up one master, one replica, have GTID enabled and parallel slave threads > 1, then take a mariabackup on the replica and verify that the saved GTID in the slave_info file does not correspond to the GTID of the saved position as reported by BINLOG_GTID_POS(). You need some traffic flowing on master for this of course, we used sysbench. As an additional proof - if sysbench workload is inserting rows, using the recommended GTID to set up replication will result in duplicate keys errors.

Vladislav Lesin added a comment - 2020-09-01 12:33

Binlog position is gotten with SHOW MASTER STATUS in mariabackup. GTID of the last change is gtid_current_pos variable which is gotten with SHOW VARIABLES statement. (see write_binlog_info()). Both statements are executed under BACKUP STAGE BLOCK_COMMIT lock (if --no-lock option is not used).

I failed to repeat it with mtr test. I have started replication with several slave threads and set watchpoints on variables which are used to count binlog position(mysql_bin_log.log_file.pos_in_file, mysql_bin_log.log_file.write_pos, mysql_bin_log.log_file.request_pos, see also MYSQL_BIN_LOG::raw_get_current_log() - the function which is used to count binlog position on SHOW MASTER STATUS execution). I also set breakpoints to function which is used to update gtid_current_pos(rpl_binlog_state::update_nolock()). Then I set BLOCK_COMMIT lock on slave and executed a bunch of insert's on master. There were not hits after BLOCK_COMMIT on slave.

I am trying to repeat the bug with sysbench.

Vladislav Lesin added a comment - 2020-09-01 12:33 Binlog position is gotten with SHOW MASTER STATUS in mariabackup. GTID of the last change is gtid_current_pos variable which is gotten with SHOW VARIABLES statement. (see write_binlog_info()). Both statements are executed under BACKUP STAGE BLOCK_COMMIT lock (if --no-lock option is not used). I failed to repeat it with mtr test. I have started replication with several slave threads and set watchpoints on variables which are used to count binlog position(mysql_bin_log.log_file.pos_in_file, mysql_bin_log.log_file.write_pos, mysql_bin_log.log_file.request_pos, see also MYSQL_BIN_LOG::raw_get_current_log() - the function which is used to count binlog position on SHOW MASTER STATUS execution). I also set breakpoints to function which is used to update gtid_current_pos(rpl_binlog_state::update_nolock()). Then I set BLOCK_COMMIT lock on slave and executed a bunch of insert's on master. There were not hits after BLOCK_COMMIT on slave. I am trying to repeat the bug with sysbench.

Andrea Ponzo (Inactive) added a comment - 2020-09-02 08:01

vlad.lesin let's explain this issue better:
we have a Server A ( MariaDB version is not important)
a server B ( Slave of A, mariadb version 10.4.14)
we need to create a server C ( version 10.4.14) as Slave of A performing a mariabackup on B.
during the backup the traffic is open on Master A
At the end on the backup we need to use the file xtrabackup_slave_info to get the CHANGE MASTER in order to connect C as slave of A.

and we have

[root@m2 andre2]# cat xtrabackup_slave_info
SET GLOBAL gtid_slave_pos = '0-1-18691';
CHANGE MASTER TO master_use_gtid = slave_pos

of course we will add the MASTER_HOST of A

doing this we have duplicate errors because the above gtid_slave_pos is wrong.

if we open the file
[root@m2 andre2]# cat xtrabackup_binlog_info
m2-bin.000007 25035029 0-1-18691

binlog file and pos of server B
and calculate its GTID for this position

MariaDB [test]> SELECT BINLOG_GTID_POS("m2-bin.000007",25035029 );
--------------------------------------------

BINLOG_GTID_POS("m2-bin.000007",25035029 )

--------------------------------------------

0-1-18699

--------------------------------------------

it is different from that reported on xtrabackup_slave_info

So, the MASTER_LOG_FILE and MASTER_LOG_POS are correct if we want to setup the server C as slave of B, but the GTID is not.

the Sequence number difference is due to the number of parallel_workers used
the difference is 1 for parallel workers 0 or 1
and n for slave_parallel = n used in server B.

Can you please report, on your test, the output of :
cat xtrabackup_binlog_info
SELECT BINLOG_GTID_POS(file,pos);

thanks
Andrea

Andrea Ponzo (Inactive) added a comment - 2020-09-02 08:01 vlad.lesin let's explain this issue better: we have a Server A ( MariaDB version is not important) a server B ( Slave of A, mariadb version 10.4.14) we need to create a server C ( version 10.4.14) as Slave of A performing a mariabackup on B. during the backup the traffic is open on Master A At the end on the backup we need to use the file xtrabackup_slave_info to get the CHANGE MASTER in order to connect C as slave of A. and we have [root@m2 andre2] # cat xtrabackup_slave_info SET GLOBAL gtid_slave_pos = '0-1-18691'; CHANGE MASTER TO master_use_gtid = slave_pos of course we will add the MASTER_HOST of A doing this we have duplicate errors because the above gtid_slave_pos is wrong. if we open the file [root@m2 andre2] # cat xtrabackup_binlog_info m2-bin.000007 25035029 0-1-18691 binlog file and pos of server B and calculate its GTID for this position MariaDB [test] > SELECT BINLOG_GTID_POS("m2-bin.000007",25035029 ); -------------------------------------------- BINLOG_GTID_POS("m2-bin.000007",25035029 ) -------------------------------------------- 0-1-18699 -------------------------------------------- it is different from that reported on xtrabackup_slave_info So, the MASTER_LOG_FILE and MASTER_LOG_POS are correct if we want to setup the server C as slave of B, but the GTID is not. the Sequence number difference is due to the number of parallel_workers used the difference is 1 for parallel workers 0 or 1 and n for slave_parallel = n used in server B. Can you please report, on your test, the output of : cat xtrabackup_binlog_info SELECT BINLOG_GTID_POS(file,pos); thanks Andrea

Vladislav Lesin added a comment - 2020-09-02 11:39

My current assumption, which I am trying to repeat in MTR test, is the following:

Consider the following code:

int ha_commit_trans(THD *thd, bool all)

...

  if (!is_real_trans)

    error= commit_one_phase_2(thd, all, trans, is_real_trans, rw_trans);

    goto done;

...

  cookie= tc_log->log_and_order(thd, xid, all, need_prepare_ordered,

                                need_commit_ordered);

...

  error= commit_one_phase_2(thd, all, trans, is_real_trans, rw_trans) ? 2 : 0;

...

tc_log->log_and_order() flushes binlog and changes binlog position. commit_one_phase_2() acquires backup block commit lock. As we can see, in the case, when is_real_trans==true, binlog is flushed before block commit is acquired.

So, there can be the following scenario:
T2 - slave worker, have started transaction and executed some Write_rows events;
T1 - connection from mariabackup, holds backup block commit lock;
T2 - executes Xid_log_event in ha_commit_trans(), and as is_real_commit == true for Xid_log_event, it executes tc_log->log_and_order() first, i.e. flushes binlog and changes binlog position, but then it is blocked in commit_one_phase_2() because backup block commit lock is held by T1;
T1 - executes 'SHOW MASTER STATUS' and 'SHOW VARIABLES LIKE "gtid_current_pos"', and receives binlog position, which is greater then gtid_current_pos.

Vladislav Lesin added a comment - 2020-09-02 11:39 My current assumption, which I am trying to repeat in MTR test, is the following: Consider the following code: int ha_commit_trans(THD *thd, bool all) { ... if (!is_real_trans) { error= commit_one_phase_2(thd, all, trans, is_real_trans, rw_trans); goto done; } ... cookie= tc_log->log_and_order(thd, xid, all, need_prepare_ordered, need_commit_ordered); ... error= commit_one_phase_2(thd, all, trans, is_real_trans, rw_trans) ? 2 : 0 ; ... } tc_log->log_and_order() flushes binlog and changes binlog position. commit_one_phase_2() acquires backup block commit lock. As we can see, in the case, when is_real_trans==true, binlog is flushed before block commit is acquired. So, there can be the following scenario: T2 - slave worker, have started transaction and executed some Write_rows events; T1 - connection from mariabackup, holds backup block commit lock; T2 - executes Xid_log_event in ha_commit_trans(), and as is_real_commit == true for Xid_log_event, it executes tc_log->log_and_order() first, i.e. flushes binlog and changes binlog position, but then it is blocked in commit_one_phase_2() because backup block commit lock is held by T1; T1 - executes 'SHOW MASTER STATUS' and 'SHOW VARIABLES LIKE "gtid_current_pos"', and receives binlog position, which is greater then gtid_current_pos.

Vladislav Lesin added a comment - 2020-09-03 17:44

monty, I have just confirmed the above scenario with test case: https://github.com/MariaDB/server/commit/cbcb921bb83be8b84142dfa73583724c8af843a8.

Vladislav Lesin added a comment - 2020-09-03 17:44 monty , I have just confirmed the above scenario with test case: https://github.com/MariaDB/server/commit/cbcb921bb83be8b84142dfa73583724c8af843a8 .

Vladislav Lesin added a comment - 2020-09-04 07:42 - edited

I have just added one more test case to reproduce the scenario without replication: https://github.com/MariaDB/server/commit/f19ff5b68e70214c4d75a2165ce8f5ef8cd2739f. The problem is the same - binlog flushing in ha_commit_trans() is not covered by backup lock.

Vladislav Lesin added a comment - 2020-09-04 07:42 - edited I have just added one more test case to reproduce the scenario without replication: https://github.com/MariaDB/server/commit/f19ff5b68e70214c4d75a2165ce8f5ef8cd2739f . The problem is the same - binlog flushing in ha_commit_trans() is not covered by backup lock.

Michael Widenius added a comment - 2020-09-07 07:40

Code to review is in bb-10.4-monty

Michael Widenius added a comment - 2020-09-07 07:40 Code to review is in bb-10.4-monty

Michael Widenius added a comment - 2020-09-25 14:22

Fixed by partly reverting ~~MDEV-21953~~ to put back MDL_BACKUP_COMMIT locking
before log_and_order.

The original problem for ~~MDEV-21953~~ was that while a thread was waiting in
for another threads to commit in 'log_and_order', it had the
MDL_BACKUP_COMMIT lock. The backup thread was waiting to get the
MDL_BACKUP_WAIT_COMMIT lock, which blocks all new MDL_BACKUP_COMMIT locks.
This causes a deadlock as the waited-for thread can never get past the
MDL_BACKUP_COMMIT lock in ha_commit_trans.

The main part of the bug fix is to release the MDL_BACKUP_COMMIT lock while
a thread is waiting for other 'previous' threads to commit. This ensures
that no transactional thread keeps MDL_BACKUP_COMMIT while waiting, which
ensures that there are no deadlocks any more.

Michael Widenius added a comment - 2020-09-25 14:22 Fixed by partly reverting MDEV-21953 to put back MDL_BACKUP_COMMIT locking before log_and_order. The original problem for MDEV-21953 was that while a thread was waiting in for another threads to commit in 'log_and_order', it had the MDL_BACKUP_COMMIT lock. The backup thread was waiting to get the MDL_BACKUP_WAIT_COMMIT lock, which blocks all new MDL_BACKUP_COMMIT locks. This causes a deadlock as the waited-for thread can never get past the MDL_BACKUP_COMMIT lock in ha_commit_trans. The main part of the bug fix is to release the MDL_BACKUP_COMMIT lock while a thread is waiting for other 'previous' threads to commit. This ensures that no transactional thread keeps MDL_BACKUP_COMMIT while waiting, which ensures that there are no deadlocks any more.

People

Assignee:: Michael Widenius

Reporter:: Rick Pizzi (Inactive)

Votes:: 4 Vote for this issue

Watchers:: 14 Start watching this issue

Dates

Created:: 2020-08-26 12:14

Updated:: 2024-07-07 21:57

Resolved:: 2020-09-25 14:22

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server