[MDEV-4694] Changes to mysql.gtid_slave_pos table become visible too early - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Minor
Resolution: Not a Bug
Affects Version/s: None
Fix Version/s: 10.0.4
Component/s: None
Labels:
None

Description

The summary might be not quite accurate looking from inside, but it describes the effect I observe. Please feel free to modify it if needed.

I have master=>slave setup with master_use_gtid=current_pos. The slave does not have log-slave-updates.
A test executes DML flow (1 thread only) on master, simultaneously checking gtid* values on slave every 10 seconds, and crashing and restarting the slave every 30 seconds.
To check the gtid* values, it uses the following query:

SELECT GROUP_CONCAT(domain_id,"-",server_id,"-",seq_no separator ","), @@gtid_current_pos, @@gtid_slave_pos, @@gtid_binlog_pos, MAX(seq_no)

FROM mysql.gtid_slave_pos

WHERE (domain_id, sub_id) IN (SELECT domain_id, MAX(sub_id) FROM mysql.gtid_slave_pos GROUP BY domain_id)

and prints the values like this:

table mysql.gtid_slave_pos: 0-1-1511

gtid_current_pos: 0-1-1511

gtid_slave_pos:   0-1-1511

gtid_binlog_pos:

Normally, I see either 3 equal values, as above, or

table mysql.gtid_slave_pos: 0-1-3197

gtid_current_pos: 0-1-3196

gtid_slave_pos:   0-1-3196

gtid_binlog_pos:

which I attributed to a delay at updating the variables.
However, every once in a while I see the following picture:

Slave GTID status:

  table mysql.gtid_slave_pos: 0-1-14040

  gtid_current_pos: 0-1-14039

  gtid_slave_pos:   0-1-14039

  gtid_binlog_pos:

Sending SIGKILL to server with pid 42395 in order to force crash recovery

 Trying to restart the server ...

Slave GTID status after server restart:

  table mysql.gtid_slave_pos: 0-1-14039

  gtid_current_pos: 0-1-14039

  gtid_slave_pos:   0-1-14039

  gtid_binlog_pos:

That is, mysql.gtid_slave_pos already had seq_no 14040 before server crash, but after restart it's back to 14039.
At the same time, I see no indication of the event 14040 being re-executed. So, it looks like no harm is done.
Transaction isolation level is default REPEATABLE-READ.
I find it bizarre, because to my understanding the table is supposed to be updated in the same transaction as the change the new GTID belongs to, so if I see the change, the transaction should be already committed; and then I don't understand how it can be rolled back after restart...

I can provide more information, including how to reproduce it with a stress test, or I can setup the test on perro, or modify the test in any manner, or do whatever it takes to debug it, but I'm also quite prepared to learn that it works as expected.

bzr version-info

revision-id: knielsen@knielsen-hq.org-20130621192324-544gs6nw2gj4h45x

revno: 3664

branch-nick: 10.0-base

The effect is not related to two recent bugfixes, I observed it before as well.

Attachments

Issue Links

relates to

MDEV-26 Global transaction ID

Closed

Activity

People

Assignee:: Kristian Nielsen

Reporter:: Elena Stepanova

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 2013-06-22 02:54

Updated:: 2013-06-22 12:50

Resolved:: 2013-06-22 12:46

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server