[MDEV-4694] Changes to mysql.gtid_slave_pos table become visible too early Created: 2013-06-22  Updated: 2013-06-22  Resolved: 2013-06-22

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: None
Fix Version/s: 10.0.4

Type: Bug Priority: Minor
Reporter: Elena Stepanova Assignee: Kristian Nielsen
Resolution: Not a Bug Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-26 Global transaction ID Closed

 Description   

The summary might be not quite accurate looking from inside, but it describes the effect I observe. Please feel free to modify it if needed.

I have master=>slave setup with master_use_gtid=current_pos. The slave does not have log-slave-updates.
A test executes DML flow (1 thread only) on master, simultaneously checking gtid* values on slave every 10 seconds, and crashing and restarting the slave every 30 seconds.
To check the gtid* values, it uses the following query:

SELECT GROUP_CONCAT(domain_id,"-",server_id,"-",seq_no separator ","), @@gtid_current_pos, @@gtid_slave_pos, @@gtid_binlog_pos, MAX(seq_no)
FROM mysql.gtid_slave_pos 
WHERE (domain_id, sub_id) IN (SELECT domain_id, MAX(sub_id) FROM mysql.gtid_slave_pos GROUP BY domain_id)

and prints the values like this:

table mysql.gtid_slave_pos: 0-1-1511
gtid_current_pos: 0-1-1511
gtid_slave_pos:   0-1-1511
gtid_binlog_pos:  

Normally, I see either 3 equal values, as above, or

table mysql.gtid_slave_pos: 0-1-3197
gtid_current_pos: 0-1-3196
gtid_slave_pos:   0-1-3196
gtid_binlog_pos:  

which I attributed to a delay at updating the variables.
However, every once in a while I see the following picture:

Slave GTID status:
  table mysql.gtid_slave_pos: 0-1-14040
  gtid_current_pos: 0-1-14039
  gtid_slave_pos:   0-1-14039
  gtid_binlog_pos:  
Sending SIGKILL to server with pid 42395 in order to force crash recovery
 Trying to restart the server ...
Slave GTID status after server restart:
  table mysql.gtid_slave_pos: 0-1-14039
  gtid_current_pos: 0-1-14039
  gtid_slave_pos:   0-1-14039
  gtid_binlog_pos:  

That is, mysql.gtid_slave_pos already had seq_no 14040 before server crash, but after restart it's back to 14039.
At the same time, I see no indication of the event 14040 being re-executed. So, it looks like no harm is done.
Transaction isolation level is default REPEATABLE-READ.
I find it bizarre, because to my understanding the table is supposed to be updated in the same transaction as the change the new GTID belongs to, so if I see the change, the transaction should be already committed; and then I don't understand how it can be rolled back after restart...

I can provide more information, including how to reproduce it with a stress test, or I can setup the test on perro, or modify the test in any manner, or do whatever it takes to debug it, but I'm also quite prepared to learn that it works as expected.

bzr version-info

revision-id: knielsen@knielsen-hq.org-20130621192324-544gs6nw2gj4h45x
revno: 3664
branch-nick: 10.0-base

The effect is not related to two recent bugfixes, I observed it before as well.



 Comments   
Comment by Kristian Nielsen [ 2013-06-22 ]

This is actually InnoDB behaviour, as per this comment from the source:

-----------------------------------------------------------------------
/* The following assignment makes the transaction committed in memory
and makes its changes to data visible to other transactions.
NOTE that there is a small discrepancy from the strict formal
visibility rules here: a human user of the database can see
modifications made by another transaction T even before the necessary
log segment has been flushed to the disk. If the database happens to
crash before the flush, the user has seen modifications from T which
will never be a committed transaction. However, any transaction T2
which sees the modifications of the committing transaction T, and
which also itself makes modifications to the database, will get an lsn
larger than the committing transaction T. In the case where the log
flush fails, and T never gets committed, also T2 will never get
committed. */
-----------------------------------------------------------------------

I was able to repeat the same behaviour just with a standard InnoDB table, no
replication or GTID. Test case as follows, though it uses sleep and is thus
racy and may not behave the same on your machine:

-----------------------------------------------------------------------

  1. rpl_mdev4694.test
    --source include/have_innodb.inc
    --source include/have_debug.inc

--connect(con1,localhost,root,,)

--connection default
CREATE TABLE t1 (a INT PRIMARY KEY, b INT) ENGINE=InnoDB;

INSERT INTO t1 VALUES (1, 1);
INSERT INTO t1 VALUES (2, 2);

--write_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
wait
EOF

--connection con1
BEGIN;
INSERT INTO t1 VALUES (3,3);
DELETE FROM t1 WHERE a=1;
send COMMIT;

--connection default
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
BEGIN;
SELECT * FROM t1 ORDER BY a;
SET SESSION debug_dbug="+d,dummy_disable_default_dbug_output";
SET SESSION debug_dbug="+d,crash_dispatch_command_before";
--error 2006,2013
SELECT 1;
--source include/wait_until_disconnected.inc

--append_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
restart
EOF

--enable_reconnect
--source include/wait_until_connected_again.inc

SELECT * FROM t1 ORDER BY a;

--connection con1
--enable_reconnect
--error 0,2006,2013
reap;

--connection default
DROP TABLE t1;
-----------------------------------------------------------------------

You will need this .opt file:

-----------------------------------------------------------------------

  1. rpl_mdev4694-master.opt
    --skip-stack-trace --skip-core-file
    -----------------------------------------------------------------------

And this patch:

-----------------------------------------------------------------------
=== modified file 'storage/xtradb/trx/trx0trx.c'
— storage/xtradb/trx/trx0trx.c 2013-06-06 15:51:28 +0000
+++ storage/xtradb/trx/trx0trx.c 2013-06-22 09:20:43 +0000
@@ -1200,7 +1200,7 @@ trx_commit_off_kernel(
ulint flush_log_at_trx_commit;

mutex_exit(&kernel_mutex);
-
+ os_thread_sleep(3000000);
if (trx->insert_undo != NULL) {

trx_undo_insert_cleanup(trx);

-----------------------------------------------------------------------

I get the following output:

-----------------------------------------------------------------------
CREATE TABLE t1 (a INT PRIMARY KEY, b INT) ENGINE=InnoDB;
INSERT INTO t1 VALUES (1, 1);
INSERT INTO t1 VALUES (2, 2);
BEGIN;
INSERT INTO t1 VALUES (3,3);
DELETE FROM t1 WHERE a=1;
COMMIT;
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
BEGIN;
SELECT * FROM t1 ORDER BY a;
a b
2 2
3 3
SET SESSION debug_dbug="+d,dummy_disable_default_dbug_output";
SET SESSION debug_dbug="+d,crash_dispatch_command_before";
SELECT 1;
Got one of the listed errors
SELECT * FROM t1 ORDER BY a;
a b
1 1
2 2
DROP TABLE t1;
main.rpl_mdev4694 'xtradb' [ pass ] 13829
-----------------------------------------------------------------------

So the transaction is visible before the crash, but gone after the crash.

So this seems to be intended InnoDB behaviour. Transactions seen just before a
crash may still be rolled back during crash recovery, if the crash happens in
the middle of the commit of said transactions.

I believe it will not cause any issues for GTID, as you also seem to
observe. The change to mysql.gtid_slave_pos will be recovered or rolled back
together with the replicated transaction during InnoDB crash recovery, so the
result will be consistent and correct irrespectively of visibility before the
crash.

So I will mark this as "not a bug". But please re-open if you think there are
more things related to this to consider or investigate. I am quite happy that
you have reached this deep level of testing of the GTID stuff. So please
continue reporting anything that appears strange, even if this is not a bug,
the next one may well be.

Comment by Elena Stepanova [ 2013-06-22 ]

Thank you, I suspected there was a catch in there somewhere, but it looked so weird that I just had to find out. I agree that it's not a bug, at least as long as we don't find a scenario where it actually causes any trouble.

Generated at Thu Feb 08 06:58:25 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.