[MDEV-10282] Queries which read from INFORMATION_SCHEMA and such should be marked as unsafe for SBR Created: 2016-06-24  Updated: 2016-06-24

Status: Open
Project: MariaDB Server
Component/s: Replication
Affects Version/s: 5.5, 10.0, 10.1
Fix Version/s: 10.2

Type: Bug Priority: Minor
Reporter: Elena Stepanova Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: upstream


 Description   

It's an old upstream bug https://bugs.mysql.com/bug.php?id=48382. It says:

The tables in the mysql and information_schema databases are filled in and updated by the server automatically, without replicating the changes. The contents may generally vary depending on server version, execution history and many other parameters that may vary between master and slave.
<...>
Such statements should be marked unsafe, so that they get logged in row format.

I'm re-filing it because I hit this problem from time to time in my concurrent tests and spend a lot of time analyzing the failure.

For example, it represents in this rather obscure way:

# 2016-06-24T21:03:29 [21878] Slave has stopped with error: Could not execute Delete_rows_v1 event on table test.is_schemata; Can't find record in 'is_schemata', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's master log mysql-bin.000001, end_log_pos 66972

Master runs 2 threads.
One repeatedly executes CREATE DATABASE and DROP DATABASE.
Another thread reads from INFORMATION_SCHEMA.SCHEMATA and inserts into a regular table; or deletes from the table.
Importantly, it does INSERT with REPEATABLE READ, and does DELETE with READ COMMITTED. It makes INSERT be written into the binlog in the statement format, but DELETE is written in row format.

At some point, the binlog ends up with, essentially, the following sequence:

# at 65107
#160624 21:02:19 server id 1  end_log_pos 65224         Query   thread_id=11    exec_time=1     error_code=0
use `test`/*!*/;
SET TIMESTAMP=1466791339/*!*/;
CREATE TABLE `test`.`is_schemata` (`name` VARCHAR(64))
/*!*/;
# at 65224
...
# at 66009
#160624 21:02:20 server id 1  end_log_pos 66118         Query   thread_id=10    exec_time=0     error_code=0
SET TIMESTAMP=1466791340/*!*/;
DROP SCHEMA IF EXISTS test_21879_105
/*!*/;
# at 66118
...
# at 66615
#160624 21:02:20 server id 1  end_log_pos 66804         Query   thread_id=11    exec_time=0     error_code=0
use `test`/*!*/;
SET TIMESTAMP=1466791340/*!*/;
INSERT INTO `test`.`is_schemata` SELECT SCHEMA_NAME FROM INFORMATION_SCHEMA.SCHEMATA WHERE SCHEMA_NAME LIKE 'test%' ORDER BY 1
/*!*/;
# at 66804
...
# at 66869
#160624 21:02:20 server id 1  end_log_pos 66972         Delete_rows: table id 82 flags: STMT_END_F
### DELETE FROM `test`.`is_schemata`
### WHERE
###   @1='test'
### DELETE FROM `test`.`is_schemata`
### WHERE
###   @1='test_21879_105'
# at 66972

Either it's because the statements are written in a wrong order, or because INFORMATION_SCHEMA is not updated momentarily upon DDL – all in all, even though test_21879_105 has already been dropped, it is still returned upon a select from INFORMATION_SCHEMA.SCHEMATA. Since DELETE is written in the ROW format, the row explicitly appears and needs to exist on the slave, which does not happen.

perl ./runall-new.pl \
--threads=2 \
--duration=600 \
--queries=100M \
--grammar=/data/tmp/rpl.yy \
--skip-gendata \
--rpl_mode=mixed \
--basedir1=<basedir> --vardir1=<vardir>

rpl.yy

thread1_init:
    { $max_db = 1; $current_db = 0; '' } ;
    
thread2_init:
	CREATE TABLE `test`.`is_schemata` (`name` VARCHAR(64));
 
thread1:
	  CREATE SCHEMA IF NOT EXISTS { $db = ( $current_db ? $current_db : $max_db++ ); $current_db = $db; 'test_'.abs($$).'_'.$db  }
	| DROP SCHEMA IF EXISTS { $db = ( $current_db ? $current_db : $max_db-1 ); $current_db = 0; 'test_'.abs($$).'_'.$db } 
;
 
thread2:
      SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;  DELETE FROM `test`.`is_schemata` 
    | SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ; INSERT INTO `test`.`is_schemata` SELECT SCHEMA_NAME FROM INFORMATION_SCHEMA.SCHEMATA WHERE SCHEMA_NAME LIKE 'test%' ORDER BY 1
;

The above is mostly for me to save time in future.

But the problem with INFORMATION_SCHEMA is general and can be demonstrated in dozens various ways, simply because contents INFORMATION_SCHEMA on master and slave are is not supposed to be identical.


Generated at Thu Feb 08 07:41:02 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.