[MDEV-14760] Replication does not take into account SET STATEMENT Created: 2017-12-24  Updated: 2023-12-19

Status: Stalled
Project: MariaDB Server
Component/s: Replication
Affects Version/s: 10.1, 10.2, 10.6
Fix Version/s: 10.4

Type: Bug Priority: Critical
Reporter: Elena Stepanova Assignee: Andrei Elkin
Resolution: Unresolved Votes: 1
Labels: decide_logging_format

Issue Links:
Blocks
blocks MDEV-32930 mysqlbinlog --verbose shows incorrect... Open
Relates
relates to MDEV-27462 SET STATEMENT allows variables that c... Stalled

 Description   

DDL statements sent to slave does not take into account the original environment where the query was executed.
This is true when using SET STATEMENT or having a non standard value for any variable that is internally used by a DDL, like default_storage_engine.

--source include/master-slave.inc
 
SET STATEMENT storage_engine=Aria FOR CREATE TABLE t (i INT);
SHOW CREATE TABLE t;
 
--sync_slave_with_master
SHOW CREATE TABLE t;
 
--connection master
DROP TABLE t;
--source include/rpl_end.inc

Result

[connection master]
SET STATEMENT storage_engine=Aria FOR CREATE TABLE t (i INT);
SHOW CREATE TABLE t;
Table	Create Table
t	CREATE TABLE `t` (
  `i` int(11) DEFAULT NULL
) ENGINE=Aria DEFAULT CHARSET=latin1 PAGE_CHECKSUM=1
connection slave;
SHOW CREATE TABLE t;
Table	Create Table
t	CREATE TABLE `t` (
  `i` int(11) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1



 Comments   
Comment by Sujatha Sivakumar (Inactive) [ 2019-03-12 ]

IMHO the reported issue is not a bug, because of the following reasons.

Reason 1:-
The MySQL manual states that 'default_storage_engine' and 'storage_engine'
variables are never replicated. Please refer below.

https://dev.mysql.com/doc/refman/5.7/en/replication-features-variables.html
The default_storage_engine and storage_engine system variables are not
replicated, regardless of the logging mode; this is intended to facilitate
replication between different storage engines.

I hope MariaDB also has the same behaviour.

Reason 2:-
SET STATEMENT feature was added as part MDEV-5231. This feature is a port of
"Per query variables from Percona Server".

The ported patch added a test named :rpl_set_statement.test

This test has a scenario do demonstrate that following variables are never
replicated. as part of SET STATEMENTS.

--echo 4) variables that are not replicated at all:
--echo default_storage_engine, storage_engine, max_heap_table_size

Hence the documentation needs to be updated saying the above SET STATEMENTS will
not be replicated.

Please let me know your thoughts.

Comment by Sujatha Sivakumar (Inactive) [ 2019-03-12 ]

Hello Andrei,

Please review the latest comments on the JIRA page and provide your thoughts.

Thank you.

Comment by Andrei Elkin [ 2019-05-29 ]

sujatha.sivakumar, howdy.

Thanks for briefing me with the feature's history, which really saved a little time !
Your point about not replicated SET of default_storage_engine variable is naturally fair. However two facts should be noted in order to decide about the case.
The first is that the current value of default_storage_engine still plays a role, that is to evaluate
the engine type for CREATE TABLE t (a int);, like here:

MariaDB [test]> SET @@SESSION.default_storage_engine=aria;
 
MariaDB [test]> CREATE TABLE ta (a INT);
 
MariaDB [test]> SHOW CREATE TABLE t_a;
+-------+------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                         |
+-------+------------------------------------------------------------------------------------------------------+
| t_a   | CREATE TABLE `t_a` (
  `a` int(11) DEFAULT NULL
) ENGINE=Aria DEFAULT CHARSET=latin1 PAGE_CHECKSUM=1 |
+-------+------------------------------------------------------------------------------------------------------+

After the evaluation the chosen value is also present in Query-log-event (the evidence part is omitted though).

And secondly the whole idea of the SET-STATEMENT
feature is to provide one statement time "dynamic" exception to any current default.
Notice that unlike to the "static" default evaluation case the binlog event does not match in the engine attribute,
which I can read only as a bug.

I hope this makes it more clear.

Cheers,
Andrei

Comment by Andrei Elkin [ 2019-05-29 ]

It took time to arrive @ the question, Sujatha.

Comment by Sujatha Sivakumar (Inactive) [ 2019-05-31 ]

Hello Andrei,

Good morning. Thank you for the review comments.

The system variables which are set using "SET STATEMENT" are never replicated.
But we just write the query along with the "SET STATEMENT" into the
"Query_log_event". On slave the "SET STATEMENT" syntax is ignored. Only the
actual query gets executed.

But if we look at the 'rpl_set_statement.test' the second test case states that
the system variables which ARE replicated, They must be replicated correctly
with "SET STATEMENT" too.

For example:
-------------------
MariaDB [test]> SET STATEMENT auto_increment_increment=40 FOR INSERT INTO t1
VALUES(@@auto_increment_increment); Query OK, 1 row affected (0.01 sec)

In general 'auto_increment_increment' system variable is replicated. Please have
a look at the following binlog output. "SET @@session.auto_increment_increment=40"
gets added to binary log as part of regular replication. On the slave side this
"SET @@session.auto_increment_increment=40" command will be executed as part of
Query_log_event::do_apply_event. It is not because of
"SET STATEMENT auto_increment_increment=40 ...".

BEGIN
/*!*/;
# at 631
#190531 12:30:07 server id 1  end_log_pos 794     Query    thread_id=5    exec_time=0    error_code=0
SET TIMESTAMP=1559286007/*!*/;
SET @@session.auto_increment_increment=40, @@session.auto_increment_offset=2/*!*/;
SET STATEMENT auto_increment_increment=40 FOR INSERT INTO t1 VALUES(@@auto_increment_increment)
/*!*/;
# at 794
#190531 12:30:07 server id 1  end_log_pos 868     Query    thread_id=5    exec_time=0    error_code=0
SET TIMESTAMP=1559286007/*!*/;

"SET STATEMENT" syntax is ignored on slave. The reason why the implementer chose
it that way is, he/she didn't want to introduce any change of behavior
with respect to system variables which should/should not be replicated.

Hence the original code looks like this.

File: sql/sql_parse.cc
 
 if (!lex->stmt_var_list.is_empty() && !thd->slave_thread)
  {
    Query_arena backup;
    DBUG_PRINT("info", ("SET STATEMENT %d vars", lex->stmt_var_list.elements));

The internal implementation of 'SET STATEMENT' looks like this

SET @save_value=@@var1;
SET SESSION var1=value1;
stmt;
SET SESSION var1=@save_value;

For example:
-------------------
If user sets 'storage_engine' at session level, it is never replicated.

MariaDB [test]> set @@session.storage_engine=aria;
Query OK, 0 rows affected (0.00 sec)
 
MariaDB [test]> create table t ( f int);
Query OK, 0 rows affected (0.00 sec)
 
MariaDB [test]> show create table t;
+-------+----------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                       |
+-------+----------------------------------------------------------------------------------------------------+
| t     | CREATE TABLE `t` (
  `f` int(11) DEFAULT NULL
) ENGINE=Aria DEFAULT CHARSET=latin1 PAGE_CHECKSUM=1 |
+-------+----------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

Binary log output:
---------------------------

SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=8/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
create table t ( f int)
/*!*/;
DELIMITER ;
# End of log file

Conclusion: Since "SET STATEMENT" also acts like above it should not replicate
the "storage_engine" variable. Hence we should not change this behavior.

Please let me know your thoughts.

Comment by Andrei Elkin [ 2019-07-01 ]

sujatha.sivakumar, thanks for a thorough analysis and detailing! The matter finally settled in my mind. Like you say in The internal implementation it's fair to think of SET STATEMENT var1,var2,..., var_n FOR Query as a shortcut: save old values; set var:s; exec Query; restore var:s. Now we notice that among the var:s we have ones that are present in Query_log_event::"context" (replicated), and those that are not. Those that are currently not should not get into the context. Those that are in already don't need {{SET STATEMENT var1,var2,..., var_n }} preamble be replicated.
Hence the original design decision makes sense. Secondly I have to agree with not-a-bug status.
Yet we need to acknowledge and tackle a use cases where a not replicatable SET-STATEMENT var is referred in the Query explicitly like SET STATEMENT var=value INSERT INTO t SET a=@@var. The case is clear consistency issue as var=value is not set into the exec context of Query on slave.

I suggest we consider such cases are replication unsafe. Now the question is how to implement that technically, and also when/how urgently.

Comment by Joffrey MICHAIE [ 2021-03-12 ]

Hello,

the same behavior is observed with old_alter_table:

SET STATEMENT old_alter_table = 1 ALTER XXX

While in show processlist on the replica, the SET STATEMENT is displayed in the command, under the hood the ALTER is being performed without setting the variable as requested (State: ALTERING TABLE) and taking much longer than planned. This can be confusing for the User/DBA.

| 948403 | system user | | db_name | Slave_SQL | 329 | altering table | SET STATEMENT old_alter_table = 1 FOR alter table my_table add column_name smallint defaul | 0.000 |

Maybe this should be documented more explicitly as not supported, and some downtime can be avoided.

Cheers,
Joffrey

Comment by Andrei Elkin [ 2021-03-12 ]

sujatha.sivakumar: Could we try something like Annotate_log_event that is to embed the original SET-query into a new char[] Query_log_event::status_var::set_query. The variable would be replicated along with others see Query_log_event::write to replace Query_log_event::query on slave.

Comment by Andrei Elkin [ 2021-03-12 ]

joffrey92 Indeed! Thank you. We're still considering if/and how to fix.

Comment by Michael Widenius [ 2023-09-05 ]

This is not only related to SET STATEMENT but with the whole environment on the master.

set default_storage_engine=aria;
CREATE TABLE t (i INT);
SHOW CREATE TABLE t;
Table Create Table
t CREATE TABLE `t` (
`i` int(11) DEFAULT NULL
) ENGINE=Aria DEFAULT CHARSET=latin1 COLLATE=latin1_swedish_ci PAGE_CHECKSUM=1
connection slave;
SHOW CREATE TABLE t;
Table Create Table
t CREATE TABLE `t` (
`i` int(11) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_swedish_ci

A better way would be to send together with the query the whole environment that the query is using, like default storage engine, default character set etc.
We could do this included in the events (var1=x,var2=y etc) or just always use SET STATEMENT .. with any DDL query in the binary log.

Another options is to send to the slave not the original query but the query one gets from 'show create table' as this includes all information needed.

Comment by Elena Stepanova [ 2023-09-05 ]

This is not only related to SET STATEMENT but with the whole environment on the master

Sure, but it has always been so, as a known legacy limitation that we only replicate a tiny part of the environment.
At the point of filing this report, SET STATEMENT was a new addition, and in my opinion it wasn't obvious at all (and still isn't really) that it should behave the same way as if the master were executing consequently SET var=new_value; <query>; SET var=old_value. While I understand the logic induced by the internal implementation, intuitively, it could be expected that if the statement is executed as a whole and is written in the binlog as a whole, it should also be executed by the slave as a whole.

Comment by Michael Widenius [ 2023-10-23 ]

Agree that the simple solution is to include the original SET STATEMENT into the binary log.

Generated at Thu Feb 08 08:16:04 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.