Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-14760

Replication does not take into account SET STATEMENT

Details

    Description

      Refined and complete description
      ================================

      While SET STATEMENT @@session.var1=v1, ..., @@session.var_n=vn Query,
      when binlog_format = STATEMENT, is binlogged in verbatim the results of the Query execution are different between master (primary) and slave (replica).
      Unlike master, the slave side refuses to adopt var_1, ..., var_n customization, and
      which is the case of either DML and DDL queries.

      Even though that is by design the behavior does not allow for replicating correctly queries that either implicitly depend on session variables whose values are not included into Query_log_event, like

      SET STATEMENT storage_engine=Aria FOR CREATE TABLE t (i INT);

      (on slave the replicated table may not be of Aria type)
      or the set-statement's query is composed with referencing variables explicitly e.g

      SET STATEMENT query_alloc_block_size = 1024 FOR INSERT INTO t_heap SET i=@@session.query_alloc_block_size + 1;

      (on slave the inserted value of i will be the slave's @@global.query_alloc_block_size + 1)
      Note that employing a separate SET @@session.var = value to prepend Query may not be (it *is* not for the two above cases) a workaround for statement binlog format replication either.

      To fix the issue beyond suggested Query re-writing, Monty and others considered effectively lift the design's constrain.

      After all it looks to be grounded solely on pessimistic assessments of 'security issues' at the slave side execution

      F-10: Replication: Slave threads will ignore SET_VAR hints to avoids security
      issues since slave threads run as root, avoids potential problems with
      variables replicated with SBR.

      Apparently it's far from being so dramatic as the above examples may hint. Specifically storage_engine,query_alloc_block_size and lots of other are security safe.

      I suggest to indeed *revise* that decision, and when that's done *introduce* for backward compatibility a slave side switch. Along that let's also *attend* MDEV-27462 which must review the current list of disallowed variables (in particular consider to append to it binlog_format which is disallowed to change within transaction context which may not be the case on master but it may be so on slave).

      With this measure we also achieve (addresses knielsen's note) consistent results between the slave applier and the binlog "direct" applier executions.
      The replicated with session-variables SET-STATEMENT-FOR-Query could be engaged further on cases when correctness can be accomplished (addresses monty's note) only with more reach execution context than currently Query_log_event is provided with.


      *Former Elena's description:*

      DDL statements sent to slave does not take into account the original environment where the query was executed.
      This is true when using SET STATEMENT or having a non standard value for any variable that is internally used by a DDL, like default_storage_engine.

      --source include/master-slave.inc
       
      SET STATEMENT storage_engine=Aria FOR CREATE TABLE t (i INT);
      SHOW CREATE TABLE t;
       
      --sync_slave_with_master
      SHOW CREATE TABLE t;
       
      --connection master
      DROP TABLE t;
      --source include/rpl_end.inc
      


      Result

      [connection master]
      SET STATEMENT storage_engine=Aria FOR CREATE TABLE t (i INT);
      SHOW CREATE TABLE t;
      Table	Create Table
      t	CREATE TABLE `t` (
        `i` int(11) DEFAULT NULL
      ) ENGINE=Aria DEFAULT CHARSET=latin1 PAGE_CHECKSUM=1
      connection slave;
      SHOW CREATE TABLE t;
      Table	Create Table
      t	CREATE TABLE `t` (
        `i` int(11) DEFAULT NULL
      ) ENGINE=MyISAM DEFAULT CHARSET=latin1
      

      Attachments

        Issue Links

          Activity

            elenst Elena Stepanova added a comment - - edited

            This is not only related to SET STATEMENT but with the whole environment on the master

            Sure, but it has always been so, as a known legacy limitation that we only replicate a tiny part of the environment.
            At the point of filing this report, SET STATEMENT was a new addition, and in my opinion it wasn't obvious at all (and still isn't really) that it should behave the same way as if the master were executing consequently SET var=new_value; <query>; SET var=old_value. While I understand the logic induced by the internal implementation, intuitively, it could be expected that if the statement is executed as a whole and is written in the binlog as a whole, it should also be executed by the slave as a whole.

            elenst Elena Stepanova added a comment - - edited This is not only related to SET STATEMENT but with the whole environment on the master Sure, but it has always been so, as a known legacy limitation that we only replicate a tiny part of the environment. At the point of filing this report, SET STATEMENT was a new addition, and in my opinion it wasn't obvious at all (and still isn't really) that it should behave the same way as if the master were executing consequently SET var=new_value; <query>; SET var=old_value . While I understand the logic induced by the internal implementation, intuitively, it could be expected that if the statement is executed as a whole and is written in the binlog as a whole, it should also be executed by the slave as a whole.

            Agree that the simple solution is to include the original SET STATEMENT into the binary log.

            monty Michael Widenius added a comment - Agree that the simple solution is to include the original SET STATEMENT into the binary log.

            I agree with the idea that SET STATEMENT X=Y FOR <stmt> should be identical to SET old=X; SET X=Y; <stmt>; SET X=old.

            I think this is needed to ensure that the semantics is clear and correct. This way, it is easy to describe and for the user to understand what it does. And making it work differently opens all kinds of potential corner cases.

            That still leaves an issue. It is then a bit strange that the <stmt> is not binlogged the same way, but the entire SET STATEMENT ... original client SQL string appears in the Query_log_event (in statement-based binlogging). And it currently means that mysqlbinlog | mysql does not handle it identical to explicit set/restore, as the SET STATEMENT ... takes effect there.

            The best might be to remove the initial SET STATEMENT ... FOR part from the query string entirely before binlogging. We do have some query string rewriting before binlogging already (INSERT DELAYED...), though I'd still prefer to keep such rewriting to an absolute minimum.

            This is a good example of how important it is to carefully think through all aspects of a new feature (SET STATEMENT ... in this case). The MariaDB code is so stuffed with features and options that there are almost always tricky interactions that need to be carefully thought through in the design before coding and pushing. At least SET STATEMENT ... doesn't allow subqueries or stored function calls to get the temporary values to set.

            knielsen Kristian Nielsen added a comment - I agree with the idea that SET STATEMENT X=Y FOR <stmt> should be identical to SET old=X; SET X=Y; <stmt>; SET X=old. I think this is needed to ensure that the semantics is clear and correct. This way, it is easy to describe and for the user to understand what it does. And making it work differently opens all kinds of potential corner cases. That still leaves an issue. It is then a bit strange that the <stmt> is not binlogged the same way, but the entire SET STATEMENT ... original client SQL string appears in the Query_log_event (in statement-based binlogging). And it currently means that mysqlbinlog | mysql does not handle it identical to explicit set/restore, as the SET STATEMENT ... takes effect there. The best might be to remove the initial SET STATEMENT ... FOR part from the query string entirely before binlogging. We do have some query string rewriting before binlogging already (INSERT DELAYED...), though I'd still prefer to keep such rewriting to an absolute minimum. This is a good example of how important it is to carefully think through all aspects of a new feature (SET STATEMENT ... in this case). The MariaDB code is so stuffed with features and options that there are almost always tricky interactions that need to be carefully thought through in the design before coding and pushing. At least SET STATEMENT ... doesn't allow subqueries or stored function calls to get the temporary values to set.
            Elkin Andrei Elkin added a comment -

            knielsen, also considering you were last to comment on the case, I set you as the prime reviewer of the updated description. Looking forward to read your input!

            Elkin Andrei Elkin added a comment - knielsen , also considering you were last to comment on the case, I set you as the prime reviewer of the updated description. Looking forward to read your input!
            knielsen Kristian Nielsen added a comment - - edited

            I read the updated description. Unfortunately, it is not clear to me what
            exactly is proposed?

            I see "effectively lift the design's constrain" and "I suggest to indeed
            revise that decision". But I do not see exactly what is proposed as the
            implementation.

            Also, I do not understand what is meant with this: "The replicated with
            session-variables SET-STATEMENT-FOR-Query could be engaged further on cases
            when correctness can be accomplished (addresses Michael Widenius's note)
            only with more reach execution context than currently Query_log_event is
            provided with."

            It sounds to me that what is suggested involves creating a list of further
            session variables that must be replicated (for Query_log_event) along with
            those already replicated (like current timestamp and so on). That is
            probably appropriate. Note that such list must be a white-list; each new
            variable must be added after suitable consideration, as opposed to
            replicating everything except a list of exceptions. This is already how it
            is done for the existing variables replicated with the Query_log_event.

            As I wrote earlier, it is important that these two behave the same:

            SET STATEMENT var= value FOR <query>
            

            and

            SET @old= @@SESSION.var;
            SET SESSION var= value;
            <query>
            SET SESSION var= @old;
            

            This is achieved by added more variables to the list of what is replicated
            for Query_log_event.

            This does not solve the problem that `mysqlbinlog | mysql` will work
            differently, obeying all SET STATEMENT ... FOR regardless of which
            variable. I am not super fond of my earlier suggestion about query string
            rewriting for the Query_log_event. I think the more general problem of
            getting `mysqlbinlog | mysql` to be consistent with replication should be
            solved in a more general manner, by implementing something that makes this
            use the same code paths inside the server as the replication slave, so that
            the various quirks and workarounds are the same between the two. Such
            general solution is outside the scope of this particular MDEV.

            knielsen Kristian Nielsen added a comment - - edited I read the updated description. Unfortunately, it is not clear to me what exactly is proposed? I see "effectively lift the design's constrain" and "I suggest to indeed revise that decision". But I do not see exactly what is proposed as the implementation. Also, I do not understand what is meant with this: "The replicated with session-variables SET-STATEMENT-FOR-Query could be engaged further on cases when correctness can be accomplished (addresses Michael Widenius's note) only with more reach execution context than currently Query_log_event is provided with." It sounds to me that what is suggested involves creating a list of further session variables that must be replicated (for Query_log_event) along with those already replicated (like current timestamp and so on). That is probably appropriate. Note that such list must be a white-list; each new variable must be added after suitable consideration, as opposed to replicating everything except a list of exceptions. This is already how it is done for the existing variables replicated with the Query_log_event. As I wrote earlier, it is important that these two behave the same: SET STATEMENT var= value FOR <query> and SET @old= @@SESSION.var; SET SESSION var= value; <query> SET SESSION var= @old; This is achieved by added more variables to the list of what is replicated for Query_log_event. This does not solve the problem that `mysqlbinlog | mysql` will work differently, obeying all SET STATEMENT ... FOR regardless of which variable. I am not super fond of my earlier suggestion about query string rewriting for the Query_log_event. I think the more general problem of getting `mysqlbinlog | mysql` to be consistent with replication should be solved in a more general manner, by implementing something that makes this use the same code paths inside the server as the replication slave, so that the various quirks and workarounds are the same between the two. Such general solution is outside the scope of this particular MDEV.

            People

              bnestere Brandon Nesterenko
              elenst Elena Stepanova
              Votes:
              1 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.