Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-25587

RENAME on namesake of temporary table breaks binlog replay

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5, 10.6
    • 10.5
    • Replication
    • None

    Description

      If a temporary table creation was written into binary log and then another connection did RENAME on a base table with the same name, replaying such a binlog via mysqlbinlog => client causes data discrepancy – the base table remains intact (apparently RENAME is applied to the temporary table from a wrong thread instead).
      It doesn't happen in a replication setup, the slave applies RENAME correctly.

      --source include/have_log_bin.inc
      --source include/have_binlog_format_mixed.inc
       
      RESET MASTER;
      CREATE TABLE t1 (a INT);
       
      --connect (con1,localhost,root,,)
      CREATE TEMPORARY TABLE t1 (b INT);
       
      --connection default
      RENAME TABLE t1 TO t2;
       
      --disconnect con1
      --connection default
       
      --echo ######
      --echo # Tables on the original server
      --echo #
      SHOW TABLES IN test;
      --echo ######
       
      FLUSH BINARY LOGS;
      DROP TABLE IF EXISTS t1, t2;
       
      --let $datadir= `select @@datadir`
      --exec $MYSQL_BINLOG $datadir/master-bin.000001 | $MYSQL test
       
      --echo ######
      --echo # Tables after binlog replay
      --echo #
      SHOW TABLES IN test;
      --echo ######
       
      DROP TABLE IF EXISTS t1, t2;
      

      10.2 e788738e

      ######
      # Tables on the original server
      #
      SHOW TABLES IN test;
      Tables_in_test
      t2
      ######
      

      ######
      # Tables after binlog replay
      #
      SHOW TABLES IN test;
      Tables_in_test
      t1
      ######
      

      The difference started showing up long time ago in 10.0 after this commit:

      commit bf2e31500c9a0a7fcdccfb724e9447347a3ab684
      Author: Alexey Botchkov
      Date:   Mon Jul 18 11:50:08 2016 +0400
       
          MDEV-8569 build_table_filename() doesn't support temporary tables.
          
                  Temporary tables support added for RENAME and ALTER TABLE.
      

      However, I wouldn't go as far as calling it a regression – most likely it wouldn't work anyway, just with different symptoms/outcome.

      Attachments

        Issue Links

          Activity

            Hm, this seems tricky.

            The problem seems much bigger than just RENAME. Anytime a temporary table is binlogged (in mixed/statement mode) that conflicts with the name of a global table (or even a temporary table in another concurrent connection), there will be ambiguity in mysqlbinlog | mysql about which table is referred to. Eg:

            --source include/have_binlog_format_mixed.inc
             
            --let $datadir= `select @@datadir`
            RESET MASTER;
             
            connect (con1,localhost,root,,);
            CREATE TEMPORARY TABLE t1 (a INT PRIMARY KEY);
            INSERT INTO t1 VALUES (1);
             
            --connection default
            CREATE TABLE t1 (a INT PRIMARY KEY);
            INSERT INTO t1 VALUES (2);
             
            --connection con1
            INSERT INTO t1 VALUES (3);
            SELECT * FROM t1 ORDER BY a;
             
            --connection default
            INSERT INTO t1 VALUES (4);
            SELECT * FROM t1 ORDER BY a;
             
            --disconnect con1
             
            FLUSH BINARY LOGS;
            DROP TABLE t1;
             
            --exec $MYSQL_BINLOG $datadir/master-bin.000001 | $MYSQL test
            SELECT * FROM t1 ORDER BY a;
             
            DROP TABLE t1;
            

            Here applying the binlog with mysqlbinlog | mysql leaves the table empty, because all rows go to the temporary table (which is discarded when mysql disconnects).

            I don't remember exactly how replication deals with this problem, I think it uses the thread id in the binlogged query (@@pseudo_thread_id?) to distinguish between different temporary tables with the same name.

            I urge whoever ends up fixing this bug to use extreme caution and not to rush it unnecessarily.
            I don't remember a case when a change around temporary tables replication hadn't broken something, it's probably the most error-prone replication-related area there is.

            Indeed. Like, we could try to rewrite all references to temporary table names in binlogged queries to some unique name containing the thread id maybe. That seems tremendously hard to get right without breaking anything.

            A loose idea that might be feasible is to implement some special mode (through some internal system variable maybe?) that makes temporary table name resolution do some magic so that @@pseudo_thread_id can be used to select between different temporary tables with the same name in the same client connection, that mysqlbinlog can then use in its output. But I'm still sceptical, replication of temporary tables in statement mode is a huge can of worms, and unlikely to ever become 100% robust, much less in mysqlbinlog | mysql.

            Maybe the best way forward is to make the change (in a non-GA version) that in MIXED mode, all use of temporary tables cause a switch to ROW mode, avoiding binlogging any temporary table names at all. This would be default, though we could provide some @@tmptable_unsafe_for_stmt_binlogging or whatever setting to enable the old backwards-compatible behaviour. This would fix a lot of problems with temporary tables in one go. But this is a task with larger scope than this bug.

            Maybe reasonable to document it as a restriction for mysqlbinlog | mysql that temporary table names used in STATEMENT or MIXED mode need to be globally unique. Though this is a somewhat severe restriction, still better to document it at least.

            Hope this helps,

            - Kristian.

            knielsen Kristian Nielsen added a comment - Hm, this seems tricky. The problem seems much bigger than just RENAME. Anytime a temporary table is binlogged (in mixed/statement mode) that conflicts with the name of a global table (or even a temporary table in another concurrent connection), there will be ambiguity in mysqlbinlog | mysql about which table is referred to. Eg: --source include/have_binlog_format_mixed.inc   --let $datadir= `select @@datadir` RESET MASTER;   connect (con1,localhost,root,,); CREATE TEMPORARY TABLE t1 (a INT PRIMARY KEY); INSERT INTO t1 VALUES (1);   --connection default CREATE TABLE t1 (a INT PRIMARY KEY); INSERT INTO t1 VALUES (2);   --connection con1 INSERT INTO t1 VALUES (3); SELECT * FROM t1 ORDER BY a;   --connection default INSERT INTO t1 VALUES (4); SELECT * FROM t1 ORDER BY a;   --disconnect con1   FLUSH BINARY LOGS; DROP TABLE t1;   --exec $MYSQL_BINLOG $datadir/master-bin.000001 | $MYSQL test SELECT * FROM t1 ORDER BY a;   DROP TABLE t1; Here applying the binlog with mysqlbinlog | mysql leaves the table empty, because all rows go to the temporary table (which is discarded when mysql disconnects). I don't remember exactly how replication deals with this problem, I think it uses the thread id in the binlogged query (@@pseudo_thread_id?) to distinguish between different temporary tables with the same name. I urge whoever ends up fixing this bug to use extreme caution and not to rush it unnecessarily. I don't remember a case when a change around temporary tables replication hadn't broken something, it's probably the most error-prone replication-related area there is. Indeed. Like, we could try to rewrite all references to temporary table names in binlogged queries to some unique name containing the thread id maybe. That seems tremendously hard to get right without breaking anything. A loose idea that might be feasible is to implement some special mode (through some internal system variable maybe?) that makes temporary table name resolution do some magic so that @@pseudo_thread_id can be used to select between different temporary tables with the same name in the same client connection, that mysqlbinlog can then use in its output. But I'm still sceptical, replication of temporary tables in statement mode is a huge can of worms, and unlikely to ever become 100% robust, much less in mysqlbinlog | mysql . Maybe the best way forward is to make the change (in a non-GA version) that in MIXED mode, all use of temporary tables cause a switch to ROW mode, avoiding binlogging any temporary table names at all. This would be default, though we could provide some @@tmptable_unsafe_for_stmt_binlogging or whatever setting to enable the old backwards-compatible behaviour. This would fix a lot of problems with temporary tables in one go. But this is a task with larger scope than this bug. Maybe reasonable to document it as a restriction for mysqlbinlog | mysql that temporary table names used in STATEMENT or MIXED mode need to be globally unique. Though this is a somewhat severe restriction, still better to document it at least. Hope this helps, - Kristian.

            People

              Elkin Andrei Elkin
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.