Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-38619

Crash in DDL around binlog rotation can result in DDL that is rolled back but still written to the binlog

    XMLWordPrintable

Details

    Description

      This issue is a bug in the atomic/crash-safe DDL feature.

      The idea in atomic DDL is during crash recovery to check if a DDL has already been written to the binlog before the crash. If so, the DDL is completed; if not the DDL is aborted and rolled back. This way the state of the table(s) and of the binlog are supposed to be kept in sync.

      The implementation is that the ddl log code uses the scan of the binlog that happens during crash recovery to get a hash with all of the DDL statements contained in the binlog. The problem is that there is nothing in the code to guarantee that the scan starts sufficiently early in the binlog to include all pending DDLs.

      The following testcase demonstrates the problem. By having a binlog rotate in parallel with a RENAME TABLE just around the point of a server crash, the server ends up in a state where the RENAME TABLE is in the binlog, but the table is left not renamed, an inconsistency that could eg. lead to replication breaking etc. The test case requires a small patch to the code to insert a debug_sync point:

      diff --git a/sql/sql_rename.cc b/sql/sql_rename.cc
      index 421c7198c10..4b8c23d8978 100644
      --- a/sql/sql_rename.cc
      +++ b/sql/sql_rename.cc
      @@ -182,6 +182,7 @@ bool mysql_rename_tables(THD *thd, TABLE_LIST *table_list, bool silent,
           thd->binlog_xid= thd->query_id;
           ddl_log_update_xid(&ddl_log_state, thd->binlog_xid);
           binlog_error= write_bin_log(thd, TRUE, thd->query(), thd->query_length());
      +    DEBUG_SYNC(thd, "ddl_log_rename_after_binlog");
           if (binlog_error)
             error= 1;
           thd->binlog_xid= 0;
      

      --source include/have_debug_sync.inc
      --source include/have_binlog_format_mixed.inc
       
      CREATE TABLE t1 (a INT PRIMARY KEY);
      connect (con1,localhost,root,,);
      SET debug_sync= 'ddl_log_rename_after_binlog SIGNAL ready WAIT_FOR ever';
      send RENAME TABLE t1 TO t2;
      --connection default
      SET debug_sync= 'now WAIT_FOR ready';
      FLUSH BINARY LOGS;
      --source include/wait_for_binlog_checkpoint.inc
       
      --let $shutdown_timeout=0
      --source include/restart_mysqld.inc
       
      SHOW BINLOG EVENTS IN 'master-bin.000001';
      SHOW BINLOG EVENTS IN 'master-bin.000002';
      query_vertical
      SHOW CREATE TABLE t2;
       
      DROP TABLE t2;
      

      CREATE TABLE t1 (a INT PRIMARY KEY);
      connect  con1,localhost,root,,;
      SET debug_sync= 'ddl_log_rename_after_binlog SIGNAL ready WAIT_FOR ever';
      RENAME TABLE t1 TO t2;
      connection default;
      SET debug_sync= 'now WAIT_FOR ready';
      FLUSH BINARY LOGS;
      # restart
      SHOW BINLOG EVENTS IN 'master-bin.000001';
      Log_name	Pos	Event_type	Server_id	End_log_pos	Info
      master-bin.000001	4	Format_desc	1	256	Server ver: 11.4.10-MariaDB-debug-log, Binlog ver: 4
      master-bin.000001	256	Gtid_list	1	285	[]
      master-bin.000001	285	Binlog_checkpoint	1	329	master-bin.000001
      master-bin.000001	329	Gtid	1	371	GTID 0-1-1
      master-bin.000001	371	Query	1	482	use `test`; CREATE TABLE t1 (a INT PRIMARY KEY)
      master-bin.000001	482	Gtid	1	524	GTID 0-1-2
      master-bin.000001	524	Query	1	621	use `test`; RENAME TABLE t1 TO t2
      master-bin.000001	621	Rotate	1	669	master-bin.000002;pos=4
      SHOW BINLOG EVENTS IN 'master-bin.000002';
      Log_name	Pos	Event_type	Server_id	End_log_pos	Info
      master-bin.000002	4	Format_desc	1	256	Server ver: 11.4.10-MariaDB-debug-log, Binlog ver: 4
      master-bin.000002	256	Gtid_list	1	299	[0-1-2]
      master-bin.000002	299	Binlog_checkpoint	1	343	master-bin.000001
      master-bin.000002	343	Binlog_checkpoint	1	387	master-bin.000002
      SHOW CREATE TABLE t2;
      main.elenst                              [ fail ]
              Test ended at 2026-01-21 13:59:44
       
      CURRENT_TEST: main.elenst
      mysqltest: At line 18: query 'SHOW CREATE TABLE t2' failed: ER_NO_SUCH_TABLE (1146): Table 'test.t2' doesn't exist
      

      As a fix, my suggestion is that the ddl log should save the current GTID position of the server in the DDL log, along with the query XID, prior to binlogging. Then in case of crash recovery, the ddl log code should do its own binlog scan from the saved GTID position to look for the query XID. This will ensure that the binlog scan happens from just the point in the binlog necessary. It will also integrate well with the new binlog-in-engine code (which normally doesn't need to scan the whole binlog file in case of crash).

      Attachments

        Activity

          People

            monty Michael Widenius
            knielsen Kristian Nielsen
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.