Details

    Description

      The purpose of this task is to ensure that the following internal operations
      are atomic (either all or nothing) for all storage engines:

      Manage .frm
      Storage engine data dictionary
      Binary log

      This will solve the following things:

      • .frm context and storage engine dictionary is always in sync
      • no #sql-xxxx files if server crashes during alter table
      • binary log will always contain the DDL statement if the DDL successfully and committed to the storage engine. If the DDL
        was not successful, all traces of the DDL (like temporary files) will be deleted/rolled back.
        For example, if an alter table was successfully done and committed and we get a crash just before the write to the binary
        log, crash recovery will write the ALTER TABLE command to the binary log. In the case of a crash in the middle of a multi-table drop, crash recovery will write to the binary log those tables that was actually dropped.

      The cost of doing the above should not be more than 1-4 sync per DDL.

      High level architecture

      When doing a DDL, store somewhere (either trough write_ddl_log_entry() or
      some new method):

      • Operation
      • Number of tables
      • Table name
      • Table id for original table
      • Table id for resulting table (in case of rename)
      • sql command (for binary log)

      If there is no table id (for example for CSV) we would use the
      timestamp of the files.

      With the above information we would be able to continue from the place
      where the operation failed.

      Low level architecture is done for each sub project

      Some requirements for a storage engine to be 'Atomic compliment':

      • drop table and rename_table needs to be either atomic or can be retried if there was a crash in middle of the
        operation.
      • Engines supporting inplace alter table, must also support the handlerton->check_version() call to allow the
        ddl recovery code to check if the inline alter table succeed. This is only needed if there was a crash between
        the inplace alter table commit and the rename of the .frm file.

      Supported engines in 10.6 are (among others)

      • InnoDB
      • MyRocks
      • Aria (transactional and non transactional tables)
      • MyISAM
      • Any engine that has only one table file (as then drop and rename will be atomic)

      Attachments

        Issue Links

          Activity

            I hope that this will address the failure scenario of MDEV-23741.

            marko Marko Mäkelä added a comment - I hope that this will address the failure scenario of MDEV-23741 .

            I do not think that DDL operations in InnoDB can be truly crash-safe before MDEV-18518 has been implemented. See MDEV-24569 for my analysis of corruption that was caused by killing the server during DROP TABLE.

            marko Marko Mäkelä added a comment - I do not think that DDL operations in InnoDB can be truly crash-safe before MDEV-18518 has been implemented. See MDEV-24569 for my analysis of corruption that was caused by killing the server during DROP TABLE .

            As noted in MDEV-24589, I think that we can implement a lighter-weight fix to have crash-safe DROP TABLE in InnoDB.

            marko Marko Mäkelä added a comment - As noted in MDEV-24589 , I think that we can implement a lighter-weight fix to have crash-safe DROP TABLE in InnoDB.

            It seems that we are currently missing the recovery step for the embedded server library.

            marko Marko Mäkelä added a comment - It seems that we are currently missing the recovery step for the embedded server library.

            As noted in MDEV-25506, some error message output needs to be suppressed during recovery. The scenario is that the server was killed during CREATE TABLE tt13 in such a way that InnoDB rolled back the transaction:

            bb-10.6-monty 387d673edb5899adb31695f02032b060ed7574f7

            2021-05-04 19:40:28 0 [ERROR] InnoDB: Table `test`.`tt13` does not exist in the InnoDB internal data dictionary though MariaDB is trying to drop it. Have you copied the .frm file of the table to the MariaDB database directory from another database? Please refer to https://mariadb.com/kb/en/innodb-troubleshooting/ for how to resolve the issue.
            

            There is no point to issue such messages when the DDL log recovery is in progress:

            bb-10.6-monty 387d673edb5899adb31695f02032b060ed7574f7

            (rr) bt
            #0  sql_print_error (format=0x561255a94148 "InnoDB: %s")
                at /home/mdbe/atomic_ddl/bb-10.6-monty-for-rr/sql/log.cc:9177
            #1  0x00005612553fb730 in ib::error::~error (this=0x7fff4e168310, __in_chrg=<optimized out>)
                at /home/mdbe/atomic_ddl/bb-10.6-monty-for-rr/storage/innobase/ut/ut0ut.cc:508
            #2  0x00005612551b9e9a in ha_innobase::delete_table (this=0x5612573d54c0, 
                name=0x561256ecdad2 "./test/tt13", sqlcom=SQLCOM_END)
                at /home/mdbe/atomic_ddl/bb-10.6-monty-for-rr/storage/innobase/handler/ha_innodb.cc:13142
            #3  0x00005612551a2616 in ha_innobase::delete_table (this=0x5612573d54c0, 
                name=0x561256ecdad2 "./test/tt13")
                at /home/mdbe/atomic_ddl/bb-10.6-monty-for-rr/storage/innobase/handler/ha_innodb.cc:13214
            #4  0x0000561254de755f in hton_drop_table (hton=0x561256eeed08, path=0x561256ecdad2 "./test/tt13")
                at /home/mdbe/atomic_ddl/bb-10.6-monty-for-rr/sql/handler.cc:577
            #5  0x0000561254bb4248 in ddl_log_execute_action (thd=0x561257109408, mem_root=0x7fff4e168db0, 
                ddl_log_entry=0x7fff4e168df0) at /home/mdbe/atomic_ddl/bb-10.6-monty-for-rr/sql/ddl_log.cc:1711
            #6  0x0000561254bb5d67 in ddl_log_execute_entry_no_lock (thd=0x561257109408, first_entry=2)
                at /home/mdbe/atomic_ddl/bb-10.6-monty-for-rr/sql/ddl_log.cc:2358
            #7  0x0000561254bb679c in ddl_log_execute_recovery ()
                at /home/mdbe/atomic_ddl/bb-10.6-monty-for-rr/sql/ddl_log.cc:2714
            

            marko Marko Mäkelä added a comment - As noted in MDEV-25506 , some error message output needs to be suppressed during recovery. The scenario is that the server was killed during CREATE TABLE tt13 in such a way that InnoDB rolled back the transaction: bb-10.6-monty 387d673edb5899adb31695f02032b060ed7574f7 2021-05-04 19:40:28 0 [ERROR] InnoDB: Table `test`.`tt13` does not exist in the InnoDB internal data dictionary though MariaDB is trying to drop it. Have you copied the .frm file of the table to the MariaDB database directory from another database? Please refer to https://mariadb.com/kb/en/innodb-troubleshooting/ for how to resolve the issue. There is no point to issue such messages when the DDL log recovery is in progress: bb-10.6-monty 387d673edb5899adb31695f02032b060ed7574f7 (rr) bt #0 sql_print_error (format=0x561255a94148 "InnoDB: %s") at /home/mdbe/atomic_ddl/bb-10.6-monty-for-rr/sql/log.cc:9177 #1 0x00005612553fb730 in ib::error::~error (this=0x7fff4e168310, __in_chrg=<optimized out>) at /home/mdbe/atomic_ddl/bb-10.6-monty-for-rr/storage/innobase/ut/ut0ut.cc:508 #2 0x00005612551b9e9a in ha_innobase::delete_table (this=0x5612573d54c0, name=0x561256ecdad2 "./test/tt13", sqlcom=SQLCOM_END) at /home/mdbe/atomic_ddl/bb-10.6-monty-for-rr/storage/innobase/handler/ha_innodb.cc:13142 #3 0x00005612551a2616 in ha_innobase::delete_table (this=0x5612573d54c0, name=0x561256ecdad2 "./test/tt13") at /home/mdbe/atomic_ddl/bb-10.6-monty-for-rr/storage/innobase/handler/ha_innodb.cc:13214 #4 0x0000561254de755f in hton_drop_table (hton=0x561256eeed08, path=0x561256ecdad2 "./test/tt13") at /home/mdbe/atomic_ddl/bb-10.6-monty-for-rr/sql/handler.cc:577 #5 0x0000561254bb4248 in ddl_log_execute_action (thd=0x561257109408, mem_root=0x7fff4e168db0, ddl_log_entry=0x7fff4e168df0) at /home/mdbe/atomic_ddl/bb-10.6-monty-for-rr/sql/ddl_log.cc:1711 #6 0x0000561254bb5d67 in ddl_log_execute_entry_no_lock (thd=0x561257109408, first_entry=2) at /home/mdbe/atomic_ddl/bb-10.6-monty-for-rr/sql/ddl_log.cc:2358 #7 0x0000561254bb679c in ddl_log_execute_recovery () at /home/mdbe/atomic_ddl/bb-10.6-monty-for-rr/sql/ddl_log.cc:2714

            People

              monty Michael Widenius
              monty Michael Widenius
              Votes:
              8 Vote for this issue
              Watchers:
              18 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.