[MDEV-17567] Atomic DDL Created: 2018-10-30  Updated: 2023-01-07  Resolved: 2021-05-25

Status: Closed
Project: MariaDB Server
Component/s: Data Definition - Alter Table, Data Definition - Create Table
Fix Version/s: 10.6.1

Type: Task Priority: Critical
Reporter: Michael Widenius Assignee: Michael Widenius
Resolution: Fixed Votes: 8
Labels: None

Issue Links:
Blocks
blocks MDEV-4259 transactional DDL Open
blocks MDEV-11633 Make the InnoDB system tablespace opt... Open
blocks MDEV-11655 Transactional data dictionary Open
blocks MDEV-15140 Implement Partial / Filtered Indexes Open
blocks MDEV-16417 Store Foreign Key metadata outside of... In Review
blocks MDEV-21652 FK migration from old version Stalled
blocks MDEV-22165 CONVERT TABLE: move in partition from... Closed
blocks MDEV-22166 CONVERT PARTITION: move out partition... Closed
blocks MDEV-23570 deprecate keep_files_on_create Closed
is blocked by MDEV-24589 DROP TABLE is not crash-safe Closed
PartOf
includes MDEV-23842 Atomic RENAME TABLE Closed
includes MDEV-23844 Atomic DROP TABLE (single table) Closed
includes MDEV-24395 Atomic DROP TRIGGER Closed
includes MDEV-24408 Crash-safe DROP DATABASE Closed
includes MDEV-24576 Atomic CREATE TABLE Closed
includes MDEV-24607 ATOMIC CREATE VIEW Closed
includes MDEV-24746 Atomic CREATE TRIGGER Closed
includes MDEV-25180 Atomic ALTER TABLE Closed
Problem/Incident
causes MDEV-25683 Atomic DDL: With innodb_force_recover... Closed
Relates
relates to MDEV-18518 Implement atomic multi-table (or mult... Closed
relates to MDEV-22928 SUMMARY: AddressSanitizer: SEGV stora... Closed
relates to MDEV-23741 InnoDB: Table ... contains <n> indexe... Closed
relates to MDEV-24755 Rebase FK tasks on top of Atomic DDL Closed
relates to MDEV-25588 Atomic DDL: Binlog query event writte... Closed
relates to MDEV-25606 Concurrent CREATE TRIGGER statements ... Closed
relates to MDEV-25655 Atomic DDL: InnoDB: Cannot replay ren... Closed
relates to MDEV-25656 Atomic DDL: Assertion `!full_crc32(fl... Closed
relates to MDEV-25657 Atomic DDL: InnoDB: Cannot apply log,... Closed
relates to MDEV-25662 Atomic DDL: InnoDB: Expected tablespa... Closed
relates to MDEV-25683 Atomic DDL: With innodb_force_recover... Closed
relates to MDEV-25705 Atomic DDL: atomic.rename_case and at... Open
relates to MDEV-25920 Atomic DROP DATABASE Open
relates to MDEV-26966 The parameter innodb_force_load_corru... Closed
relates to MDEV-29566 Failed to read from the .par file upo... Stalled
relates to MDEV-6642 Server crashed with assertion failure... Open
relates to MDEV-11675 Lag Free Alter On Slave Closed
relates to MDEV-16290 ALTER TABLE ... RENAME COLUMN syntax Closed
relates to MDEV-21053 Crash safety of foreign key DDL In Review
relates to MDEV-22056 Rocks db corrupts data when disk is o... Open
relates to MDEV-22165 CONVERT TABLE: move in partition from... Closed
relates to MDEV-22166 CONVERT PARTITION: move out partition... Closed
relates to MDEV-25604 Atomic DDL: Binlog event written upon... Closed
relates to MDEV-25646 Atomic DDL: InnoDB: Datafile './test... Closed
relates to MDEV-25666 Atomic DDL: InnoDB: Operating system ... Closed

 Description   

The purpose of this task is to ensure that the following internal operations
are atomic (either all or nothing) for all storage engines:

Manage .frm
Storage engine data dictionary
Binary log

This will solve the following things:

  • .frm context and storage engine dictionary is always in sync
  • no #sql-xxxx files if server crashes during alter table
  • binary log will always contain the DDL statement if the DDL successfully and committed to the storage engine. If the DDL
    was not successful, all traces of the DDL (like temporary files) will be deleted/rolled back.
    For example, if an alter table was successfully done and committed and we get a crash just before the write to the binary
    log, crash recovery will write the ALTER TABLE command to the binary log. In the case of a crash in the middle of a multi-table drop, crash recovery will write to the binary log those tables that was actually dropped.

The cost of doing the above should not be more than 1-4 sync per DDL.

High level architecture

When doing a DDL, store somewhere (either trough write_ddl_log_entry() or
some new method):

  • Operation
  • Number of tables
  • Table name
  • Table id for original table
  • Table id for resulting table (in case of rename)
  • sql command (for binary log)

If there is no table id (for example for CSV) we would use the
timestamp of the files.

With the above information we would be able to continue from the place
where the operation failed.

Low level architecture is done for each sub project

Some requirements for a storage engine to be 'Atomic compliment':

  • drop table and rename_table needs to be either atomic or can be retried if there was a crash in middle of the
    operation.
  • Engines supporting inplace alter table, must also support the handlerton->check_version() call to allow the
    ddl recovery code to check if the inline alter table succeed. This is only needed if there was a crash between
    the inplace alter table commit and the rename of the .frm file.

Supported engines in 10.6 are (among others)

  • InnoDB
  • MyRocks
  • Aria (transactional and non transactional tables)
  • MyISAM
  • Any engine that has only one table file (as then drop and rename will be atomic)


 Comments   
Comment by Marko Mäkelä [ 2020-09-16 ]

I hope that this will address the failure scenario of MDEV-23741.

Comment by Marko Mäkelä [ 2021-01-14 ]

I do not think that DDL operations in InnoDB can be truly crash-safe before MDEV-18518 has been implemented. See MDEV-24569 for my analysis of corruption that was caused by killing the server during DROP TABLE.

Comment by Marko Mäkelä [ 2021-01-14 ]

As noted in MDEV-24589, I think that we can implement a lighter-weight fix to have crash-safe DROP TABLE in InnoDB.

Comment by Marko Mäkelä [ 2021-05-06 ]

It seems that we are currently missing the recovery step for the embedded server library.

Comment by Marko Mäkelä [ 2021-05-06 ]

As noted in MDEV-25506, some error message output needs to be suppressed during recovery. The scenario is that the server was killed during CREATE TABLE tt13 in such a way that InnoDB rolled back the transaction:

bb-10.6-monty 387d673edb5899adb31695f02032b060ed7574f7

2021-05-04 19:40:28 0 [ERROR] InnoDB: Table `test`.`tt13` does not exist in the InnoDB internal data dictionary though MariaDB is trying to drop it. Have you copied the .frm file of the table to the MariaDB database directory from another database? Please refer to https://mariadb.com/kb/en/innodb-troubleshooting/ for how to resolve the issue.

There is no point to issue such messages when the DDL log recovery is in progress:

bb-10.6-monty 387d673edb5899adb31695f02032b060ed7574f7

(rr) bt
#0  sql_print_error (format=0x561255a94148 "InnoDB: %s")
    at /home/mdbe/atomic_ddl/bb-10.6-monty-for-rr/sql/log.cc:9177
#1  0x00005612553fb730 in ib::error::~error (this=0x7fff4e168310, __in_chrg=<optimized out>)
    at /home/mdbe/atomic_ddl/bb-10.6-monty-for-rr/storage/innobase/ut/ut0ut.cc:508
#2  0x00005612551b9e9a in ha_innobase::delete_table (this=0x5612573d54c0, 
    name=0x561256ecdad2 "./test/tt13", sqlcom=SQLCOM_END)
    at /home/mdbe/atomic_ddl/bb-10.6-monty-for-rr/storage/innobase/handler/ha_innodb.cc:13142
#3  0x00005612551a2616 in ha_innobase::delete_table (this=0x5612573d54c0, 
    name=0x561256ecdad2 "./test/tt13")
    at /home/mdbe/atomic_ddl/bb-10.6-monty-for-rr/storage/innobase/handler/ha_innodb.cc:13214
#4  0x0000561254de755f in hton_drop_table (hton=0x561256eeed08, path=0x561256ecdad2 "./test/tt13")
    at /home/mdbe/atomic_ddl/bb-10.6-monty-for-rr/sql/handler.cc:577
#5  0x0000561254bb4248 in ddl_log_execute_action (thd=0x561257109408, mem_root=0x7fff4e168db0, 
    ddl_log_entry=0x7fff4e168df0) at /home/mdbe/atomic_ddl/bb-10.6-monty-for-rr/sql/ddl_log.cc:1711
#6  0x0000561254bb5d67 in ddl_log_execute_entry_no_lock (thd=0x561257109408, first_entry=2)
    at /home/mdbe/atomic_ddl/bb-10.6-monty-for-rr/sql/ddl_log.cc:2358
#7  0x0000561254bb679c in ddl_log_execute_recovery ()
    at /home/mdbe/atomic_ddl/bb-10.6-monty-for-rr/sql/ddl_log.cc:2714

Generated at Thu Feb 08 08:37:26 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.