[MDEV-13564] TRUNCATE TABLE and undo tablespace truncation are not compatible with Mariabackup - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 10.2.2
Fix Version/s: 10.3.10, 10.4.0, 10.2.19
Component/s: Backup, Storage Engine - InnoDB
Labels:
- backup
- ddl
- performance
- recovery

Description

MariaDB 10.2.2 imported MySQL 5.7.9, which introduced separate log files, for server startup to determine if any tables or undo tablespace need "truncate fixup".

There is no logic in Mariabackup to deal with this.

A cleaner solution would be to remove the separate log files and to make the InnoDB redo log self-contained with respect to the truncate operations. This would likely require writing a new redo log record type MLOG_FILE_CREATE that would cause the file to be initialized from the scratch, followed by some page-level redo log records that would initialize the page contents.
This would also remove the need for a redo log checkpoint during the truncate operations.

~~MDEV-13563~~ proposes a Mariabackup option that could be used to prevent TRUNCATE TABLE from occurring during backups. It would not prevent undo tablespace truncation from happening.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

truncate.patch
79 kB
2018-08-02 13:30

Issue Links

blocks

MDEV-14481 Execute InnoDB crash recovery in the background

Closed

causes

MDEV-17816 InnoDB: Failing assertion: trx->dict_operation_lock_mode == RW_X_LATCH upon TRUNCATE TABLE after converting to REDUNDANT

Closed

MDEV-17849 Undo tablespace truncation recovery fails to shrink file

Closed

MDEV-17885 TRUNCATE on temporary table causes ER_GET_ERRNO and "Could not remove temporary table" in the log

Closed

MDEV-18836 Race conditions in TRUNCATE TABLE

Closed

MDEV-19449 1030: Got error 168 "Unknown (generic) error from engine" for valid TRUNCATE (temporary) TABLE

Closed

MDEV-21496 Downgrade from current 10.2 to 10.2.19: InnoDB: Failing assertion: trx->dict_operation_lock_mode == RW_X_LATCH

Closed

MDEV-23705 Assertion `table->data_dir_path || !space' failed in row_drop_table_for_mysql on TRUNCATE after DISCARD TABLESPACE

Closed

MDEV-24532 Table corruption ER_NO_SUCH_TABLE_IN_ENGINE or ER_CRASHED_ON_USAGE after ALTER on table with foreign key

Closed

MDEV-26450 Corruption due to innodb_undo_log_truncate

Closed

is blocked by

MDEV-14717 RENAME TABLE in InnoDB is not crash-safe

Closed

is duplicated by

MDEV-9459 Truncate table causes innodb stalls

Closed

relates to

MDEV-9459 Truncate table causes innodb stalls

Closed

MDEV-14585 Automatically remove #sql- tables in innodb dictionary during recovery

Closed

MDEV-16557 Remove INNOBASE_SHARE::idx_trans_tbl

Closed

MDEV-17049 Enable --suite=innodb_undo on buildbot

Closed

MDEV-17138 Reduce redo log volume for undo tablespace initialization

Closed

MDEV-17158 TRUNCATE is not atomic after MDEV-13564

Closed

MDEV-17780 innodb.truncate_recover crashes in recovery due to out-of-bounds page read

Closed

MDEV-17794 Do not assign persistent ID for temporary tables

Closed

MDEV-17831 Assertion `supports_instant()' failed in dict_table_t::prepare_instant upon ADD COLUMN on table with KEY_BLOCK_SIZE

Closed

MDEV-18739 crash (long semaphore wait)

Closed

MDEV-19769 Mariabackup should write warning during backup if server does not have innodb_safe_truncate=ON set

Open

MDEV-22733 XA PREPARE breaks MDL in pseudo_slave_mode=1

Stalled

MDEV-24532 Table corruption ER_NO_SUCH_TABLE_IN_ENGINE or ER_CRASHED_ON_USAGE after ALTER on table with foreign key

Closed

MDEV-25051 Race condition between persistent statistics and RENAME TABLE or TRUNCATE

Closed

MDEV-25710 Dead code os_file_opendir() in the server

Closed

MDEV-33112 innodb_undo_log_truncate=ON is blocking page writes

Closed

MDEV-9459 Truncate table causes innodb stalls

Closed

MDEV-13563 lock DDL for mariabackup in 10.2+

Closed

MDEV-14481 Execute InnoDB crash recovery in the background

Closed

MDEV-14545 Backup fails due to MLOG_INDEX_LOAD record

Closed

MDEV-15154 WSREP: BF lock wait long after a TRUNCATE TABLE

Closed

MDEV-15522 Change galera suite MTR tests to use mariabackup instead of xtrabackup

Closed

MDEV-16306 TRUNCATE waits for metadata lock on the tables when a SELECT is executing on it

Open

MDEV-16465 Invalid (old?) table or database name or hang in ha_innobase::delete_table and log semaphore wait upon concurrent DDL with foreign keys

Closed

MDEV-17043 Purge of indexed virtual columns may cause hang on table-rebuilding DDL

Closed

MDEV-17304 Replace use of XtraBackup with MariaDB Backup

Closed

MDEV-18654 Failing assertion: sym_node->table != NULL in buildbot with innodb_fts.sync_ddl and outside

Closed

MDEV-18960 Assertion `!omits_virtual_cols(*form->s)' failed after upgrade from 10.0/10.1 with PERSISTENT generated column

Closed

(5 causes, 1 is blocked by, 1 is duplicated by, 28 relates to)

Activity

Ascending order - Click to sort in descending order

Marko Mäkelä created issue - 2017-08-17 14:12

Marko Mäkelä made changes - 2017-08-17 14:12

Field	Original Value	New Value
Link		This issue relates to ~~MDEV-13563~~ [ ~~MDEV-13563~~ ]

Marko Mäkelä made changes - 2017-12-09 08:46

Link

This issue relates to ~~MDEV-14585~~ [ ~~MDEV-14585~~ ]

Marko Mäkelä made changes - 2018-02-02 06:53

Link

This issue relates to ~~MDEV-14545~~ [ ~~MDEV-14545~~ ]

Marko Mäkelä made changes - 2018-04-23 11:58

Fix Version/s		10.4 [ 22408 ]
Fix Version/s	10.3 [ 22126 ]

Marko Mäkelä added a comment - 2018-04-23 12:02

monty mentioned that a customer would like to have non-locking TRUNCATE TABLE: Old transactions that are reading from the table would continue to see the table contents. The TRUNCATE action would basically rename the old table to an internal #sql name so that ~~MDEV-14585~~ can take care of crash recovery and create an empty table. The table would be dropped when the last reader closes the old table handle.

This could be refined further by implementing a multi-versioned data dictionary cache (which is work mostly outside InnoDB). In that case, old transactions would continue to see the table contents as it was before the TRUNCATE, even when the first access to the table is after the TRUNCATE was executed. (Write transactions would always refer to the newest table definition.)

Marko Mäkelä added a comment - 2018-04-23 12:02 monty mentioned that a customer would like to have non-locking TRUNCATE TABLE : Old transactions that are reading from the table would continue to see the table contents. The TRUNCATE action would basically rename the old table to an internal #sql name so that MDEV-14585 can take care of crash recovery and create an empty table. The table would be dropped when the last reader closes the old table handle. This could be refined further by implementing a multi-versioned data dictionary cache (which is work mostly outside InnoDB). In that case, old transactions would continue to see the table contents as it was before the TRUNCATE , even when the first access to the table is after the TRUNCATE was executed. (Write transactions would always refer to the newest table definition.)

Marko Mäkelä made changes - 2018-05-28 12:37

Link

This issue relates to MDEV-16306 [ MDEV-16306 ]

Marko Mäkelä made changes - 2018-07-05 09:17

Link

This issue relates to ~~MDEV-15154~~ [ ~~MDEV-15154~~ ]

Marko Mäkelä made changes - 2018-07-11 19:07

Link

This issue relates to ~~MDEV-9459~~ [ ~~MDEV-9459~~ ]

Marko Mäkelä made changes - 2018-07-11 19:08

Status

Open [ 1 ]

Confirmed [ 10101 ]

Marko Mäkelä made changes - 2018-07-30 08:49

Link

This issue blocks ~~MDEV-16850~~ [ ~~MDEV-16850~~ ]

Marko Mäkelä added a comment - 2018-08-02 06:21

I believe that we need a twofold fix:

Implement TRUNCATE TABLE as a combination of renaming the table to #sql name and creating one with the original name, in a single transaction. Then, issue DROP TABLE for the old copy (this can be executed in the background).
Implement undo tablespace truncation as a single mini-transaction that rewrites the first few pages (including FSP_SIZE in the first page), then trims the file size. Make sure that recovery (and backup) will ignore old redo log records for pages that were after the trimmed end of the file. To do this, we can write a MLOG_FILE_CREATE2 record with the new size as the page number (instead of writing 0). The MLOG_FILE_CREATE2 records were previously parsed but ignored during recovery and backup.

In this way, the TruncateLogger and some related code can be removed. But we will have to keep TruncateLogParser in order to be able to crash-upgrade from MariaDB Server 10.2 or 10.3 prior to this fix.

In MariaDB Server 10.2, `TRUNCATE` will no longer be crash-safe

MariaDB Server 10.2 is affected by ~~MDEV-14717~~ RENAME TABLE in InnoDB is not crash-safe.

If the server is killed in the middle of the BEGIN; RENAME; CREATE; COMMIT; transaction, after recovery we could end up with the table not being truncated, and with the data file having been renamed to #sql-ib….ibd. Some manual recovery would then be needed (such as, renaming the .frm file to match the .ibd file name, then RENAME TABLE `#mysql50##sql-ib…` TO original_table_name;

If the server is killed before the original table (#sql-ib….ibd) is dropped, then the table would remain orphaned after recovery. It could be dropped by copying the .frm file and then issuing DROP TABLE `#mysql50##sql-ib…`;.

MariaDB Server 10.3 is not affected by these issues, because there RENAME operations will be correctly rolled back, and #sql tables will be dropped on startup (~~MDEV-14585~~).

Marko Mäkelä added a comment - 2018-08-02 06:21 I believe that we need a twofold fix: Implement TRUNCATE TABLE as a combination of renaming the table to #sql name and creating one with the original name, in a single transaction. Then, issue DROP TABLE for the old copy (this can be executed in the background). Implement undo tablespace truncation as a single mini-transaction that rewrites the first few pages (including FSP_SIZE in the first page), then trims the file size. Make sure that recovery (and backup) will ignore old redo log records for pages that were after the trimmed end of the file. To do this, we can write a MLOG_FILE_CREATE2 record with the new size as the page number (instead of writing 0). The MLOG_FILE_CREATE2 records were previously parsed but ignored during recovery and backup. In this way, the TruncateLogger and some related code can be removed. But we will have to keep TruncateLogParser in order to be able to crash-upgrade from MariaDB Server 10.2 or 10.3 prior to this fix. In MariaDB Server 10.2, TRUNCATE will no longer be crash-safe MariaDB Server 10.2 is affected by MDEV-14717 RENAME TABLE in InnoDB is not crash-safe. If the server is killed in the middle of the BEGIN; RENAME; CREATE; COMMIT; transaction, after recovery we could end up with the table not being truncated, and with the data file having been renamed to #sql-ib….ibd . Some manual recovery would then be needed (such as, renaming the .frm file to match the .ibd file name, then RENAME TABLE `#mysql50##sql-ib…` TO original_table_name; If the server is killed before the original table ( #sql-ib….ibd ) is dropped, then the table would remain orphaned after recovery. It could be dropped by copying the .frm file and then issuing DROP TABLE `#mysql50##sql-ib…`; . MariaDB Server 10.3 is not affected by these issues, because there RENAME operations will be correctly rolled back, and #sql tables will be dropped on startup ( MDEV-14585 ).

Marko Mäkelä made changes - 2018-08-02 12:58

Link

This issue blocks ~~MDEV-16850~~ [ ~~MDEV-16850~~ ]

Marko Mäkelä made changes - 2018-08-02 13:30

Attachment

truncate.patch [ 45980 ]

Marko Mäkelä added a comment - 2018-08-02 13:31

truncate.patch is a work-in-progress patch. The crash recovery for MLOG_FILE_CREATE2 records has not yet been implemented, and ALTER TABLE…TRUNCATE PARTITION is crashing due to a name mismatch in INNOBASE_SHARE. I think that we must remove INNOBASE_SHARE first, in ~~MDEV-16557~~.

Marko Mäkelä added a comment - 2018-08-02 13:31 truncate.patch is a work-in-progress patch. The crash recovery for MLOG_FILE_CREATE2 records has not yet been implemented, and ALTER TABLE…TRUNCATE PARTITION is crashing due to a name mismatch in INNOBASE_SHARE . I think that we must remove INNOBASE_SHARE first, in MDEV-16557 .

Marko Mäkelä made changes - 2018-08-02 13:31

Link

This issue relates to ~~MDEV-16557~~ [ ~~MDEV-16557~~ ]

Marko Mäkelä added a comment - 2018-08-16 13:21

Mariabackup starting with 10.2.18 and 10.3.10 will refuse operation if any MLOG_TRUNCATE record was written (by the incompatible implementation of TRUNCATE TABLE). Unfortunately we cannot easily detect if the incompatible form of undo log tablespace truncation was attempted.

I plan to implement both undo log tablespace truncation and TRUNCATE TABLE in a backup-safe way in the first affected series (MariaDB Server 10.2).

Marko Mäkelä added a comment - 2018-08-16 13:21 Mariabackup starting with 10.2.18 and 10.3.10 will refuse operation if any MLOG_TRUNCATE record was written (by the incompatible implementation of TRUNCATE TABLE ). Unfortunately we cannot easily detect if the incompatible form of undo log tablespace truncation was attempted. I plan to implement both undo log tablespace truncation and TRUNCATE TABLE in a backup-safe way in the first affected series (MariaDB Server 10.2).

Marko Mäkelä made changes - 2018-08-22 19:21

Status

Confirmed [ 10101 ]

In Progress [ 3 ]

Marko Mäkelä made changes - 2018-08-23 10:16

Link

This issue relates to ~~MDEV-17043~~ [ ~~MDEV-17043~~ ]

Marko Mäkelä made changes - 2018-08-23 14:39

Link

This issue relates to ~~MDEV-17049~~ [ ~~MDEV-17049~~ ]

Marko Mäkelä added a comment - 2018-08-28 12:27

There was an issue with mysql.gtid_slave_pos. With the old truncate, it was not a problem to have open table handles lingering around:

diff --git a/sql/rpl_gtid.cc b/sql/rpl_gtid.cc

index 2a0ac9a465f..c933ad4a0ab 100644

--- a/sql/rpl_gtid.cc

+++ b/sql/rpl_gtid.cc

@@ -402,6 +402,8 @@ rpl_slave_state::truncate_state_table(THD *thd)

                        NULL, TL_WRITE);

   if (!(err= open_and_lock_tables(thd, &tlist, FALSE, 0)))

+    tdc_remove_table(thd, TDC_RT_REMOVE_UNUSED, "mysql",

+                     rpl_gtid_slave_state_table_name.str, false);

     err= tlist.table->file->ha_truncate();

     if (err)

Also, mroonga must pass the table options, because ha_innobase::truncate() will be calling ha_innobase::create():

diff --git a/storage/mroonga/ha_mroonga.cpp b/storage/mroonga/ha_mroonga.cpp

index b4bfc152053..4c63e95a364 100644

--- a/storage/mroonga/ha_mroonga.cpp

+++ b/storage/mroonga/ha_mroonga.cpp

@@ -12859,13 +12859,22 @@ int ha_mroonga::delete_all_rows()

 int ha_mroonga::wrapper_truncate()

   int error = 0;

+  MRN_SHARE *tmp_share;

   MRN_DBUG_ENTER_METHOD();

+  if (!(tmp_share = mrn_get_share(table->s->table_name.str, table, &error)))

+    DBUG_RETURN(error);

   MRN_SET_WRAP_SHARE_KEY(share, table->s);

   MRN_SET_WRAP_TABLE_KEY(this, table);

-  error = wrap_handler->ha_truncate();

+  error = parse_engine_table_options(ha_thd(), tmp_share->hton, table->s)

+    ? MRN_GET_ERROR_NUMBER

+    : wrap_handler->ha_truncate();

   MRN_SET_BASE_SHARE_KEY(share, table->s);

   MRN_SET_BASE_TABLE_KEY(this, table);

+  mrn_free_share(tmp_share);

   if (!error && wrapper_have_target_index()) {

     error = wrapper_truncate_index();

Marko Mäkelä added a comment - 2018-08-28 12:27 There was an issue with mysql.gtid_slave_pos . With the old truncate, it was not a problem to have open table handles lingering around: diff --git a/sql/rpl_gtid.cc b/sql/rpl_gtid.cc index 2a0ac9a465f..c933ad4a0ab 100644 --- a/sql/rpl_gtid.cc +++ b/sql/rpl_gtid.cc @@ -402,6 +402,8 @@ rpl_slave_state::truncate_state_table(THD *thd) NULL, TL_WRITE); if (!(err= open_and_lock_tables(thd, &tlist, FALSE, 0))) { + tdc_remove_table(thd, TDC_RT_REMOVE_UNUSED, "mysql", + rpl_gtid_slave_state_table_name.str, false); err= tlist.table->file->ha_truncate(); if (err) Also, mroonga must pass the table options, because ha_innobase::truncate() will be calling ha_innobase::create() : diff --git a/storage/mroonga/ha_mroonga.cpp b/storage/mroonga/ha_mroonga.cpp index b4bfc152053..4c63e95a364 100644 --- a/storage/mroonga/ha_mroonga.cpp +++ b/storage/mroonga/ha_mroonga.cpp @@ -12859,13 +12859,22 @@ int ha_mroonga::delete_all_rows() int ha_mroonga::wrapper_truncate() { int error = 0; + MRN_SHARE *tmp_share; MRN_DBUG_ENTER_METHOD(); + + if (!(tmp_share = mrn_get_share(table->s->table_name.str, table, &error))) + DBUG_RETURN(error); + MRN_SET_WRAP_SHARE_KEY(share, table->s); MRN_SET_WRAP_TABLE_KEY(this, table); - error = wrap_handler->ha_truncate(); + error = parse_engine_table_options(ha_thd(), tmp_share->hton, table->s) + ? MRN_GET_ERROR_NUMBER + : wrap_handler->ha_truncate(); MRN_SET_BASE_SHARE_KEY(share, table->s); MRN_SET_BASE_TABLE_KEY(this, table); + mrn_free_share(tmp_share); + if (!error && wrapper_have_target_index()) { error = wrapper_truncate_index(); }

Marko Mäkelä made changes - 2018-08-28 13:40

Fix Version/s

10.3 [ 22126 ]

Marko Mäkelä added a comment - 2018-08-28 16:39

I have pushed this to bb-10.2-marko for testing. There are 2 open issues:

Undo tablespace truncation (which is disabled by default) is generating a huge mini-transaction (more than 1 megabyte of log), which will cause crash recovery to hang when using small buffer pool (8 megabytes).
TRUNCATE TABLE is not crash-safe before ~~MDEV-14717~~ (crash-safe RENAME TABLE inside InnoDB). If the server is killed, we might end up with the tablename.ibd being called #sql-ib….ibd, and only DROP TABLE would be allowed.
We might also end up having an orphan table #sql-ib….

These issues could be resolved by implementing changes that will break crash-downgrade to earlier versions in the 10.2 or 10.3 series:

Port ~~MDEV-14717~~ to 10.2. (This change is already present in bb-10.2-ext and 10.3.) This will change the insert_undo log format, breaking crash-downgrade to earlier 10.2. Normal downgrade should not be affected, because DDL operations are always committed before shutdown, and the insert_undo log will be discarded at transaction commit.
Introduce more compact redo log format for undo tablespace truncation. This would break crash-downgrade to earlier 10.2 or 10.3 versions. Crash-downgrade with the current code should be possible; the only caveat is that the undo tablespace file size would not be shrunk in the recovery by earlier versions.

Marko Mäkelä added a comment - 2018-08-28 16:39 I have pushed this to bb-10.2-marko for testing. There are 2 open issues: Undo tablespace truncation (which is disabled by default) is generating a huge mini-transaction (more than 1 megabyte of log), which will cause crash recovery to hang when using small buffer pool (8 megabytes). TRUNCATE TABLE is not crash-safe before MDEV-14717 (crash-safe RENAME TABLE inside InnoDB). If the server is killed, we might end up with the tablename.ibd being called #sql-ib….ibd , and only DROP TABLE would be allowed. We might also end up having an orphan table #sql-ib… . These issues could be resolved by implementing changes that will break crash-downgrade to earlier versions in the 10.2 or 10.3 series: Port MDEV-14717 to 10.2. (This change is already present in bb-10.2-ext and 10.3.) This will change the insert_undo log format, breaking crash-downgrade to earlier 10.2. Normal downgrade should not be affected, because DDL operations are always committed before shutdown, and the insert_undo log will be discarded at transaction commit. Introduce more compact redo log format for undo tablespace truncation. This would break crash-downgrade to earlier 10.2 or 10.3 versions. Crash-downgrade with the current code should be possible; the only caveat is that the undo tablespace file size would not be shrunk in the recovery by earlier versions.

Marko Mäkelä added a comment - 2018-08-31 12:13

For the record, this would also fix the following hang, which I observed in innodb_zip.wl6501_scale_1:

10.2 206528f722799b04708c60a71b59d75bd32bdeb3
#7 0x00005654d44a76d0 in rw_lock_x_lock_wait_func (lock=0x5654d7b36c40,
pass=0, threshold=0,
file_name=0x5654d4ad0268 "/mariadb/10.2m/storage/innobase/btr/btr0sea.cc",
line=1259) at /mariadb/10.2m/storage/innobase/sync/sync0rw.cc:477
#8 0x00005654d44a7822 in rw_lock_x_lock_low (lock=0x5654d7b36c40, pass=0,
file_name=0x5654d4ad0268 "/mariadb/10.2m/storage/innobase/btr/btr0sea.cc",
line=1259) at /mariadb/10.2m/storage/innobase/sync/sync0rw.cc:541
#9 0x00005654d44a7be4 in rw_lock_x_lock_func (lock=0x5654d7b36c40, pass=0,
file_name=0x5654d4ad0268 "/mariadb/10.2m/storage/innobase/btr/btr0sea.cc",
line=1259) at /mariadb/10.2m/storage/innobase/sync/sync0rw.cc:692
#10 0x00005654d4540631 in btr_search_drop_page_hash_index (
block=0x7f26b36d3278)
at /mariadb/10.2m/storage/innobase/btr/btr0sea.cc:1259
#11 0x00005654d457cb46 in buf_LRU_free_page (bpage=0x7f26b36d3278, zip=true)
at /mariadb/10.2m/storage/innobase/buf/buf0lru.cc:1767
#12 0x00005654d455dc47 in buf_page_io_complete (bpage=0x7f26b36d3278,
dblwr=false, evict=true)
at /mariadb/10.2m/storage/innobase/buf/buf0buf.cc:6297
#13 0x00005654d45e4442 in fil_aio_wait (segment=5)
at /mariadb/10.2m/storage/innobase/fil/fil0fil.cc:5290
#14 0x00005654d449ab11 in io_handler_thread (arg=0x5654d5203368 <n+40>)
at /mariadb/10.2m/storage/innobase/srv/srv0start.cc:337

The above I/O thread is waiting for an adaptive hash index latch, which is being held by TRUNCATE (which appears to have initiated the flush and eviction):

#2  0x00005654d4579d31 in buf_flush_dirty_pages (buf_pool=0x5654d7ac6b30,

    id=55, observer=0x0) at /mariadb/10.2m/storage/innobase/buf/buf0lru.cc:694

#3  0x00005654d4579e32 in buf_LRU_flush_or_remove_pages (id=55, observer=0x0)

    at /mariadb/10.2m/storage/innobase/buf/buf0lru.cc:712

#4  0x00005654d45dd5b4 in fil_reinit_space_header_for_table (

    table=0x7f265005c298, size=7, trx=0x7f26b8172138)

    at /mariadb/10.2m/storage/innobase/fil/fil0fil.cc:3185

#5  0x00005654d446df5e in row_truncate_table_for_mysql (table=0x7f265005c298,

    trx=0x7f26b8172138)

    at /mariadb/10.2m/storage/innobase/row/row0trunc.cc:2036

#6  0x00005654d4302650 in ha_innobase::truncate (this=0x7f2650035f40)

    at /mariadb/10.2m/storage/innobase/handler/ha_innodb.cc:13073

Marko Mäkelä added a comment - 2018-08-31 12:13 For the record, this would also fix the following hang, which I observed in innodb_zip.wl6501_scale_1 : 10.2 206528f722799b04708c60a71b59d75bd32bdeb3 #7 0x00005654d44a76d0 in rw_lock_x_lock_wait_func (lock=0x5654d7b36c40, pass=0, threshold=0, file_name=0x5654d4ad0268 "/mariadb/10.2m/storage/innobase/btr/btr0sea.cc", line=1259) at /mariadb/10.2m/storage/innobase/sync/sync0rw.cc:477 #8 0x00005654d44a7822 in rw_lock_x_lock_low (lock=0x5654d7b36c40, pass=0, file_name=0x5654d4ad0268 "/mariadb/10.2m/storage/innobase/btr/btr0sea.cc", line=1259) at /mariadb/10.2m/storage/innobase/sync/sync0rw.cc:541 #9 0x00005654d44a7be4 in rw_lock_x_lock_func (lock=0x5654d7b36c40, pass=0, file_name=0x5654d4ad0268 "/mariadb/10.2m/storage/innobase/btr/btr0sea.cc", line=1259) at /mariadb/10.2m/storage/innobase/sync/sync0rw.cc:692 #10 0x00005654d4540631 in btr_search_drop_page_hash_index ( block=0x7f26b36d3278) at /mariadb/10.2m/storage/innobase/btr/btr0sea.cc:1259 #11 0x00005654d457cb46 in buf_LRU_free_page (bpage=0x7f26b36d3278, zip=true) at /mariadb/10.2m/storage/innobase/buf/buf0lru.cc:1767 #12 0x00005654d455dc47 in buf_page_io_complete (bpage=0x7f26b36d3278, dblwr=false, evict=true) at /mariadb/10.2m/storage/innobase/buf/buf0buf.cc:6297 #13 0x00005654d45e4442 in fil_aio_wait (segment=5) at /mariadb/10.2m/storage/innobase/fil/fil0fil.cc:5290 #14 0x00005654d449ab11 in io_handler_thread (arg=0x5654d5203368 <n+40>) at /mariadb/10.2m/storage/innobase/srv/srv0start.cc:337 The above I/O thread is waiting for an adaptive hash index latch, which is being held by TRUNCATE (which appears to have initiated the flush and eviction): #2 0x00005654d4579d31 in buf_flush_dirty_pages (buf_pool=0x5654d7ac6b30, id=55, observer=0x0) at /mariadb/10.2m/storage/innobase/buf/buf0lru.cc:694 #3 0x00005654d4579e32 in buf_LRU_flush_or_remove_pages (id=55, observer=0x0) at /mariadb/10.2m/storage/innobase/buf/buf0lru.cc:712 #4 0x00005654d45dd5b4 in fil_reinit_space_header_for_table ( table=0x7f265005c298, size=7, trx=0x7f26b8172138) at /mariadb/10.2m/storage/innobase/fil/fil0fil.cc:3185 #5 0x00005654d446df5e in row_truncate_table_for_mysql (table=0x7f265005c298, trx=0x7f26b8172138) at /mariadb/10.2m/storage/innobase/row/row0trunc.cc:2036 #6 0x00005654d4302650 in ha_innobase::truncate (this=0x7f2650035f40) at /mariadb/10.2m/storage/innobase/handler/ha_innodb.cc:13073

Marko Mäkelä made changes - 2018-09-03 14:56

Link

This issue relates to ~~MDEV-16465~~ [ ~~MDEV-16465~~ ]

Marko Mäkelä made changes - 2018-09-05 10:35

Link

This issue blocks ~~MDEV-14481~~ [ ~~MDEV-14481~~ ]

Marko Mäkelä made changes - 2018-09-05 12:31

Link

This issue relates to ~~MDEV-17138~~ [ ~~MDEV-17138~~ ]

Marko Mäkelä added a comment - 2018-09-05 13:29

bb-10.4-marko removes code for supporting crash-upgrade of TRUNCATE TABLE or undo tablespace truncation from pre-~~MDEV-13564~~ 10.2 or 10.3 to 10.4.
To play it safe, I think that 10.4 should refuse crash-upgrade from 10.2 or 10.3 where a ~~MDEV-13564~~ fix is not present. While we can detect the occurrence of unsupported TRUNCATE TABLE by the presence of a MLOG_TRUNCATE record, undo tablespace truncation appears to be signalled by the presence of a separate log file only. We can implement this by introducing a redo log format subtype that indicates whether the ~~MDEV-13564~~ fix is present. Older versions would ignore this subtype byte and keep working.

Marko Mäkelä added a comment - 2018-09-05 13:29 bb-10.4-marko removes code for supporting crash-upgrade of TRUNCATE TABLE or undo tablespace truncation from pre- MDEV-13564 10.2 or 10.3 to 10.4. To play it safe, I think that 10.4 should refuse crash-upgrade from 10.2 or 10.3 where a MDEV-13564 fix is not present. While we can detect the occurrence of unsupported TRUNCATE TABLE by the presence of a MLOG_TRUNCATE record, undo tablespace truncation appears to be signalled by the presence of a separate log file only. We can implement this by introducing a redo log format subtype that indicates whether the MDEV-13564 fix is present. Older versions would ignore this subtype byte and keep working.

Marko Mäkelä added a comment - 2018-09-06 11:14

To make the new TRUNCATE crash-safe in 10.2, I backported ~~MDEV-14717~~, ~~MDEV-14378~~, a follow-up to ~~MDEV-13407~~, and ~~MDEV-14585~~ to bb-10.2-marko.

With this, a crash-downgrade of a RENAME (or TRUNCATE or table-rebuilding ALTER TABLE or OPTIMIZE TABLE) operation to an earlier 10.2 version would trigger a debug assertion failure during rollback, in trx_roll_pop_top_rec_of_trx():

		ut_ad(undo == update || undo == temp);

In a non-debug build, cause the undo log record to be misinterpreted as an update. The table name would be misinterpreted as DB_TRX_ID,DB_ROLL_PTR and the PRIMARY KEY of the table. In the highly unlikely event that a record is found, the execution would be aborted in row_undo_mod(), on the switch (node->rec_type). Normally, the non-debug build would crash inside ha_innobase::open():

#2  0x00005555559702b7 in ut_dbg_assertion_failed (

    expr=expr@entry=0x555556101bac "table2 == NULL",

    file=file@entry=0x5555561005a8 "/mariadb/10.2m/storage/innobase/dict/dict0dict.cc", line=line@entry=1319)

    at /mariadb/10.2m/storage/innobase/ut/ut0dbg.cc:61

#3  0x0000555555e5a21e in dict_table_add_to_cache (

    table=table@entry=0x7fff98017790,

    can_be_evicted=can_be_evicted@entry=true, heap=heap@entry=0x7fff9801c8d0)

    at /mariadb/10.2m/storage/innobase/dict/dict0dict.cc:1319

#4  0x0000555555e6bec5 in dict_load_table_one(table_name_t&, bool, dict_err_ignore_t, std::deque<char const*, ut_allocator<char const*, true> >&) ()

    at /mariadb/10.2m/storage/innobase/dict/dict0load.cc:3013

#5  0x0000555555e6c4c2 in dict_load_table(char const*, bool, dict_err_ignore_t) () at /mariadb/10.2m/storage/innobase/dict/dict0load.cc:2810

#6  0x0000555555e5fd38 in dict_table_open_on_name(char const*, unsigned long, unsigned long, dict_err_ignore_t) ()

    at /mariadb/10.2m/storage/innobase/dict/dict0dict.cc:1170

#7  0x0000555555ceca9b in ha_innobase::open_dict_table (

    table_name=<optimized out>, norm_name=0x7ffff4fcad20 "test/t1",

    is_partition=<optimized out>, ignore_err=DICT_ERR_IGNORE_NONE)

    at /mariadb/10.2m/storage/innobase/handler/ha_innodb.cc:6552

#8  0x0000555555cfb00e in ha_innobase::open(char const*, int, unsigned int) ()

    at /mariadb/10.2m/storage/innobase/handler/ha_innodb.cc:6216

I am not sure if this really counts as a regression. 10.2 without ~~MDEV-14717~~ would not have crash-safe RENAME to begin with.

Marko Mäkelä added a comment - 2018-09-06 11:14 To make the new TRUNCATE crash-safe in 10.2, I backported MDEV-14717 , MDEV-14378 , a follow-up to MDEV-13407 , and MDEV-14585 to bb-10.2-marko . With this, a crash-downgrade of a RENAME (or TRUNCATE or table-rebuilding ALTER TABLE or OPTIMIZE TABLE ) operation to an earlier 10.2 version would trigger a debug assertion failure during rollback, in trx_roll_pop_top_rec_of_trx() : ut_ad(undo == update || undo == temp); In a non-debug build, cause the undo log record to be misinterpreted as an update. The table name would be misinterpreted as DB_TRX_ID,DB_ROLL_PTR and the PRIMARY KEY of the table. In the highly unlikely event that a record is found, the execution would be aborted in row_undo_mod() , on the switch (node->rec_type) . Normally, the non-debug build would crash inside ha_innobase::open() : #2 0x00005555559702b7 in ut_dbg_assertion_failed ( expr=expr@entry=0x555556101bac "table2 == NULL", file=file@entry=0x5555561005a8 "/mariadb/10.2m/storage/innobase/dict/dict0dict.cc", line=line@entry=1319) at /mariadb/10.2m/storage/innobase/ut/ut0dbg.cc:61 #3 0x0000555555e5a21e in dict_table_add_to_cache ( table=table@entry=0x7fff98017790, can_be_evicted=can_be_evicted@entry=true, heap=heap@entry=0x7fff9801c8d0) at /mariadb/10.2m/storage/innobase/dict/dict0dict.cc:1319 #4 0x0000555555e6bec5 in dict_load_table_one(table_name_t&, bool, dict_err_ignore_t, std::deque<char const*, ut_allocator<char const*, true> >&) () at /mariadb/10.2m/storage/innobase/dict/dict0load.cc:3013 #5 0x0000555555e6c4c2 in dict_load_table(char const*, bool, dict_err_ignore_t) () at /mariadb/10.2m/storage/innobase/dict/dict0load.cc:2810 #6 0x0000555555e5fd38 in dict_table_open_on_name(char const*, unsigned long, unsigned long, dict_err_ignore_t) () at /mariadb/10.2m/storage/innobase/dict/dict0dict.cc:1170 #7 0x0000555555ceca9b in ha_innobase::open_dict_table ( table_name=<optimized out>, norm_name=0x7ffff4fcad20 "test/t1", is_partition=<optimized out>, ignore_err=DICT_ERR_IGNORE_NONE) at /mariadb/10.2m/storage/innobase/handler/ha_innodb.cc:6552 #8 0x0000555555cfb00e in ha_innobase::open(char const*, int, unsigned int) () at /mariadb/10.2m/storage/innobase/handler/ha_innodb.cc:6216 I am not sure if this really counts as a regression. 10.2 without MDEV-14717 would not have crash-safe RENAME to begin with.

Marko Mäkelä made changes - 2018-09-06 20:14

Fix Version/s

10.2 [ 14601 ]

Marko Mäkelä added a comment - 2018-09-06 20:17

We can prevent a crash-downgrade to earlier MariaDB 10.2 versions by changing the InnoDB redo log format identifier to the 10.3 identifier, and by introducing a subformat identifier so that 10.2 can continue to refuse crash-downgrade from 10.3 or later. After a clean shutdown, a downgrade to MariaDB 10.2.13 or later would still be possible. Older MariaDB 10.2 are missing ~~MDEV-14909~~, and a downgrade would only be possible after removing the log files (not recommended).

Marko Mäkelä added a comment - 2018-09-06 20:17 We can prevent a crash-downgrade to earlier MariaDB 10.2 versions by changing the InnoDB redo log format identifier to the 10.3 identifier, and by introducing a subformat identifier so that 10.2 can continue to refuse crash-downgrade from 10.3 or later. After a clean shutdown, a downgrade to MariaDB 10.2.13 or later would still be possible. Older MariaDB 10.2 are missing MDEV-14909 , and a downgrade would only be possible after removing the log files (not recommended).

Marko Mäkelä made changes - 2018-09-07 16:52

Link

This issue is blocked by ~~MDEV-14717~~ [ ~~MDEV-14717~~ ]

Marko Mäkelä made changes - 2018-09-07 18:59

Link

This issue relates to ~~MDEV-17158~~ [ ~~MDEV-17158~~ ]

Marko Mäkelä added a comment - 2018-09-07 19:24

Even though I prepared a fix for 10.2, I decided not to push it yet, because I fear that ~~MDEV-17158~~ could occasionally cause loss of data when InnoDB is killed during a table-rebuilding operation, such as TRUNCATE or ALTER or OPTIMIZE.

Marko Mäkelä added a comment - 2018-09-07 19:24 Even though I prepared a fix for 10.2, I decided not to push it yet, because I fear that MDEV-17158 could occasionally cause loss of data when InnoDB is killed during a table-rebuilding operation, such as TRUNCATE or ALTER or OPTIMIZE .

Marko Mäkelä made changes - 2018-09-07 19:24

issue.field.resolutiondate

2018-09-07 19:24:38.0

2018-09-07 19:24:38.377

Marko Mäkelä made changes - 2018-09-07 19:24

Fix Version/s		10.3.10 [ 23140 ]
Fix Version/s		10.4.0 [ 23115 ]
Fix Version/s	10.2 [ 14601 ]
Fix Version/s	10.3 [ 22126 ]
Fix Version/s	10.4 [ 22408 ]
Resolution		Fixed [ 1 ]
Status	In Progress [ 3 ]	Closed [ 6 ]

Marko Mäkelä made changes - 2018-09-13 14:29

Link

This issue relates to ~~MDEV-15522~~ [ ~~MDEV-15522~~ ]

Marko Mäkelä added a comment - 2018-09-13 14:38

We decided not to push this to 10.2.18, because it backports a large amount of code to 10.2, which could be risky right before a 10.2 release. So, the backport could be merged to 10.2.19 at the earliest.

Note that the crash-downgrade prevention for 10.2 will prevent Percona Xtrabackup from working with MariaDB Server 10.2 with the backport included. (Xtrabackup already does not work with MariaDB Server 10.3 or later.)

Marko Mäkelä added a comment - 2018-09-13 14:38 We decided not to push this to 10.2.18, because it backports a large amount of code to 10.2, which could be risky right before a 10.2 release. So, the backport could be merged to 10.2.19 at the earliest. Note that the crash-downgrade prevention for 10.2 will prevent Percona Xtrabackup from working with MariaDB Server 10.2 with the backport included. (Xtrabackup already does not work with MariaDB Server 10.3 or later.)

Marko Mäkelä made changes - 2018-09-21 06:00

Link

This issue is duplicated by ~~MDEV-9459~~ [ ~~MDEV-9459~~ ]

Marko Mäkelä made changes - 2018-09-26 09:39

Fix Version/s

10.2.19 [ 23207 ]

Marko Mäkelä made changes - 2018-09-27 11:37

Link

This issue relates to ~~MDEV-17304~~ [ ~~MDEV-17304~~ ]

Marko Mäkelä added a comment - 2018-10-10 16:26 - edited

bb-10.2-marko
MariaDB 10.2.19 will support the backup-unsafe TRUNCATE TABLE by default, to retain perceived compatibility with xtrabackup.
Undo tablespace truncation will use the redo log, but older versions of the server or older backup tools will fail to shrink the undo tablespace files on recovery.
The backup-safe TRUNCATE can be enabled in MariaDB 10.2 by setting the start-up parameter loose_innodb_unsafe_truncate=OFF. This parameter will not be available in 10.3 or later releases.

Edit: the option was renamed to innodb_safe_truncate.

Marko Mäkelä added a comment - 2018-10-10 16:26 - edited bb-10.2-marko MariaDB 10.2.19 will support the backup-unsafe TRUNCATE TABLE by default, to retain perceived compatibility with xtrabackup. Undo tablespace truncation will use the redo log, but older versions of the server or older backup tools will fail to shrink the undo tablespace files on recovery. The backup-safe TRUNCATE can be enabled in MariaDB 10.2 by setting the start-up parameter loose_innodb_unsafe_truncate=OFF . This parameter will not be available in 10.3 or later releases. Edit: the option was renamed to innodb_safe_truncate .

Marko Mäkelä added a comment - 2018-10-11 05:12

Buildbot seems to be OK with the change. I conducted a manual test of crash-downgrading to mariadb-10.2.18:

./mtr --manual-gdb innodb.truncate_crash

./mtr --manual-gdb innodb_zip.wl6501_crash_3

Once the server is killed by the test, switch to the 10.2.18 executable (using the file command in GDB) and restart (run in GDB).

With the first test (which uses the backup-safe mechanism), the InnoDB in MariaDB Server 10.2.18 would refuse to start up, after emitting the following message to the error log:

2018-10-11 7:58:30 140737330922240 [ERROR] InnoDB: Downgrade after a crash is not supported. The redo log was created with MariaDB 10.2.19.

With the second test, which uses MySQL 5.7’s backup-unsafe but crash-safe TRUNCATE TABLE, the MariaDB 10.2.18 server would parse and apply the old-format redo log and the ib_*_*_trunc.log just fine. That test is restarting the server several times. I ran the test twice; first, switching to 10.2.18 on the first restart, and second, switching to 10.2.18 on the second restart and on subsequent restarts, switching between 10.2.18 and this 10.2.19 revision.

Marko Mäkelä added a comment - 2018-10-11 05:12 Buildbot seems to be OK with the change. I conducted a manual test of crash-downgrading to mariadb-10.2.18: ./mtr --manual-gdb innodb.truncate_crash ./mtr --manual-gdb innodb_zip.wl6501_crash_3 Once the server is killed by the test, switch to the 10.2.18 executable (using the file command in GDB) and restart ( run in GDB). With the first test (which uses the backup-safe mechanism), the InnoDB in MariaDB Server 10.2.18 would refuse to start up, after emitting the following message to the error log: 2018-10-11 7:58:30 140737330922240 [ERROR] InnoDB: Downgrade after a crash is not supported. The redo log was created with MariaDB 10.2.19. With the second test, which uses MySQL 5.7’s backup-unsafe but crash-safe TRUNCATE TABLE , the MariaDB 10.2.18 server would parse and apply the old-format redo log and the ib_*_*_trunc.log just fine. That test is restarting the server several times. I ran the test twice; first, switching to 10.2.18 on the first restart, and second, switching to 10.2.18 on the second restart and on subsequent restarts, switching between 10.2.18 and this 10.2.19 revision.

Marko Mäkelä added a comment - 2018-10-17 14:37

The default value for the 10.2-specific parameter will be innodb_safe_truncate=ON.

Marko Mäkelä added a comment - 2018-10-17 14:37 The default value for the 10.2-specific parameter will be innodb_safe_truncate=ON .

Marko Mäkelä made changes - 2018-11-01 09:38

Link

This issue relates to ~~MDEV-9459~~ [ ~~MDEV-9459~~ ]

Marko Mäkelä made changes - 2018-11-20 10:50

Link

This issue relates to ~~MDEV-17780~~ [ ~~MDEV-17780~~ ]

Marko Mäkelä made changes - 2018-11-22 08:18

Link

This issue relates to ~~MDEV-17794~~ [ ~~MDEV-17794~~ ]

Marko Mäkelä made changes - 2018-11-26 08:55

Link

This issue causes ~~MDEV-17816~~ [ ~~MDEV-17816~~ ]

Marko Mäkelä made changes - 2018-11-26 13:04

Link

This issue relates to ~~MDEV-17831~~ [ ~~MDEV-17831~~ ]

Marko Mäkelä made changes - 2018-11-27 09:21

Link

This issue causes ~~MDEV-17849~~ [ ~~MDEV-17849~~ ]

Elena Stepanova made changes - 2018-12-01 11:38

Link

This issue causes ~~MDEV-17885~~ [ ~~MDEV-17885~~ ]

Marko Mäkelä made changes - 2019-03-11 19:04

Link

This issue causes ~~MDEV-18836~~ [ ~~MDEV-18836~~ ]

Marko Mäkelä made changes - 2019-03-12 16:01

Link

This issue relates to ~~MDEV-18739~~ [ ~~MDEV-18739~~ ]

Marko Mäkelä made changes - 2019-03-19 07:48

Link

This issue relates to ~~MDEV-18960~~ [ ~~MDEV-18960~~ ]

Marko Mäkelä made changes - 2019-04-12 15:31

Link

This issue relates to ~~MDEV-14481~~ [ ~~MDEV-14481~~ ]

Marko Mäkelä made changes - 2019-05-06 07:21

Link

This issue relates to ~~MDEV-18654~~ [ ~~MDEV-18654~~ ]

Marko Mäkelä made changes - 2019-05-13 17:08

Link

This issue causes ~~MDEV-19449~~ [ ~~MDEV-19449~~ ]

Geoff Montee (Inactive) made changes - 2019-06-15 04:37

Link

This issue relates to MDEV-19769 [ MDEV-19769 ]

Marko Mäkelä made changes - 2020-01-16 07:22

Link

This issue causes ~~MDEV-21496~~ [ ~~MDEV-21496~~ ]

Marko Mäkelä made changes - 2020-06-04 09:57

Link

This issue relates to MDEV-22733 [ MDEV-22733 ]

Marko Mäkelä made changes - 2020-09-09 13:01

Link

This issue causes ~~MDEV-23705~~ [ ~~MDEV-23705~~ ]

Marko Mäkelä made changes - 2021-02-22 09:39

Link

This issue relates to ~~MDEV-24532~~ [ ~~MDEV-24532~~ ]

Marko Mäkelä made changes - 2021-03-01 17:36

Link

This issue causes ~~MDEV-24532~~ [ ~~MDEV-24532~~ ]

Marko Mäkelä made changes - 2021-03-04 14:02

Link

This issue relates to ~~MDEV-25051~~ [ ~~MDEV-25051~~ ]

Marko Mäkelä made changes - 2021-04-28 06:46

Link

This issue causes ~~MDEV-25524~~ [ ~~MDEV-25524~~ ]

Marko Mäkelä made changes - 2021-04-28 08:51

Link

This issue causes ~~MDEV-25524~~ [ ~~MDEV-25524~~ ]

Marko Mäkelä made changes - 2021-05-18 07:15

Link

This issue relates to ~~MDEV-25710~~ [ ~~MDEV-25710~~ ]

Marko Mäkelä made changes - 2021-09-21 10:43

Link

This issue causes ~~MDEV-26450~~ [ ~~MDEV-26450~~ ]

Sergei Golubchik made changes - 2021-12-06 21:45

Workflow

MariaDB v3 [ 82147 ]

MariaDB v4 [ 152648 ]

Marko Mäkelä made changes - 2023-12-22 08:38

Link

This issue relates to ~~MDEV-33112~~ [ ~~MDEV-33112~~ ]

People

Assignee:: Marko Mäkelä

Reporter:: Marko Mäkelä

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 2017-08-17 14:12

Updated:: 2023-12-22 08:38

Resolved:: 2018-09-07 19:24

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.