[MDEV-25683] Atomic DDL: With innodb_force_recovery=3 InnoDB: Trying to load index but the index tree has been freed Created: 2021-05-14  Updated: 2022-11-02  Resolved: 2021-10-29

Status: Closed
Project: MariaDB Server
Component/s: Embedded Server, Server, Storage Engine - InnoDB
Affects Version/s: 10.6.1, 10.6.2
Fix Version/s: 10.6.5

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: regression

Issue Links:
Problem/Incident
is caused by MDEV-17567 Atomic DDL Closed
Relates
relates to MDEV-26966 The parameter innodb_force_load_corru... Closed
relates to MDEV-17567 Atomic DDL Closed

 Description   

The test was running ALTER TABLE in two threads when it was sigkill-ed. The standard crash recovery seems to work all right (no errors reported and the server starts, other than numerous [Note] records "InnoDB: At LSN: ...: unable to open file ... for tablespace ..."), as well as innodb_force_recovery=[12], but with innodb_force_recovery=3 it throws errors and ends with an assertion failure.

The original test was run on bb-10.6-monty-innodb 60f144ef1f. The recovery failure on a stored datadir is reproducible on bb-10.6-monty be1dcc8aa.

bb-10.6-monty be1dcc8aa

2021-05-15  1:39:51 0 [Note] InnoDB: Opened 2 undo tablespaces
2021-05-15  1:39:51 0 [Warning] InnoDB: innodb_undo_tablespaces=0 disables dedicated undo log tablespaces
2021-05-15  1:39:51 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=54267,54267
2021-05-15  1:39:51 0 [Note] InnoDB: At LSN: 471698: unable to open file ./test/#sql-alter-27122f-f.ibd for tablespace 34
2021-05-15  1:39:51 0 [Note] InnoDB: At LSN: 492629: unable to open file ./test/#sql-backup-27122f-f.ibd for tablespace 15
2021-05-15  1:39:51 0 [Note] InnoDB: At LSN: 552122: unable to open file ./test/#sql-backup-27122f-10.ibd for tablespace 11
...
2021-05-15  1:39:52 0 [Note] InnoDB: Opened 2 undo tablespaces
2021-05-15  1:39:52 0 [Warning] InnoDB: innodb_undo_tablespaces=0 disables dedicated undo log tablespaces
2021-05-15  1:39:52 0 [Note] InnoDB: 1 transaction(s) which must be rolled back or cleaned up in total 19 row operations to undo
2021-05-15  1:39:52 0 [Note] InnoDB: Trx id counter is 2959
2021-05-15  1:39:52 0 [Note] InnoDB: Starting final batch to recover 180 pages from redo log.
2021-05-15  1:39:58 0 [Note] InnoDB: 128 rollback segments in 2 undo tablespaces are active.
2021-05-15  1:39:58 0 [Note] InnoDB: Removed temporary tablespace data file: "./ibtmp1"
2021-05-15  1:39:58 0 [Note] InnoDB: Creating shared tablespace for temporary tables
2021-05-15  1:39:58 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
2021-05-15  1:39:58 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
2021-05-15  1:39:58 0 [Note] InnoDB: 10.6.1 started; log sequence number 4332081; transaction id 2959
2021-05-15  1:39:58 0 [Note] InnoDB: !!! innodb_force_recovery is set to 3 !!!
2021-05-15  1:39:58 0 [Note] InnoDB: Loading buffer pool(s) from /mnt-hd8t/src/bb-10.6-monty/data/ib_buffer_pool
2021-05-15  1:39:58 0 [Note] Plugin 'FEEDBACK' is disabled.
2021-05-15  1:39:58 0 [Note] InnoDB: Buffer pool(s) load completed at 210515  1:39:58
2021-05-15  1:39:58 0 [ERROR] InnoDB: Trying to load index `PRIMARY` for table `test`.`#sql-alter-27122f-10`, but the index tree has been freed!
2021-05-15  1:39:58 0 [Note] InnoDB: Index is corrupt but forcing load into data dictionary
2021-05-15  1:39:58 0 [ERROR] InnoDB: A record lock wait happens in a dictionary operation. index `CLUST_IND` of table `SYS_TABLES`. Submit a detailed bug report to https://jira.mariadb.org/
mysqld: /data/src/bb-10.6-monty/storage/innobase/lock/lock0lock.cc:1262: dberr_t lock_rec_enqueue_waiting(ib_lock_t*, unsigned int, page_id_t, const ib_page_t*, ulint, dict_index_t*, que_thr_t*, lock_prdt_t*): Assertion `0' failed.
210515  1:39:58 [ERROR] mysqld got signal 6 ;
 
#7  0x00007fbcfbf8df36 in __GI___assert_fail (assertion=0x555581499806 "0", file=0x55558149b3d0 "/data/src/bb-10.6-monty/storage/innobase/lock/lock0lock.cc", line=1262, function=0x55558149c3b0 "dberr_t lock_rec_enqueue_waiting(ib_lock_t*, unsigned int, page_id_t, const ib_page_t*, ulint, dict_index_t*, que_thr_t*, lock_prdt_t*)") at assert.c:101
#8  0x000055558081aa66 in lock_rec_enqueue_waiting (c_lock=0x7fbcf90cd1f0, type_mode=3, id=..., page=0x7fbcd8d60000 "", heap_no=39, index=0x5555845794c8, thr=0x555584787938, prdt=0x0) at /data/src/bb-10.6-monty/storage/innobase/lock/lock0lock.cc:1262
#9  0x000055558081cb34 in lock_rec_lock (impl=false, mode=3, block=0x7fbcd8029f88, heap_no=39, index=0x5555845794c8, thr=0x555584787938) at /data/src/bb-10.6-monty/storage/innobase/lock/lock0lock.cc:1508
#10 0x000055558083d9b4 in lock_clust_rec_read_check_and_lock (flags=0, block=0x7fbcd8029f88, rec=0x7fbcd8d60bc5 "test/#sql-alter-27122f-10", index=0x5555845794c8, offsets=0x7ffd185c0cd0, mode=LOCK_X, gap_mode=0, thr=0x555584787938) at /data/src/bb-10.6-monty/storage/innobase/lock/lock0lock.cc:5261
#11 0x0000555580aacf29 in sel_set_rec_lock (pcur=0x555584786af0, rec=0x7fbcd8d60bc5 "test/#sql-alter-27122f-10", index=0x5555845794c8, offsets=0x7ffd185c0cd0, mode=3, type=0, thr=0x555584787938, mtr=0x7ffd185c0f30) at /data/src/bb-10.6-monty/storage/innobase/row/row0sel.cc:1282
#12 0x0000555580ab006c in row_sel (node=0x5555847866e0, thr=0x555584787938) at /data/src/bb-10.6-monty/storage/innobase/row/row0sel.cc:1829
#13 0x0000555580ab263c in row_sel_step (thr=0x555584787938) at /data/src/bb-10.6-monty/storage/innobase/row/row0sel.cc:2342
#14 0x0000555580963154 in que_thr_step (thr=0x555584787938) at /data/src/bb-10.6-monty/storage/innobase/que/que0que.cc:629
#15 0x00005555809639cf in que_run_threads_low (thr=0x555584787938) at /data/src/bb-10.6-monty/storage/innobase/que/que0que.cc:709
#16 0x0000555580963bd2 in que_run_threads (thr=0x555584787938) at /data/src/bb-10.6-monty/storage/innobase/que/que0que.cc:729
#17 0x0000555580964078 in que_eval_sql (info=0x5555845f8908, sql=0x5555814f3c00 "PROCEDURE DROP_VIRTUAL_PROC () IS\ntid CHAR;\nBEGIN\nSELECT ID INTO tid FROM SYS_TABLES\nWHERE NAME = :name FOR UPDATE;\nIF (SQL % NOTFOUND) THEN RETURN; END IF;\nDELETE FROM SYS_VIRTUAL WHERE TABLE_ID = ti"..., reserve_dict_mutex=false, trx=0x7fbcf90cf3c0) at /data/src/bb-10.6-monty/storage/innobase/que/que0que.cc:779
#18 0x0000555580a3d715 in row_drop_table_for_mysql (name=0x7ffd185c1cd0 "test/#sql-alter-27122f-10", trx=0x7fbcf90cf3c0, sqlcom=SQLCOM_END, create_failed=false, nonatomic=true) at /data/src/bb-10.6-monty/storage/innobase/row/row0mysql.cc:3517
#19 0x000055558071e2e7 in ha_innobase::delete_table (this=0x5555842cdea0, name=0x7ffd185c2740 "./test/#sql-alter-27122f-10", sqlcom=SQLCOM_END) at /data/src/bb-10.6-monty/storage/innobase/handler/ha_innodb.cc:13095
#20 0x00005555806ee8c1 in ha_innobase::delete_table (this=0x5555842cdea0, name=0x7ffd185c2740 "./test/#sql-alter-27122f-10") at /data/src/bb-10.6-monty/storage/innobase/handler/ha_innodb.cc:13218
#21 0x000055557fecf5e5 in hton_drop_table (hton=0x555584245318, path=0x7ffd185c2740 "./test/#sql-alter-27122f-10") at /data/src/bb-10.6-monty/sql/handler.cc:577
#22 0x000055557fa6ece9 in ddl_log_execute_action (thd=0x5555842c7288, mem_root=0x7ffd185c29b0, ddl_log_entry=0x7ffd185c29f0) at /data/src/bb-10.6-monty/sql/ddl_log.cc:2023
#23 0x000055557fa70f91 in ddl_log_execute_entry_no_lock (thd=0x5555842c7288, first_entry=4) at /data/src/bb-10.6-monty/sql/ddl_log.cc:2373
#24 0x000055557fa72efb in ddl_log_execute_recovery () at /data/src/bb-10.6-monty/sql/ddl_log.cc:2733
#25 0x000055557f5b15e5 in init_server_components () at /data/src/bb-10.6-monty/sql/mysqld.cc:5325
#26 0x000055557f5b2dd2 in mysqld_main (argc=14, argv=0x5555841c2ee8) at /data/src/bb-10.6-monty/sql/mysqld.cc:5652
#27 0x000055557f59a91f in main (argc=13, argv=0x7ffd185c4c58) at /data/src/bb-10.6-monty/sql/main.cc:34
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x555581d325e0): INTERNAL DDL LOG RECOVER IN PROGRESS
 
Connection ID (thread ID): 0
Status: NOT_KILLED

Additional remarks:

  • should all these [Note] records be there?
  • Query (0x555581d325e0): INTERNAL DDL LOG RECOVER IN PROGRESS doesn't look right

Data directory and logs are available. There is no rr profile for the initial run; the profile for recovery can always be created, since the failure is easily reproducible by starting the server on the stored datadir.



 Comments   
Comment by Marko Mäkelä [ 2021-05-15 ]

From the InnoDB point of view, everything works as designed. The parameter innodb_force_recovery=3 prevents any recovered incomplete transactions from being rolled back. Based on the stack trace, it looks like a recovered DDL transaction would exist and hold locks on the InnoDB dictionary tables. Transactional lock waits are not supposed to be possible for InnoDB dictionary tables: the dict_sys.latch is supposed to be covering all modifications to dictionary tables.

This problem was introduded with the MDEV-17567 recovery logic. It fails to account for a special case where one of the storage engines is in read-only mode.

In InnoDB, there is an internal variable high_level_read_only that would be set if innodb_force_recovery is set to 5 or 6, or innodb_read_only is set. Already the values 2 and 3 are somewhat problematic for the DDL log recovery:

  • With innodb_force_recovery=2, purge will not run, and will not delete files that would be eligible to deletion after a DDL transaction was committed. Failure to delete files could clash with some rename operations that could be part of replaying the SQL-layer DDL log.
  • With innodb_force_recovery=3, the rollback of any recovered DDL transaction will be blocked, and such incomplete transaction may hold locks on the dictionary tables, triggering an assertion failure as described.

I think that this needs to be addressed with an API change:

  • Before starting DDL recovery, a new function handlerton::start_ddl_recovery will be invoked. If a storage engine responds that it will be unable to run recovery, then recovery will be skipped if any table belonging to that storage engine would require recovery.
  • The InnoDB impelementation of the new function would refuse recovery if innodb_read_only is set, or innodb_force_recovery is set to the value of 2 or more.
Comment by Michael Widenius [ 2021-09-07 ]

I think we need to first understand why one would use a high value for innodb_force_recovery.
The 'only' reason I can think if is that InnoDB is corrupted and refuses to start and the user wants to start
the server and be able to run 'mariadb-dump' to recover as much data as possible.

One problem with this is that if ddl recovery is not done, we will have an issue with probably-correct tables that where used for DDL's during the last crash. I call them possible-correct as these tables where exclusively locked, fully committed (and hopefully flushed)) and has no open transactions on them.
The problems are that will cause mysqldump to find no data for the tables:

  • Non existing or not matching .frm (possible for ALTER TABLE & RENAME TABLE)
  • Tables that should not be there (possible with ALTER TABLE ... RENAME)
  • Half dropped tables (.frm exists but no data, which may confuse mariadb-dump)
  • Triggers may be orphan.

One way to fix this would be to separate innodb_force_recovery and ddl recovery a bit.

Some suggestions:

  • If innodb_force_recovery < 10, then temporary set innodb_force_recovery to 0 during ddl recovery
    (We can also do this by having another flag: '--log-ddl-force-recovery' (set to true by default) that will
    do the same if set).
  • If Innodb_force_recovery >= 10 (or ddl transactions are not rolled back), innodb_read_only is set or --log-ddl-force-recovery=false then on startup
    we could pre-check if there is any InnoDB tables in the recovery log. If yes, then refuse to start.
    Another option is to run only the ddl log entries that does not touch storage engines that are not
    enabled or marked as 'not touchable because of options like innodb_force_recovery'.
Comment by Marko Mäkelä [ 2021-09-07 ]

innodb_force_recovery is somewhat broken by design, and a collection of Boolean flags would be better. Let me enumerate the values:

  • 0: normal
  • 1: ignore missing files and corrupted pages on redo log recovery
  • 2: 1 + disable background tasks, such as the purge of transaction history
  • 3: 2 + disable the rollback of recovered transactions (dictionary tables will remain locked if an incomplete DDL transaction was recovered)
  • 4: like 3 (since MDEV-19514 in 10.5)
  • 5: 4 + disable the recovery of undo logs; all transactions will run at READ UNCOMMITTED isolation level
  • 6: 5 + disable the redo log recovery

As you can see, the settings 3 and 4 may prevent DDL recovery. With the settings 5 and 6, DDL recovery must really be refused.

A possible simple revision might be as follows:

  • Make innodb_force_recovery=3 roll back DDL transactions, but not any other recovered transactions. Thus, we can allow DDL log recovery normally.
  • Make innodb_force_recovery=4 (or larger) refuse to roll back DDL transactions, and refuse InnoDB startup if DDL recovery would be needed and some dictionary tables are recovered as locked.

Can InnoDB plugin initialization be informed that the ddl-recovery.log is not logically empty? If so, then it could just refuse startup when it is undesirable.

Comment by Marko Mäkelä [ 2021-10-26 ]

I got a patch that implements a predicate ddl_log_check_if_engine_is_used().

Comment by Marko Mäkelä [ 2021-10-26 ]

Because ddl_log_check_if_engine_is_used() only accepts the name of a storage engine and not a table, I think that it is simplest to refuse startup if DDL log for InnoDB exists and innodb_read_only=ON or innodb_force_recovery>2 has been specified.

Comment by Marko Mäkelä [ 2021-10-26 ]

Please fix the init_server_components() in my work-in-progress change. It does not actually fix the bug (by refusing InnoDB startup) because ddl_log_initialize() is being called after innodb_init() and not before it.

Comment by Marko Mäkelä [ 2021-10-27 ]

The test case innodb.alter_copy would cover this, but it currently exposes a problem:

2021-10-27 15:19:24 0 [ERROR] DDL_LOG: Got error 4182 when trying to execute action for entry 2 of type 'alter table'

In addition to this error, we would ‘leak’ a #sql-alter table from an ALTER TABLE operation that was interrupted by killing the server.

Because in the general case it is not possible to apply only part of the DDL recovery log, it may be simplest to make the entire server to refuse start up if some storage engine is unable to perform DDL recovery.

Comment by Marko Mäkelä [ 2021-10-29 ]

With the original dataset and with a recent 10.6 that includes a fix of MDEV-25919, I am seeing the following for both data directories:

sql/mariadbd --innodb-force-recovery=3 --datadir /dev/shm/atomic_ddl/MDEV-25683/data_orig
sql/mariadbd --innodb-force-recovery=3 --datadir /dev/shm/atomic_ddl/MDEV-25683/data

10.6 d8c6c53a0658ea8315252997f65d796b13182387

2021-10-29  8:19:21 0 [Note] InnoDB: At LSN: 4048912: unable to open file ./test/#sql-backup-27122f-10.ibd for tablespace 88
2021-10-29  8:19:21 0 [Note] InnoDB: At LSN: 4190176: unable to open file ./test/#sql-backup-27122f-10.ibd for tablespace 104
2021-10-29  8:19:22 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
2021-10-29  8:19:22 0 [Note] InnoDB: 10.6.5 started; log sequence number 4332081; transaction id 2959
2021-10-29  8:19:22 0 [Note] InnoDB: Loading buffer pool(s) from /dev/shm/atomic_ddl/MDEV-25683/data_orig/ib_buffer_pool
2021-10-29  8:19:22 0 [Note] Plugin 'FEEDBACK' is disabled.
2021-10-29  8:19:22 0 [Note] InnoDB: Buffer pool(s) load completed at 211029  8:19:22
2021-10-29  8:19:22 0 [ERROR] mariadbd: Can't open shared library '/usr/local/mysql/lib/plugin/disks.so' (errno: 0, cannot open shared object file: No such file or directory)
2021-10-29  8:19:22 0 [Warning] 10.6m/sql/mariadbd: unknown option '--loose-enable-named-pipe'
2021-10-29  8:19:22 0 [ERROR] InnoDB: Trying to load index `PRIMARY` for table `test`.`#sql-alter-27122f-10`, but the index tree has been freed!
2021-10-29  8:19:22 0 [Note] InnoDB: Index is corrupt but forcing load into data dictionary

The server is hanging in ddl_log_execute_recovery() and not responding to SIGQUIT. The culprit is a lock wait on the InnoDB table SYS_TABLES when trying to drop the table #sql-alter-27122f-10. That lock wait will occur because a DDL transaction was recovered, but its rollback was prohibited by innodb_force_recovery=3.

The last messages about #sql-alter-27122f-10 are issued by dict_load_indexes() and should be removed by MDEV-26933.

Now, how would the situation change if I set innodb_force_recovery=5 or innodb_read_only=ON? In those modes, InnoDB should refuse any DDL or DML operation, except maybe to temporary tables.

sql/mariadbd --innodb-read-only --datadir /dev/shm/atomic_ddl/MDEV-25683/data_orig

10.6 d8c6c53a0658ea8315252997f65d796b13182387

2021-10-29  8:55:47 0 [Warning] InnoDB: innodb_read_only prevents crash recovery
2021-10-29  8:55:47 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1423] with error Read only transaction
2021-10-29  8:55:47 0 [ERROR] Unknown/unsupported storage engine: InnoDB
2021-10-29  8:55:47 0 [ERROR] Aborting

MDEV-12700 has been filed for lifting this limitation.

sql/mariadbd --innodb-force-recovery=5 --datadir /dev/shm/atomic_ddl/MDEV-25683/data_orig

10.6 d8c6c53a0658ea8315252997f65d796b13182387

2021-10-29  8:52:32 0 [Note] DDL_LOG: Crash recovery executed 1 entries
2021-10-29  8:52:32 0 [Note] Server socket created on IP: '0.0.0.0'.
2021-10-29  8:52:32 0 [Note] Server socket created on IP: '::'.
2021-10-29  8:52:32 1 [ERROR] InnoDB: Failed to find tablespace for table `mysql`.`gtid_slave_pos` in the cache. Attempting to load the tablespace with space id 6
2021-10-29  8:52:32 0 [Note] 10.6/sql/mariadbd: ready for connections.
Version: '10.6.5-MariaDB-debug'  socket: '/dev/shm/mariadb.sock'  port: 3306  Source distribution

The server will start up, but not nicely. There is no log message saying that we failed to drop the table #sql-alter-27122f-10 from InnoDB. The .frm file will be removed, but the #sql-alter-27122f-10.ibd will remain, as will the InnoDB data dictionary entries for the table. This would not be corrected on the next restart without that InnoDB setting.

Note: I would not recommend to trust any dumped data if the server was started up with innodb_force_recovery=5. That would make all transactions use the READ UNCOMMITTED isolation level. Currently, a start-up with innodb_force_recovery=5 would avoid the DDL recovery hang and leave the InnoDB dictionary inconsistent with the DDL log. But, it would also allow a subsequent restart with innodb_force_recovery=3, which should be safer for dumping the data, because the proper REPEATABLE READ isolation level will be available. (If some pointers to undo log are corrupted, then the server might crash while reading some data.)

Now, one last attempt. What if we implement my earlier suggestion that we will start to distinguish innodb_force_recovery=4 from innodb_force_recovery=3 in 10.6? (That setting was effectively removed by MDEV-19514 in 10.5.) That is, innodb_force_recovery=4 would prevent the rollback of a recovered DDL transaction, while innodb_force_recovery=3 would only prevent the rollback of DML transactions:

diff --git a/storage/innobase/srv/srv0start.cc b/storage/innobase/srv/srv0start.cc
index 3f0ccf44b1e..2975bc10f51 100644
--- a/storage/innobase/srv/srv0start.cc
+++ b/storage/innobase/srv/srv0start.cc
@@ -1745,7 +1745,7 @@ dberr_t srv_start(bool create_new_db)
 		should guarantee that there is at most one data
 		dictionary transaction active at a time. */
 		if (!high_level_read_only
-		    && srv_force_recovery < SRV_FORCE_NO_TRX_UNDO) {
+		    && srv_force_recovery <= SRV_FORCE_NO_TRX_UNDO) {
 			/* If the following call is ever removed, the
 			first-time ha_innobase::open() must hold (or
 			acquire and release) a table lock that
diff --git a/storage/innobase/trx/trx0roll.cc b/storage/innobase/trx/trx0roll.cc
index a0582413d07..fd28dae4715 100644
--- a/storage/innobase/trx/trx0roll.cc
+++ b/storage/innobase/trx/trx0roll.cc
@@ -709,7 +709,7 @@ void trx_rollback_recovered(bool all)
 {
   std::vector<trx_t*> trx_list;
 
-  ut_a(srv_force_recovery < SRV_FORCE_NO_TRX_UNDO);
+  ut_a(srv_force_recovery < (all ? SRV_FORCE_NO_TRX_UNDO : SRV_FORCE_NO_IBUF_MERGE));
 
   /*
     Collect list of recovered ACTIVE transaction ids first. Once collected, no

The final fix would rename SRV_FORCE_NO_IBUF_MERGE to a more descriptive name SRV_FORCE_NO_DDL_UNDO and change the documentation of innodb_force_recovery. With the above patch, the server will start up and perform the DDL log recovery normally on both data directories:

sql/mariadbd --innodb-force-recovery=3 --datadir /dev/shm/atomic_ddl/MDEV-25683/data_orig
sql/mariadbd --innodb-force-recovery=3 --datadir /dev/shm/atomic_ddl/MDEV-25683/data

10.6 d8c6c53a0658ea8315252997f65d796b13182387 with patch

2021-10-29  9:21:19 0 [Note] DDL_LOG: Crash recovery executed 1 entries
2021-10-29  9:21:19 0 [Note] Server socket created on IP: '0.0.0.0'.
2021-10-29  9:21:19 0 [Note] Server socket created on IP: '::'.
2021-10-29  9:21:19 0 [Note] 10.6m/sql/mariadbd: ready for connections.
Version: '10.6.5-MariaDB-debug'  socket: '/dev/shm/mariadb.sock'  port: 3306  Source distribution

There will be no #sql files in the data directories afterwards.

With or without the above patch, the hang can be reproduced with innodb_force_recovery=4 just fine. And the message about #sql-alter- will be back too:

sql/mariadbd --innodb-force-recovery=4 --datadir /dev/shm/atomic_ddl/MDEV-25683/data_orig/

10.6 d8c6c53a0658ea8315252997f65d796b13182387 with or without patch

2021-10-29  9:26:35 0 [ERROR] InnoDB: Trying to load index `PRIMARY` for table `test`.`#sql-alter-27122f-10`, but the index tree has been freed!
2021-10-29  9:26:35 0 [Note] InnoDB: Index is corrupt but forcing load into data dictionary
[startup hangs here]

Comment by Marko Mäkelä [ 2021-10-29 ]

Fix: innodb_force_recovery=3 will no longer prevent the rollback of recovered DDL transactions. This allows the data set to recover correctly when using innodb_force_recovery=3.

Other settings than innodb_force_recovery=3 were unaffected by this change.
The setting innodb_force_recovery=4 will continue to prevent the rollback of recovered DDL transactions and likely make the DDL log recovery hang.
The setting innodb_force_recovery=5 allows the server to start up, but the DDL log recovery will be skipped inside InnoDB (causing inconsistency between InnoDB and the .frm files).

Comment by Sergei Golubchik [ 2021-10-29 ]

greenman, see above and in particular see the commit comment. This impacts the documentation of innodb_force_recovery

Comment by Marko Mäkelä [ 2021-11-04 ]

MDEV-26966 in 10.6.6 (not 10.6.5) removes these two messages that were previously output at innodb_force_recovery=4:

10.6 d8c6c53a0658ea8315252997f65d796b13182387

2021-10-29  9:26:35 0 [ERROR] InnoDB: Trying to load index `PRIMARY` for table `test`.`#sql-alter-27122f-10`, but the index tree has been freed!
2021-10-29  9:26:35 0 [Note] InnoDB: Index is corrupt but forcing load into data dictionary

After MDEV-26966, the startup will look like this:

10.6 667e71399ea5deba0b2530afe26a0699a59c56ec

2021-11-04 10:41:34 0 [Note] InnoDB: 10.6.5 started; log sequence number 4332197; transaction id 2959
2021-11-04 10:41:34 0 [Note] InnoDB: Loading buffer pool(s) from /dev/shm/atomic_ddl/MDEV-25683/data/ib_buffer_pool
2021-11-04 10:41:34 0 [Note] InnoDB: Buffer pool(s) load completed at 211104 10:41:34
2021-11-04 10:41:34 0 [Note] Plugin 'FEEDBACK' is disabled.
2021-11-04 10:41:34 0 [ERROR] mariadbd: Can't open shared library '/usr/local/mysql/lib/plugin/disks.so' (errno: 0, cannot open shared object file: No such file or directory)
2021-11-04 10:41:34 0 [Warning] /dev/shm/10.6m/sql/mariadbd: unknown option '--loose-enable-named-pipe'

Yes, it will still hang there due to the lock conflict on a dictionary table, because the recovered DDL transaction will not be rolled back at innodb_force_recovery=4. This MDEV-25683 changed it only for innodb_force_recovery=3.

Generated at Thu Feb 08 09:39:33 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.