[MDEV-12700] Allow innodb_read_only startup without prior slow shutdown - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Won't Do
Affects Version/s: 10.0(EOL), 10.1(EOL), 10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5, 10.6, 10.7(EOL), 10.8(EOL), 10.9(EOL), 10.10(EOL), 10.11, 11.0(EOL)
Fix Version/s: 11.0.1
Component/s: Storage Engine - InnoDB
Labels:

Description

The parameter innodb_read_only which first appeared in MySQL 5.6 and MariaDB Server 10.0 disallows startup if the redo log is not logically empty or if the change buffer is not empty.

It is technically possible to apply the needed changes in memory only, without writing them back to the data files. If MariaDB does that, it will be possible to check hot backups in a read-only fashion without any preparatory steps.

Attachments

Issue Links

includes

MDEV-19514 Defer change buffer merge until pages are requested

Closed

is blocked by

MDEV-14481 Execute InnoDB crash recovery in the background

Closed

is duplicated by

MDEV-13614 Allow --innodb-read-only startup even when change buffer or redo logs are not empty

Closed

relates to

MDEV-19176 Do not run out of InnoDB buffer pool during recovery

Closed

MDEV-11814 Refuse innodb_read_only startup if crash recovery is needed

Closed

MDEV-12699 Improve crash recovery of corrupted data pages

Closed

MDEV-14080 InnoDB shutdown sometimes hangs

Closed

MDEV-15202 innodb.log_file_size failed in buildbot - Can't initiate database recovery, running in read-only-mode

Closed

MDEV-29694 Remove the InnoDB change buffer

Closed

(4 relates to)

Activity

Ascending order - Click to sort in descending order

Marko Mäkelä added a comment - 2019-11-05 07:54

There are several things that slow shutdown would do, compared to simply killing the server:

Completely merge the change buffer.
Purge all old history of committed transactions. (innodb_read_only does not care whether this happened.)
Create a log checkpoint (effectively clear the recovery log).

Even with ~~MDEV-19514~~ implemented, we still refuse read-only startup when the change buffer is not empty. The test innodb.innodb-change-buffer-recovery covers this scenario.

Another stumbling stone would be that the redo log might not be empty.

We can fix both issues, by simply being prepared to apply buffered changes or redo log records at any time when an affected page is added to the buffer pool. We would not write any redo log or add anything to buf_pool_t::flush_list.

Note: Obviously, we cannot remove anything from recv_sys.pages in this mode, and neither would we modify any change buffer tree or bitmap pages, not even in the buffer pool. If a page is evicted from the buffer pool and reloaded later, we must apply the buffered changes or log records again.

We may have to refuse startup if the buffer pool is not sufficiently large for buffering all redo log records that were written since the latest checkpoint. The usable capacity of the buffer pool would depend on the size of log that was written since the previous checkpoint.

Marko Mäkelä added a comment - 2019-11-05 07:54 There are several things that slow shutdown would do, compared to simply killing the server: Completely merge the change buffer. Purge all old history of committed transactions. ( innodb_read_only does not care whether this happened.) Create a log checkpoint (effectively clear the recovery log). Even with MDEV-19514 implemented, we still refuse read-only startup when the change buffer is not empty. The test innodb.innodb-change-buffer-recovery covers this scenario. Another stumbling stone would be that the redo log might not be empty. We can fix both issues, by simply being prepared to apply buffered changes or redo log records at any time when an affected page is added to the buffer pool. We would not write any redo log or add anything to buf_pool_t::flush_list . Note: Obviously, we cannot remove anything from recv_sys.pages in this mode, and neither would we modify any change buffer tree or bitmap pages, not even in the buffer pool. If a page is evicted from the buffer pool and reloaded later, we must apply the buffered changes or log records again. We may have to refuse startup if the buffer pool is not sufficiently large for buffering all redo log records that were written since the latest checkpoint. The usable capacity of the buffer pool would depend on the size of log that was written since the previous checkpoint.

Marko Mäkelä added a comment - 2019-11-05 07:56

I think that the redo log related part of this depends on ~~MDEV-14481~~: Execute InnoDB crash recovery in the background.

Marko Mäkelä added a comment - 2019-11-05 07:56 I think that the redo log related part of this depends on MDEV-14481 : Execute InnoDB crash recovery in the background.

Marko Mäkelä added a comment - 2022-12-15 08:19

Note: This limitation can manifest itself in non-obvious ways. Here is a recent example:

10.6 4b2e7616f85c5297d64ce44f9202d39104c06df1
sql_sequence.read_only 'innodb' w4 [ fail ]
Test ended at 2022-12-14 05:02:02

CURRENT_TEST: sql_sequence.read_only
mysqltest: At line 26: query 'create sequence s2 cache 5 engine=innodb' failed with wrong errno ER_UNKNOWN_STORAGE_ENGINE (1286): 'Unknown storage engine 'innodb'', instead of ER_CANT_CREATE_TABLE (1005)...

The server error log shows a more complete picture:

CURRENT_TEST: sql_sequence.read_only

2022-12-14  5:00:48 0 [Note] /usr/sbin/mariadbd (initiated by: root[root] @ localhost []): Normal shutdown

221214  5:01:48 [ERROR] mysqld got signal 6 ;

…

2022-12-14  5:02:01 0 [Warning] InnoDB: innodb_read_only prevents crash recovery

2022-12-14  5:02:01 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1463] with error Read only transaction

For an unknown reason (due to some bug), the server process failed to shut down, and it the process was forcibly kill -ABRT’ed by the test harness.

Subsequently, InnoDB refused to start up, because the option innodb_read_only prevents all writes, including any that might be executed by crash recovery.

Once the change buffer is removed (~~MDEV-29694~~), this limitation could be easier to lift. I do not think that this limitation can be completely lifted. If the innodb_buffer_pool_size is too small to hold copies of all recovered pages (that is, multiple recovery batches would be needed), there is no way to start up the server without writing anything to the file system.

Marko Mäkelä added a comment - 2022-12-15 08:19 Note: This limitation can manifest itself in non-obvious ways. Here is a recent example : 10.6 4b2e7616f85c5297d64ce44f9202d39104c06df1 sql_sequence.read_only 'innodb' w4 [ fail ] Test ended at 2022-12-14 05:02:02 CURRENT_TEST: sql_sequence.read_only mysqltest: At line 26: query 'create sequence s2 cache 5 engine=innodb' failed with wrong errno ER_UNKNOWN_STORAGE_ENGINE (1286): 'Unknown storage engine 'innodb'', instead of ER_CANT_CREATE_TABLE (1005)... The server error log shows a more complete picture: CURRENT_TEST: sql_sequence.read_only 2022-12-14 5:00:48 0 [Note] /usr/sbin/mariadbd (initiated by: root[root] @ localhost []): Normal shutdown 221214 5:01:48 [ERROR] mysqld got signal 6 ; … 2022-12-14 5:02:01 0 [Warning] InnoDB: innodb_read_only prevents crash recovery 2022-12-14 5:02:01 0 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1463] with error Read only transaction For an unknown reason (due to some bug), the server process failed to shut down, and it the process was forcibly kill -ABRT ’ed by the test harness. Subsequently, InnoDB refused to start up, because the option innodb_read_only prevents all writes, including any that might be executed by crash recovery. Once the change buffer is removed ( MDEV-29694 ), this limitation could be easier to lift. I do not think that this limitation can be completely lifted. If the innodb_buffer_pool_size is too small to hold copies of all recovered pages (that is, multiple recovery batches would be needed), there is no way to start up the server without writing anything to the file system.

Marko Mäkelä added a comment - 2023-05-15 06:12

Here is a recent occurrence of sql_sequence.read_only failure:

10.5 c9eff1a144ba44846373660a30d342d3f0dc91a5
CURRENT_TEST: sql_sequence.read_only
mysqltest: At line 26: query 'create sequence s2 cache 5 engine=innodb' failed with wrong errno 1286: 'Unknown storage engine 'innodb'', instead of 1005...

The result from queries just before the failure was:
…
#4 0x000055ff3bc813c1 in debug_sync (thd=0x7f4b50000db8, sync_point_name=0x55ff3f54cee9 "innobase_connection_closed", name_len=26) at /home/buildbot/buildbot/build/mariadb-10.5.21/sql/debug_sync.cc:1580
#5 0x000055ff3e03ce4d in innobase_close_connection (hton=0x55ff45289a28, thd=0x7f4b50000db8) at /home/buildbot/buildbot/build/mariadb-10.5.21/storage/innobase/handler/ha_innodb.cc:4785
#6 0x000055ff3cbe0581 in ha_close_connection (thd=0x7f4b50000db8) at /home/buildbot/buildbot/build/mariadb-10.5.21/sql/handler.cc:902
#7 0x000055ff3b8e59cc in THD::free_connection (this=0x7f4b50000db8) at /home/buildbot/buildbot/build/mariadb-10.5.21/sql/sql_class.cc:1615
#8 0x000055ff3b575937 in unlink_thd (thd=0x7f4b50000db8) at /home/buildbot/buildbot/build/mariadb-10.5.21/sql/mysqld.cc:2625

This stack trace is from a hang on shutdown, caused by the earlier test main.kill_debug that had been executed on the same worker earlier:

main.kill_debug 'innodb'                 w3 [ pass ]    344

This is being tracked as ~~MDEV-30084~~.

Marko Mäkelä added a comment - 2023-05-15 06:12 Here is a recent occurrence of sql_sequence.read_only failure : 10.5 c9eff1a144ba44846373660a30d342d3f0dc91a5 CURRENT_TEST: sql_sequence.read_only mysqltest: At line 26: query 'create sequence s2 cache 5 engine=innodb' failed with wrong errno 1286: 'Unknown storage engine 'innodb'', instead of 1005... The result from queries just before the failure was: … #4 0x000055ff3bc813c1 in debug_sync (thd=0x7f4b50000db8, sync_point_name=0x55ff3f54cee9 "innobase_connection_closed", name_len=26) at /home/buildbot/buildbot/build/mariadb-10.5.21/sql/debug_sync.cc:1580 #5 0x000055ff3e03ce4d in innobase_close_connection (hton=0x55ff45289a28, thd=0x7f4b50000db8) at /home/buildbot/buildbot/build/mariadb-10.5.21/storage/innobase/handler/ha_innodb.cc:4785 #6 0x000055ff3cbe0581 in ha_close_connection (thd=0x7f4b50000db8) at /home/buildbot/buildbot/build/mariadb-10.5.21/sql/handler.cc:902 #7 0x000055ff3b8e59cc in THD::free_connection (this=0x7f4b50000db8) at /home/buildbot/buildbot/build/mariadb-10.5.21/sql/sql_class.cc:1615 #8 0x000055ff3b575937 in unlink_thd (thd=0x7f4b50000db8) at /home/buildbot/buildbot/build/mariadb-10.5.21/sql/mysqld.cc:2625 This stack trace is from a hang on shutdown, caused by the earlier test main.kill_debug that had been executed on the same worker earlier: main.kill_debug 'innodb' w3 [ pass ] 344 This is being tracked as MDEV-30084 .

Marko Mäkelä added a comment - 2023-06-07 07:13

The change buffer related part of this was fixed in ~~MDEV-29694~~, by removing the change buffer.

The redo log related part will not be done. See ~~MDEV-14481~~ for details.

Marko Mäkelä added a comment - 2023-06-07 07:13 The change buffer related part of this was fixed in MDEV-29694 , by removing the change buffer. The redo log related part will not be done. See MDEV-14481 for details.

MariaDB Server

Allow innodb_read_only startup without prior slow shutdown

Details

Description

Attachments

Issue Links

Activity

People

Dates

Git Integration