[MDEV-14481] Execute InnoDB crash recovery in the background - Jira

Details

Type: Task
Status: Closed (View Workflow)
Priority: Major
Resolution: Won't Do
Fix Version/s: N/A
Component/s: Storage Engine - InnoDB
Labels:
- performance
- recovery

Description

InnoDB startup unnecessarily waits for recovered redo log records to be applied to the data files.

In fact, normally while the function trx_sys_init_at_db_start() is executing, the pages that it is requesting from the buffer pool will have any recovered redo log applied to them in the background.

Basically, we only need to remove or refactor some calls in the InnoDB server startup. Some of this was performed in ~~MDEV-19514~~ and ~~MDEV-21216~~.
The crash recovery would ‘complete’ at the time of the next redo log checkpoint is written.

We should rewrite or remove ~~recv_recovery_from_checkpoint_finish()~~ recv_sys.apply(true) so that ~~it will not wait for any page flushing to complete~~ (already done in ~~MDEV-27022~~). While doing this, we must also ~~remove the buf_pool_t::flush_rbt~~ (removed in ~~MDEV-23399~~) and ~~use the normal flushing mechanism that strictly obeys the ARIES protocol for write-ahead logging~~ (implemented in ~~MDEV-24626~~).

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

1.svg
457 kB
2021-11-26 04:10
Screenshot from 2021-11-26 10-08-41.png
117 kB
2021-11-26 04:10

Issue Links

blocks

MDEV-12700 Allow innodb_read_only startup without prior slow shutdown

Closed

includes

MDEV-27022 Buffer pool is being flushed during recovery

Closed

is blocked by

MDEV-9843 InnoDB hangs on startup between "InnoDB: Apply batch completed" and "rollback segment(s) are active", various tests fail sporadically in buildbot on p8-rhel6-bintar-debug

Closed

MDEV-13564 TRUNCATE TABLE and undo tablespace truncation are not compatible with Mariabackup

Closed

MDEV-13869 MariaDB slow start

Closed

MDEV-19514 Defer change buffer merge until pages are requested

Closed

MDEV-21216 InnoDB performs dirty read of TRX_SYS page before crash recovery

Closed

MDEV-23399 10.5 performance regression with IO-bound tpcc

Closed

MDEV-24626 Remove synchronous write of page0 and flushing file during file creation

Closed

MDEV-27022 Buffer pool is being flushed during recovery

Closed

relates to

MDEV-12699 Improve crash recovery of corrupted data pages

Closed

MDEV-13542 Crashing on a corrupted page is unhelpful

Closed

MDEV-13564 TRUNCATE TABLE and undo tablespace truncation are not compatible with Mariabackup

Closed

MDEV-14935 Remove bogus conditions related to not redo-logging PAGE_MAX_TRX_ID changes

Closed

MDEV-18733 MariaDB slow start after crash recovery

Closed

MDEV-19229 Allow innodb_undo_tablespaces to be changed after database creation

Closed

MDEV-27610 Unnecessary wait in InnoDB crash recovery

Closed

MDEV-29911 InnoDB recovery and mariadb-backup --prepare fail to report detailed progress

Closed

MDEV-30069 InnoDB: Trying to write ... bytes at ... outside the bounds of the file ...

Closed

MDEV-9663 InnoDB assertion failure: *cursor->index->name == TEMP_INDEX_PREFIX, or !cursor->index->is_committed()

Closed

MDEV-14425 Change the InnoDB redo log format to reduce write amplification

Closed

MDEV-26326 MDEV-24626 (remove synchronous page0 write) seems to cause mariabackup to skip valid ibd file.

Closed

(5 is blocked by, 12 relates to)

Activity

Ascending order - Click to sort in descending order

Marko Mäkelä created issue - 2017-11-23 09:05

Marko Mäkelä made changes - 2017-11-23 09:05

Field	Original Value	New Value
Link		This issue relates to ~~MDEV-14425~~ [ ~~MDEV-14425~~ ]

Marko Mäkelä made changes - 2018-01-12 14:54

Link

This issue relates to ~~MDEV-14935~~ [ ~~MDEV-14935~~ ]

Marko Mäkelä made changes - 2018-01-16 18:37

Link

This issue is blocked by ~~MDEV-9843~~ [ ~~MDEV-9843~~ ]

Marko Mäkelä made changes - 2018-02-13 06:45

Link

This issue is blocked by ~~MDEV-13869~~ [ ~~MDEV-13869~~ ]

Thirunarayanan Balathandayuthapani made changes - 2018-02-13 09:07

Assignee

Marko Mäkelä [ marko ]

Thirunarayanan B [ thiru ]

Marko Mäkelä made changes - 2018-03-09 10:10

Link

This issue relates to ~~MDEV-9663~~ [ ~~MDEV-9663~~ ]

Marko Mäkelä made changes - 2018-09-05 10:35

Link

This issue is blocked by ~~MDEV-13564~~ [ ~~MDEV-13564~~ ]

Marko Mäkelä made changes - 2019-02-04 16:17

Link

This issue relates to ~~MDEV-16526~~ [ ~~MDEV-16526~~ ]

Marko Mäkelä made changes - 2019-03-22 14:49

NRE Projects

RM_105_CANDIDATE

Marko Mäkelä made changes - 2019-04-05 17:59

Link

This issue relates to ~~MDEV-12699~~ [ ~~MDEV-12699~~ ]

Marko Mäkelä made changes - 2019-04-09 06:26

Link

This issue relates to ~~MDEV-11634~~ [ ~~MDEV-11634~~ ]

Marko Mäkelä made changes - 2019-04-09 08:52

Link

This issue relates to ~~MDEV-18733~~ [ ~~MDEV-18733~~ ]

Marko Mäkelä made changes - 2019-04-12 15:31

Link

This issue relates to ~~MDEV-19229~~ [ ~~MDEV-19229~~ ]

Marko Mäkelä made changes - 2019-04-12 15:31

Link

This issue relates to ~~MDEV-13564~~ [ ~~MDEV-13564~~ ]

Marko Mäkelä made changes - 2019-05-17 13:31

Link

This issue is blocked by ~~MDEV-19514~~ [ ~~MDEV-19514~~ ]

Marko Mäkelä made changes - 2019-05-17 13:32

Link

This issue relates to ~~MDEV-11634~~ [ ~~MDEV-11634~~ ]

Sergei Golubchik made changes - 2019-11-03 08:35

Fix Version/s		10.5 [ 23123 ]
Fix Version/s	10.4 [ 22408 ]

Marko Mäkelä made changes - 2019-11-05 07:56

Link

This issue blocks ~~MDEV-12700~~ [ ~~MDEV-12700~~ ]

Marko Mäkelä made changes - 2019-11-28 11:40

Link

This issue is blocked by ~~MDEV-16526~~ [ ~~MDEV-16526~~ ]

Thirunarayanan Balathandayuthapani made changes - 2019-12-03 11:09

Status

Open [ 1 ]

In Progress [ 3 ]

Marko Mäkelä made changes - 2019-12-04 13:01

Link

This issue is blocked by ~~MDEV-21216~~ [ ~~MDEV-21216~~ ]

Marko Mäkelä made changes - 2019-12-04 13:46

Description

InnoDB startup unnecessarily waits for recovered redo log records to be applied to the data files.

In fact, normally while the function trx_sys_init_at_db_start() is executing, the pages that it is requesting from the buffer pool will have any recovered redo log applied to them in the background.

Basically, we only need to remove or refactor some calls in the InnoDB server startup. The crash recovery would ‘complete’ at the time of the next redo log checkpoint is written.

One of the code pieces that must be removed is this one:
{code:diff}
commit 5103b38fc3cbc73d4a66a1b21740c4d7498f4a67
Author: Marko Mäkelä <marko.makela@mariadb.com>
Date: Wed Nov 15 06:51:19 2017 +0200

    Remove one more trace of innodb_file_format

    innobase_start_or_create_for_mysql(): Remove an unnecessary call
    that should have been removed as part of
    commit 0c92794db3026cda03218caf4918b996baab6ba6
    when the dirty read of the innodb_file_format tag was removed.

diff --git a/storage/innobase/srv/srv0start.cc b/storage/innobase/srv/srv0start.cc
index 988ddf1a759..4f8823ef36c 100644
--- a/storage/innobase/srv/srv0start.cc
+++ b/storage/innobase/srv/srv0start.cc
@@ -2194,13 +2194,6 @@ innobase_start_or_create_for_mysql()
return(srv_init_abort(err));
}
} else {
- /* Invalidate the buffer pool to ensure that we reread
- the page that we read above, during recovery.
- Note that this is not as heavy weight as it seems. At
- this point there will be only ONE page in the buf_LRU
- and there must be no page in the buf_flush list. */
- buf_pool_invalidate();
-
/* Scan and locate truncate log files. Parsed located files
and add table to truncate information to central vector for
truncate fix-up action post recovery. */
{code}
However, if we apply this patch, crash recovery will start failing. The reason could be that we are accessing the InnoDB system and undo tablespace files before opening the redo logs:
{code:c}
fil_open_log_and_system_tablespace_files();
ut_d(fil_space_get(0)->recv_size = srv_sys_space_size_debug);

err = srv_undo_tablespaces_init(create_new_db);
…
err = recv_recovery_from_checkpoint_start(flushed_lsn);
{code}
If we first open the redo log files and then the persistent data files, then there should be no need to invalidate the buffer pool contents.

Similarly, later in the startup, we should rewrite or remove recv_recovery_from_checkpoint_finish() so that it will not wait for any page flushing to complete. While doing this, we must also remove the buf_pool_t::flush_rbt and use the normal flushing mechanism that strictly obeys the ARIES protocol for write-ahead logging.

InnoDB startup unnecessarily waits for recovered redo log records to be applied to the data files.

In fact, normally while the function {{trx_sys_init_at_db_start()}} is executing, the pages that it is requesting from the buffer pool will have any recovered redo log applied to them in the background.

Basically, we only need to remove or refactor some calls in the InnoDB server startup. Some of this was performed in ~~MDEV-19514~~ and ~~MDEV-21216~~.
The crash recovery would ‘complete’ at the time of the next redo log checkpoint is written.

We should rewrite or remove {{recv_recovery_from_checkpoint_finish()}} so that it will not wait for any page flushing to complete. While doing this, we must also remove the {{buf_pool_t::flush_rbt}} and use the normal flushing mechanism that strictly obeys the ARIES protocol for write-ahead logging.

Marko Mäkelä made changes - 2019-12-04 14:01

Status

In Progress [ 3 ]

Stalled [ 10000 ]

Marko Mäkelä made changes - 2019-12-31 09:29

Link

This issue relates to ~~MDEV-16526~~ [ ~~MDEV-16526~~ ]

Julien Fritsch made changes - 2020-02-10 11:10

Due Date

2020-03-31

Sergei Golubchik made changes - 2020-02-18 14:19

Fix Version/s		10.6 [ 24028 ]
Fix Version/s	10.5 [ 23123 ]

Marko Mäkelä made changes - 2020-02-26 08:09

Link

This issue relates to ~~MDEV-13542~~ [ ~~MDEV-13542~~ ]

Ralf Gebhardt made changes - 2020-03-27 17:15

Due Date

2020-03-31

Sergei Golubchik made changes - 2020-08-16 20:02

Rank

Ranked higher

Marko Mäkelä made changes - 2020-08-26 15:34

Link

This issue is blocked by ~~MDEV-23399~~ [ ~~MDEV-23399~~ ]

Marko Mäkelä made changes - 2020-10-28 14:38

Link

This issue is blocked by ~~MDEV-16526~~ [ ~~MDEV-16526~~ ]

Ralf Gebhardt made changes - 2021-02-26 14:40

Fix Version/s		N/A [ 14700 ]
Fix Version/s	10.6 [ 24028 ]

Julien Fritsch made changes - 2021-03-03 08:00

Fix Version/s		10.7 [ 24805 ]
Fix Version/s	N/A [ 14700 ]

Ralf Gebhardt made changes - 2021-10-25 12:40

Fix Version/s		10.8 [ 26121 ]
Fix Version/s	10.7 [ 24805 ]

Marko Mäkelä made changes - 2021-10-27 17:32

Assignee

Thirunarayanan Balathandayuthapani [ thiru ]

Eugene Kosov [ kevg ]

Sergei Golubchik made changes - 2021-10-29 13:43

Priority

Major [ 3 ]

Critical [ 2 ]

Eugene Kosov (Inactive) made changes - 2021-11-11 09:18

Link

This issue includes ~~MDEV-27022~~ [ ~~MDEV-27022~~ ]

Marko Mäkelä made changes - 2021-11-11 15:12

Link

This issue relates to ~~MDEV-26326~~ [ ~~MDEV-26326~~ ]

Eugene Kosov (Inactive) made changes - 2021-11-26 04:10

Attachment

Screenshot from 2021-11-26 10-08-41.png [ 60930 ]

Eugene Kosov (Inactive) made changes - 2021-11-26 04:10

Attachment

1.svg [ 60931 ]

Sergei Golubchik made changes - 2021-12-06 21:22

Workflow

MariaDB v3 [ 83949 ]

MariaDB v4 [ 131680 ]

Marko Mäkelä made changes - 2022-01-25 07:53

Link

This issue relates to ~~MDEV-27610~~ [ ~~MDEV-27610~~ ]

Sergei Golubchik made changes - 2022-02-01 11:32

Fix Version/s		10.9 [ 26905 ]
Fix Version/s	10.8 [ 26121 ]

Marko Mäkelä made changes - 2022-02-21 14:18

Assignee

Eugene Kosov [ kevg ]

Vladislav Lesin [ vlad.lesin ]

Sergei Golubchik made changes - 2022-03-15 19:40

Fix Version/s		10.10 [ 27530 ]
Fix Version/s	10.9 [ 26905 ]

Sergei Golubchik made changes - 2022-05-19 10:33

Priority

Critical [ 2 ]

Major [ 3 ]

Sergei Golubchik made changes - 2022-06-15 13:35

Fix Version/s		10.11 [ 27614 ]
Fix Version/s	10.10 [ 27530 ]

Sergei Golubchik made changes - 2022-08-02 19:58

Fix Version/s		10.12 [ 28320 ]
Fix Version/s	10.11 [ 27614 ]

AirFocus made changes - 2022-08-09 16:11

Description

Marko Mäkelä made changes - 2022-11-22 09:00

Link

This issue relates to ~~MDEV-30069~~ [ ~~MDEV-30069~~ ]

Vladislav Lesin made changes - 2023-02-09 11:37

Status

Stalled [ 10000 ]

In Progress [ 3 ]

Ralf Gebhardt made changes - 2023-02-09 12:16

Fix Version/s		11.1 [ 28549 ]
Fix Version/s	11.0 [ 28320 ]

Marko Mäkelä made changes - 2023-02-09 14:00

Link

This issue is blocked by ~~MDEV-24626~~ [ ~~MDEV-24626~~ ]

Marko Mäkelä made changes - 2023-02-09 14:00

Link

This issue is blocked by ~~MDEV-27022~~ [ ~~MDEV-27022~~ ]

Marko Mäkelä made changes - 2023-02-09 14:00

Description

InnoDB startup unnecessarily waits for recovered redo log records to be applied to the data files.

In fact, normally while the function {{trx_sys_init_at_db_start()}} is executing, the pages that it is requesting from the buffer pool will have any recovered redo log applied to them in the background.

Basically, we only need to remove or refactor some calls in the InnoDB server startup. Some of this was performed in ~~MDEV-19514~~ and ~~MDEV-21216~~.
The crash recovery would ‘complete’ at the time of the next redo log checkpoint is written.

We should rewrite or remove -{{recv_recovery_from_checkpoint_finish()}}- {{recv_sys.apply(true)}} so that -it will not wait for any page flushing to complete- (already done in ~~MDEV-27022~~). While doing this, we must also -remove the {{buf_pool_t::flush_rbt}}- (removed in ~~MDEV-23399~~) and -use the normal flushing mechanism that strictly obeys the ARIES protocol for write\-ahead logging- (implemented in ~~MDEV-24626~~).

Marko Mäkelä made changes - 2023-04-12 14:02

issue.field.resolutiondate

2023-04-12 14:02:22.0

2023-04-12 14:02:22.471

Marko Mäkelä made changes - 2023-04-12 14:02

Fix Version/s		N/A [ 14700 ]
Fix Version/s	11.1 [ 28549 ]
Assignee	Vladislav Lesin [ vlad.lesin ]	Marko Mäkelä [ marko ]
Resolution		Won't Do [ 10201 ]
Status	In Progress [ 3 ]	Closed [ 6 ]

Marko Mäkelä made changes - 2023-04-12 14:04

Link

This issue relates to ~~MDEV-29911~~ [ ~~MDEV-29911~~ ]

People

Assignee:: Marko Mäkelä

Reporter:: Marko Mäkelä

Votes:: 2 Vote for this issue

Watchers:: 14 Start watching this issue

Dates

Created:: 2017-11-23 09:05

Updated:: 2023-04-13 10:57

Resolved:: 2023-04-12 14:02

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration