Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.2.6, 10.2.17, 10.3(EOL)
-
None
-
Ubuntu 14.04.4 LTS
Description
Hello,
I've been using MariaDB 10.2.6 with slave_run_triggers_for_rbr for some aggregation tables.
However, there's a big memory leak with this setup and the server regularly hangs.
Last time it hangs, innodb crash recovery could not start because MLOG_CHECKPOINT was not found.
My server has got a BBWC so any fsync should be persisted on disk. I dont understand how that could happen ?
Attachments
Issue Links
- duplicates
-
MDEV-17680 innodb.undo_truncate_recover fails in buildbot with Missing MLOG_CHECKPOINT
-
- Closed
-
- relates to
-
MDEV-15282 innodb.autoinc_persist failed in buildbot, Assertion failed: recv_sys->mlog_checkpoint_lsn <= recv_sys->recovered_lsn
-
- Closed
-
-
MDEV-18038 Assertion failure in innodb.undo_truncate_recover: "pad_len >= len || i * 512U >= len - pad_len || log_block_get_hdr_no( buf + i * 512U) == log_block_get_hdr_no(buf) + i"
-
- Closed
-
-
MDEV-13830 Assertion failed: recv_sys->mlog_checkpoint_lsn <= recv_sys->recovered_lsn
-
- Closed
-
-
MDEV-19346 Assertion `recv_sys->mlog_checkpoint_lsn <= recv_sys->recovered_lsn' failed in recv_parse_log_recs during mariabackup --prepare
-
- Closed
-
kevg, that is interesting. Could it be that we sometimes update last_checkpoint_lsn before the corresponding write to the checkpoint header becomes persistent? Could it be that fil_aio_wait() is invoked on the redo log pseudo-tablespace for something else than the latest pending checkpoint write? After all, we do allow multiple log_checkpoint() executions in parallel, interleaved with each other.
Could you add an assertion to log_complete_checkpoint() or its call path, to verify that the completed write was for the expected checkpoint?
Note: The redo log is not being extended while the server is running. Only the redo log write buffer is extended during this in 10.2 and 10.3. In 10.4, less redo log is being written thanks to the
MDEV-17138MLOG_MEMSET record.