[MDEV-36288] sequences are not quite durable in the presense of crashes - Jira

Details

Type: Bug
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Affects Version/s: 10.5, 10.6, 10.11, 11.4, 11.8
Fix Version/s: 10.11, 11.4, 11.8
Component/s: Sequences, Storage Engine - InnoDB
Labels:
None

Description

This test case

--source include/have_innodb.inc

create sequence t1 engine=innodb;

select next value for t1;

select next value for t1;

--source include/kill_mysqld.inc

--source include/start_mysqld.inc

select next value for t1;

drop sequence t1;

shows

create sequence t1 engine=innodb;

select next value for t1;

next value for t1

select next value for t1;

next value for t1

# Kill the server

# restart

select next value for t1;

next value for t1

drop sequence t1;

That is, sequence updates aren't always durable

Attachments

Activity

Ascending order - Click to sort in descending order

Marko Mäkelä added a comment - 2025-03-14 07:43

InnoDB tries to defer log writes as much as possible, so that the group commit would work as efficiently as possible. If the log of every mini-transaction were written immediately from log_sys.buf to ib_logfile0, there would be frequent overwrites of the last physical block of the log file.

If a transaction does not have any pending undo log records in trx_t::commit(), normally it means that the transaction had been rolled back. We do not want to durably write log for such transactions. If the server were killed and after crash recovery we observed that the transaction was not committed (also rolled-back transactions will be committed after undoing all changes), then we would roll back the transaction in the background, implicitly holding exclusive locks on the modified rows until the rollback is completed.

If we wanted to make any operations on sequences durable, we could introduce a flag in trx_t that would be set to indicate that even if there are no undo log records associated with the transaction, we would make the “empty” commit durable.

I think that we must consider replication as well. How would these changes be replicated, and what is the expected result? Can you or perhaps knielsen help write a test that covers the replication related aspects of this?

Marko Mäkelä added a comment - 2025-03-14 07:43 InnoDB tries to defer log writes as much as possible, so that the group commit would work as efficiently as possible. If the log of every mini-transaction were written immediately from log_sys.buf to ib_logfile0 , there would be frequent overwrites of the last physical block of the log file. If a transaction does not have any pending undo log records in trx_t::commit() , normally it means that the transaction had been rolled back. We do not want to durably write log for such transactions. If the server were killed and after crash recovery we observed that the transaction was not committed (also rolled-back transactions will be committed after undoing all changes), then we would roll back the transaction in the background, implicitly holding exclusive locks on the modified rows until the rollback is completed. If we wanted to make any operations on sequences durable, we could introduce a flag in trx_t that would be set to indicate that even if there are no undo log records associated with the transaction, we would make the “empty” commit durable. I think that we must consider replication as well. How would these changes be replicated, and what is the expected result? Can you or perhaps knielsen help write a test that covers the replication related aspects of this?

Marko Mäkelä added a comment - 2025-03-14 08:09

If we introduced a flag to make “empty” transactions durable, it seems that we should set this flag also when a XA distributed transaction ID is set, and revise the condition:

diff --git a/storage/innobase/trx/trx0trx.cc b/storage/innobase/trx/trx0trx.cc

index ade8dea929a..04f12fbf802 100644

--- a/storage/innobase/trx/trx0trx.cc

+++ b/storage/innobase/trx/trx0trx.cc

@@ -1468,7 +1468,7 @@ TRANSACTIONAL_INLINE inline void trx_t::commit_in_memory(const mtr_t *mtr)

     serialize all commits and prevent a group of transactions from

     gathering. */

-    commit_lsn= undo_no || !xid.is_null() ? mtr->commit_lsn() : 0;

+    commit_lsn= undo_no || durable_when_empty ? mtr->commit_lsn() : 0;

     if (commit_lsn && !flush_log_later && srv_flush_log_at_trx_commit)

       trx_flush_log_if_needed(commit_lsn, this);

serg, I think that this needs to be assigned to you, because you are working on an alternative fix of ~~MDEV-35813~~ based on this idea.

Marko Mäkelä added a comment - 2025-03-14 08:09 If we introduced a flag to make “empty” transactions durable, it seems that we should set this flag also when a XA distributed transaction ID is set, and revise the condition: diff --git a/storage/innobase/trx/trx0trx.cc b/storage/innobase/trx/trx0trx.cc index ade8dea929a..04f12fbf802 100644 --- a/storage/innobase/trx/trx0trx.cc +++ b/storage/innobase/trx/trx0trx.cc @@ -1468,7 +1468,7 @@ TRANSACTIONAL_INLINE inline void trx_t::commit_in_memory(const mtr_t *mtr) serialize all commits and prevent a group of transactions from gathering. */ - commit_lsn= undo_no || !xid.is_null() ? mtr->commit_lsn() : 0; + commit_lsn= undo_no || durable_when_empty ? mtr->commit_lsn() : 0; if (commit_lsn && !flush_log_later && srv_flush_log_at_trx_commit) { trx_flush_log_if_needed(commit_lsn, this); serg , I think that this needs to be assigned to you, because you are working on an alternative fix of MDEV-35813 based on this idea.

People

Assignee:: Sergei Golubchik

Reporter:: Sergei Golubchik

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 2025-03-13 14:06

Updated:: 2025-03-14 08:09

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server