[MDEV-22718] trx_purge_truncate_history(): head.trx_no() >= purge_sys.low_limit_no() - Jira

XML

Word

Printable

Details

Type: Bug
Status: In Progress (View Workflow)
Priority: Major
Resolution: Unresolved
Affects Version/s: 5.5(EOL), 10.0(EOL), 10.1(EOL), 10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5(EOL)
Fix Version/s: 13.1
Component/s: Storage Engine - InnoDB
Labels:
None

Epic Link:
InnoDB trx_sys improvements

Description

The if clause for the condition contains the following code:

  if (head.trx_no >= purge_sys.low_limit_no())

    /* This is sometimes necessary. TODO: find out why. */

    head.trx_no= purge_sys.low_limit_no();

    head.undo_no= 0;

This is a fix-up that addresses a situation in the purge system that we think shouldn't happen. The fact that it does happen and we don't know why is a risk which should be investigated--either to fix the root cause or understand it as a legitimate scenario and document it.

NOTE: The bug's scope has shifted from the original description (left below for reference). The current diagnosis for that is that there is no data race, as the data accessed in the function is only written by one thread (the purge coordinator, using exclusive lock on the latch), and for that thread only it is legal to read the data without holding any latch. The known cases where the assertion was violated was from that thread and as of writing this there is no reason to believe that this causes any problems. The comment in the affected code has since been updated to reflect this.

The mention in the comments about MDEV-16260 changing the design and getting rid of the purge co-ordinator should be treated with care - this a separate issue, at the time of writing the design for that is not final and it is not known whether there will or will not be a purge coordinator and what threads access what data . When implementing MDEV-16260, the developer should obviously take care not to introduce data races and should pay attention to this code (low_limit_no()) in particular.

ORIGINAL DESCRIPTION:

Check if this is fixable and re-enable this assertion:

  /** A wrapper around ReadView::low_limit_no(). */

  trx_id_t low_limit_no() const

#if 0 /* Unfortunately we don't hold this assertion, see MDEV-22718. */

    ut_ad(rw_lock_own(&latch, RW_LOCK_S));

#endif

    return view.low_limit_no();

Attachments

Issue Links

relates to

MDEV-16260 Scale the purge effort according to the workload

Open

MDEV-30671 innodb_undo_log_truncate=ON fails to wait for purge of transaction history

Closed

MDEV-35227 Executing CHECK TABLE...EXTENDED right after server startup may attempt to access too old history

Confirmed

MDEV-36845 InnoDB: Failing assertion: tail.trx_no <= last_trx_no

Closed

MDEV-31234 InnoDB does not free UNDO after the fix of MDEV-30671, thus shared tablespace (ibdata1) may grow indefinitely for no good reason

Closed

Activity

People

Assignee:: Andrzej Jarzabek

Reporter:: Sergey Vojtovich

Votes:: 1 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 2020-05-26 13:09

Updated:: 2026-03-30 11:03

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.