Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-22718

trx_purge_truncate_history(): head.trx_no() >= purge_sys.low_limit_no()

    XMLWordPrintable

Details

    • Bug
    • Status: In Progress (View Workflow)
    • Major
    • Resolution: Unresolved
    • 5.5(EOL), 10.0(EOL), 10.1(EOL), 10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5(EOL)
    • 13.1
    • None

    Description

      The if clause for the condition contains the following code:

        if (head.trx_no >= purge_sys.low_limit_no())
        {
          /* This is sometimes necessary. TODO: find out why. */
          head.trx_no= purge_sys.low_limit_no();
          head.undo_no= 0;
        }
      

      This is a fix-up that addresses a situation in the purge system that we think shouldn't happen. The fact that it does happen and we don't know why is a risk which should be investigated--either to fix the root cause or understand it as a legitimate scenario and document it.

      NOTE: The bug's scope has shifted from the original description (left below for reference). The current diagnosis for that is that there is no data race, as the data accessed in the function is only written by one thread (the purge coordinator, using exclusive lock on the latch), and for that thread only it is legal to read the data without holding any latch. The known cases where the assertion was violated was from that thread and as of writing this there is no reason to believe that this causes any problems. The comment in the affected code has since been updated to reflect this.

      The mention in the comments about MDEV-16260 changing the design and getting rid of the purge co-ordinator should be treated with care - this a separate issue, at the time of writing the design for that is not final and it is not known whether there will or will not be a purge coordinator and what threads access what data . When implementing MDEV-16260, the developer should obviously take care not to introduce data races and should pay attention to this code (low_limit_no()) in particular.

      ORIGINAL DESCRIPTION:

      Check if this is fixable and re-enable this assertion:

        /** A wrapper around ReadView::low_limit_no(). */
        trx_id_t low_limit_no() const
        {
      #if 0 /* Unfortunately we don't hold this assertion, see MDEV-22718. */
          ut_ad(rw_lock_own(&latch, RW_LOCK_S));
      #endif
          return view.low_limit_no();
        }
      

      Attachments

        Issue Links

          Activity

            People

              andrzej.jarzabek Andrzej Jarzabek
              svoj Sergey Vojtovich
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.