Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-22718

InnoDB: purge_sys.low_limit_no() is not protected

Details

    • Bug
    • Status: Stalled (View Workflow)
    • Critical
    • Resolution: Unresolved
    • 5.5(EOL), 10.0(EOL), 10.1(EOL), 10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5
    • 10.5
    • None

    Description

      Check if this is fixable and re-enable this assertion:

        /** A wrapper around ReadView::low_limit_no(). */
        trx_id_t low_limit_no() const
        {
      #if 0 /* Unfortunately we don't hold this assertion, see MDEV-22718. */
          ut_ad(rw_lock_own(&latch, RW_LOCK_S));
      #endif
          return view.low_limit_no();
        }
      

      Attachments

        Issue Links

          Activity

            I debugged the server execution corresponding to the following section of innodb.instant_alter_purge:

            connection prevent_purge;
            COMMIT;
            START TRANSACTION WITH CONSISTENT SNAPSHOT;
            connection default;
             
            ALTER TABLE t1 ADD COLUMN extra TINYINT UNSIGNED NOT NULL DEFAULT 42;
            let $wait_all_purged= 1;
            --source include/wait_all_purged.inc
            

            The assertion would fail during the wait. The purge coordinator would be invoked after the ALTER TABLE transaction commit invoked trx_purge_add_undo_to_history(), which invoked rseg->set_last_commit(86,39). The purge coordinator would first update purge_sys.view.m_low_limit_no to 40 (corresponding to the time after the ALTER TABLE) and then adjust it to 36 corresponding to the read view of the older started transaction. Shortly after this, the assertion failed within the same invocation of purge_coordinator_callback(). The state would be as follows:

            (rr) p purge_sys.tail
            $10 = {trx_no = 39, undo_no = 0}
            (rr) p purge_sys.head
            $11 = {trx_no = 39, undo_no = 0}
            (rr) p purge_sys.view
            $12 = {m_low_limit_id = 36, m_up_limit_id = 36, 
              m_ids = std::vector of length 0, capacity 32, m_low_limit_no = 36}
            

            Simply put, TrxUndoRsegsIterator::set_next() is ‘overshooting’ the purge_sys.view when updating purge_sys.tail. I tried the following patch:

            diff --git a/storage/innobase/trx/trx0purge.cc b/storage/innobase/trx/trx0purge.cc
            index c0fafe1ec6b..8ee1bd688cd 100644
            --- a/storage/innobase/trx/trx0purge.cc
            +++ b/storage/innobase/trx/trx0purge.cc
            @@ -110,7 +110,7 @@ TRANSACTIONAL_INLINE inline bool TrxUndoRsegsIterator::set_next()
             	ut_ad(purge_sys.rseg->space->id == TRX_SYS_SPACE
             	      || srv_is_undo_tablespace(purge_sys.rseg->space->id));
             
            -	trx_id_t last_trx_no, tail_trx_no;
            +	trx_id_t last_trx_no;
             	{
             #ifdef SUX_LOCK_GENERIC
             		purge_sys.rseg->latch.rd_lock(SRW_LOCK_CALL);
            @@ -119,9 +119,7 @@ TRANSACTIONAL_INLINE inline bool TrxUndoRsegsIterator::set_next()
             			{purge_sys.rseg->latch};
             #endif
             		last_trx_no = purge_sys.rseg->last_trx_no();
            -		tail_trx_no = purge_sys.tail.trx_no;
             
            -		purge_sys.tail.trx_no = last_trx_no;
             		purge_sys.hdr_offset = purge_sys.rseg->last_offset();
             		purge_sys.hdr_page_no = purge_sys.rseg->last_page_no;
             
            @@ -131,10 +129,14 @@ TRANSACTIONAL_INLINE inline bool TrxUndoRsegsIterator::set_next()
             	}
             
             	/* Only the purge coordinator task will access
            -	purge_sys.rseg_iter or purge_sys.hdr_page_no. */
            +	purge_sys.rseg_iter, purge_sys.hdr_page_no,
            +	or modify purge_sys.view. */
             	ut_ad(last_trx_no == m_rsegs.trx_no);
             	ut_a(purge_sys.hdr_page_no != FIL_NULL);
            -	ut_a(tail_trx_no <= last_trx_no);
            +	ut_a(purge_sys.tail.trx_no <= last_trx_no);
            +	if (last_trx_no <= purge_sys.low_limit_no()) {
            +		purge_sys.tail.trx_no = last_trx_no;
            +	}
             
             	return(true);
             }
            @@ -607,6 +609,9 @@ TRANSACTIONAL_TARGET static void trx_purge_truncate_history()
             
               if (head.trx_no >= purge_sys.low_limit_no())
               {
            +    ut_a(head.undo_no
            +         ? head.trx_no < purge_sys.low_limit_no()
            +         : head.trx_no == purge_sys.low_limit_no());
                 /* This is sometimes necessary. TODO: find out why. */
                 head.trx_no= purge_sys.low_limit_no();
                 head.undo_no= 0;
            

            Not only would the added assertion in trx_purge_truncate_history() fail in the test innodb.instant_alter_bugs, but the test innodb.innodb-isolation would show result differences.

            vlad.lesin, since you are familiar with read views and purge, do you have any suggestions? The options are to replace the FIXME comment with a good explanation, or to adjust the code so that no adjustment in trx_purge_truncate_history() would be needed.

            marko Marko Mäkelä added a comment - I debugged the server execution corresponding to the following section of innodb.instant_alter_purge : connection prevent_purge; COMMIT ; START TRANSACTION WITH CONSISTENT SNAPSHOT; connection default ;   ALTER TABLE t1 ADD COLUMN extra TINYINT UNSIGNED NOT NULL DEFAULT 42; let $wait_all_purged= 1; --source include/wait_all_purged.inc The assertion would fail during the wait. The purge coordinator would be invoked after the ALTER TABLE transaction commit invoked trx_purge_add_undo_to_history() , which invoked rseg->set_last_commit(86,39) . The purge coordinator would first update purge_sys.view.m_low_limit_no to 40 (corresponding to the time after the ALTER TABLE ) and then adjust it to 36 corresponding to the read view of the older started transaction. Shortly after this, the assertion failed within the same invocation of purge_coordinator_callback() . The state would be as follows: (rr) p purge_sys.tail $10 = {trx_no = 39, undo_no = 0} (rr) p purge_sys.head $11 = {trx_no = 39, undo_no = 0} (rr) p purge_sys.view $12 = {m_low_limit_id = 36, m_up_limit_id = 36, m_ids = std::vector of length 0, capacity 32, m_low_limit_no = 36} Simply put, TrxUndoRsegsIterator::set_next() is ‘overshooting’ the purge_sys.view when updating purge_sys.tail . I tried the following patch: diff --git a/storage/innobase/trx/trx0purge.cc b/storage/innobase/trx/trx0purge.cc index c0fafe1ec6b..8ee1bd688cd 100644 --- a/storage/innobase/trx/trx0purge.cc +++ b/storage/innobase/trx/trx0purge.cc @@ -110,7 +110,7 @@ TRANSACTIONAL_INLINE inline bool TrxUndoRsegsIterator::set_next() ut_ad(purge_sys.rseg->space->id == TRX_SYS_SPACE || srv_is_undo_tablespace(purge_sys.rseg->space->id)); - trx_id_t last_trx_no, tail_trx_no; + trx_id_t last_trx_no; { #ifdef SUX_LOCK_GENERIC purge_sys.rseg->latch.rd_lock(SRW_LOCK_CALL); @@ -119,9 +119,7 @@ TRANSACTIONAL_INLINE inline bool TrxUndoRsegsIterator::set_next() {purge_sys.rseg->latch}; #endif last_trx_no = purge_sys.rseg->last_trx_no(); - tail_trx_no = purge_sys.tail.trx_no; - purge_sys.tail.trx_no = last_trx_no; purge_sys.hdr_offset = purge_sys.rseg->last_offset(); purge_sys.hdr_page_no = purge_sys.rseg->last_page_no; @@ -131,10 +129,14 @@ TRANSACTIONAL_INLINE inline bool TrxUndoRsegsIterator::set_next() } /* Only the purge coordinator task will access - purge_sys.rseg_iter or purge_sys.hdr_page_no. */ + purge_sys.rseg_iter, purge_sys.hdr_page_no, + or modify purge_sys.view. */ ut_ad(last_trx_no == m_rsegs.trx_no); ut_a(purge_sys.hdr_page_no != FIL_NULL); - ut_a(tail_trx_no <= last_trx_no); + ut_a(purge_sys.tail.trx_no <= last_trx_no); + if (last_trx_no <= purge_sys.low_limit_no()) { + purge_sys.tail.trx_no = last_trx_no; + } return(true); } @@ -607,6 +609,9 @@ TRANSACTIONAL_TARGET static void trx_purge_truncate_history() if (head.trx_no >= purge_sys.low_limit_no()) { + ut_a(head.undo_no + ? head.trx_no < purge_sys.low_limit_no() + : head.trx_no == purge_sys.low_limit_no()); /* This is sometimes necessary. TODO: find out why. */ head.trx_no= purge_sys.low_limit_no(); head.undo_no= 0; Not only would the added assertion in trx_purge_truncate_history() fail in the test innodb.instant_alter_bugs , but the test innodb.innodb-isolation would show result differences. vlad.lesin , since you are familiar with read views and purge, do you have any suggestions? The options are to replace the FIXME comment with a good explanation, or to adjust the code so that no adjustment in trx_purge_truncate_history() would be needed.

            The ‘overshooting’ might be related to innodb_purge_batch_size. In MDEV-24402 I had to tweak the purge_sys.end_view creation so that it will not be a copy of purge_sys.view, but capped by purge_sys.head.trx_no or purge_sys.tail.trx_no. Few mtr tests that would invoke CHECK TABLE are generating enough undo logs so that the default innodb_purge_batch_size=300 would start to matter. A few tests for partitioned tables do that.

            marko Marko Mäkelä added a comment - The ‘overshooting’ might be related to innodb_purge_batch_size . In MDEV-24402 I had to tweak the purge_sys.end_view creation so that it will not be a copy of purge_sys.view , but capped by purge_sys.head.trx_no or purge_sys.tail.trx_no . Few mtr tests that would invoke CHECK TABLE are generating enough undo logs so that the default innodb_purge_batch_size=300 would start to matter. A few tests for partitioned tables do that.

            In MDEV-24402, I found out that it is not easy to reconstruct the state of purge after server restart. The tweaks might also be related to that.

            marko Marko Mäkelä added a comment - In MDEV-24402 , I found out that it is not easy to reconstruct the state of purge after server restart. The tweaks might also be related to that.
            marko Marko Mäkelä added a comment - - edited

            The MDEV-24402 CHECK TABLE…EXTENDED sometimes produces warnings when enabling innodb_undo_log_truncate. I observed it for 2 tests:

            parts.partition_alter4_innodb 'innodb'   w13 [ fail ]
                    Test ended at 2022-12-08 11:12:16
             
            CURRENT_TEST: parts.partition_alter4_innodb
            --- /home/buildbot/maria-slave/fulltest-debug-big-only/build/mysql-test/suite/parts/r/partition_alter4_innodb.result	2022-12-08 10:44:53.105837636 +0000
            +++ /home/buildbot/maria-slave/fulltest-debug-big-only/build/mysql-test/suite/parts/r/partition_alter4_innodb.reject	2022-12-08 11:12:15.246208692 +0000
            @@ -42328,6 +42328,8 @@
             test.t1	analyze	status	OK
             CHECK    TABLE t1 EXTENDED;
             Table	Op	Msg_type	Msg_text
            +test.t1	check	Warning	InnoDB: Unpurged clustered index record in table `test`.`t1` /* Partition `part_2` */: COMPACT RECORD(info_bits=32, 8 fields): {[6]    X8(0x000000005838),[6]    G.(0x00000001472E),[7](      (0x2800000182079C),[4]    (0x8000000A),[4]    (0x8000000A),[20]10                  (0x3130202020202020202020202020202020202020),[20]10                  (0x3130202020202020202020202020202020202020),[13]just inserted(0x6A75737420696E736572746564)}
            +test.t1	check	Warning	InnoDB: Unpurged clustered index record in table `test`.`t1` /* Partition `part_5` */: COMPACT RECORD(info_bits=32, 8 fields): {[6]    X7(0x000000005837),[6]    G.(0x00000001472E),[7](      (0x280000018207D2),[4]    (0x80000009),[4]    (0x80000009),[20]9                   (0x3920202020202020202020202020202020202020),[20]9                   (0x3920202020202020202020202020202020202020),[13]just inserted(0x6A75737420696E736572746564)}
             test.t1	check	status	OK
             CHECKSUM TABLE t1 EXTENDED;
             Table	Checksum
             
            mysqltest: Result length mismatch
            

            and

            CURRENT_TEST: parts.partition_alter1_2_innodb
            --- /home/buildbot/maria-slave/fulltest-debug-big-only/build/mysql-test/suite/parts/r/partition_alter1_2_innodb.result	2022-12-07 21:27:06.327078615 +0000
            +++ /home/buildbot/maria-slave/fulltest-debug-big-only/build/mysql-test/suite/parts/r/partition_alter1_2_innodb.reject	2022-12-08 11:11:17.827001454 +0000
            @@ -30611,6 +30611,8 @@
             test.t1	analyze	status	OK
             CHECK    TABLE t1 EXTENDED;
             Table	Op	Msg_type	Msg_text
            +test.t1	check	Warning	InnoDB: Unpurged clustered index record in table `test`.`t1` /* Partition `part0` */: COMPACT RECORD(info_bits=32, 8 fields): {[6]    B (0x00000000421A),[6]    Ff(0x000000014666),[7]    0!z(0x0B00000130217A),[4]    (0x8000000E),[4]    (0x8000000E),[20]                    (0x2020202020202020202020202020202020202020),[20]                    (0x2020202020202020202020202020202020202020),[12]was inserted(0x77617320696E736572746564)}
            +test.t1	check	Warning	InnoDB: Unpurged clustered index record in table `test`.`t1` /* Partition `part2` */: COMPACT RECORD(info_bits=32, 8 fields): {[6]    B (0x000000004219),[6]    Ff(0x000000014666),[7]    0"=(0x0B00000130223D),[4]    (0x8000000F),[4]    (0x8000000F),[20]                    (0x2020202020202020202020202020202020202020),[20]                    (0x2020202020202020202020202020202020202020),[12]was inserted(0x77617320696E736572746564)}
             test.t1	check	status	OK
             CHECKSUM TABLE t1 EXTENDED;
             Table	Checksum
             
            mysqltest: Result length mismatch
            

            This is not easily reproducible, and I never reproduced it with rr record. To avoid CI failures due to this, in MDEV-29986 I had changed the CHECK TABLE t1 EXTENDED to CHECK TABLE t1.

            Edit: Due to a performance impact, MDEV-29986 will not change the default from innodb_undo_log_truncate=OFF after all; hence, the tests will not have to be touched.

            marko Marko Mäkelä added a comment - - edited The MDEV-24402 CHECK TABLE…EXTENDED sometimes produces warnings when enabling innodb_undo_log_truncate . I observed it for 2 tests: parts.partition_alter4_innodb 'innodb' w13 [ fail ] Test ended at 2022-12-08 11:12:16   CURRENT_TEST: parts.partition_alter4_innodb --- /home/buildbot/maria-slave/fulltest-debug-big-only/build/mysql-test/suite/parts/r/partition_alter4_innodb.result 2022-12-08 10:44:53.105837636 +0000 +++ /home/buildbot/maria-slave/fulltest-debug-big-only/build/mysql-test/suite/parts/r/partition_alter4_innodb.reject 2022-12-08 11:12:15.246208692 +0000 @@ -42328,6 +42328,8 @@ test.t1 analyze status OK CHECK TABLE t1 EXTENDED; Table Op Msg_type Msg_text +test.t1 check Warning InnoDB: Unpurged clustered index record in table `test`.`t1` /* Partition `part_2` */: COMPACT RECORD(info_bits=32, 8 fields): {[6] X8(0x000000005838),[6] G.(0x00000001472E),[7]( (0x2800000182079C),[4] (0x8000000A),[4] (0x8000000A),[20]10 (0x3130202020202020202020202020202020202020),[20]10 (0x3130202020202020202020202020202020202020),[13]just inserted(0x6A75737420696E736572746564)} +test.t1 check Warning InnoDB: Unpurged clustered index record in table `test`.`t1` /* Partition `part_5` */: COMPACT RECORD(info_bits=32, 8 fields): {[6] X7(0x000000005837),[6] G.(0x00000001472E),[7]( (0x280000018207D2),[4] (0x80000009),[4] (0x80000009),[20]9 (0x3920202020202020202020202020202020202020),[20]9 (0x3920202020202020202020202020202020202020),[13]just inserted(0x6A75737420696E736572746564)} test.t1 check status OK CHECKSUM TABLE t1 EXTENDED; Table Checksum   mysqltest: Result length mismatch and CURRENT_TEST: parts.partition_alter1_2_innodb --- /home/buildbot/maria-slave/fulltest-debug-big-only/build/mysql-test/suite/parts/r/partition_alter1_2_innodb.result 2022-12-07 21:27:06.327078615 +0000 +++ /home/buildbot/maria-slave/fulltest-debug-big-only/build/mysql-test/suite/parts/r/partition_alter1_2_innodb.reject 2022-12-08 11:11:17.827001454 +0000 @@ -30611,6 +30611,8 @@ test.t1 analyze status OK CHECK TABLE t1 EXTENDED; Table Op Msg_type Msg_text +test.t1 check Warning InnoDB: Unpurged clustered index record in table `test`.`t1` /* Partition `part0` */: COMPACT RECORD(info_bits=32, 8 fields): {[6] B (0x00000000421A),[6] Ff(0x000000014666),[7] 0!z(0x0B00000130217A),[4] (0x8000000E),[4] (0x8000000E),[20] (0x2020202020202020202020202020202020202020),[20] (0x2020202020202020202020202020202020202020),[12]was inserted(0x77617320696E736572746564)} +test.t1 check Warning InnoDB: Unpurged clustered index record in table `test`.`t1` /* Partition `part2` */: COMPACT RECORD(info_bits=32, 8 fields): {[6] B (0x000000004219),[6] Ff(0x000000014666),[7] 0"=(0x0B00000130223D),[4] (0x8000000F),[4] (0x8000000F),[20] (0x2020202020202020202020202020202020202020),[20] (0x2020202020202020202020202020202020202020),[12]was inserted(0x77617320696E736572746564)} test.t1 check status OK CHECKSUM TABLE t1 EXTENDED; Table Checksum   mysqltest: Result length mismatch This is not easily reproducible, and I never reproduced it with rr record . To avoid CI failures due to this, in MDEV-29986 I had changed the CHECK TABLE t1 EXTENDED to CHECK TABLE t1 . Edit: Due to a performance impact, MDEV-29986 will not change the default from innodb_undo_log_truncate=OFF after all; hence, the tests will not have to be touched.
            marko Marko Mäkelä added a comment - - edited

            I believe that the above mentioned bugs when using innodb_undo_log_truncate=ON could have been fixed by MDEV-30671. I did not test that.

            marko Marko Mäkelä added a comment - - edited I believe that the above mentioned bugs when using innodb_undo_log_truncate=ON could have been fixed by MDEV-30671 . I did not test that.

            People

              vlad.lesin Vladislav Lesin
              svoj Sergey Vojtovich
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.