Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33213

History list is not shrunk unless there is a pause in the workload

Details

    Description

      While testing MDEV-32050 and its follow-up changes MDEV-33009 and MDEV-33112, axel pointed out that the InnoDB history list is not being shrunk during a workload.

      Some initial debugging indicates that the purge_truncation_task (which was added in MDEV-32050) is being invoked but it is not doing anything. Also, the purge_sys.view is advancing; there are no old read views that would block the purge of history. Throttling the Sysbench workload for a few seconds would allow the history list to shrink immediately.

      Attachments

        1. 10.6-MDEV-33213.pdf
          107 kB
        2. 12x5.pdf
          58 kB
        3. 24x5_high_threads_pausing.pdf
          59 kB
        4. 24x5_high_threads.pdf
          56 kB
        5. 24x5.pdf
          59 kB
        6. no_pause.png
          no_pause.png
          36 kB
        7. with_pause.png
          with_pause.png
          47 kB

        Issue Links

          Activity

            axel Axel Schwenke added a comment -

            Attached:

            This is the same build from branch 10.6-MDEV-33112. The first run is sysbench-tpcc for 1x 60min and shows the history list growing to 12 mio trx (and according UNDO log size). The other run is sysbench-tpcc for 6x 10min with only 5 seconds pause between. The 5 second pause is enough to clean up the history list and shrink the UNDO logs.

            axel Axel Schwenke added a comment - Attached: This is the same build from branch 10.6- MDEV-33112 . The first run is sysbench-tpcc for 1x 60min and shows the history list growing to 12 mio trx (and according UNDO log size). The other run is sysbench-tpcc for 6x 10min with only 5 seconds pause between. The 5 second pause is enough to clean up the history list and shrink the UNDO logs.

            I instrumented a 11.x server with the following patch:

            diff --git a/storage/innobase/srv/srv0srv.cc b/storage/innobase/srv/srv0srv.cc
            index d1ea843e521..1018d2f880d 100644
            --- a/storage/innobase/srv/srv0srv.cc
            +++ b/storage/innobase/srv/srv0srv.cc
            @@ -1098,7 +1098,8 @@ static void purge_truncation_callback(void*)
               purge_sys.latch.rd_lock(SRW_LOCK_CALL);
               const purge_sys_t::iterator head= purge_sys.head;
               purge_sys.latch.rd_unlock();
            -  head.free_history();
            +  dberr_t err= head.free_history();
            +  sql_print_information("InnoDB purge truncation: %s", ut_strerr(err));
             }
             
             static tpool::task_group purge_task_group;
            @@ -1415,6 +1416,7 @@ inline void purge_coordinator_state::do_purge()
                   break;
                 }
             
            +    sql_print_information("InnoDB purge: %zu", history_size);
                 ulint n_pages_handled= trx_purge(n_threads, history_size);
                 if (!trx_sys.history_exists())
                   goto no_history;
            

            I see output like this:

            2024-01-10 16:42:46 0 [Note] InnoDB purge: 34
            2024-01-10 16:42:46 0 [Note] InnoDB purge: 36
            2024-01-10 16:42:46 0 [Note] InnoDB purge truncation: Success
            2024-01-10 16:42:46 0 [Note] InnoDB purge: 3
            2024-01-10 16:42:46 0 [Note] InnoDB purge truncation: Success
            2024-01-10 16:42:46 0 [Note] InnoDB purge: 5
            2024-01-10 16:42:46 0 [Note] InnoDB purge truncation: Success
            2024-01-10 16:42:47 0 [Note] InnoDB purge: 114649
            2024-01-10 16:42:54 0 [Note] InnoDB purge: 1444953
            2024-01-10 16:42:54 0 [Note] InnoDB purge truncation: Success
            2024-01-10 16:42:58 0 [Note] InnoDB purge: 2199290
            2024-01-10 16:42:58 0 [Note] InnoDB purge truncation: Success
            2024-01-10 16:43:02 0 [Note] InnoDB purge: 2924550
            2024-01-10 16:43:02 0 [Note] InnoDB purge truncation: Success
            2024-01-10 16:43:24 0 [Note] InnoDB purge: 5928129
            2024-01-10 16:43:24 0 [Note] InnoDB purge truncation: Success
            2024-01-10 16:43:47 0 [Note] InnoDB purge: 7893229
            2024-01-10 16:43:47 0 [Note] InnoDB purge truncation: Success
            2024-01-10 16:44:23 0 [Note] InnoDB purge: 10563812
            2024-01-10 16:44:23 0 [Note] InnoDB purge truncation: Success
            …
            2024-01-10 16:44:49 0 [Note] InnoDB: Shutdown completed; log sequence number 2550911428; transaction id 23745949
            

            The workload script is based on something that I used for MDEV-32050. It consisted of two 60-second sysbench oltp_update_index, separated by a sleep 1. The purge message with a history_size of 7893229 was output around the switching point. During the second run, there was only one more attempt of purging.

            I tried again oltp_update_non_index, which should allow purge to work around MDEV-17598. Same thing there, despite a 3-second sleep between two 60-second workloads:

            2024-01-10 16:47:48 0 [Note] InnoDB purge: 34
            2024-01-10 16:47:48 0 [Note] InnoDB purge: 36
            2024-01-10 16:47:48 0 [Note] InnoDB purge truncation: Success
            2024-01-10 16:47:48 0 [Note] InnoDB purge: 3
            2024-01-10 16:47:48 0 [Note] InnoDB purge truncation: Success
            2024-01-10 16:47:48 0 [Note] InnoDB purge: 5
            2024-01-10 16:47:48 0 [Note] InnoDB purge truncation: Success
            2024-01-10 16:47:49 0 [Note] InnoDB purge: 141988
            2024-01-10 16:47:50 0 [Note] InnoDB purge: 345948
            2024-01-10 16:47:50 0 [Note] InnoDB purge truncation: Success
            2024-01-10 16:47:50 0 [Note] InnoDB purge: 450133
            2024-01-10 16:47:50 0 [Note] InnoDB purge truncation: Success
            2024-01-10 16:47:51 0 [Note] InnoDB purge: 551898
            2024-01-10 16:47:51 0 [Note] InnoDB purge truncation: Success
            2024-01-10 16:47:51 0 [Note] InnoDB purge: 669964
            2024-01-10 16:47:51 0 [Note] InnoDB purge truncation: Success
            2024-01-10 16:47:52 0 [Note] InnoDB purge: 763708
            2024-01-10 16:47:52 0 [Note] InnoDB purge truncation: Success
            2024-01-10 16:47:53 0 [Note] InnoDB purge: 905287
            2024-01-10 16:47:53 0 [Note] InnoDB purge truncation: Success
            2024-01-10 16:47:53 0 [Note] InnoDB purge truncation: Success
            2024-01-10 16:47:53 0 [Note] InnoDB purge: 1026138
            …
            2024-01-10 16:48:47 0 [Note] InnoDB purge: 12705454
            2024-01-10 16:48:47 0 [Note] InnoDB purge truncation: Success
            2024-01-10 16:48:48 0 [Note] InnoDB purge: 12865213
            2024-01-10 16:48:48 0 [Note] InnoDB purge truncation: Success
            2024-01-10 16:48:48 0 [Note] InnoDB purge: 12897164
            2024-01-10 16:48:48 0 [Note] InnoDB purge truncation: Success
            2024-01-10 16:48:48 0 [Note] InnoDB purge: 12897164
            2024-01-10 16:48:50 0 [Note] InnoDB purge truncation: Success
            2024-01-10 16:48:50 0 [Note] InnoDB purge: 10519866
            …
            2024-01-10 16:48:51 0 [Note] InnoDB purge: 10181678
            2024-01-10 16:48:51 0 [Note] InnoDB purge truncation: Success
            2024-01-10 16:48:51 0 [Note] InnoDB purge: 10153547
            2024-01-10 16:48:51 0 [Note] InnoDB purge truncation: Success
            2024-01-10 16:48:52 0 [Note] InnoDB purge: 10262769
            2024-01-10 16:48:52 0 [Note] InnoDB purge truncation: Success
            2024-01-10 16:48:52 0 [Note] InnoDB purge: 10363969
            2024-01-10 16:48:52 0 [Note] InnoDB purge truncation: Success
            …
            2024-01-10 16:49:51 0 [Note] InnoDB purge: 23166576
            2024-01-10 16:49:51 0 [Note] InnoDB purge truncation: Success
            2024-01-10 16:49:51 0 [Note] InnoDB purge: 23166576
            

            Roughly coinciding with the 3-second sleep, we see some reduction of the history list, followed by a steady growth to the end of the run.

            marko Marko Mäkelä added a comment - I instrumented a 11.x server with the following patch: diff --git a/storage/innobase/srv/srv0srv.cc b/storage/innobase/srv/srv0srv.cc index d1ea843e521..1018d2f880d 100644 --- a/storage/innobase/srv/srv0srv.cc +++ b/storage/innobase/srv/srv0srv.cc @@ -1098,7 +1098,8 @@ static void purge_truncation_callback(void*) purge_sys.latch.rd_lock(SRW_LOCK_CALL); const purge_sys_t::iterator head= purge_sys.head; purge_sys.latch.rd_unlock(); - head.free_history(); + dberr_t err= head.free_history(); + sql_print_information("InnoDB purge truncation: %s", ut_strerr(err)); } static tpool::task_group purge_task_group; @@ -1415,6 +1416,7 @@ inline void purge_coordinator_state::do_purge() break; } + sql_print_information("InnoDB purge: %zu", history_size); ulint n_pages_handled= trx_purge(n_threads, history_size); if (!trx_sys.history_exists()) goto no_history; I see output like this: 2024-01-10 16:42:46 0 [Note] InnoDB purge: 34 2024-01-10 16:42:46 0 [Note] InnoDB purge: 36 2024-01-10 16:42:46 0 [Note] InnoDB purge truncation: Success 2024-01-10 16:42:46 0 [Note] InnoDB purge: 3 2024-01-10 16:42:46 0 [Note] InnoDB purge truncation: Success 2024-01-10 16:42:46 0 [Note] InnoDB purge: 5 2024-01-10 16:42:46 0 [Note] InnoDB purge truncation: Success 2024-01-10 16:42:47 0 [Note] InnoDB purge: 114649 2024-01-10 16:42:54 0 [Note] InnoDB purge: 1444953 2024-01-10 16:42:54 0 [Note] InnoDB purge truncation: Success 2024-01-10 16:42:58 0 [Note] InnoDB purge: 2199290 2024-01-10 16:42:58 0 [Note] InnoDB purge truncation: Success 2024-01-10 16:43:02 0 [Note] InnoDB purge: 2924550 2024-01-10 16:43:02 0 [Note] InnoDB purge truncation: Success 2024-01-10 16:43:24 0 [Note] InnoDB purge: 5928129 2024-01-10 16:43:24 0 [Note] InnoDB purge truncation: Success 2024-01-10 16:43:47 0 [Note] InnoDB purge: 7893229 2024-01-10 16:43:47 0 [Note] InnoDB purge truncation: Success 2024-01-10 16:44:23 0 [Note] InnoDB purge: 10563812 2024-01-10 16:44:23 0 [Note] InnoDB purge truncation: Success … 2024-01-10 16:44:49 0 [Note] InnoDB: Shutdown completed; log sequence number 2550911428; transaction id 23745949 The workload script is based on something that I used for MDEV-32050 . It consisted of two 60-second sysbench oltp_update_index , separated by a sleep 1 . The purge message with a history_size of 7893229 was output around the switching point. During the second run, there was only one more attempt of purging. I tried again oltp_update_non_index , which should allow purge to work around MDEV-17598 . Same thing there, despite a 3-second sleep between two 60-second workloads: 2024-01-10 16:47:48 0 [Note] InnoDB purge: 34 2024-01-10 16:47:48 0 [Note] InnoDB purge: 36 2024-01-10 16:47:48 0 [Note] InnoDB purge truncation: Success 2024-01-10 16:47:48 0 [Note] InnoDB purge: 3 2024-01-10 16:47:48 0 [Note] InnoDB purge truncation: Success 2024-01-10 16:47:48 0 [Note] InnoDB purge: 5 2024-01-10 16:47:48 0 [Note] InnoDB purge truncation: Success 2024-01-10 16:47:49 0 [Note] InnoDB purge: 141988 2024-01-10 16:47:50 0 [Note] InnoDB purge: 345948 2024-01-10 16:47:50 0 [Note] InnoDB purge truncation: Success 2024-01-10 16:47:50 0 [Note] InnoDB purge: 450133 2024-01-10 16:47:50 0 [Note] InnoDB purge truncation: Success 2024-01-10 16:47:51 0 [Note] InnoDB purge: 551898 2024-01-10 16:47:51 0 [Note] InnoDB purge truncation: Success 2024-01-10 16:47:51 0 [Note] InnoDB purge: 669964 2024-01-10 16:47:51 0 [Note] InnoDB purge truncation: Success 2024-01-10 16:47:52 0 [Note] InnoDB purge: 763708 2024-01-10 16:47:52 0 [Note] InnoDB purge truncation: Success 2024-01-10 16:47:53 0 [Note] InnoDB purge: 905287 2024-01-10 16:47:53 0 [Note] InnoDB purge truncation: Success 2024-01-10 16:47:53 0 [Note] InnoDB purge truncation: Success 2024-01-10 16:47:53 0 [Note] InnoDB purge: 1026138 … 2024-01-10 16:48:47 0 [Note] InnoDB purge: 12705454 2024-01-10 16:48:47 0 [Note] InnoDB purge truncation: Success 2024-01-10 16:48:48 0 [Note] InnoDB purge: 12865213 2024-01-10 16:48:48 0 [Note] InnoDB purge truncation: Success 2024-01-10 16:48:48 0 [Note] InnoDB purge: 12897164 2024-01-10 16:48:48 0 [Note] InnoDB purge truncation: Success 2024-01-10 16:48:48 0 [Note] InnoDB purge: 12897164 2024-01-10 16:48:50 0 [Note] InnoDB purge truncation: Success 2024-01-10 16:48:50 0 [Note] InnoDB purge: 10519866 … 2024-01-10 16:48:51 0 [Note] InnoDB purge: 10181678 2024-01-10 16:48:51 0 [Note] InnoDB purge truncation: Success 2024-01-10 16:48:51 0 [Note] InnoDB purge: 10153547 2024-01-10 16:48:51 0 [Note] InnoDB purge truncation: Success 2024-01-10 16:48:52 0 [Note] InnoDB purge: 10262769 2024-01-10 16:48:52 0 [Note] InnoDB purge truncation: Success 2024-01-10 16:48:52 0 [Note] InnoDB purge: 10363969 2024-01-10 16:48:52 0 [Note] InnoDB purge truncation: Success … 2024-01-10 16:49:51 0 [Note] InnoDB purge: 23166576 2024-01-10 16:49:51 0 [Note] InnoDB purge truncation: Success 2024-01-10 16:49:51 0 [Note] InnoDB purge: 23166576 Roughly coinciding with the 3-second sleep, we see some reduction of the history list, followed by a steady growth to the end of the run.

            During the busy part of my test workload, purge_sys.sees(rseg.needs_purge) would never hold in purge_sys_t::iterator::free_history(). Occasionally, rseg.is_referenced() would hold. Only when trx_purge_truncate_rseg_history() is invoked with all=true, the history would be shrunk. When all=false, I observed the difference trx_sys.get_max_trx_id()-rseg.needs_purge to be between 0 and 3379. The type of the field trx_rseg_t::needs_purge was changed from a Boolean to a transaction identifier in MDEV-30671 in order to prevent the premature freeing of undo log pages.

            marko Marko Mäkelä added a comment - During the busy part of my test workload, purge_sys.sees(rseg.needs_purge) would never hold in purge_sys_t::iterator::free_history() . Occasionally, rseg.is_referenced() would hold. Only when trx_purge_truncate_rseg_history() is invoked with all=true , the history would be shrunk. When all=false , I observed the difference trx_sys.get_max_trx_id()-rseg.needs_purge to be between 0 and 3379. The type of the field trx_rseg_t::needs_purge was changed from a Boolean to a transaction identifier in MDEV-30671 in order to prevent the premature freeing of undo log pages.

            The following patch, which will prevent the reuse of undo pages by multiple small transactions (TRX_UNDO_CACHED), will allow the history list to shrink during my test workload:

            diff --git a/storage/innobase/trx/trx0purge.cc b/storage/innobase/trx/trx0purge.cc
            index add4db962d5..f593c49070b 100644
            --- a/storage/innobase/trx/trx0purge.cc
            +++ b/storage/innobase/trx/trx0purge.cc
            @@ -275,6 +275,7 @@ trx_purge_add_undo_to_history(const trx_t* trx, trx_undo_t*& undo, mtr_t* mtr)
             
               uint16_t undo_state;
             
            +#if 0
               if (undo->size == 1 &&
                   TRX_UNDO_PAGE_REUSE_LIMIT >
                   mach_read_from_2(TRX_UNDO_PAGE_HDR + TRX_UNDO_PAGE_FREE +
            @@ -284,6 +285,7 @@ trx_purge_add_undo_to_history(const trx_t* trx, trx_undo_t*& undo, mtr_t* mtr)
                 UT_LIST_ADD_FIRST(rseg->undo_cached, undo);
               }
               else
            +#endif
               {
                 ut_ad(undo->size == flst_get_len(TRX_UNDO_SEG_HDR + TRX_UNDO_PAGE_LIST +
                                                  undo_page->page.frame));
             
            

            If this change turns out to improve performance in a broad selection of benchmarks, we might want to introduce a configuration parameter for this. Back when I was working on MDEV-25062, I observed that a change like this would degrade throughput. My test workloads were short, and I ignored the growth of the history list.

            Note: In MDEV-29593 the purge of committed history was improved so that can free pages that were marked reusable (TRX_UNDO_CACHED) but have not yet been reused. The rseg.is_referenced() check is needed for the rare case that a starting transaction has chosen to use a rollback segment but has not modified any undo page yet.

            marko Marko Mäkelä added a comment - The following patch, which will prevent the reuse of undo pages by multiple small transactions ( TRX_UNDO_CACHED ), will allow the history list to shrink during my test workload: diff --git a/storage/innobase/trx/trx0purge.cc b/storage/innobase/trx/trx0purge.cc index add4db962d5..f593c49070b 100644 --- a/storage/innobase/trx/trx0purge.cc +++ b/storage/innobase/trx/trx0purge.cc @@ -275,6 +275,7 @@ trx_purge_add_undo_to_history(const trx_t* trx, trx_undo_t*& undo, mtr_t* mtr) uint16_t undo_state; +#if 0 if (undo->size == 1 && TRX_UNDO_PAGE_REUSE_LIMIT > mach_read_from_2(TRX_UNDO_PAGE_HDR + TRX_UNDO_PAGE_FREE + @@ -284,6 +285,7 @@ trx_purge_add_undo_to_history(const trx_t* trx, trx_undo_t*& undo, mtr_t* mtr) UT_LIST_ADD_FIRST(rseg->undo_cached, undo); } else +#endif { ut_ad(undo->size == flst_get_len(TRX_UNDO_SEG_HDR + TRX_UNDO_PAGE_LIST + undo_page->page.frame)); If this change turns out to improve performance in a broad selection of benchmarks, we might want to introduce a configuration parameter for this. Back when I was working on MDEV-25062 , I observed that a change like this would degrade throughput. My test workloads were short, and I ignored the growth of the history list. Note: In MDEV-29593 the purge of committed history was improved so that can free pages that were marked reusable ( TRX_UNDO_CACHED ) but have not yet been reused. The rseg.is_referenced() check is needed for the rare case that a starting transaction has chosen to use a rollback segment but has not modified any undo page yet.

            I think that the only practical way to prevent rseg.needs_purge from blocking the cleanup action is to temporarily disable one rollback segment at a time, by invoking trx_rseg_t::skip_allocation(). Adding more rollback segments is not practical, because the current file format (especially the format of DB_ROLL_PTR) limits us to only 128 rollback segments, or 127 when innodb_undo_tablespaces>0 because the first one is hard-coded in the system tablespace. When transactions are started, they can choose any of the other rollback 127 or 126 segments while the purge subsystem is preparing to free the undo log pages that stored the already purged history in one rollback segment.

            For innodb_undo_log_truncate=ON we already disable all rollback segments (trx_rseg_t::set_skip_allocation()) that belong to the purge_sys.truncate.current tablespace. The reason why this logic does not work during a heavy workload is that we are “too shy” to prepare for the truncation. If we allow purge_coordinator_state::do_purge() to invoke that part of trx_purge_truncate_history() directly, then undo tablespace truncation will work during the workload. I have implemented this, and it seems to work with my simple benchmark script, without too much disruption on the throughput.

            If we also invoked trx_rseg_t::set_skip_allocation() for one rollback segment at a time in the innodb_undo_log_truncate=OFF case, the need for undo tablespace truncation should be reduced.

            marko Marko Mäkelä added a comment - I think that the only practical way to prevent rseg.needs_purge from blocking the cleanup action is to temporarily disable one rollback segment at a time, by invoking trx_rseg_t::skip_allocation() . Adding more rollback segments is not practical, because the current file format (especially the format of DB_ROLL_PTR ) limits us to only 128 rollback segments, or 127 when innodb_undo_tablespaces>0 because the first one is hard-coded in the system tablespace. When transactions are started, they can choose any of the other rollback 127 or 126 segments while the purge subsystem is preparing to free the undo log pages that stored the already purged history in one rollback segment. For innodb_undo_log_truncate=ON we already disable all rollback segments ( trx_rseg_t::set_skip_allocation() ) that belong to the purge_sys.truncate.current tablespace. The reason why this logic does not work during a heavy workload is that we are “too shy” to prepare for the truncation. If we allow purge_coordinator_state::do_purge() to invoke that part of trx_purge_truncate_history() directly, then undo tablespace truncation will work during the workload. I have implemented this, and it seems to work with my simple benchmark script, without too much disruption on the throughput. If we also invoked trx_rseg_t::set_skip_allocation() for one rollback segment at a time in the innodb_undo_log_truncate=OFF case, the need for undo tablespace truncation should be reduced.
            axel Axel Schwenke added a comment -

            The commit 4799e4efee2 in branch mariadb-10.6-MDEV-33213 looks indeed very good. I have run all 4 TPC-C benchmarks I had prepared on host g5. The performance is better (sometimes much better) and the problem with growing InnoDB History Length is gone.

            Attachments: 12x5.pdf 24x5.pdf 24x5_high_threads.pdf 24x5_high_threads_pausing.pdf

            axel Axel Schwenke added a comment - The commit 4799e4efee2 in branch mariadb-10.6- MDEV-33213 looks indeed very good. I have run all 4 TPC-C benchmarks I had prepared on host g5 . The performance is better (sometimes much better) and the problem with growing InnoDB History Length is gone. Attachments: 12x5.pdf 24x5.pdf 24x5_high_threads.pdf 24x5_high_threads_pausing.pdf
            axel Axel Schwenke added a comment -

            I repeated now the TPC-C regression tests. They run on cheetah01 and include tests with innodb_undo_log_truncate=OFF (default) as well as innodb_undo_log_truncate=ON. The results also look good.

            Attached; 10.6-MDEV-33213.pdf

            axel Axel Schwenke added a comment - I repeated now the TPC-C regression tests. They run on cheetah01 and include tests with innodb_undo_log_truncate=OFF (default) as well as innodb_undo_log_truncate=ON. The results also look good. Attached; 10.6-MDEV-33213.pdf

            10.6-MDEV-33213 028dd57a0c07e94a3c1f205c4915efc06bae47d0 2024-01-12T09:41:15+02:00
            performed well in RQG testing.

            mleich Matthias Leich added a comment - 10.6- MDEV-33213 028dd57a0c07e94a3c1f205c4915efc06bae47d0 2024-01-12T09:41:15+02:00 performed well in RQG testing.

            RQG testing etc.

            mleich Matthias Leich added a comment - RQG testing etc.

            An enterprise customer experienced a hang that was introduced in 10.6 (but not 10.5) by MDEV-30671 and fixed here. The scenario of the hang is that trx_purge_truncate_rseg_history() (which we here replaced with purge_sys_t::iterator::free_history_rseg(trx_rseg_t &rseg)) could read an unbounded number of undo log pages into the buffer pool even though another thread meanwhile chose this rollback segment in trx_assign_rseg_low() and subsequently strarted waiting in trx_undo_assign_low<false>(), all while holding an index page latch that could block many other threads.

            marko Marko Mäkelä added a comment - An enterprise customer experienced a hang that was introduced in 10.6 (but not 10.5) by MDEV-30671 and fixed here. The scenario of the hang is that trx_purge_truncate_rseg_history() (which we here replaced with purge_sys_t::iterator::free_history_rseg(trx_rseg_t &rseg) ) could read an unbounded number of undo log pages into the buffer pool even though another thread meanwhile chose this rollback segment in trx_assign_rseg_low() and subsequently strarted waiting in trx_undo_assign_low<false>() , all while holding an index page latch that could block many other threads.
            TheWitness Larry Adams added a comment -

            Marko,

            We are upgrading on 3/28. Keep you posted afterwards.

            Larry

            TheWitness Larry Adams added a comment - Marko, We are upgrading on 3/28. Keep you posted afterwards. Larry
            TheWitness Larry Adams added a comment -

            Marko,

            I've upgraded to 10.6.17-13 this morning. I'm not seeing any increased history flushing, but at least it's flushing now. I've opened a ticket with support to see if there is a way to speed it up though as the number keeps going up incrementally.

            Larry

            TheWitness Larry Adams added a comment - Marko, I've upgraded to 10.6.17-13 this morning. I'm not seeing any increased history flushing, but at least it's flushing now. I've opened a ticket with support to see if there is a way to speed it up though as the number keeps going up incrementally. Larry
            TheWitness Larry Adams added a comment -

            I tried

            SET GLOBAL innodb_max_purge_lag_wait=0;

            But that command just hung for almost 10 minutes before I killed it.

            TheWitness Larry Adams added a comment - I tried SET GLOBAL innodb_max_purge_lag_wait=0; But that command just hung for almost 10 minutes before I killed it.

            Hi Larry, let’s follow up on this in the support ticket. One thing that we need to be aware of is that when SHOW ENGINE INNODB STATUS shows any ACTIVE or PREPARED transactions with a transaction identifier or with a read view, then purge will be unable to proceed beyond that, and SET GLOBAL innodb_max_purge_lag_wait=0 will hang. In the worst case, someone starts a connection with START TRANSACTION WITH CONSISTENT SNAPSHOT and walks away.

            That said, it is possible that there are no active transactions or read views, and the purge is really taking extremely long to clean up the queue. Implementing MDEV-17598 should help in case the reason is secondary indexes.

            marko Marko Mäkelä added a comment - Hi Larry, let’s follow up on this in the support ticket. One thing that we need to be aware of is that when SHOW ENGINE INNODB STATUS shows any ACTIVE or PREPARED transactions with a transaction identifier or with a read view, then purge will be unable to proceed beyond that, and SET GLOBAL innodb_max_purge_lag_wait=0 will hang. In the worst case, someone starts a connection with START TRANSACTION WITH CONSISTENT SNAPSHOT and walks away. That said, it is possible that there are no active transactions or read views, and the purge is really taking extremely long to clean up the queue. Implementing MDEV-17598 should help in case the reason is secondary indexes.

            I have recently upgraded many servers that had been running on 10.3.12 to 10.6.15 at which time I noticed that the history list was not shrinking as it had always done on 10.3.12. I found this bug which seems to describe the situation pretty well. I upgraded to 10.6.17 and tried many variations of innodb_* parameters described above and on other tickets. None seemed to have any effect. I also tried versions 10.11.8 and 11.4.2. The innodb undo logs still do not seem to be getting purged in those versions. If I stop non-replication traffic, the history list length reduces very rapidly. Only the busiest hosts in my infrastructure can't keep up with the undo log purging. however, on 10.3.12 this was never an issue. It feels like the rows in the undo log must already be delete-marked or query performance would be really bad at such large history lengths. So the only impact I'm feeling right now is unbounded disk space growth (or the hassle of juggling hosts in and out of rotation, including the primary).

            marko are you sure this issue is fully resolved? thank you!!

            ch701 chad ambrosius added a comment - I have recently upgraded many servers that had been running on 10.3.12 to 10.6.15 at which time I noticed that the history list was not shrinking as it had always done on 10.3.12. I found this bug which seems to describe the situation pretty well. I upgraded to 10.6.17 and tried many variations of innodb_* parameters described above and on other tickets. None seemed to have any effect. I also tried versions 10.11.8 and 11.4.2. The innodb undo logs still do not seem to be getting purged in those versions. If I stop non-replication traffic, the history list length reduces very rapidly. Only the busiest hosts in my infrastructure can't keep up with the undo log purging. however, on 10.3.12 this was never an issue. It feels like the rows in the undo log must already be delete-marked or query performance would be really bad at such large history lengths. So the only impact I'm feeling right now is unbounded disk space growth (or the hassle of juggling hosts in and out of rotation, including the primary). marko are you sure this issue is fully resolved? thank you!!

            ch701, unfortunately MariaDB 10.6.16 and 10.6.17 are affected by MDEV-33819, which was introduced in MDEV-32050. However, that fix should be present in the 10.11.8 and 11.4.2 releases, which you also tested.

            MariaDB Server 10.3 is missing some correctness fixes such as MDEV-31355. It is possible that history is purged prematurely, while some old transactions could still access the history. I believe that the bug could lead to a situation where the undo log page has been reused and trx_undo_rec_copy() would attempt to copy data at an incorrect offset, and ultimately cause a crash. MDEV-11044 is an example of that, although this particular problematic undo page access was not in an MVCC read but the purge of history itself. Perhaps that bug was actually fixed by MDEV-22388.

            I am currently working on a performance regression in this area, MDEV-34515. The test case is artificial: a frequently updated single-row table with a secondary index. But there clearly is some room for improvement.

            marko Marko Mäkelä added a comment - ch701 , unfortunately MariaDB 10.6.16 and 10.6.17 are affected by MDEV-33819 , which was introduced in MDEV-32050 . However, that fix should be present in the 10.11.8 and 11.4.2 releases, which you also tested. MariaDB Server 10.3 is missing some correctness fixes such as MDEV-31355 . It is possible that history is purged prematurely, while some old transactions could still access the history. I believe that the bug could lead to a situation where the undo log page has been reused and trx_undo_rec_copy() would attempt to copy data at an incorrect offset, and ultimately cause a crash. MDEV-11044 is an example of that, although this particular problematic undo page access was not in an MVCC read but the purge of history itself. Perhaps that bug was actually fixed by MDEV-22388 . I am currently working on a performance regression in this area, MDEV-34515 . The test case is artificial: a frequently updated single-row table with a secondary index. But there clearly is some room for improvement.

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              1 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.