Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33099

Regression in select count query while large transaction keeps undo logs

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.4.31, 10.5.23, 10.11.6
    • 10.5, 10.6, 10.11, 11.4
    • Server
    • None

    Description

      There was apparently some regression introduced between versions 10.4.21 and 10.4.31, as the same test shows a dramatic decrease in SELECT COUNT(*) performance when a big enough transaction is active with uncommitted row changes.

      How to reproduce:
      – prepare 10M sysbench table
      – start trx and delete random 4k rows
      – run select count(*)

      Example runs for various versions:

      $ cat test_count_in_trx.sql 
       
      use db1
      select count(*) from sbtest1;
       
      start transaction;
      DELETE FROM sbtest1 ORDER BY RAND() LIMIT 4000;
      select count(*) from sbtest1;
      rollback;
      

      MariaDB 10.4.21

      $ msb_10_4_21/use -vvv < test_count_in_trx.sql
      --------------
      select count(*) from sbtest1
      --------------
       
      +----------+
      | count(*) |
      +----------+
      | 10000000 |
      +----------+
      1 row in set (2.070 sec)
       
      --------------
      start transaction
      --------------
       
      Query OK, 0 rows affected (0.000 sec)
       
      --------------
      DELETE FROM sbtest1 ORDER BY RAND() LIMIT 4000
      --------------
       
      Query OK, 4000 rows affected (12.723 sec)
       
      --------------
      select count(*) from sbtest1
      --------------
       
      +----------+
      | count(*) |
      +----------+
      |  9996000 |
      +----------+
      1 row in set (53.875 sec)
       
      --------------
      rollback
      --------------
       
      Query OK, 0 rows affected (0.451 sec)
      

      MariaDB 10.4.31

      $ msb_10_4_31/use db1 -vvv < test_count_in_trx.sql 
      --------------
      select count(*) from sbtest1
      --------------
       
      +----------+
      | count(*) |
      +----------+
      | 10000000 |
      +----------+
      1 row in set (1.959 sec)
       
      --------------
      start transaction
      --------------
       
      Query OK, 0 rows affected (0.000 sec)
       
      --------------
      DELETE FROM sbtest1 ORDER BY RAND() LIMIT 4000
      --------------
       
      Query OK, 4000 rows affected (11.911 sec)
       
      --------------
      select count(*) from sbtest1
      --------------
       
      +----------+
      | count(*) |
      +----------+
      |  9996000 |
      +----------+
      1 row in set (11 min 41.352 sec)
       
      --------------
      rollback
      --------------
       
      Query OK, 0 rows affected (1.270 sec)
      

      MariaDB 10.5.23

      $ msb_10_5_23/use db1 -vvv < test_count_in_trx.sql 
      --------------
      select count(*) from sbtest1
      --------------
       
      +----------+
      | count(*) |
      +----------+
      | 10000000 |
      +----------+
      1 row in set (1.940 sec)
       
      --------------
      start transaction
      --------------
       
      Query OK, 0 rows affected (0.000 sec)
       
      --------------
      DELETE FROM sbtest1 ORDER BY RAND() LIMIT 4000
      --------------
       
      Query OK, 4000 rows affected (13.320 sec)
       
      --------------
      select count(*) from sbtest1
      --------------
       
      1 row in set (13 min 14.588 sec)
       
      --------------
      rollback
      --------------
       
      Query OK, 0 rows affected (1.370 sec)
      

      MySQL 8.0.35

      $ msb_8_0_35/use db1 -vvv < test_count_in_trx.sql 
      --------------
      select count(*) from sbtest1
      --------------
       
      +----------+
      | count(*) |
      +----------+
      | 10000000 |
      +----------+
      1 row in set (7.17 sec)
       
      --------------
      start transaction
      --------------
       
      Query OK, 0 rows affected (0.00 sec)
       
      --------------
      DELETE FROM sbtest1 ORDER BY RAND() LIMIT 4000
      --------------
       
      Query OK, 4000 rows affected (14.78 sec)
       
      --------------
      select count(*) from sbtest1
      --------------
       
      +----------+
      | count(*) |
      +----------+
      |  9996000 |
      +----------+
      1 row in set (9.46 sec)
       
      --------------
      rollback
      --------------
       
      Query OK, 0 rows affected (2.50 sec)
      

      MySQL 5.7.44

      $ msb_5_7_44/use -vvv < test_count_in_trx.sql 
      --------------
      select count(*) from sbtest1
      --------------
       
      +----------+
      | count(*) |
      +----------+
      | 10000000 |
      +----------+
      1 row in set (2.17 sec)
       
      --------------
      start transaction
      --------------
       
      Query OK, 0 rows affected (0.00 sec)
       
      --------------
      DELETE FROM sbtest1 ORDER BY RAND() LIMIT 4000
      --------------
       
      Query OK, 4000 rows affected (15.24 sec)
       
      --------------
      select count(*) from sbtest1
      --------------
       
      +----------+
      | count(*) |
      +----------+
      |  9996000 |
      +----------+
      1 row in set (8.08 sec)
       
      --------------
      rollback
      --------------
       
      Query OK, 0 rows affected (1.27 sec)
      

      Ran all tests with default settings, i.e., buffer pool at 128MB.

      Attachments

        Activity

          przemek@mysqlmaniac.com, I’m terribly sorry for the delay. First, I checked the following:

          git log --oneline mariadb-10.4.21..mariadb-10.4.31 storage/innobase/trx/trx0purge.cc storage/innobase/row/row0purge.cc|grep -v Merge

          1562b2c20b5 MDEV-29666 InnoDB fails to purge secondary index records when indexed virtual columns exist
          3c92050d1c9 Fix build without either ENABLED_DEBUG_SYNC or DBUG_OFF
          a6f258e47f4 MDEV-20605 Awaken transaction can miss inserted by other transaction records due to wrong persistent cursor restoration
          4bfdba2e89c MDEV-26672 innodb_undo_log_truncate may reset transaction ID sequence
          1cb218c37cc MDEV-26450: Corruption due to innodb_undo_log_truncate
          c0a84fb9b03 MDEV-26465 Race condition in trx_purge_rseg_get_next_history_log()
          

          It does not seem to me that the fixes of MDEV-20605 or MDEV-29666 could have caused any regression. The fixes MDEV-26672 and MDEV-26450 are specific to innodb_undo_tablespaces>1 and innodb_undo_log_truncate=ON.

          The most likely explanation for the observed regression is that before MDEV-26465 was fixed, some purging of committed transaction history could have been missed if the undo log page had meanwhile been reused for something else and the TRX_UNDO_NEEDS_PURGE field was no longer being skipped. So, you would have been left with some extra garbage in any secondary indexes; https://mariadb.org/fest2022/how-innodb-multi-version-concurrency-control-mvcc-works/ can help understand this.

          Further correctness fixes MDEV-30671 and MDEV-31355 caused more lag to the purge of history. Even though in our internal testing we only reproduced failures when using innodb_undo_log_truncate=ON, I believe that these bugs affect also the "default" case where the undo tablespaces are not being truncated, or the undo logs are being stored in the InnoDB system tablespace. In retrospect, I think that MDEV-14407 was only "fixed", but its actual root cause could have been fixed by MDEV-30671 or MDEV-31355. Had we had https://rr-project.org in use back then, it should have been possible to properly analyze and fix the race condition.

          That said, these fixes made the existing problem of "purge lag" much worse. The InnoDB history list length would easily grow into hundreds of thousands or even millions in a rather short Sysbench workload. The latest significant performance improvement in this area are MDEV-32050 (which was fixed in 10.6.16 and 10.11.6) and MDEV-34515. I’d expect the recently released 10.6.21 and 10.11.11 to perform much better.

          marko Marko Mäkelä added a comment - przemek@mysqlmaniac.com , I’m terribly sorry for the delay. First, I checked the following: git log --oneline mariadb-10.4.21..mariadb-10.4.31 storage/innobase/trx/trx0purge.cc storage/innobase/row/row0purge.cc|grep -v Merge 1562b2c20b5 MDEV-29666 InnoDB fails to purge secondary index records when indexed virtual columns exist 3c92050d1c9 Fix build without either ENABLED_DEBUG_SYNC or DBUG_OFF a6f258e47f4 MDEV-20605 Awaken transaction can miss inserted by other transaction records due to wrong persistent cursor restoration 4bfdba2e89c MDEV-26672 innodb_undo_log_truncate may reset transaction ID sequence 1cb218c37cc MDEV-26450: Corruption due to innodb_undo_log_truncate c0a84fb9b03 MDEV-26465 Race condition in trx_purge_rseg_get_next_history_log() It does not seem to me that the fixes of MDEV-20605 or MDEV-29666 could have caused any regression. The fixes MDEV-26672 and MDEV-26450 are specific to innodb_undo_tablespaces>1 and innodb_undo_log_truncate=ON . The most likely explanation for the observed regression is that before MDEV-26465 was fixed , some purging of committed transaction history could have been missed if the undo log page had meanwhile been reused for something else and the TRX_UNDO_NEEDS_PURGE field was no longer being skipped. So, you would have been left with some extra garbage in any secondary indexes; https://mariadb.org/fest2022/how-innodb-multi-version-concurrency-control-mvcc-works/ can help understand this. Further correctness fixes MDEV-30671 and MDEV-31355 caused more lag to the purge of history. Even though in our internal testing we only reproduced failures when using innodb_undo_log_truncate=ON , I believe that these bugs affect also the "default" case where the undo tablespaces are not being truncated, or the undo logs are being stored in the InnoDB system tablespace. In retrospect, I think that MDEV-14407 was only "fixed", but its actual root cause could have been fixed by MDEV-30671 or MDEV-31355 . Had we had https://rr-project.org in use back then, it should have been possible to properly analyze and fix the race condition. That said, these fixes made the existing problem of "purge lag" much worse. The InnoDB history list length would easily grow into hundreds of thousands or even millions in a rather short Sysbench workload. The latest significant performance improvement in this area are MDEV-32050 (which was fixed in 10.6.16 and 10.11.6) and MDEV-34515 . I’d expect the recently released 10.6.21 and 10.11.11 to perform much better.
          przemek@mysqlmaniac.com Przemek added a comment -

          The same problem applies to version 10.11.6:

          $ msb_10_11_6/use -vvv < test_count_in_trx.sql 
          --------------
          select count(*) from sbtest1
          --------------
           
          +----------+
          | count(*) |
          +----------+
          | 10000000 |
          +----------+
          1 row in set (1.616 sec)
           
          --------------
          start transaction
          --------------
           
          Query OK, 0 rows affected (0.000 sec)
           
          --------------
          DELETE FROM sbtest1 ORDER BY RAND() LIMIT 4000
          --------------
           
          Query OK, 4000 rows affected (11.726 sec)
           
          --------------
          select count(*) from sbtest1
          --------------
           
          +----------+
          | count(*) |
          +----------+
          |  9996000 |
          +----------+
          1 row in set (11 min 19.114 sec)
           
          --------------
          rollback
          --------------
           
          Query OK, 0 rows affected (1.518 sec)
          

          przemek@mysqlmaniac.com Przemek added a comment - The same problem applies to version 10.11.6: $ msb_10_11_6/ use -vvv < test_count_in_trx.sql -------------- select count (*) from sbtest1 --------------   + ----------+ | count (*) | + ----------+ | 10000000 | + ----------+ 1 row in set (1.616 sec)   -------------- start transaction --------------   Query OK, 0 rows affected (0.000 sec)   -------------- DELETE FROM sbtest1 ORDER BY RAND() LIMIT 4000 --------------   Query OK, 4000 rows affected (11.726 sec)   -------------- select count (*) from sbtest1 --------------   + ----------+ | count (*) | + ----------+ | 9996000 | + ----------+ 1 row in set (11 min 19.114 sec)   -------------- rollback --------------   Query OK, 0 rows affected (1.518 sec)

          People

            axel Axel Schwenke
            przemek@mysqlmaniac.com Przemek
            Votes:
            6 Vote for this issue
            Watchers:
            12 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.