Details
Description
There was apparently some regression introduced between versions 10.4.21 and 10.4.31, as the same test shows a dramatic decrease in SELECT COUNT(*) performance when a big enough transaction is active with uncommitted row changes.
How to reproduce:
– prepare 10M sysbench table
– start trx and delete random 4k rows
– run select count(*)
Example runs for various versions:
$ cat test_count_in_trx.sql
|
|
use db1 |
select count(*) from sbtest1; |
|
start transaction; |
DELETE FROM sbtest1 ORDER BY RAND() LIMIT 4000; |
select count(*) from sbtest1; |
rollback; |
MariaDB 10.4.21
$ msb_10_4_21/use -vvv < test_count_in_trx.sql |
--------------
|
select count(*) from sbtest1 |
--------------
|
|
+----------+ |
| count(*) | |
+----------+ |
| 10000000 |
|
+----------+ |
1 row in set (2.070 sec) |
|
--------------
|
start transaction |
--------------
|
|
Query OK, 0 rows affected (0.000 sec) |
|
--------------
|
DELETE FROM sbtest1 ORDER BY RAND() LIMIT 4000 |
--------------
|
|
Query OK, 4000 rows affected (12.723 sec) |
|
--------------
|
select count(*) from sbtest1 |
--------------
|
|
+----------+ |
| count(*) | |
+----------+ |
| 9996000 |
|
+----------+ |
1 row in set (53.875 sec) |
|
--------------
|
rollback
|
--------------
|
|
Query OK, 0 rows affected (0.451 sec) |
MariaDB 10.4.31
$ msb_10_4_31/use db1 -vvv < test_count_in_trx.sql |
--------------
|
select count(*) from sbtest1 |
--------------
|
|
+----------+ |
| count(*) | |
+----------+ |
| 10000000 |
|
+----------+ |
1 row in set (1.959 sec) |
|
--------------
|
start transaction |
--------------
|
|
Query OK, 0 rows affected (0.000 sec) |
|
--------------
|
DELETE FROM sbtest1 ORDER BY RAND() LIMIT 4000 |
--------------
|
|
Query OK, 4000 rows affected (11.911 sec) |
|
--------------
|
select count(*) from sbtest1 |
--------------
|
|
+----------+ |
| count(*) | |
+----------+ |
| 9996000 |
|
+----------+ |
1 row in set (11 min 41.352 sec) |
|
--------------
|
rollback
|
--------------
|
|
Query OK, 0 rows affected (1.270 sec) |
MariaDB 10.5.23
$ msb_10_5_23/use db1 -vvv < test_count_in_trx.sql |
--------------
|
select count(*) from sbtest1 |
--------------
|
|
+----------+ |
| count(*) | |
+----------+ |
| 10000000 |
|
+----------+ |
1 row in set (1.940 sec) |
|
--------------
|
start transaction |
--------------
|
|
Query OK, 0 rows affected (0.000 sec) |
|
--------------
|
DELETE FROM sbtest1 ORDER BY RAND() LIMIT 4000 |
--------------
|
|
Query OK, 4000 rows affected (13.320 sec) |
|
--------------
|
select count(*) from sbtest1 |
--------------
|
|
1 row in set (13 min 14.588 sec) |
|
--------------
|
rollback
|
--------------
|
|
Query OK, 0 rows affected (1.370 sec) |
MySQL 8.0.35
$ msb_8_0_35/use db1 -vvv < test_count_in_trx.sql |
--------------
|
select count(*) from sbtest1 |
--------------
|
|
+----------+ |
| count(*) | |
+----------+ |
| 10000000 |
|
+----------+ |
1 row in set (7.17 sec) |
|
--------------
|
start transaction |
--------------
|
|
Query OK, 0 rows affected (0.00 sec) |
|
--------------
|
DELETE FROM sbtest1 ORDER BY RAND() LIMIT 4000 |
--------------
|
|
Query OK, 4000 rows affected (14.78 sec) |
|
--------------
|
select count(*) from sbtest1 |
--------------
|
|
+----------+ |
| count(*) | |
+----------+ |
| 9996000 |
|
+----------+ |
1 row in set (9.46 sec) |
|
--------------
|
rollback
|
--------------
|
|
Query OK, 0 rows affected (2.50 sec) |
MySQL 5.7.44
$ msb_5_7_44/use -vvv < test_count_in_trx.sql |
--------------
|
select count(*) from sbtest1 |
--------------
|
|
+----------+ |
| count(*) | |
+----------+ |
| 10000000 |
|
+----------+ |
1 row in set (2.17 sec) |
|
--------------
|
start transaction |
--------------
|
|
Query OK, 0 rows affected (0.00 sec) |
|
--------------
|
DELETE FROM sbtest1 ORDER BY RAND() LIMIT 4000 |
--------------
|
|
Query OK, 4000 rows affected (15.24 sec) |
|
--------------
|
select count(*) from sbtest1 |
--------------
|
|
+----------+ |
| count(*) | |
+----------+ |
| 9996000 |
|
+----------+ |
1 row in set (8.08 sec) |
|
--------------
|
rollback
|
--------------
|
|
Query OK, 0 rows affected (1.27 sec) |
Ran all tests with default settings, i.e., buffer pool at 128MB.
przemek@mysqlmaniac.com, I’m terribly sorry for the delay. First, I checked the following:
git log --oneline mariadb-10.4.21..mariadb-10.4.31 storage/innobase/trx/trx0purge.cc storage/innobase/row/row0purge.cc|grep -v Merge
1562b2c20b5 MDEV-29666 InnoDB fails to purge secondary index records when indexed virtual columns exist
3c92050d1c9 Fix build without either ENABLED_DEBUG_SYNC or DBUG_OFF
a6f258e47f4 MDEV-20605 Awaken transaction can miss inserted by other transaction records due to wrong persistent cursor restoration
4bfdba2e89c MDEV-26672 innodb_undo_log_truncate may reset transaction ID sequence
1cb218c37cc MDEV-26450: Corruption due to innodb_undo_log_truncate
c0a84fb9b03 MDEV-26465 Race condition in trx_purge_rseg_get_next_history_log()
It does not seem to me that the fixes of
MDEV-20605orMDEV-29666could have caused any regression. The fixesMDEV-26672andMDEV-26450are specific to innodb_undo_tablespaces>1 and innodb_undo_log_truncate=ON.The most likely explanation for the observed regression is that before
MDEV-26465was fixed, some purging of committed transaction history could have been missed if the undo log page had meanwhile been reused for something else and the TRX_UNDO_NEEDS_PURGE field was no longer being skipped. So, you would have been left with some extra garbage in any secondary indexes; https://mariadb.org/fest2022/how-innodb-multi-version-concurrency-control-mvcc-works/ can help understand this.Further correctness fixes
MDEV-30671andMDEV-31355caused more lag to the purge of history. Even though in our internal testing we only reproduced failures when using innodb_undo_log_truncate=ON, I believe that these bugs affect also the "default" case where the undo tablespaces are not being truncated, or the undo logs are being stored in the InnoDB system tablespace. In retrospect, I think thatMDEV-14407was only "fixed", but its actual root cause could have been fixed byMDEV-30671orMDEV-31355. Had we had https://rr-project.org in use back then, it should have been possible to properly analyze and fix the race condition.That said, these fixes made the existing problem of "purge lag" much worse. The InnoDB history list length would easily grow into hundreds of thousands or even millions in a rather short Sysbench workload. The latest significant performance improvement in this area are
MDEV-32050(which was fixed in 10.6.16 and 10.11.6) andMDEV-34515. I’d expect the recently released 10.6.21 and 10.11.11 to perform much better.