[MDEV-33315] InnoDB history length and undo tablespace files keep growing Created: 2024-01-25  Updated: 2024-02-05

Status: Needs Feedback
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.11.6
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Arnas Klova Assignee: Marko Mäkelä
Resolution: Unresolved Votes: 0
Labels: None
Environment:

CloudLinux 7h


Issue Links:
Relates
relates to MDEV-33213 History list is not shrunk unless the... Closed
relates to MDEV-33325 Crash in flst_read_addr on corrupted ... Confirmed

 Description   

Hello, after an upgrade from 10.5.19 to 10.11.6 on one of our servers we've started getting complaints on DDL not working on some databases.
As we checked, we have noticed affected databases had `MDL_SHARED` `Table metadata lock` with thread_id 0 which led us to check innodb status and recently enabled innodb_undo_tablespaces, which are now 16GB and appear to be growing in tandem with history list length.

Trx id counter 23795545285
Purge done for trx's n:o < 23673930766 undo n:o < 0 state: running
History list length 51823666
 
-rw-rw---- 1 mysql mysql 16928210944 Jan 25 20:35 /var/lib/mysql/undo001
-rw-rw---- 1 mysql mysql 15804137472 Jan 25 20:33 /var/lib/mysql/undo002
-rw-rw---- 1 mysql mysql 15988686848 Jan 25 20:34 /var/lib/mysql/undo003

It seems to be stuck at the `23673930766` trx. The only purge related configuration change was innodb_purge_threads to 32, rest are defaults:

MariaDB [(none)]> show global variables like '%undo%';
+--------------------------+----------+
| Variable_name            | Value    |
+--------------------------+----------+
| innodb_max_undo_log_size | 10485760 |
| innodb_undo_directory    | ./       |
| innodb_undo_log_truncate | OFF      |
| innodb_undo_tablespaces  | 3        |
+--------------------------+----------+
4 rows in set (0.041 sec)
 
MariaDB [(none)]> show global variables like '%purge%';
+--------------------------------------+------------+
| Variable_name                        | Value      |
+--------------------------------------+------------+
| aria_log_purge_type                  | immediate  |
| innodb_max_purge_lag                 | 0          |
| innodb_max_purge_lag_delay           | 0          |
| innodb_max_purge_lag_wait            | 4294967295 |
| innodb_purge_batch_size              | 1000       |
| innodb_purge_rseg_truncate_frequency | 128        |
| innodb_purge_threads                 | 32         |
| relay_log_purge                      | ON         |
+--------------------------------------+------------+
8 rows in set (0.001 sec)
 
$ mysql --version
mysql  Ver 15.1 Distrib 10.11.6-MariaDB, for Linux (x86_64) using readline 5.1

Does seem very similar to https://jira.mariadb.org/browse/MDEV-31676, but our innodb_max_undo_log_size is default and is undo logs are way over limit, so purges should be going.

We've already tried experimenting with max_purge_lag% variables, but nothing appeared to help. Also, not all servers from our fleet that got upgraded seem to be affected either, with same configuration.

If there's any other information needed, please let us know.



 Comments   
Comment by Marko Mäkelä [ 2024-01-26 ]

MDEV-33213 is the last fix in this area. However, I do not think that it would help in the described case.

We see that the last assigned transaction identifier (or next to-be-assigned identifier, I do not remember which) is 121,614,519 more than the last purged transaction identifier.

Transaction identifiers are assigned on both transaction start (DB_TRX_ID written to the records) and commit (the "transaction number" written to the undo log is what purge cares about). If we assume that every started transaction is also committed (and not rolled back, or stuck indefinitely in XA PREPARE state), this would correspond to 60,807,259½ transactions. But, the history list length is only 51,823,666 transactions.

In some other cases that I think would be fixed by MDEV-33213, the history list length would be orders of magnitude larger than the difference between the last assigned and last purged transaction identifier.

I would suspect that you have some old transaction lingering in the system, possibly in XA PREPARE state. Even if you never used such statements, the binlog and replication subsystem will internally use two-phase commit in an attempt to keep the binlog consistent with InnoDB. There could be a bug in this area.

Can you check the full SHOW ENGINE INNODB STATUS output and look for ACTIVE or ACTIVE (PREPARED) transactions whose identifier is significantly less than the last assigned transaction identifier?

Comment by Arnas Klova [ 2024-01-26 ]

Thank you for looking into this.

Here's output from `SHOW ENGINE INNODB STATUS` `TRANSACTIONS` block. Unfortunately, there we cannot find any active transactions with anomalous identifier that you've mentioned. Any new ACTIVE state transactions all appear to be close in value to `non started` transactions in the list.

Trx id counter 23803806364
Purge done for trx's n:o < 23673930766 undo n:o < 116 state: running
History list length 55407777
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION (0x7f200e804240), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7f200e80f240), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7f200e80dc40), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7f200e80c640), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7f200e80bb40), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7f200e806e40), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7f200e811e40), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7f200e80fd40), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7f200e80b040), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7f200e808440), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7f200e807940), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7f200e803740), ACTIVE 0 sec fetching rows
mysql tables in use 1, locked 0
0 lock struct(s), heap size 1128, 0 row lock(s)
MariaDB thread id 1778837, OS thread handle 139779804849920, query id 79217795 127.0.0.1 u10000462_wp Sending data
SELECT ID FROM wp_posts WHERE 1=1 AND post_title = ? AND post_type = ?
Trx read view will not see trx with id >= 23803806364, sees < 23803806364
---TRANSACTION (0x7f200e802140), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7f200e801640), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7f200e800b40), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7f200e80a540), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7f200e804d40), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7f200e808f40), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7f200e80d140), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
--------
FILE I/O
...

Comment by Marko Mäkelä [ 2024-01-26 ]

Thank you, I see that there currently only exist read-only transactions. No persistent identifier is assigned for them; that is why they are identified as memory addresses in the output.

Would you be willing to try a development snapshot where the MDEV-33213 fix is present? We are close to the scheduled releases. At https://ci.mariadb.org/42881/ you would find packages for a 10.11 snapshot that should be close to the 10.11.7 release. Beware that some release-blocking bugs have not been fixed yet. It would be best to test this on a copy of the data, not in a production environment.

Comment by Arnas Klova [ 2024-01-26 ]

Hello, thanks for the information and suggestions. During the day we've made a really dirty copy of production data (ib* files after shutdown, undo00* while running) to minimize unscheduled downtime and tried it out with the development snapshot you've mentioned. Results do seem to be promising, as history length began decreasing:

TRANSACTIONS
------------
Trx id counter 23809343100
Purge done for trx's n:o < 23673930766 undo n:o < 54 state: running
History list length 57729385
LIST OF TRANSACTIONS FOR EACH SESSION:
--------
 
------------
Trx id counter 23809343100
Purge done for trx's n:o < 23673959867 undo n:o < 0 state: running
History list length 57728442
LIST OF TRANSACTIONS FOR EACH SESSION:
--------

A short moment later, MariaDB server crashed, likely because of the aforementioned semi-broken copy:

systemd-coredump[4063990]: Process 4061235 (mariadbd) of user 991 dumped core.
 
                                                                          Stack trace of thread 4061808:
                                                                          #0  0x00007f3f81aa8acf n/a (n/a)
                                                                          #1  0x00005637de134da1 flst_read_addr (/opt/mariadb1011/mariadb-10.11.7-linux-x86_64/bin/mariadbd)
                                                                          #2  0x00005637de92c8de flst_read_addr (/opt/mariadb1011/mariadb-10.11.7-linux-x86_64/bin/mariadbd)
                                                                          #3  0x00005637de91dbe6 fseg_free_page_low (/opt/mariadb1011/mariadb-10.11.7-linux-x86_64/bin/mariadbd)
                                                                          #4  0x00005637de91dcf5 _Z14fseg_free_pagePhP11fil_space_tjP5mtr_tb (/opt/mariadb1011/mariadb-10.11.7-linux-x86_64/bin/mariadbd)
                                                                          #5  0x00005637de87f62a _Z13btr_page_freeP12dict_index_tP11buf_block_tP5mtr_tbb (/opt/mariadb1011/mariadb-10.11.7-linux-x86_64/bin/mariadbd)
                                                                          #6  0x00005637de8a0e90 _Z32btr_free_externally_stored_fieldP12dict_index_tPhPKhPKtP11buf_block_tmbP5mtr_t (/opt/mariadb1011/mariadb-10.11.7-linux-x86_64/bin/mariadbd)
                                                                          #7  0x00005637de816f2b row_purge_upd_exist_or_extern_func (/opt/mariadb1011/mariadb-10.11.7-linux-x86_64/bin/mariadbd)
                                                                          #8  0x00005637de8177fe row_purge (/opt/mariadb1011/mariadb-10.11.7-linux-x86_64/bin/mariadbd)
                                                                          #9  0x00005637de7daa0d que_thr_step (/opt/mariadb1011/mariadb-10.11.7-linux-x86_64/bin/mariadbd)
                                                                          #10 0x00005637de835b4f srv_task_execute (/opt/mariadb1011/mariadb-10.11.7-linux-x86_64/bin/mariadbd)
                                                                          #11 0x00005637de96b822 _ZN5tpool10task_group7executeEPNS_4taskE (/opt/mariadb1011/mariadb-10.11.7-linux-x86_64/bin/mariadbd)
                                                                          #12 0x00005637de9690a9 _ZN5tpool19thread_pool_generic11worker_mainEPNS_11worker_dataE (/opt/mariadb1011/mariadb-10.11.7-linux-x86_64/bin/mariadbd)
                                                                          #13 0x00007f3f8247bec4 n/a (n/a)
 
...
mariadbd[4070987]: 2024-01-26 15:04:42 0x7f46025ff700  InnoDB: Assertion failure in file /home/buildbot/amd64-rhel-7/build/storage/innobase/include/fut0lst.h line 122
mariadbd[4070987]: InnoDB: Failing assertion: addr.page == FIL_NULL || addr.boffset >= FIL_PAGE_DATA

We could try out with proper copy later to see if crash goes away, if needed, it would just require for us to schedule maintenance, which would take a few days.

Comment by Marko Mäkelä [ 2024-01-29 ]

Thank you for the update. Yes, I think that such a crash could occur due to using an inconsistent copy of the database. I filed MDEV-33325 for that crash on the corrupted data.

Comment by Arnas Klova [ 2024-02-05 ]

Hello, thank you for filling out https://jira.mariadb.org/browse/MDEV-33325.

We've got clean copy on another server as affected one is getting low on space due to growing undo logs. Unfortunately, it seems that the history length is stuck even with development build, purge is at the same exact transaction.

Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 5
Server version: 10.11.7-MariaDB Source distribution
 
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
...
MariaDB [(none)]> show engine innodb status \G
...
------------
TRANSACTIONS
------------
Trx id counter 23965298920
Purge done for trx's n:o < 23673930766 undo n:o < 54 state: running
History list length 125422312
LIST OF TRANSACTIONS FOR EACH SESSION:
--------
FILE I/O

Generated at Thu Feb 08 10:38:00 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.