[MDEV-31382] SET GLOBAL innodb_undo_log_truncate=ON does not free space when no undo logs exist Created: 2023-06-01 Updated: 2023-07-26 Resolved: 2023-06-08 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 10.2, 10.3, 10.4, 10.5, 10.6, 10.7, 10.8, 10.9, 10.10, 10.11, 11.0, 11.1 |
| Fix Version/s: | 10.5.22, 10.6.15, 10.9.8, 10.10.6, 10.11.5, 11.0.3, 11.1.2 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Marko Mäkelä | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | purge | ||
| Issue Links: |
|
||||||||||||||||||||
| Description |
|
The following simple test demonstrates that innodb_undo_log_truncate=ON fails to truncate undo tablespaces:
Invocation:
The expected outcome would be that all undo tablespaces have been truncated to their default soft limit size (innodb_max_undo_log_size=10M). Instead of that, we will observe that one of the undo tablespace files is larger. I think that the undo tablespace truncation needs to work also while InnoDB is running (mostly idle, with some writes every now and then) and the parameter innodb_purge_rseg_truncate_frequency caused a call to trx_purge_truncate_history() to be skipped during the last purge batch that made the undo logs logically empty but failed to reclaim the space. I originally noticed this when testing an upgrade from a server that is affected by |
| Comments |
| Comment by Marko Mäkelä [ 2023-06-02 ] | ||||||||||
|
In 10.5, if I run the test with ./mtr --rr, the second slow shutdown will be so slow that mtr kills the process. In 10.6, the shutdown completes. During the server run that ends in the second shutdown, purge_coordinator_callback() is not being invoked at all. The function trx_sys.history_size() will return 0 both times it was called, both in innodb_preshutdown(). It looks like the condition in srv_wake_purge_thread_if_not_active() needs to be revised so that it will trigger the purge even if no history exists but undo tablespace truncation is enabled and useful. Similarly, the purge coordinator task needs to invoke trx_purge_truncate_history() once after the history list got empty. | ||||||||||
| Comment by Marko Mäkelä [ 2023-06-05 ] | ||||||||||
|
So far, I got the undo log truncation during slow shutdown to work for my test case. While working on it, I had to revise an unnecessarily strict condition that had originally been added in
This condition must be revised in | ||||||||||
| Comment by Marko Mäkelä [ 2023-06-05 ] | ||||||||||
|
A call to trx_purge_truncate_history() will attempt to truncate all undo tablespaces whose size exceeds the soft limit innodb_max_undo_log_size. I tested my fix also outside shutdown:
My fix will cause SET GLOBAL innodb_undo_log_truncate=ON to wake up the purge coordinator in case it is not running. | ||||||||||
| Comment by Vladislav Lesin [ 2023-06-08 ] | ||||||||||
|
LGTM | ||||||||||
| Comment by Marko Mäkelä [ 2023-06-27 ] | ||||||||||
|
Related to this, I was wondering if it would make sense to change the default value of the confusingly named parameter innodb_purge_rseg_truncate_frequency to 1 (for the maximum frequency), so that undo log pages would be freed more frequently even when using the default setting innodb_undo_log_truncate=OFF. axel tested that and found that it would slightly reduce throughput. |