[MDEV-16952] Introduce SET GLOBAL innodb_max_purge_lag_wait Created: 2018-08-11 Updated: 2023-08-24 Resolved: 2020-10-27 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB, Tests |
| Affects Version/s: | 10.2, 10.3, 10.4, 10.5 |
| Fix Version/s: | 10.2.35, 10.3.26, 10.4.16, 10.5.7 |
| Type: | Bug | Priority: | Major |
| Reporter: | Elena Stepanova | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Description |
|
http://buildbot.askmonty.org/buildbot/builders/kvm-rpm-centos74-amd64-debug/builds/755
|
| Comments |
| Comment by Marko Mäkelä [ 2018-08-23 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
Earlier in the test, we successfully got '0 transactions not purged'.
Unfortunately, I am unable to repeat this, and there were no additional messages in the server error log. | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2020-10-27 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
I suspect that the test sporadically fails due to the CI system being overloaded. The client-side polling is much less efficient than server-side polling would be. The client would request a large amount of expensive-to-compute information and throw out most of the SHOW ENGINE INNODB STATUS output, 10 times per second, for 60 seconds. I think that it is more efficient and more useful to implement server-side waiting:
If purge is not allowed to proceed due to an active read view that is held by some connection, potentially including the current one, then the wait may be interrupted to normal statement or test case timeout. But it would no longer limited to the 60 seconds that the wait_all_purged.inc implemented. This new dummy global variable would also assist in upgrades from 10.2 or earlier to 10.3 or later, avoiding | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Manuel Arostegui [ 2020-11-20 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
Marko, any chances this flag can come with some regression on the replication thread?.
was set to 4294967295 (the default) when we upgraded it to 10.4.17. | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2020-11-20 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
marostegui, this is a special variable whose value should always read as a constant, even after SET GLOBAL was executed. The SET GLOBAL statement is special: as long as the connection that executed as is active, that connection will wait until the history list length is at most the desired value. If that connection is holding an open read view, then it is possible that the history list length will keep growing. Say, if you executed
then that would be almost guaranteed to kill performance. Only in the special case that there was nothing to purge, we would get through. In any other case, the read view that we opened before initiating the wait, would prevent purge of any transactions that were committed after the read view was created. The idea was that this SET GLOBAL statement would typically only be executed while the server is idle, and no new write transactions are arriving. It should be possible to use KILL QUERY or KILL CONNECTION to abort the wait. Or just disconnect the client. If SELECT @@GLOBAL.innodb.max_purge_lag_wait; is returning different values, that is a bug that could be filed and fixed. My intention was that it would always return a constant. I did not test that. Note: Specifying an initial value of innodb_max_purge_lag_wait in the start-up configuration has no effect. | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Manuel Arostegui [ 2020-11-20 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
Thank you for the detailed answer. We will keep looking for the culprit of these lags! | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Larry Adams [ 2023-07-17 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
Is there any good documentation for this yet guys? I've got a massive history list due to very large number of update/insert's over the day, and then at the end of the day I rename that table. I wonder if that creates some sort of issue. But the history list is in the billions and my ibdata1 is 3+ TB. Not sure how to allow MariaDB to catch up if ever. | |||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Larry Adams [ 2023-07-17 ] | |||||||||||||||||||||||||||||||||||||||||||||||
|
Make that hundreds of millions.
|