[MDEV-30357] Performance regression in locking reads from secondary indexes Created: 2023-01-06 Updated: 2023-03-16 Resolved: 2023-03-16 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 10.9.0, 10.6.8, 10.6.9, 10.7.4, 10.8.3, 10.10.0, 10.6.10, 10.6.11, 11.0.0, 10.6.12 |
| Fix Version/s: | 10.11.3, 11.0.2, 10.6.13, 10.8.8, 10.9.6, 10.10.4 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Thijn Geurts | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | performance, regression-10.6 | ||
| Environment: |
Issue is both on linux (Centos 7) and Windows 10. |
||
| Attachments: |
|
||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||
| Description |
|
Since updating from MariaDB 10.3.35 to 10.6.11 one of our query's which used to take around 1min 30sec, is now still in the Sending data stage after more then an hour. I've kept it running for hours (3 or 4), but it wouldn't finish. I've tested every version from 10.3 till 10.6, all versions work fine (same timing as 10.3) right up until version 10.6.7. The issue starts from version 10.6.8. I have also tried the latest 10.10.2, which also does not work, which means this does appear to be a bug that's yet to be fixed. Row counts for the tables are: The query itself isn't particularly difficult or complex. I've even put the two tables the WHERE is on inside a temporary table so it just becomes one join, which did not help. I did try limiting the row count for Werkzoeker to a much lower (6k instead of 300k) sub-set, and that did work. So it appears it either is a performance issue (one that's pretty bad), or one that chokes on a bigger row count.
Show create table for the 3 tables:
Explain extended:
I am able to provide database dumps for these tables if needed (Which is around 7.5GB). Please let me know if you need anything else from me, happy to help to get this issue resolved. |
| Comments |
| Comment by Sergei Golubchik [ 2023-02-03 ] | |||||||||||||||||
|
Between 10.6.7 and 10.6.8 there were many InnoDB related changes but very few optimizer related, and in cases like yours optimizer is usually at fault. Do you by any chance have EXPLAIN EXTENDED from 10.6.7 or any other "fast" version? Note, that generally, if you do EXPLAIN EXTENDED, don't forget to do SHOW WARNINGS afterwards as one of the main features of EXTENDED is to show the transformed optimized query as a warning. I personally prefer to have
in ~/.my.cnf | |||||||||||||||||
| Comment by Thijn Geurts [ 2023-02-06 ] | |||||||||||||||||
|
Thank you for your reply. Unfortunately both versions produce the same query for explain extended:
| |||||||||||||||||
| Comment by Alice Sherepa [ 2023-03-01 ] | |||||||||||||||||
|
I repeated the performance issues on 10.6 with the test on 10.5.16 - query takes 0.3-0.5s if I do ANALYZE TABLE on 10.6:
| |||||||||||||||||
| Comment by Sergei Golubchik [ 2023-03-01 ] | |||||||||||||||||
|
explain is exactly the same (row number differs slightly). I run both under perf, flame graphs attached | |||||||||||||||||
| Comment by Sergei Golubchik [ 2023-03-01 ] | |||||||||||||||||
|
caused by
| |||||||||||||||||
| Comment by Thijn Geurts [ 2023-03-08 ] | |||||||||||||||||
|
I'm glad you were able to reproduce our issue, and possibly find the commit that caused it. Looking forward to getting this issue fixed. | |||||||||||||||||
| Comment by Marko Mäkelä [ 2023-03-13 ] | |||||||||||||||||
|
I tweaked alice’s MDEV-30357.test
This allowed me to run perf record on the interesting part of the test execution, using ./mtr --manual-gdb There is a prominent difference inside the function lock_sec_rec_read_check_and_lock(). In the fast version, this function is invoking lf_hash_iterate(), presumably inside trx_sys.get_min_trx_id(). In the slow version, the time taken by that is replaced by a call to lock_rec_convert_impl_to_expl(), which spends the bulk of its time in row_vers_impl_x_locked(), looking up a clustered index record version that would match the secondary index record. This could be avoided altogether if the PAGE_MAX_TRX_ID on the secondary index page is older than trx_sys.get_min_trx_id(). To reduce the number of calls to trx_sys.get_min_trx_id() (which was the motivation of the patch) I think that we can use another trick: cache the previous value of the call, because we know that the next call to that function cannot return a smaller value. | |||||||||||||||||
| Comment by Marko Mäkelä [ 2023-03-13 ] | |||||||||||||||||
|
The trx_sys.find_same_or_older() that replaced trx_sys.get_min_trx_id() in In lock_sec_rec_read_check_and_lock(), the purpose of the trx_sys.get_min_trx_id() check was to determine if any older transaction may be holding locks on record in the secondary index page. If yes, only then it would make sense to invoke lock_rec_convert_impl_to_expl() to convert any implicit lock that older transaction may be holding on the record to an explicit exclusive lock, so that the current transaction can start waiting for that explicit lock to be released. Because this check had been removed, we would unnecessarily attempt to convert a non-existing implicit lock to an explicit one. This would involve costly lookups of clustered index records. If I add an equivalent call to trx_sys.find_same_or_older(), the CREATE TEMPORARY TABLE…SELECT statement will consume some 0.4 seconds of CPU time, instead of 0.2 seconds that it would consume if we skipped both the lookup and the call to lock_rec_convert_impl_to_expl(). I think that it would be even better if trx_sys.find_same_or_older() returned the minimum transaction identifier that it encountered during the search. In that way, we could cache the value and avoid repeated traversals in lock_sec_rec_read_check_and_lock(). | |||||||||||||||||
| Comment by Marko Mäkelä [ 2023-03-13 ] | |||||||||||||||||
|
My proposed fix includes some speedup of trx_sys_t::find_same_or_older(). | |||||||||||||||||
| Comment by Marko Mäkelä [ 2023-03-14 ] | |||||||||||||||||
|
I realized that there is a simpler fix. When I ported the Oracle Bug #33059387 fix, I had added a condition to lock_sec_rec_some_has_impl() that I thought would improve things:
I did not realize that it is possible to have trx->id=0 in a locking read if the transaction is not modifying any persistent tables. | |||||||||||||||||
| Comment by Vladislav Lesin [ 2023-03-14 ] | |||||||||||||||||
|
Looks good to me. |