[MDEV-20301] InnoDB's MVCC has O(N^2) behaviors Created: 2019-08-08 Updated: 2023-06-30 Resolved: 2019-08-14 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 5.5, 10.2, 10.3, 10.4 |
| Fix Version/s: | 10.2.27, 10.3.19, 10.4.9 |
| Type: | Bug | Priority: | Major |
| Reporter: | Aleksey Midenkov | Assignee: | Aleksey Midenkov |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | performance | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||
| Description |
|
Originally described here: https://bugs.mysql.com/bug.php?id=84958 If there're multiple row versions in InnoDB, reading one row from PK may have O(N) complexity and reading from secondary keys may have O(N^2) complexity. How to repeat: T1: BEGIN; SELECT * FROM t1; T2: 10000 * INSERT INTO t1 VALUES (1,2,3) ON DUPLICATE KEY UPDATE c=c+1 T1: SELECT * FROM t1 FORCE INDEX (PRIMARY) T1: SELECT * FROM t1 FORCE INDEX (b) Operator: *weep* *sob* *weep* |
| Comments |
| Comment by Marko Mäkelä [ 2019-08-14 ] |
|
I see that the test innodb.innodb_bug14704286 is for the initial ‘fix’ of this bug in MySQL. Yes, it would still be O(n²), but at least you would be able to interrupt the slow query. |
| Comment by Aleksey Midenkov [ 2019-09-26 ] |
|
This is how MVCC for secondary index works (from MySQL docs):
If there are multiple versions of clustered index record it traversed all of them each time it needed to retrieve secondary index record. The task case retrieved multiple versions of secondary index records in process of searching the current one and searched corresponding current clustered index record by scanning undo log. For each version from secondary index it did full undo log scan. So the algorithm complexity was O(N ^ 2): traverse undo log versions for each secondary index version. The fix just optimized such process. So it remembers now once found clustered index record and when secondary index versions are traversed it retrieved this remembered clustered index record. And the complexity is now O(2 * N): traverse undo log (clustered index versions) and then traverse secondary index versions. So, the short version of the above: optimize scanning the undo log versions while scanning the secondary index versions. |
| Comment by Marko Mäkelä [ 2020-05-12 ] |
|
Quoting my comment from 2015-01-01 to a blog post:
|
| Comment by Marko Mäkelä [ 2020-05-12 ] |
|
I believe that MDEV-17598 would fix the O(n²) page accesses in secondary index operations. As far as I remember, this fix in MariaDB only changes the asymptotic complexity by caching some lookup results. |