[MDEV-18188] Maintain persistent COUNT(*) in InnoDB Created: 2019-01-09 Updated: 2023-11-30 |
|
| Status: | Stalled |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Fix Version/s: | None |
| Type: | New Feature | Priority: | Major |
| Reporter: | Marko Mäkelä | Assignee: | Thirunarayanan Balathandayuthapani |
| Resolution: | Unresolved | Votes: | 5 |
| Labels: | performance, statistics | ||
| Issue Links: |
|
||||||||||||
| Description |
|
Query planning needs to know the number of records in a table. Currently, InnoDB only provides an estimate of this. If InnoDB kept accurate track of the number of records in a table, then it would not only benefit statistics, but also limited cases like the following:
At the moment, InnoDB would scan the entire table in order to count the records. If we maintained a durable count of committed records, we could instantly return the results for the above cases. The locking variants would simply lock the table. For the default REPEATABLE READ we would still have to count the records. However, if we additionally maintained a count of uncommitted rows, then we could have instant COUNT
These counts could possibly be maintained in the clustered index root page. If this hurts concurrency too much (in MDEV-6076, storing the persistent AUTO_INCREMENT there did not seem to hurt), then we could partition the two counters into multiple pages. |
| Comments |
| Comment by Marko Mäkelä [ 2019-02-24 ] |
|
While doing this, we could also introduce a persistent checksum on the data, synchronized with every update of the clustered index. The checksum would necessarily be something else than what is currently returned by CHECKSUM TABLE, because it would depend on the page contents and would not be updated on instant ALTER TABLE operations that would affect the logical data format. |
| Comment by Marko Mäkelä [ 2020-12-03 ] |
|
Based on some technical difficulties that we have encountered during the development of Neither way of extending the file format would support persistent COUNT(*) for ROW_FORMAT=COMPRESSED tables, which we are hoping to phase out, as noted in We could protect the updates of the persistent count in a similar way as we protect the changes of the persistent AUTO_INCREMENT ( |