[MDEV-32286] ANALYZE displays a huge number of InnoDB secondary index pages_accessed Created: 2023-09-29 Updated: 2023-11-28 |
|
| Status: | Confirmed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 10.6, 10.10, 10.11, 11.0, 11.1, 11.2 |
| Fix Version/s: | 10.6, 10.11, 11.0, 11.1, 11.2 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Oleg Smirnov | Assignee: | Vladislav Lesin |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | performance | ||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Description |
|
When a table is accessed by an index, "pages_accessed" in the "r_engine_stats" of ANALYZE FORMAT=JSON output displays suspiciously large numbers. See the test case attached and compare pages_accessed=14 for full table scan against pages_accessed=2003 for an access by index. |
| Comments |
| Comment by Marko Mäkelä [ 2023-09-29 ] | |||||||||||||||||||||||||||
|
I think that we need mtr test coverage of pages_accessed, starting from something simple, such as ANALYZE FORMAT=JSON SELECT * FROM empty_table, which should access 1 page. When I was debugging the test case that I posted to | |||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2023-09-29 ] | |||||||||||||||||||||||||||
|
Is it ANALYZE that causes the problem? Or its "source" data that is printed into the slow query log? I take the attached testcase, add printing to slow query log:
run the test and see
It looks like the slow query log has wrong data, too. So the cause is | |||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2023-09-29 ] | |||||||||||||||||||||||||||
|
I debugged this a little deeper, this time, using ./mtr --rr and rr replay. It looks like for the last ANALYZE TABLE statement the counter is indeed being incremented that many times by buf_page_get_low(), and it is being reset at the start of the statement:
It turns out that this last statement (which will report pages_accessed as 2,003) will actually invoke row_search_mvcc() 1,001 times. If I set breakpoints on ha_handler_stats::reset() and row_search_mvcc(), I can see that the first statement (which reported 32 pages accessed) invoked row_search_mvcc() 10,001 times. The second statement with FORCE INDEX(active_epims) reports 2,005 pages accessed after invoking row_search_mvcc() 1,001 times. The third statement with FORCE INDEX(domain) reports 2,003 pages accessed after invoking row_search_mvcc() 1,001 times. I had already filed MDEV-31589 for the order-of-2 difference during secondary index lookup. I assume that handler::ha_index_next_same() is executing an index range scan and not repetitive secondary index lookups. It looks like InnoDB could be missing some cursor optimization here. In | |||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2023-10-03 ] | |||||||||||||||||||||||||||
Correct. it's a single-table select. |