[MDEV-13534] InnoDB STATS_PERSISTENT fails to ignore garbage delete-mark flag on node pointer pages Created: 2017-08-15 Updated: 2019-06-17 Resolved: 2017-08-24 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 10.2.7 |
| Fix Version/s: | 10.2.9 |
| Type: | Bug | Priority: | Major |
| Reporter: | Tim Westervoorde | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | assertion, innodb | ||
| Environment: |
CentOS 7, Linux 3.10.0-514.26.2.el7.x86_64 #1 SMP Tue Jul 4 15:04:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
||
| Issue Links: |
|
||||||||||||||||||||
| Description |
|
The MariaDB crashed with an assertion failure. At that time there was a large transaction running with (according to the rollback log) approx. 4191650 row operations.
|
| Comments |
| Comment by Elena Stepanova [ 2017-08-15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
Please paste or attach your cnf file(s). Also, whatever information you retrieved from the log or any monitoring tools, might also be helpful. | |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Tim Westervoorde [ 2017-08-16 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
The config is as follows. There weren't any log messages before the crash. Unfortunately this server wasn't in our monitoring environment yet so I have no other info except this.
| |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Tim Westervoorde [ 2017-08-22 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi Elena, This happened again today. It happens in a large transaction.
Is there anything I can do to help pinpointing this issue? We have an idea when this gets triggered so we could trigger it by hand. | |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2017-08-23 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
dicode, this assertion failure occurs when the InnoDB persistent statistics are updated. On the surface, the assertion failure looks like a sign of corruption: an index page is empty even though the table is not empty. An empty index page is only allowed when it is the root page and the whole index (and table) is empty.
However, if we look deeper, a change to the logic of dict_stats_scan_page() seems to be behind this:
It is entirely possible and OK for a page to consist only of delete-marked records. We should tolerate a NULL return value if !page_is_empty(page). | |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2017-08-24 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
I debugged this a little with the following test:
During this test, persistent statistics were calculated twice: first during an INSERT…SELECT, then during the ANALYZE TABLE which was what I was interested in. In the ANALYZE TABLE, the execution got this far, on a leaf page that consisted entirely of delete-marked records:
Unfortunately, because this is a leaf page, we would not get to the dict_stats_scan_page() call. But, this is also good news: I think I have found the bug. While I cannot repeat this yet, I do know that on non-leaf pages, the delete-mark flag in the node pointer records is basically garbage. (Delete-marking only makes sense at the leaf level anyway. The purpose of the delete-mark is to tell MVCC, locking and purge that a leaf-level record does not exist in the READ UNCOMMITTED view, but it used to exist.) When a page is split, InnoDB creates a node pointer record out of the child page record that the cursor is positioned on. The node pointer record for the parent page will be a copy of the child page record, amended with the child page number. If the child page record happened to carry the delete-mark flag, then the node pointer record would also carry this flag (even though the flag makes no sense outside child pages). (On a related note, for the first node pointer record in the first node pointer page of each tree level, if the MIN_REC_FLAG is set, the rest of the record contents (except the child page number) is basically garbage. From this garbage you could deduce at which point the child was originally split.) I believe that to repeat this bug, you would need a node pointer page that is full of node pointer records that happen to have the delete-mark flag set. This is possible when a page split is triggered on a leaf page that is full of delete-marked records. Next, I will try to revise my test case so that it will first construct a single leaf page full of delete-marked records, and then cause an INSERT that makes each node pointer record carry a delete-mark flag, and finally execute ANALYZE TABLE to hit the bug. The fix is obvious: dict_stats_scan_page() should ignore the garbage delete-mark flags on non-leaf pages. But I want to repeat this first. | |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Tim Westervoorde [ 2017-08-24 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
This seems valid for my case. A table with ~450k rows is completely rebuild along with a 2nd table with 1.7m rows which references this table. So there are a lot of insert / delete statements in a short period of time, within a transaction. If there is anything I can debug for you let me know, I can trigger this to happen by running the same update script over and over again Kind regards, | |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2017-08-24 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
A prerequisite of this bug is that a non-root node pointer page is filled with records that carry the (garbage) delete-mark flag. There is no problem if all root page records are ‘delete-marked’ or if all leaf page records are delete-marked. So, the minimum required index tree height is 3, that is, 2 levels of node pointers above the leaf level. Because the maximum PRIMARY KEY length is so small (768 bytes on innodb_page_size=4k; 3072 bytes on innodb_page_size=16k), I thought that it is infeasible to create an ’organic’ test for this. Yes, I could pad the leaf page record sizes with non-key columns, but if each node pointer page will have at least 4 records, I’d still need quite a few row before the node pointer page is split. I was able to reproduce this by using debug instrumentation, adapting the great innodb.innodb_bug14676111 test painstakingly created by Yasufumi Kinoshita:
| |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2017-08-24 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
I introduced this bug in
This change wrongly causes the bogus (garbage) delete-mark flags to be considered on node pointer pages. The only MariaDB Server releases that are affected by this bug are 10.2.7 and 10.2.8. | |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Andrii Nikitin (Inactive) [ 2017-09-08 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
To workaround the problem (besides upgrade when 10.2.9 is released) one may try to disable persistent statistics : put following line into [mysqld] section of .cnf file: To disable persistent stats for current server without restart - execute SQL command:
| |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Guangpu Feng [ 2019-06-17 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
will not workaround this bug, verified in 10.2.8. |