[MDEV-13512] buf_flush_update_zip_checksum() corrupts SPATIAL INDEX in ROW_FORMAT=COMPRESSED tables Created: 2017-08-13 Updated: 2018-01-04 Resolved: 2017-10-06 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 10.1, 10.2 |
| Fix Version/s: | 10.1.29, 10.2.10 |
| Type: | Bug | Priority: | Major |
| Reporter: | Elena Stepanova | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Sprint: | 10.2.10 | ||||||||
| Description |
|
The test case is attached. If it doesn't fail right away, try running with --repeat=N. The test is still rather long, but reducing it further seems to decrease the probability of failure. Also reproducible on 10.3. |
| Comments |
| Comment by Jan Lindström (Inactive) [ 2017-08-14 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Is that ENCRYPTED=YES necessary for crash ? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2017-08-14 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
It appears that the table must be encrypted in order to hit the failure, at least I never got the failure otherwise. Whether it's encrypted because it has ENCRYPTED=YES or because the tables are set to be encrypted by default, wasn't important in the test initially, but on some reason the latest version of the test failed for me with the explicit clause and didn't fail otherwise, so I kept the clause. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jan Lindström (Inactive) [ 2017-09-19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
https://github.com/MariaDB/server/commit/26301165ffa30cee5d237560d961834736d9b9ed | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2017-09-19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I think that this needs another round of review. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jan Lindström (Inactive) [ 2017-09-20 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Following existing test also test row_format=compressed:
Using some of innodb_zip tests is possible but they seem to create quite a few pages i.e. would need to wait until key rotation really has time to encrypt something before table is gone. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2017-09-20 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
At the very least, I would like to see a test that reproduces the problem when the fix is not present. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jan Lindström (Inactive) [ 2017-09-20 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
https://github.com/MariaDB/server/commit/9b998c18283c8acd4ed3d8e6df503fe861af04f0 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2017-10-06 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I do not understand what the fix is supposed to achieve, apart from skipping the comparison that is failing. A design constraint of ROW_FORMAT=COMPRESSED which I designed and implemented in the InnoDB Plugin for MySQL 5.1 is that the compressed and uncompressed index page headers must match at all times. That is, if one of the headers is updated, then the other one must be update as well, while holding X-latch or SX-latch on block->lock. Once the latch is released, the two copies must be equivalent. Something is apparently breaking this constraint, and we must find and fix the cause of this. I do not think that the check should be relaxed without fully understanding the problem. I cannot repeat any problem with the included test case (the commit features an slightly extended version of t3.test If I could repeat the problem, I could better suggest how to fix this. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2017-10-06 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
If the mismatch is caused by encryption updating only one of block->frame or block->page.zip.data for the buf_block_t* block of a ROW_FORMAT=COMPRESSED table, then the root cause should be fixed by updating both copies at the same time. Note that during buffer pool flushing, both the uncompressed and compressed page must exist in the buffer pool. This is because flushing is protected by buf_block_t::lock. At other times, the page might exist in compressed format only. If it contains changes that must be written to the *.ibd file, then the buf_page_t::state will be BUF_BLOCK_ZIP_DIRTY. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2017-10-06 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I found the problem by running the test on perro. On two subsequent runs, the problem occurred on page 4:10, giving some hope of determinism.
At this point, I noticed that both the uncompressed frame and the compressed page contain non-zero bytes. The mismatch that I had observed in the core dumps was that the block->page.zip.data contained zeroes while block->frame contained nonzero bytes. So, I started to suspect that the problem is that block->page.zip.data is zeroed out, and not that block->frame is being written with nonzero bytes without updating block->page.zip.data accordingly. I set a watchpoint:
This watchpoint was triggered in a page flush:
buf_flush_update_zip_checksum() zeroes out the FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION field, which it perhaps should not do?
Note that the page here is the compressed page frame (in this case, zip_size is half the srv_page_size aka innodb_page_size). Without encryption, this field would always be zero in both the compressed and uncompressed page header. It looks like the memset() must be breaking SPATIAL INDEX in 10.2 for ROW_FORMAT=COMPRESSED, by zeroing out the SSN (split sequence number) field. In MySQL 5.7.5, two memset() were removed from buf0flu.cc in WL#6968 InnoDB GIS: R-tree index support. The proper fix is to remove the 2 offending memset() from both InnoDB and XtraDB, rather than to shoot or cripple the messenger (the consistency check).
I believe that the reason why I was unable to repeat the bug on my laptop is that there was not enough concurrency. On perro, there are more processor cores and more true concurrency between the page flush and the SQL execution thread. I tried to prove my claim that SPATIAL INDEX is broken on ROW_FORMAT=COMPRESSED tables, but we do not have the larger --suite=innodb_gis tests from 5.7 (until | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2017-10-06 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||