[MDEV-19385] InnoDB: Failing assertion: !cursor->index->is_committed() with virtual columns / indexes, WITHOUT foreign keys and temporary tables Created: 2019-05-03 Updated: 2019-06-05 Resolved: 2019-05-03 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB, Virtual Columns |
| Affects Version/s: | 10.3 |
| Fix Version/s: | 10.2.24, 10.3.15, 10.4.5 |
| Type: | Bug | Priority: | Major |
| Reporter: | Elena Stepanova | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | not-10.4 | ||
| Issue Links: |
|
||||||||||||||||
| Description |
|
Important note: Only reproducible on a non-debug build.
Also reproducible on 10.3.14, I didn't try earlier 10.3 builds because there were other bugs like that. |
| Comments |
| Comment by Marko Mäkelä [ 2019-05-03 ] | |||||||||||||||||||
|
This bug seems to depend on the ALTER TABLE executing an instant ADD COLUMN. With the following statement, the test will not crash:
I tried to add wait_all_purged.inc, but it did not help repeat this in a debug build. Also, WITH_VALGRIND did not flag any use of uninitialized memory in either debug or RelWithDebInfo build. A non-optimized RelWithDebInfo build will repeat this, so that is what I will be debugging. | |||||||||||||||||||
| Comment by Marko Mäkelä [ 2019-05-03 ] | |||||||||||||||||||
|
I simplified the test case a little further, and added ROW_FORMAT=REDUNDANT to ease the analysis of page dumps:
Both secondary indexes must be UNIQUE. Column c must be DEFAULT NULL. Column b could also be declared NOT NULL. The column pk is necessary; with an internally created DB_ROW_ID the test would pass. I was not able to replace the INSERT IGNORE with something that would use explicit ROLLBACK. For me, the crash occurs on the unique index (c,d) after trx_rollback_to_savepoint_low had been invoked for the second row that the INSERT IGNORE tried to insert. Upon entering trx_rollback_to_savepoint_low, the index (c,d) contains the following entries: (c,d,pk)=(NULL,0,2),(NULL,0,3),(1,1,1). Initially it may feel strange that no duplicate was flagged for the two (NULL,0), but we must keep in mind that because NULL is not equal to NULL, UNIQUE indexes can allow 'duplicates' where at least one of the unique columns is NULL. The unique index (e) contains (e,pk)=(0,2),(1,1), and it caused the rollback to be initiated due to the attempt to insert the duplicate value 0, in the tuple (e,pk)=(0,3). The PRIMARY KEY index contains the following: (pk,DB_TRX_ID,DB_ROLL_PTR,b,c)=metadata(0,0,1<<55,NULL,NULL,f=NULL),(1,0,1<<55,1,1),(2,DB_TRX_ID,insert,0,NULL),(3,DB_TRX_ID,insert,0,NULL). The first record pk=0 the After rollback and at the time of the assertion failure: the index pages contain the following: Bug: The record (c,d,pk)=(NULL,0,3) was not rolled back! This will be caught by the assertion failure later. At the time of the assertion failure, only the record (3,DB_TRX_ID,insert,0,NULL) had been inserted again; the secondary indexes had not been changed. The assertion would fail, because the re-insert of the record (c,d,pk)=(NULL,0,3) would hit the previously inserted record that we had failed to roll back. | |||||||||||||||||||
| Comment by Marko Mäkelä [ 2019-05-03 ] | |||||||||||||||||||
|
I can reproduce the crash with lower-level operations (no AUTO_INCREMENT):
It will crash on the last INSERT. | |||||||||||||||||||
| Comment by Marko Mäkelä [ 2019-05-03 ] | |||||||||||||||||||
|
This could be the minimal test case, with explicit ROLLBACK and with a non-unique secondary index:
| |||||||||||||||||||
| Comment by Marko Mäkelä [ 2019-05-03 ] | |||||||||||||||||||
|
The root cause is that dtuple_get_nth_v_field() is being implemented differently between debug and release builds, and that the debug build was missing
to document and enforce the assumption.
I think that the #define might have made some sense in the old days when InnoDB was C code. Motivated by this finding, I reimplemented the accessors as const-preserving inline functions and remove duplication between the .h and .ic files. While doing that, I noticed many places where InnoDB was throwing away const-ness, mostly related to virtual columns and spatial indexes. | |||||||||||||||||||
| Comment by Marko Mäkelä [ 2019-05-03 ] | |||||||||||||||||||
|
10.2 might be unaffected by this bug, but 10.3 definitely was affected if instant ADD COLUMN was being used on a table that already contained virtual columns. I have an explanation why elenst failed to reprocude a crash in 10.4. In MDEV-17468 I wrote:
| |||||||||||||||||||
| Comment by Vincent Milum Jr [ 2019-06-05 ] | |||||||||||||||||||
|
This is still persisting in 10.3.15 with slightly different conditions. I've been tracking them in |