[MDEV-29666] InnoDB fails to purge secondary index records when indexed virtual columns exist Created: 2022-09-29 Updated: 2022-11-16 Resolved: 2022-10-05 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB, Virtual Columns |
| Affects Version/s: | 10.2, 10.3, 10.4, 10.5, 10.6, 10.7, 10.8, 10.9, 10.10 |
| Fix Version/s: | 10.3.37, 10.4.27, 10.5.18, 10.6.11, 10.7.7, 10.8.6, 10.9.4, 10.10.2, 10.11.1 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Marko Mäkelä | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | leak, purge | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||
| Description |
|
When I tested
According to the od output, all history for the clustered index record (data fields starting at 0xc07e = 0xc063+0x001b) was purged. The fields are (DB_ROW_ID,DB_TRX_ID,DB_ROLL_PTR,a,b)=(0x200,0,1<<55,2,3):
In the secondary indexes, the delete-marked records created by the UPDATE were not purged. Here is the entry for index(a1):
We see both the delete-marked (a,DB_ROW_ID)=(1,0x200) at 0x1007e as well as the correct record (a,DB_ROW_ID)=2,0x200) at 0x1008e. There is a garbage record in index(b) as well:
The delete-marked (b,DB_ROW_ID)=(1,0x200) at 0x1e07e should have been purged. Only the record (b,DB_ROW_ID)=(3,0x200) should be there. I tracked this down to the following code in the function row_purge_upd_exist_or_extern_func():
We end up with entry=nullptr for both indexes, because node->row is invalid or being accessed incorrectly. The pointer to data_error indicates a column that has not been fully initialized:
|
| Comments |
| Comment by Marko Mäkelä [ 2022-09-29 ] | |||||||||||||||||||||||||||||||||||||||||||||
|
As far as I can tell, the problem was caused by the unnecessary and problematic separation of virtual and regular columns in MySQL 5.7.8. The function trx_undo_rec_get_partial_row() will leave a gap in the table row tuple, and instead copy the value of the indexed virtual column a1=1 from the undo log record to a separate array:
After the MDEV-22363 cleanup, we would just allocate a table row tuple containing all the columns in the same order as TABLE_SHARE::field, including any virtual columns. Some insult to injury is being added in row_build_index_entry_low():
The assertion in the else branch had been relaxed in MySQL 5.7.9, with no analysis, explanation or test case, or identification of the failing assertion. The column that we fail to find in the clustered index row tuple is very much present in the undo log record, because it is the primary key GEN_CLUST_INDEX(DB_ROW_ID):
| |||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-09-29 ] | |||||||||||||||||||||||||||||||||||||||||||||
|
In the split encoding of the table row tuple, row_build_index_entry_low() should have the persistent columns of the table in row->fields, indexed by the column position (as if no virtual columns existed), and the virtual columns in row->v_fields. That is, row->fields should contain the value of (a,b,DB_ROW_ID,DB_TRX_ID,DB_ROLL_PTR) and row->v_fields the value of (a1). This error affects all secondary indexes because the value of the hidden primary key column DB_ROW_ID is missing. If the UPDATE statement is changed to UPDATE t1 SET b=3 (no change to the indexed virtual column), then trx_undo_rec_get_partial_row() will be invoked with the following:
The hidden primary key column DB_ROW_ID at field_no=0 is not affected by the update, but its value will be read by trx_undo_rec_get_col_val() from ptr+3 that was passed to trx_undo_rec_get_partial_row() and correctly stored to the row tuple. The primary key was also already parsed and stored in purge_node_t::ref by the caller. In the case of UPDATE t1 SET a=2, b=3, the ptr that is passed to trx_undo_rec_get_partial_row() will look different:
The record for the DB_ROW_ID=0x200 is completely missing. Even if I add an explicit primary key, the outcome is the same:
Here, we will have two records in the index: a delete-marked (1,1) that should have been purged, and the correct record (b,pk)=(3,1). This bug reminds me of |