Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5, 10.6, 10.7(EOL), 10.8(EOL), 10.9(EOL), 10.10(EOL)
Description
When I tested MDEV-24402 CHECK TABLE...EXTENDED, the test gcol.innodb_virtual_purge would fail because secondary indexes contain orphan records. Here is a reduced test case:
--source include/have_innodb.inc
|
|
CREATE TABLE t1 (a INT, b INT, a1 INT AS(a) VIRTUAL, |
INDEX(a1),INDEX(b)) ENGINE=InnoDB; |
INSERT INTO t1 SET a=1, b=1; |
|
UPDATE t1 SET a=2, b=3; |
let $datadir=`select @@datadir`; |
let $sec=`select 3*@@innodb_page_size`; |
|
SET GLOBAL innodb_fast_shutdown=0; |
--source include/shutdown_mysqld.inc
|
--exec od -Ax -t x1 -j $sec $datadir/test/t1.ibd
|
--source include/start_mysqld.inc
|
DROP TABLE t1; |
According to the od output, all history for the clustered index record (data fields starting at 0xc07e = 0xc063+0x001b) was purged. The fields are (DB_ROW_ID,DB_TRX_ID,DB_ROLL_PTR,a,b)=(0x200,0,1<<55,2,3):
10.3 ce23802c0e4c85018f2141d2ebb0602dc0d3d6d8 |
00c060 02 00 1b 69 6e 66 69 6d 75 6d 00 02 00 0b 00 00
|
00c070 73 75 70 72 65 6d 75 6d 00 00 00 10 ff f2 00 00
|
00c080 00 00 02 00 00 00 00 00 00 00 80 00 00 00 00 00
|
00c090 00 80 00 00 02 80 00 00 03 00 00 00 00 00 00 00
|
In the secondary indexes, the delete-marked records created by the UPDATE were not purged. Here is the entry for index(a1):
10.3 ce23802c0e4c85018f2141d2ebb0602dc0d3d6d8 |
010060 02 00 1b 69 6e 66 69 6d 75 6d 00 03 00 0b 00 00
|
010070 73 75 70 72 65 6d 75 6d 00 20 00 10 00 10 80 00
|
010080 00 01 00 00 00 00 02 00 00 00 00 18 ff e2 80 00
|
010090 00 02 00 00 00 00 02 00 00 00 00 00 00 00 00 00
|
We see both the delete-marked (a,DB_ROW_ID)=(1,0x200) at 0x1007e as well as the correct record (a,DB_ROW_ID)=2,0x200) at 0x1008e. There is a garbage record in index(b) as well:
10.3 ce23802c0e4c85018f2141d2ebb0602dc0d3d6d8 |
014060 02 00 1b 69 6e 66 69 6d 75 6d 00 03 00 0b 00 00
|
014070 73 75 70 72 65 6d 75 6d 00 20 00 10 00 10 80 00
|
014080 00 01 00 00 00 00 02 00 00 00 00 18 ff e2 80 00
|
014090 00 03 00 00 00 00 02 00 00 00 00 00 00 00 00 00
|
The delete-marked (b,DB_ROW_ID)=(1,0x200) at 0x1e07e should have been purged. Only the record (b,DB_ROW_ID)=(3,0x200) should be there.
I tracked this down to the following code in the function row_purge_upd_exist_or_extern_func():
if (row_upd_changes_ord_field_binary(node->index, node->update, |
thr, NULL, NULL)) {
|
/* Build the older version of the index entry */ |
dtuple_t* entry = row_build_index_entry_low(
|
node->row, NULL, node->index,
|
heap, ROW_BUILD_FOR_PURGE);
|
row_purge_remove_sec_if_poss(node, node->index, entry);
|
We end up with entry=nullptr for both indexes, because node->row is invalid or being accessed incorrectly. The pointer to data_error indicates a column that has not been fully initialized:
10.3 ce23802c0e4c85018f2141d2ebb0602dc0d3d6d8 |
Thread 22 hit Breakpoint 2, row_purge_remove_sec_if_poss (node=node@entry=0x5613c42f37a0, index=0x7f8414167a50, entry=0x0) at /mariadb/10.3/storage/innobase/row/row0purge.cc:688
|
688 if (!entry) {
|
1: node.table.name = {m_name = 0x7f8414158690 "test/t1"}
|
(rr) p *node.row
|
$1 = {info_bits = 0, n_fields = 5, n_fields_cmp = 5, fields = 0x7f841c009270,
|
n_v_fields = 1, v_fields = 0x7f841c0092e8, magic_n = 65478679}
|
(rr) p *node.row.fields@5
|
$2 = {{data = 0x5613c42f3d5d, ext = 0, spatial_status = 0, len = 4, type = {
|
prtype = 1027, mtype = 6, len = 4, mbminlen = 0, mbmaxlen = 0}}, {
|
data = 0x5613c42f3d63, ext = 0, spatial_status = 0, len = 4, type = {
|
prtype = 1027, mtype = 6, len = 4, mbminlen = 0, mbmaxlen = 0}}, {
|
data = 0x5613c3518908 <data_error>, ext = 0, spatial_status = 0,
|
len = 4294967295, type = {prtype = 111, mtype = 0, len = 0, mbminlen = 0,
|
mbmaxlen = 0}}, {data = 0x7f841c009210, ext = 0, spatial_status = 0,
|
len = 6, type = {prtype = 257, mtype = 8, len = 6, mbminlen = 0,
|
mbmaxlen = 0}}, {data = 0x7f841c009218, ext = 0, spatial_status = 0,
|
len = 7, type = {prtype = 258, mtype = 8, len = 7, mbminlen = 0,
|
mbmaxlen = 0}}}
|
Attachments
Issue Links
- causes
-
MDEV-30024 InnoDB: tried to purge non-delete-marked record of an index on a virtual column prefix
-
- Closed
-
- is caused by
-
MDEV-5800 indexes on virtual (not materialized) columns
-
- Closed
-
- relates to
-
MDEV-22363 Reimplement the InnoDB virtual column code
-
- Open
-
-
MDEV-29823 Secondary index records may never be purged after rollback
-
- Confirmed
-
-
MDEV-14051 'Undo log record is too big.' error occurring in very narrow range of string lengths
-
- Closed
-
-
MDEV-14799 After UPDATE of indexed columns, old values will not be purged from secondary indexes
-
- Closed
-
-
MDEV-20640 [ERROR] InnoDB: tried to purge non-delete-marked record and Assertion `0' failed in row_purge_remove_sec_if_poss_leaf
-
- Closed
-
-
MDEV-24402 CHECK TABLE may miss some cases of index inconsistencies
-
- Closed
-
As far as I can tell, the problem was caused by the unnecessary and problematic separation of virtual and regular columns in MySQL 5.7.8. The function trx_undo_rec_get_partial_row() will leave a gap in the table row tuple, and instead copy the value of the indexed virtual column a1=1 from the undo log record to a separate array:
dict_v_col_t* vcol = dict_table_get_nth_v_col(
index->table, field_no);
col = &vcol->m_col;
col_no = dict_col_get_no(col);
dfield = dtuple_get_nth_v_field(*row, vcol->v_pos);
dict_col_copy_type(
&vcol->m_col,
dfield_get_type(dfield));
…
}
dfield_set_data(dfield, field, len);
After the MDEV-22363 cleanup, we would just allocate a table row tuple containing all the columns in the same order as TABLE_SHARE::field, including any virtual columns.
Some insult to injury is being added in row_build_index_entry_low():
ut_ad(v_col->v_pos < dtuple_get_n_v_fields(row));
dfield2 = dtuple_get_nth_v_field(row, v_col->v_pos);
ut_ad(dfield_is_null(dfield2) ||
dfield_get_len(dfield2) == 0 || dfield2->data);
dfield2 = dtuple_get_nth_field(row, col_no);
ut_ad(dfield_get_type(dfield2)->mtype == DATA_MISSING
|| (!(dfield_get_type(dfield2)->prtype
& DATA_VIRTUAL)));
}
== DATA_MISSING)) {
This should be from trx_undo_rec_get_partial_row(). */
}
The assertion in the else branch had been relaxed in MySQL 5.7.9, with no analysis, explanation or test case, or identification of the failing assertion.
The column that we fail to find in the clustered index row tuple is very much present in the undo log record, because it is the primary key GEN_CLUST_INDEX(DB_ROW_ID):
10.3 ce23802c0e4c85018f2141d2ebb0602dc0d3d6d8
Thread 3 hit Hardware access (read/write) watchpoint 7: *(dfield_t *) 0x7f841c0092a0
Value = {data = 0x5613c3518908 <data_error>, ext = 0, spatial_status = 0, len = 4294967295, type = {prtype = 111, mtype = 0, len = 0, mbminlen = 0, mbmaxlen = 0}}
0x00005613c30e496b in row_build_index_entry_low (row=0x7f841c009238, ext=ext@entry=0x0, index=0x7f8414168110, heap=heap@entry=0x7f841c00aff0, flag=flag@entry=1) at /mariadb/10.3/storage/innobase/row/row0row.cc:259
259 ut_ad(dfield_get_type(dfield2)->mtype == DATA_MISSING
(rr) p *ind_field
$25 = {col = 0x7f841400ca20, name = {m_name = 0x7f841400cb2c "DB_ROW_ID"}, prefix_len = 0, fixed_len = 6}