[MDEV-25459] MVCC read from index on CHAR or VARCHAR wrongly omits rows - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5, 10.6
Fix Version/s: 10.2.38, 10.3.29, 10.4.19, 10.5.10, 10.6.1
Component/s: Storage Engine - InnoDB
Labels:
- rr-profile-analyzed
- upstream

Description

origin/10.6 af418bb9ef7e422282dc976640409a6af8fcd58c 2021-04-19T14:04:02+10:00

Scenario:

1. Start the server, create one table and fill it with 100 rows.

2. Session 1 runs in a loop some update like

     UPDATE table100_innodb_int_autoinc SET `col_varchar_255_ucs2_key` = CONVERT( 'degsotrsfannidwyvkuvlkeslrryhpkeevqmbksdrzadzpyisznignsytihyjixyalxfxpnafjwzgnkbbayklurufrsajtzohanbuvcfyykvtmesobixwipkoihhqykvoejckythjnjshxgohmecmklxryubdexjgxehdiqqui' USING ASCII )

    Session 2 runs in a loop

    CHECK TABLE table100_innodb_int_autoinc EXTENDED

After short time the CHECK TABLE harvests a

test.table100_innodb_int_autoinc check Warning InnoDB: Index 'col_varchar_255_ucs2_key' contains 98 entries, should be 100.

sdp:/data/Results/1618846463/TBR-36/dev/shm/vardir/1618846463/4/1/rr

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

simp_many_indexes.cfg
43 kB
2021-04-20 09:38
TBR-36_micro.yy
0.9 kB
2021-04-20 09:38
TBR-36.zz
1 kB
2021-04-20 09:37

Issue Links

causes

MDEV-25758 InnoDB spatial indexes miss large geometry fields after MDEV-25459

Closed

relates to

MDEV-24402 CHECK TABLE may miss some cases of index inconsistencies

Closed

Activity

Ascending order - Click to sort in descending order

Matthias Leich created issue - 2021-04-20 09:33

Matthias Leich added a comment - 2021-04-20 09:35 - edited

RQG error patterns

   [ 'TBR-36'     , 'InnoDB: Index .{1,30} contains .{1,10} entries, should be .{1,10}.' ],

   [ 'TBR-36'     , 'failed: 1712 Index .* is corrupted' ],

   [ 'TBR-263'    , 'Index for table .{1,150} is corrupt; try to repair it' ],

git clone https://github.com/mleich1/rqg --branch experimental RQG

./RR_REPLAY.sh simp_many_indexes.cfg <path to MariaDb binaries>  TBR-36_micro.yy

Matthias Leich added a comment - 2021-04-20 09:35 - edited RQG error patterns [ 'TBR-36' , 'InnoDB: Index .{1,30} contains .{1,10} entries, should be .{1,10}.' ], [ 'TBR-36' , 'failed: 1712 Index .* is corrupted' ], [ 'TBR-263' , 'Index for table .{1,150} is corrupt; try to repair it' ], git clone https://github.com/mleich1/rqg --branch experimental RQG ./RR_REPLAY.sh simp_many_indexes.cfg <path to MariaDb binaries> TBR-36_micro.yy

Matthias Leich made changes - 2021-04-20 09:37

Field	Original Value	New Value
Attachment		TBR-36.zz [ 57332 ]

Matthias Leich made changes - 2021-04-20 09:38

Attachment

TBR-36_micro.yy [ 57333 ]

Matthias Leich made changes - 2021-04-20 09:38

Attachment

simp_many_indexes.cfg [ 57334 ]

Matthias Leich made changes - 2021-04-20 09:41

Assignee	Matthias Leich [ mleich ]	Marko Mäkelä [ marko ]
Description	{noformat} origin/10.6 af418bb9ef7e422282dc976640409a6af8fcd58c 2021-04-19T14:04:02+10:00 Scenario: 1. Start the server, create one table and fill it with 100 rows. 2. Session 1 runs in a loop some update like UPDATE table100_innodb_int_autoinc SET `col_varchar_255_ucs2_key` = CONVERT( 'degsotrsfannidwyvkuvlkeslrryhpkeevqmbksdrzadzpyisznignsytihyjixyalxfxpnafjwzgnkbbayklurufrsajtzohanbuvcfyykvtmesobixwipkoihhqykvoejckythjnjshxgohmecmklxryubdexjgxehdiqqui' USING ASCII ) Session 2 runs in a loop CHECK TABLE table100_innodb_int_autoinc EXTENDED After short time the CHECK TABLE harvests a test.table100_innodb_int_autoinc check Warning InnoDB: Index 'col_varchar_255_ucs2_key' contains 98 entries, should be 100. sdp:/data/Results/1618846463/TBR-36/dev/shm/vardir/1618846463/4/1/rr {noformat}	{noformat} origin/10.6 af418bb9ef7e422282dc976640409a6af8fcd58c 2021-04-19T14:04:02+10:00 Scenario: 1. Start the server, create one table and fill it with 100 rows. 2. Session 1 runs in a loop some update like UPDATE table100_innodb_int_autoinc SET `col_varchar_255_ucs2_key` = CONVERT( 'degsotrsfannidwyvkuvlkeslrryhpkeevqmbksdrzadzpyisznignsytihyjixyalxfxpnafjwzgnkbbayklurufrsajtzohanbuvcfyykvtmesobixwipkoihhqykvoejckythjnjshxgohmecmklxryubdexjgxehdiqqui' USING ASCII ) Session 2 runs in a loop CHECK TABLE table100_innodb_int_autoinc EXTENDED After short time the CHECK TABLE harvests a test.table100_innodb_int_autoinc check Warning InnoDB: Index 'col_varchar_255_ucs2_key' contains 98 entries, should be 100. sdp:/data/Results/1618846463/TBR-36/dev/shm/vardir/1618846463/4/1/rr {noformat}
Summary	Draft: Corrupt index caused by some UPDATE+CHECK TABLE	Corrupt index caused by some concurrent UPDATE+CHECK TABLE

Marko Mäkelä made changes - 2021-04-20 10:55

Status

Open [ 1 ]

In Progress [ 3 ]

Marko Mäkelä added a comment - 2021-04-20 12:12

I accessed the index contents of the data directory of the rr replay trace using a patched 10.6 server:

diff --git a/storage/innobase/include/dict0mem.h b/storage/innobase/include/dict0mem.h

index a7f04c3015e..d5d9bfe3715 100644

--- a/storage/innobase/include/dict0mem.h

+++ b/storage/innobase/include/dict0mem.h

@@ -2473,7 +2473,6 @@ inline bool dict_index_t::is_instant() const

 inline bool dict_index_t::is_corrupted() const

 	return UNIV_UNLIKELY(online_status >= ONLINE_INDEX_ABORTED

-			     || (type & DICT_CORRUPT)

 			     || (table && table->corrupted));

To avoid a debug assertion failure in purge, I started the patched server with innodb_force_recovery=2. There are 100 records in both the secondary index and the primary key index, but the primary key index contains the string 'egsotrsfannidwyvkuvlkeslrryhpkeevqmb' in many columns, while the secondary index is missing this update. Due to the way how the secondary index MVCC works, some of the secondary index records will be ignored, and a count mismatch will be reported.

It would be nice to get a simpler rr replay trace of this. It is possible that the problem is related to page splits and merges in the clustered index. Apparently only 14 of the 100 rows would match the WHERE condition. Would this be repeatable with fewer than 86 'extra' rows? An mtr based test would help me a lot.

Marko Mäkelä added a comment - 2021-04-20 12:12 I accessed the index contents of the data directory of the rr replay trace using a patched 10.6 server: diff --git a/storage/innobase/include/dict0mem.h b/storage/innobase/include/dict0mem.h index a7f04c3015e..d5d9bfe3715 100644 --- a/storage/innobase/include/dict0mem.h +++ b/storage/innobase/include/dict0mem.h @@ -2473,7 +2473,6 @@ inline bool dict_index_t::is_instant() const inline bool dict_index_t::is_corrupted() const { return UNIV_UNLIKELY(online_status >= ONLINE_INDEX_ABORTED - || (type & DICT_CORRUPT) || (table && table->corrupted)); } To avoid a debug assertion failure in purge, I started the patched server with innodb_force_recovery=2 . There are 100 records in both the secondary index and the primary key index, but the primary key index contains the string 'egsotrsfannidwyvkuvlkeslrryhpkeevqmb' in many columns, while the secondary index is missing this update. Due to the way how the secondary index MVCC works, some of the secondary index records will be ignored, and a count mismatch will be reported. It would be nice to get a simpler rr replay trace of this. It is possible that the problem is related to page splits and merges in the clustered index. Apparently only 14 of the 100 rows would match the WHERE condition. Would this be repeatable with fewer than 86 'extra' rows? An mtr based test would help me a lot.

Matthias Leich added a comment - 2021-04-21 14:50

sdp:/data/Results/1619015791/TBR-36-MDEV-25459/dev/shm/vardir/1619015791/4/1/rr

CREATE DATABASE /*!32312 IF NOT EXISTS*/ `test` /*!40100 DEFAULT CHARACTER SET latin1 */;

USE `test`;

/*!40101 SET @saved_cs_client     = @@character_set_client */;

/*!40101 SET character_set_client = utf8 */;

CREATE TABLE `t1` (

  `col_varchar_255_ucs2_key` varchar(255) CHARACTER SET ucs2 DEFAULT NULL,

  `col_int` INT,

  `col_char_255_ucs2` char(255) CHARACTER SET ucs2 DEFAULT NULL,

  `col_varchar_255_ucs2` varchar(255) CHARACTER SET ucs2 DEFAULT NULL,

  `col_char_255_utf8_key` char(255) CHARACTER SET utf8 DEFAULT NULL,

  `col_char_255_utf8` char(255) CHARACTER SET utf8 DEFAULT NULL,

  `pk` int(11) NOT NULL AUTO_INCREMENT,

  PRIMARY KEY (`pk`),

  KEY `col_varchar_255_ucs2_key` (`col_varchar_255_ucs2_key`)

) ENGINE=InnoDB AUTO_INCREMENT=95 DEFAULT CHARSET=latin1;

/*!40101 SET character_set_client = @saved_cs_client */;

INSERT INTO `t1` VALUES ('ship',NULL,'unbrfeqlhdxamoxyjrdpanbbojmgfvaikcdulupfslgcghacscrvuvbrwnoqifmmmcbcqzzvplrdhaetgjpebshvxjtyjeytzfsvthwhvh','nunbrfeqlhdxamoxyjrdpanbbojmgfvaikcdulupfslgcghacscrvuvbrwnoqifmmmcbcqzzvplrdhaetgjpebshvxjtyjeytzfsvthwhvhfgzlecuozuucsaybsdbrakgyprqnygxwagryhzmfirpduujnymlfijzbdosrmgienkjovqktvkjxwxdvgcldiilzcaurcpeuturnvpnirubyhzpxa','judgment','active',63);

/*!40101 SET @saved_cs_client     = @@character_set_client */;

/*!40101 SET character_set_client = utf8 */;

RQG grammar:

thread1:

    UPDATE t1 SET `col_varchar_255_ucs2_key` = CONVERT( 'degsotrsfannidwyvkuvlkeslrryhpkeevqmbksdrzadzpyisznignsytihyjixyalxfxpnafjwzgnkbbayklurufrsajtzohanbuvcfyykvtmesobixwipkoihhqykvoejckythjnjshxgohmecmklxryubdexjgxehdiqqui' USING ASCII ) WHERE col_int IS NULL ;

thread2:

    UPDATE t1 SET `col_varchar_255_ucs2_key` = CONVERT( _varchar(255) USING ASCII ) |

    CHECK TABLE t1 EXTENDED ;

The RQG run needs only "--threads=2".

Matthias Leich added a comment - 2021-04-21 14:50 sdp:/data/Results/1619015791/TBR-36-MDEV-25459/dev/shm/vardir/1619015791/4/1/rr CREATE DATABASE /*!32312 IF NOT EXISTS*/ `test` /*!40100 DEFAULT CHARACTER SET latin1 */; USE `test`; /*!40101 SET @saved_cs_client = @@character_set_client */; /*!40101 SET character_set_client = utf8 */; CREATE TABLE `t1` ( `col_varchar_255_ucs2_key` varchar(255) CHARACTER SET ucs2 DEFAULT NULL, `col_int` INT, `col_char_255_ucs2` char(255) CHARACTER SET ucs2 DEFAULT NULL, `col_varchar_255_ucs2` varchar(255) CHARACTER SET ucs2 DEFAULT NULL, `col_char_255_utf8_key` char(255) CHARACTER SET utf8 DEFAULT NULL, `col_char_255_utf8` char(255) CHARACTER SET utf8 DEFAULT NULL, `pk` int(11) NOT NULL AUTO_INCREMENT, PRIMARY KEY (`pk`), KEY `col_varchar_255_ucs2_key` (`col_varchar_255_ucs2_key`) ) ENGINE=InnoDB AUTO_INCREMENT=95 DEFAULT CHARSET=latin1; /*!40101 SET character_set_client = @saved_cs_client */; INSERT INTO `t1` VALUES ('ship',NULL,'unbrfeqlhdxamoxyjrdpanbbojmgfvaikcdulupfslgcghacscrvuvbrwnoqifmmmcbcqzzvplrdhaetgjpebshvxjtyjeytzfsvthwhvh','nunbrfeqlhdxamoxyjrdpanbbojmgfvaikcdulupfslgcghacscrvuvbrwnoqifmmmcbcqzzvplrdhaetgjpebshvxjtyjeytzfsvthwhvhfgzlecuozuucsaybsdbrakgyprqnygxwagryhzmfirpduujnymlfijzbdosrmgienkjovqktvkjxwxdvgcldiilzcaurcpeuturnvpnirubyhzpxa','judgment','active',63); /*!40101 SET @saved_cs_client = @@character_set_client */; /*!40101 SET character_set_client = utf8 */; RQG grammar: thread1: UPDATE t1 SET `col_varchar_255_ucs2_key` = CONVERT( 'degsotrsfannidwyvkuvlkeslrryhpkeevqmbksdrzadzpyisznignsytihyjixyalxfxpnafjwzgnkbbayklurufrsajtzohanbuvcfyykvtmesobixwipkoihhqykvoejckythjnjshxgohmecmklxryubdexjgxehdiqqui' USING ASCII ) WHERE col_int IS NULL ; thread2: UPDATE t1 SET `col_varchar_255_ucs2_key` = CONVERT( _varchar(255) USING ASCII ) | CHECK TABLE t1 EXTENDED ; The RQG run needs only "--threads=2".

Marko Mäkelä added a comment - 2021-04-21 15:35

Thank you, the new trace is perfect. Just one page in the clustered index (with 1 record) and one in the secondary index:

Clustered index:
(pk,DB_TRX_ID,DB_ROLL_PTR,col_varchar_255_ucs2_key,…)=
(63,0x3e,(update),'degsotrsfann'…,…)
Secondary index (PAGE_MAX_TRX_ID=0x3e):
(col_varchar_255_ucs2_key,pk)=
('degsotrsfann'…,63),
('qemleuzymt'…,63)(delete-marked)

The full length of the secondary index column is 340 bytes (2 bytes per ASCII character, UTF-16BE format). Those byte strings are identical between the secondary and clustered index. Here is the string in ASCII (omitting every second byte):

perl -pe 's/.(.)/$1/ge' < /dev/shm/340.bin
degsotrsfannidwyvkuvlkeslrryhpkeevqmbksdrzadzpyisznignsytihyjixyalxfxpnafjwzgnkbbayklurufrsajtzohanbuvcfyykvtmesobixwipkoihhqykvoejckythjnjshxgohmecmklxryubdexjgxehdiqqui

On a quick look, the collation function seems to work correctly. I wonder if this could be another problem with this optimization that was introduced in ~~MDEV-20301~~:

10.6 af418bb9ef7e422282dc976640409a6af8fcd58c
#0 row_sel_sec_rec_is_for_clust_rec (sec_rec=0x7f24c5fb9093 "", sec_index=0x6160008abe08, clust_rec=0x61b000064b16 "\200", clust_index=0x6160008a4308, thr=0x621000261580)
at /data/Server/10.6A/storage/innobase/row/row0sel.cc:183
#1 0x000055a4ef3c0a8d in Row_sel_get_clust_rec_for_mysql::operator() (this=0x6e0225b6b1f0, prebuilt=0x621000260d88, sec_index=0x6160008abe08, rec=0x7f24c5fb9093 "", thr=0x621000261580, out_rec=0x6e0225b6b0d0,
offsets=0x6e0225b6b150, offset_heap=0x6e0225b6b130, vrow=0x0, mtr=0x6e0225b6bc10) at /data/Server/10.6A/storage/innobase/row/row0sel.cc:3496
#2 0x000055a4ef3ccb93 in row_search_mvcc (buf=0x6210002bc100 '\276' <repeats 200 times>..., mode=PAGE_CUR_G, prebuilt=0x621000260d88, match_mode=0, direction=0)
at /data/Server/10.6A/storage/innobase/row/row0sel.cc:5383
#3 0x000055a4ef336896 in row_search_for_mysql (buf=0x6210002bc100 '\276' <repeats 200 times>..., mode=PAGE_CUR_G, prebuilt=0x621000260d88, match_mode=0, direction=0)
at /data/Server/10.6A/storage/innobase/include/row0sel.ic:137
#4 0x000055a4ef351d0a in row_scan_index_for_mysql (prebuilt=0x621000260d88, index=0x6160008abe08, n_rows=0x6e0225b6ca30) at /data/Server/10.6A/storage/innobase/row/row0mysql.cc:4631
#5 0x000055a4eeff45b9 in ha_innobase::check (this=0x61d0008246b8, thd=0x62b00011f218, check_opt=0x62b0001244a8) at /data/Server/10.6A/storage/innobase/handler/ha_innodb.cc:14665

Marko Mäkelä added a comment - 2021-04-21 15:35 Thank you, the new trace is perfect. Just one page in the clustered index (with 1 record) and one in the secondary index: Clustered index: (pk,DB_TRX_ID,DB_ROLL_PTR,col_varchar_255_ucs2_key,…)= (63,0x3e,(update),'degsotrsfann'…,…) Secondary index (PAGE_MAX_TRX_ID=0x3e): (col_varchar_255_ucs2_key,pk)= ('degsotrsfann'…,63), ('qemleuzymt'…,63)(delete-marked) The full length of the secondary index column is 340 bytes (2 bytes per ASCII character, UTF-16BE format). Those byte strings are identical between the secondary and clustered index. Here is the string in ASCII (omitting every second byte): perl -pe 's/.(.)/$1/ge' < /dev/shm/340.bin degsotrsfannidwyvkuvlkeslrryhpkeevqmbksdrzadzpyisznignsytihyjixyalxfxpnafjwzgnkbbayklurufrsajtzohanbuvcfyykvtmesobixwipkoihhqykvoejckythjnjshxgohmecmklxryubdexjgxehdiqqui On a quick look, the collation function seems to work correctly. I wonder if this could be another problem with this optimization that was introduced in MDEV-20301 : 10.6 af418bb9ef7e422282dc976640409a6af8fcd58c #0 row_sel_sec_rec_is_for_clust_rec (sec_rec=0x7f24c5fb9093 "", sec_index=0x6160008abe08, clust_rec=0x61b000064b16 "\200", clust_index=0x6160008a4308, thr=0x621000261580) at /data/Server/10.6A/storage/innobase/row/row0sel.cc:183 #1 0x000055a4ef3c0a8d in Row_sel_get_clust_rec_for_mysql::operator() (this=0x6e0225b6b1f0, prebuilt=0x621000260d88, sec_index=0x6160008abe08, rec=0x7f24c5fb9093 "", thr=0x621000261580, out_rec=0x6e0225b6b0d0, offsets=0x6e0225b6b150, offset_heap=0x6e0225b6b130, vrow=0x0, mtr=0x6e0225b6bc10) at /data/Server/10.6A/storage/innobase/row/row0sel.cc:3496 #2 0x000055a4ef3ccb93 in row_search_mvcc (buf=0x6210002bc100 '\276' <repeats 200 times>..., mode=PAGE_CUR_G, prebuilt=0x621000260d88, match_mode=0, direction=0) at /data/Server/10.6A/storage/innobase/row/row0sel.cc:5383 #3 0x000055a4ef336896 in row_search_for_mysql (buf=0x6210002bc100 '\276' <repeats 200 times>..., mode=PAGE_CUR_G, prebuilt=0x621000260d88, match_mode=0, direction=0) at /data/Server/10.6A/storage/innobase/include/row0sel.ic:137 #4 0x000055a4ef351d0a in row_scan_index_for_mysql (prebuilt=0x621000260d88, index=0x6160008abe08, n_rows=0x6e0225b6ca30) at /data/Server/10.6A/storage/innobase/row/row0mysql.cc:4631 #5 0x000055a4eeff45b9 in ha_innobase::check (this=0x61d0008246b8, thd=0x62b00011f218, check_opt=0x62b0001244a8) at /data/Server/10.6A/storage/innobase/handler/ha_innodb.cc:14665

Marko Mäkelä added a comment - 2021-04-21 15:51

I will have to debug the trace further (around when 61095) to check why exactly we are assigning clust_rec=NULL here:

		if (clust_rec

		    && (old_vers

			|| trx->isolation_level <= TRX_ISO_READ_UNCOMMITTED

			|| dict_index_is_spatial(sec_index)

			|| rec_get_deleted_flag(rec, dict_table_is_comp(

							sec_index->table)))) {

			err = row_sel_sec_rec_is_for_clust_rec(rec, sec_index,

						clust_rec, clust_index, thr);

			switch (err) {

			case DB_SUCCESS:

				clust_rec = NULL;

				break;

Currently it looks like we are constructing wrong old_vers with (pk=63,DB_TRX_ID=0x3c,(update)) and apparently empty string. It is as if CHECK TABLE is using a newer read view for accessing the clustered index.

Marko Mäkelä added a comment - 2021-04-21 15:51 I will have to debug the trace further (around when 61095) to check why exactly we are assigning clust_rec=NULL here: if (clust_rec && (old_vers || trx->isolation_level <= TRX_ISO_READ_UNCOMMITTED || dict_index_is_spatial(sec_index) || rec_get_deleted_flag(rec, dict_table_is_comp( sec_index->table)))) { err = row_sel_sec_rec_is_for_clust_rec(rec, sec_index, clust_rec, clust_index, thr); switch (err) { case DB_SUCCESS: clust_rec = NULL; break ; Currently it looks like we are constructing wrong old_vers with (pk=63,DB_TRX_ID=0x3c,(update)) and apparently empty string. It is as if CHECK TABLE is using a newer read view for accessing the clustered index.

Marko Mäkelä added a comment - 2021-04-23 09:18

For the clustered index scan in the failing CHECK TABLE, row_sel_clust_sees() will not hold for the single record
(pk,DB_TRX_ID,DB_ROLL_PTR,col_varchar_255_ucs2_key,…)=
(63,0x3e,(update),'degsotrsfann'…,…)
so we constructed old_vers that contains a BLOB reference for that column:
(pk,DB_TRX_ID,DB_ROLL_PTR,col_varchar_255_ucs2_key,…)=
(63,0x3c,(update),(space=8, page=5, offset=0x26, length=0x1f0),…)

The payload of the BLOB page is the following UTF-16BE encoded 248-char ASCII string:

qemleuzymtgopfikwzloyibtgehtjhlablwaewqzuglpeoqnnoxsqeyzbocilomtnpaxihztdrtqahffxekdhrwdvtdvmlshvdhrgfkmuwonkyqzwelujpeyggixymdugzhiqlmnrlpxzrytoatxolflxdzknhkgyttnjqwqcutogtwviuoqkbzxlbzkzdupmxeeifroaweulyyetwdnnrvxrtoksrgcplpyubzpyntedwlgnvzehakg

The prefix of this seems to match the delete-marked record in the secondary index page that I posted earlier.

So, nothing seems to be really corrupted, but the secondary index MVCC code probably fails to read the BLOB, and instead treats the 20-byte BLOB pointer as the actual string. I will post more, once I have debugged that part of the code.

Marko Mäkelä added a comment - 2021-04-23 09:18 For the clustered index scan in the failing CHECK TABLE , row_sel_clust_sees() will not hold for the single record (pk,DB_TRX_ID,DB_ROLL_PTR,col_varchar_255_ucs2_key,…)= (63,0x3e,(update),'degsotrsfann'…,…) so we constructed old_vers that contains a BLOB reference for that column: (pk,DB_TRX_ID,DB_ROLL_PTR,col_varchar_255_ucs2_key,…)= (63,0x3c,(update),(space=8, page=5, offset=0x26, length=0x1f0),…) The payload of the BLOB page is the following UTF-16BE encoded 248-char ASCII string: qemleuzymtgopfikwzloyibtgehtjhlablwaewqzuglpeoqnnoxsqeyzbocilomtnpaxihztdrtqahffxekdhrwdvtdvmlshvdhrgfkmuwonkyqzwelujpeyggixymdugzhiqlmnrlpxzrytoatxolflxdzknhkgyttnjqwqcutogtwviuoqkbzxlbzkzdupmxeeifroaweulyyetwdnnrvxrtoksrgcplpyubzpyntedwlgnvzehakg The prefix of this seems to match the delete-marked record in the secondary index page that I posted earlier. So, nothing seems to be really corrupted, but the secondary index MVCC code probably fails to read the BLOB, and instead treats the 20-byte BLOB pointer as the actual string. I will post more, once I have debugged that part of the code.

Marko Mäkelä added a comment - 2021-04-23 09:41

We have this code in row_sel_sec_rec_is_for_clust_rec():

		if (ifield->prefix_len > 0 && len != UNIV_SQL_NULL

		    && sec_len != UNIV_SQL_NULL && !is_virtual) {

			if (rec_offs_nth_extern(clust_offs, clust_pos)) {

				len -= BTR_EXTERN_FIELD_REF_SIZE;

			len = dtype_get_at_most_n_mbchars(

				col->prtype, col->mbminlen, col->mbmaxlen,

				ifield->prefix_len, len, (char*) clust_field);

			if (rec_offs_nth_extern(clust_offs, clust_pos)

			    && len < sec_len) {

				if (!row_sel_sec_rec_is_for_blob(

Because the secondary index does not comprise a column prefix, but in fact the full column col_varchar_255_ucs2_key, we fail to invoke the check for BLOB (really, off-page column, in this case VARCHAR).

This should mean that MVCC is broken for full-column secondary indexes in ROW_FORMAT=DYNAMIC and ROW_FORMAT=COMPRESSED InnoDB tables if the column happened to become chosen for external storage in the clustered index. In ROW_FORMAT=REDUNDANT (the default and only format until MySQL 5.0.3) and ROW_FORMAT=COMPACT (the default between MySQL 5.0.3 and 5.7, or MariaDB Server before 10.2.2), we would always store the entire column in the clustered index.

I will have to check if something similar is broken in the implicit locking of secondary indexes, in row_vers_impl_x_locked_low(). If the call to row_build() is not fetching the externally stored VARCHAR column, locking for secondary index records could be broken. I will try to verify that on a copy of the data directory and possibly a patched server.

Marko Mäkelä added a comment - 2021-04-23 09:41 We have this code in row_sel_sec_rec_is_for_clust_rec() : if (ifield->prefix_len > 0 && len != UNIV_SQL_NULL && sec_len != UNIV_SQL_NULL && !is_virtual) { if (rec_offs_nth_extern(clust_offs, clust_pos)) { len -= BTR_EXTERN_FIELD_REF_SIZE; } len = dtype_get_at_most_n_mbchars( col->prtype, col->mbminlen, col->mbmaxlen, ifield->prefix_len, len, ( char *) clust_field); if (rec_offs_nth_extern(clust_offs, clust_pos) && len < sec_len) { if (!row_sel_sec_rec_is_for_blob( Because the secondary index does not comprise a column prefix, but in fact the full column col_varchar_255_ucs2_key , we fail to invoke the check for BLOB (really, off-page column, in this case VARCHAR ). This should mean that MVCC is broken for full-column secondary indexes in ROW_FORMAT=DYNAMIC and ROW_FORMAT=COMPRESSED InnoDB tables if the column happened to become chosen for external storage in the clustered index. In ROW_FORMAT=REDUNDANT (the default and only format until MySQL 5.0.3) and ROW_FORMAT=COMPACT (the default between MySQL 5.0.3 and 5.7, or MariaDB Server before 10.2.2), we would always store the entire column in the clustered index. I will have to check if something similar is broken in the implicit locking of secondary indexes, in row_vers_impl_x_locked_low() . If the call to row_build() is not fetching the externally stored VARCHAR column, locking for secondary index records could be broken. I will try to verify that on a copy of the data directory and possibly a patched server.

Marko Mäkelä added a comment - 2021-04-23 11:09

I can repeat the problem with the following:

--source include/innodb_page_size_small.inc

CREATE TABLE t1 (

  pk int PRIMARY KEY, c varchar(255) UNIQUE,

  d char(255), e varchar(255), f char(255), g char(255)

) ENGINE=InnoDB ROW_FORMAT=DYNAMIC DEFAULT CHARACTER SET ucs2;

INSERT INTO t1 VALUES

(1,REPEAT('c',248),REPEAT('a',106),REPEAT('b',220),REPEAT('x',14),'');

BEGIN;

UPDATE t1 SET c=REPEAT('d',170);

connect (con1,localhost,root,,);

SELECT pk FROM t1 FORCE INDEX (c);

connection default;

COMMIT;

connection con1;

SELECT pk FROM t1 FORCE INDEX (c);

disconnect con1;

connection default;

DROP TABLE t1;

10.2 922e676b43c7b5cb0f20ca67c6d2222e2fc5ec03
innodb.mvcc_secondary '16k,innodb' w1 [ pass ] 6
innodb.mvcc_secondary '8k,innodb' w3 [ pass ] 5
innodb.mvcc_secondary '4k,innodb' w2 [ fail ]
Test ended at 2021-04-23 14:06:40

CURRENT_TEST: innodb.mvcc_secondary
--- /mariadb/10.2o/mysql-test/suite/innodb/r/mvcc_secondary.result 2021-04-23 14:06:25.841488727 +0300
+++ /mariadb/10.2o/mysql-test/suite/innodb/r/mvcc_secondary.reject 2021-04-23 14:06:40.565659501 +0300
@@ -9,7 +9,6 @@
connect con1,localhost,root,,;
SELECT pk FROM t1 FORCE INDEX (c);
pk
-1
connection default;
COMMIT;
connection con1;

mysqltest: Result length mismatch

I believe that the bug affects also earlier versions, but I did not test it, because MariaDB 5.5, 10.0, 10.1 have already reached their end of life.

Also, this test case only repeats the problem for innodb_page_size=4k, it should be repeatable with any page size when using appropriately sized records.

Marko Mäkelä added a comment - 2021-04-23 11:09 I can repeat the problem with the following: --source include/innodb_page_size_small.inc CREATE TABLE t1 ( pk int PRIMARY KEY , c varchar (255) UNIQUE , d char (255), e varchar (255), f char (255), g char (255) ) ENGINE=InnoDB ROW_FORMAT= DYNAMIC DEFAULT CHARACTER SET ucs2; INSERT INTO t1 VALUES (1,REPEAT( 'c' ,248),REPEAT( 'a' ,106),REPEAT( 'b' ,220),REPEAT( 'x' ,14), '' ); BEGIN ; UPDATE t1 SET c=REPEAT( 'd' ,170); connect (con1,localhost,root,,); SELECT pk FROM t1 FORCE INDEX (c); connection default ; COMMIT ; connection con1; SELECT pk FROM t1 FORCE INDEX (c); disconnect con1; connection default ; DROP TABLE t1; 10.2 922e676b43c7b5cb0f20ca67c6d2222e2fc5ec03 innodb.mvcc_secondary '16k,innodb' w1 [ pass ] 6 innodb.mvcc_secondary '8k,innodb' w3 [ pass ] 5 innodb.mvcc_secondary '4k,innodb' w2 [ fail ] Test ended at 2021-04-23 14:06:40 CURRENT_TEST: innodb.mvcc_secondary --- /mariadb/10.2o/mysql-test/suite/innodb/r/mvcc_secondary.result 2021-04-23 14:06:25.841488727 +0300 +++ /mariadb/10.2o/mysql-test/suite/innodb/r/mvcc_secondary.reject 2021-04-23 14:06:40.565659501 +0300 @@ -9,7 +9,6 @@ connect con1,localhost,root,,; SELECT pk FROM t1 FORCE INDEX (c); pk -1 connection default; COMMIT; connection con1; mysqltest: Result length mismatch I believe that the bug affects also earlier versions, but I did not test it, because MariaDB 5.5, 10.0, 10.1 have already reached their end of life. Also, this test case only repeats the problem for innodb_page_size=4k , it should be repeatable with any page size when using appropriately sized records.

Marko Mäkelä made changes - 2021-04-23 11:09

Fix Version/s		10.2 [ 14601 ]
Fix Version/s		10.3 [ 22126 ]
Fix Version/s		10.4 [ 22408 ]
Fix Version/s		10.5 [ 23123 ]
Affects Version/s		10.2 [ 14601 ]
Affects Version/s		10.3 [ 22126 ]
Affects Version/s		10.4 [ 22408 ]
Affects Version/s		10.5 [ 23123 ]
Affects Version/s		10.6 [ 24028 ]
Affects Version/s	10.6.0 [ 24431 ]
Labels	rr-profile	rr-profile-analyzed upstream
Summary	Corrupt index caused by some concurrent UPDATE+CHECK TABLE	MVCC read from index on CHAR or VARCHAR wrongly omits rows

Marko Mäkelä added a comment - 2021-04-23 11:23

If I append FOR UPDATE or LOCK IN SHARE MODE to the first SELECT, a locking conflict will occur:

10.2 922e676b43c7b5cb0f20ca67c6d2222e2fc5ec03
mysqltest: At line 15: query 'SELECT pk FROM t1 FORCE INDEX (c) LOCK IN SHARE MODE' failed: 1205: Lock wait timeout exceeded; try restarting transaction

This suggests that no change for row_vers_impl_x_locked_low() is needed, and only row_sel_sec_rec_is_for_clust_rec() is wrong.

Marko Mäkelä added a comment - 2021-04-23 11:23 If I append FOR UPDATE or LOCK IN SHARE MODE to the first SELECT , a locking conflict will occur: 10.2 922e676b43c7b5cb0f20ca67c6d2222e2fc5ec03 mysqltest: At line 15: query 'SELECT pk FROM t1 FORCE INDEX (c) LOCK IN SHARE MODE' failed: 1205: Lock wait timeout exceeded; try restarting transaction This suggests that no change for row_vers_impl_x_locked_low() is needed, and only row_sel_sec_rec_is_for_clust_rec() is wrong.

Marko Mäkelä made changes - 2021-04-24 08:03

Fix Version/s		10.2.38 [ 25207 ]
Fix Version/s		10.3.29 [ 25206 ]
Fix Version/s		10.4.19 [ 25205 ]
Fix Version/s		10.5.10 [ 25204 ]
Fix Version/s		10.6.1 [ 24437 ]
Fix Version/s	10.2 [ 14601 ]
Fix Version/s	10.3 [ 22126 ]
Fix Version/s	10.4 [ 22408 ]
Fix Version/s	10.5 [ 23123 ]
Fix Version/s	10.6 [ 24028 ]
Resolution		Fixed [ 1 ]
Status	In Progress [ 3 ]	Closed [ 6 ]

Marko Mäkelä made changes - 2021-04-26 06:56

Link

This issue relates to ~~MDEV-24402~~ [ ~~MDEV-24402~~ ]

Jukka Santala made changes - 2021-05-22 00:28

Link

This issue causes ~~MDEV-25758~~ [ ~~MDEV-25758~~ ]

Rob Schwyzer (Inactive) made changes - 2021-06-24 20:56

Labels

rr-profile-analyzed upstream

ServiceNow rr-profile-analyzed upstream

Rob Schwyzer (Inactive) made changes - 2021-07-02 19:45

Labels

ServiceNow rr-profile-analyzed upstream

76qDvLB8Gju6Hs7nk3VY3EX42G795W5z rr-profile-analyzed upstream

Sergei Golubchik made changes - 2021-08-13 22:33

Labels

76qDvLB8Gju6Hs7nk3VY3EX42G795W5z rr-profile-analyzed upstream

rr-profile-analyzed upstream

Sergei Golubchik made changes - 2021-12-06 21:53

Workflow

MariaDB v3 [ 121179 ]

MariaDB v4 [ 159166 ]

Jira Automation (IT) made changes - 2024-07-04 02:23

Zendesk Related Tickets		201658
Zendesk active tickets		201658

People

Assignee:: Marko Mäkelä

Reporter:: Matthias Leich

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 2021-04-20 09:33

Updated:: 2024-09-06 10:00

Resolved:: 2021-04-24 08:03

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.