Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-25459

MVCC read from index on CHAR or VARCHAR wrongly omits rows

Details

    Description

      origin/10.6 af418bb9ef7e422282dc976640409a6af8fcd58c 2021-04-19T14:04:02+10:00
       
      Scenario:
      1. Start the server, create one table and fill it with 100 rows.
      2. Session 1 runs in a loop some update like
           UPDATE table100_innodb_int_autoinc SET `col_varchar_255_ucs2_key` = CONVERT( 'degsotrsfannidwyvkuvlkeslrryhpkeevqmbksdrzadzpyisznignsytihyjixyalxfxpnafjwzgnkbbayklurufrsajtzohanbuvcfyykvtmesobixwipkoihhqykvoejckythjnjshxgohmecmklxryubdexjgxehdiqqui' USING ASCII )
          Session 2 runs in a loop
          CHECK TABLE table100_innodb_int_autoinc EXTENDED
      After short time the CHECK TABLE harvests a
      test.table100_innodb_int_autoinc check Warning InnoDB: Index 'col_varchar_255_ucs2_key' contains 98 entries, should be 100.
       
      sdp:/data/Results/1618846463/TBR-36/dev/shm/vardir/1618846463/4/1/rr
           
      

      Attachments

        1. simp_many_indexes.cfg
          43 kB
        2. TBR-36_micro.yy
          0.9 kB
        3. TBR-36.zz
          1 kB

        Issue Links

          Activity

            mleich Matthias Leich added a comment - - edited

            RQG error patterns
               [ 'TBR-36'     , 'InnoDB: Index .{1,30} contains .{1,10} entries, should be .{1,10}.' ],
               [ 'TBR-36'     , 'failed: 1712 Index .* is corrupted' ],
               [ 'TBR-263'    , 'Index for table .{1,150} is corrupt; try to repair it' ],
             
            git clone https://github.com/mleich1/rqg --branch experimental RQG
             
            ./RR_REPLAY.sh simp_many_indexes.cfg <path to MariaDb binaries>  TBR-36_micro.yy
            

            mleich Matthias Leich added a comment - - edited RQG error patterns [ 'TBR-36' , 'InnoDB: Index .{1,30} contains .{1,10} entries, should be .{1,10}.' ], [ 'TBR-36' , 'failed: 1712 Index .* is corrupted' ], [ 'TBR-263' , 'Index for table .{1,150} is corrupt; try to repair it' ],   git clone https://github.com/mleich1/rqg --branch experimental RQG   ./RR_REPLAY.sh simp_many_indexes.cfg <path to MariaDb binaries> TBR-36_micro.yy

            I accessed the index contents of the data directory of the rr replay trace using a patched 10.6 server:

            diff --git a/storage/innobase/include/dict0mem.h b/storage/innobase/include/dict0mem.h
            index a7f04c3015e..d5d9bfe3715 100644
            --- a/storage/innobase/include/dict0mem.h
            +++ b/storage/innobase/include/dict0mem.h
            @@ -2473,7 +2473,6 @@ inline bool dict_index_t::is_instant() const
             inline bool dict_index_t::is_corrupted() const
             {
             	return UNIV_UNLIKELY(online_status >= ONLINE_INDEX_ABORTED
            -			     || (type & DICT_CORRUPT)
             			     || (table && table->corrupted));
             }
             
            

            To avoid a debug assertion failure in purge, I started the patched server with innodb_force_recovery=2. There are 100 records in both the secondary index and the primary key index, but the primary key index contains the string 'egsotrsfannidwyvkuvlkeslrryhpkeevqmb' in many columns, while the secondary index is missing this update. Due to the way how the secondary index MVCC works, some of the secondary index records will be ignored, and a count mismatch will be reported.

            It would be nice to get a simpler rr replay trace of this. It is possible that the problem is related to page splits and merges in the clustered index. Apparently only 14 of the 100 rows would match the WHERE condition. Would this be repeatable with fewer than 86 'extra' rows? An mtr based test would help me a lot.

            marko Marko Mäkelä added a comment - I accessed the index contents of the data directory of the rr replay trace using a patched 10.6 server: diff --git a/storage/innobase/include/dict0mem.h b/storage/innobase/include/dict0mem.h index a7f04c3015e..d5d9bfe3715 100644 --- a/storage/innobase/include/dict0mem.h +++ b/storage/innobase/include/dict0mem.h @@ -2473,7 +2473,6 @@ inline bool dict_index_t::is_instant() const inline bool dict_index_t::is_corrupted() const { return UNIV_UNLIKELY(online_status >= ONLINE_INDEX_ABORTED - || (type & DICT_CORRUPT) || (table && table->corrupted)); } To avoid a debug assertion failure in purge, I started the patched server with innodb_force_recovery=2 . There are 100 records in both the secondary index and the primary key index, but the primary key index contains the string 'egsotrsfannidwyvkuvlkeslrryhpkeevqmb' in many columns, while the secondary index is missing this update. Due to the way how the secondary index MVCC works, some of the secondary index records will be ignored, and a count mismatch will be reported. It would be nice to get a simpler rr replay trace of this. It is possible that the problem is related to page splits and merges in the clustered index. Apparently only 14 of the 100 rows would match the WHERE condition. Would this be repeatable with fewer than 86 'extra' rows? An mtr based test would help me a lot.

            sdp:/data/Results/1619015791/TBR-36-MDEV-25459/dev/shm/vardir/1619015791/4/1/rr
             
            CREATE DATABASE /*!32312 IF NOT EXISTS*/ `test` /*!40100 DEFAULT CHARACTER SET latin1 */;
             
            USE `test`;
            /*!40101 SET @saved_cs_client     = @@character_set_client */;
            /*!40101 SET character_set_client = utf8 */;
            CREATE TABLE `t1` (
              `col_varchar_255_ucs2_key` varchar(255) CHARACTER SET ucs2 DEFAULT NULL,
              `col_int` INT,
              `col_char_255_ucs2` char(255) CHARACTER SET ucs2 DEFAULT NULL,
              `col_varchar_255_ucs2` varchar(255) CHARACTER SET ucs2 DEFAULT NULL,
              `col_char_255_utf8_key` char(255) CHARACTER SET utf8 DEFAULT NULL,
              `col_char_255_utf8` char(255) CHARACTER SET utf8 DEFAULT NULL,
              `pk` int(11) NOT NULL AUTO_INCREMENT,
              PRIMARY KEY (`pk`),
              KEY `col_varchar_255_ucs2_key` (`col_varchar_255_ucs2_key`)
            ) ENGINE=InnoDB AUTO_INCREMENT=95 DEFAULT CHARSET=latin1;
            /*!40101 SET character_set_client = @saved_cs_client */;
             
            INSERT INTO `t1` VALUES ('ship',NULL,'unbrfeqlhdxamoxyjrdpanbbojmgfvaikcdulupfslgcghacscrvuvbrwnoqifmmmcbcqzzvplrdhaetgjpebshvxjtyjeytzfsvthwhvh','nunbrfeqlhdxamoxyjrdpanbbojmgfvaikcdulupfslgcghacscrvuvbrwnoqifmmmcbcqzzvplrdhaetgjpebshvxjtyjeytzfsvthwhvhfgzlecuozuucsaybsdbrakgyprqnygxwagryhzmfirpduujnymlfijzbdosrmgienkjovqktvkjxwxdvgcldiilzcaurcpeuturnvpnirubyhzpxa','judgment','active',63);
             
            /*!40101 SET @saved_cs_client     = @@character_set_client */;
            /*!40101 SET character_set_client = utf8 */;
             
            RQG grammar:
            thread1:
                UPDATE t1 SET `col_varchar_255_ucs2_key` = CONVERT( 'degsotrsfannidwyvkuvlkeslrryhpkeevqmbksdrzadzpyisznignsytihyjixyalxfxpnafjwzgnkbbayklurufrsajtzohanbuvcfyykvtmesobixwipkoihhqykvoejckythjnjshxgohmecmklxryubdexjgxehdiqqui' USING ASCII ) WHERE col_int IS NULL ;
             
            thread2:
                UPDATE t1 SET `col_varchar_255_ucs2_key` = CONVERT( _varchar(255) USING ASCII ) |
                CHECK TABLE t1 EXTENDED ;
             
            The RQG run needs only "--threads=2".
            

            mleich Matthias Leich added a comment - sdp:/data/Results/1619015791/TBR-36-MDEV-25459/dev/shm/vardir/1619015791/4/1/rr   CREATE DATABASE /*!32312 IF NOT EXISTS*/ `test` /*!40100 DEFAULT CHARACTER SET latin1 */;   USE `test`; /*!40101 SET @saved_cs_client = @@character_set_client */; /*!40101 SET character_set_client = utf8 */; CREATE TABLE `t1` ( `col_varchar_255_ucs2_key` varchar(255) CHARACTER SET ucs2 DEFAULT NULL, `col_int` INT, `col_char_255_ucs2` char(255) CHARACTER SET ucs2 DEFAULT NULL, `col_varchar_255_ucs2` varchar(255) CHARACTER SET ucs2 DEFAULT NULL, `col_char_255_utf8_key` char(255) CHARACTER SET utf8 DEFAULT NULL, `col_char_255_utf8` char(255) CHARACTER SET utf8 DEFAULT NULL, `pk` int(11) NOT NULL AUTO_INCREMENT, PRIMARY KEY (`pk`), KEY `col_varchar_255_ucs2_key` (`col_varchar_255_ucs2_key`) ) ENGINE=InnoDB AUTO_INCREMENT=95 DEFAULT CHARSET=latin1; /*!40101 SET character_set_client = @saved_cs_client */;   INSERT INTO `t1` VALUES ('ship',NULL,'unbrfeqlhdxamoxyjrdpanbbojmgfvaikcdulupfslgcghacscrvuvbrwnoqifmmmcbcqzzvplrdhaetgjpebshvxjtyjeytzfsvthwhvh','nunbrfeqlhdxamoxyjrdpanbbojmgfvaikcdulupfslgcghacscrvuvbrwnoqifmmmcbcqzzvplrdhaetgjpebshvxjtyjeytzfsvthwhvhfgzlecuozuucsaybsdbrakgyprqnygxwagryhzmfirpduujnymlfijzbdosrmgienkjovqktvkjxwxdvgcldiilzcaurcpeuturnvpnirubyhzpxa','judgment','active',63);   /*!40101 SET @saved_cs_client = @@character_set_client */; /*!40101 SET character_set_client = utf8 */;   RQG grammar: thread1: UPDATE t1 SET `col_varchar_255_ucs2_key` = CONVERT( 'degsotrsfannidwyvkuvlkeslrryhpkeevqmbksdrzadzpyisznignsytihyjixyalxfxpnafjwzgnkbbayklurufrsajtzohanbuvcfyykvtmesobixwipkoihhqykvoejckythjnjshxgohmecmklxryubdexjgxehdiqqui' USING ASCII ) WHERE col_int IS NULL ;   thread2: UPDATE t1 SET `col_varchar_255_ucs2_key` = CONVERT( _varchar(255) USING ASCII ) | CHECK TABLE t1 EXTENDED ;   The RQG run needs only "--threads=2".

            Thank you, the new trace is perfect. Just one page in the clustered index (with 1 record) and one in the secondary index:

            Clustered index:
            (pk,DB_TRX_ID,DB_ROLL_PTR,col_varchar_255_ucs2_key,…)=
            (63,0x3e,(update),'degsotrsfann'…,…)
            Secondary index (PAGE_MAX_TRX_ID=0x3e):
            (col_varchar_255_ucs2_key,pk)=
            ('degsotrsfann'…,63),
            ('qemleuzymt'…,63)(delete-marked)

            The full length of the secondary index column is 340 bytes (2 bytes per ASCII character, UTF-16BE format). Those byte strings are identical between the secondary and clustered index. Here is the string in ASCII (omitting every second byte):

            perl -pe 's/.(.)/$1/ge' < /dev/shm/340.bin

            degsotrsfannidwyvkuvlkeslrryhpkeevqmbksdrzadzpyisznignsytihyjixyalxfxpnafjwzgnkbbayklurufrsajtzohanbuvcfyykvtmesobixwipkoihhqykvoejckythjnjshxgohmecmklxryubdexjgxehdiqqui
            

            On a quick look, the collation function seems to work correctly. I wonder if this could be another problem with this optimization that was introduced in MDEV-20301:

            10.6 af418bb9ef7e422282dc976640409a6af8fcd58c

            #0  row_sel_sec_rec_is_for_clust_rec (sec_rec=0x7f24c5fb9093 "", sec_index=0x6160008abe08, clust_rec=0x61b000064b16 "\200", clust_index=0x6160008a4308, thr=0x621000261580)
                at /data/Server/10.6A/storage/innobase/row/row0sel.cc:183
            #1  0x000055a4ef3c0a8d in Row_sel_get_clust_rec_for_mysql::operator() (this=0x6e0225b6b1f0, prebuilt=0x621000260d88, sec_index=0x6160008abe08, rec=0x7f24c5fb9093 "", thr=0x621000261580, out_rec=0x6e0225b6b0d0, 
                offsets=0x6e0225b6b150, offset_heap=0x6e0225b6b130, vrow=0x0, mtr=0x6e0225b6bc10) at /data/Server/10.6A/storage/innobase/row/row0sel.cc:3496
            #2  0x000055a4ef3ccb93 in row_search_mvcc (buf=0x6210002bc100 '\276' <repeats 200 times>..., mode=PAGE_CUR_G, prebuilt=0x621000260d88, match_mode=0, direction=0)
                at /data/Server/10.6A/storage/innobase/row/row0sel.cc:5383
            #3  0x000055a4ef336896 in row_search_for_mysql (buf=0x6210002bc100 '\276' <repeats 200 times>..., mode=PAGE_CUR_G, prebuilt=0x621000260d88, match_mode=0, direction=0)
                at /data/Server/10.6A/storage/innobase/include/row0sel.ic:137
            #4  0x000055a4ef351d0a in row_scan_index_for_mysql (prebuilt=0x621000260d88, index=0x6160008abe08, n_rows=0x6e0225b6ca30) at /data/Server/10.6A/storage/innobase/row/row0mysql.cc:4631
            #5  0x000055a4eeff45b9 in ha_innobase::check (this=0x61d0008246b8, thd=0x62b00011f218, check_opt=0x62b0001244a8) at /data/Server/10.6A/storage/innobase/handler/ha_innodb.cc:14665
            

            marko Marko Mäkelä added a comment - Thank you, the new trace is perfect. Just one page in the clustered index (with 1 record) and one in the secondary index: Clustered index: (pk,DB_TRX_ID,DB_ROLL_PTR,col_varchar_255_ucs2_key,…)= (63,0x3e,(update),'degsotrsfann'…,…) Secondary index (PAGE_MAX_TRX_ID=0x3e): (col_varchar_255_ucs2_key,pk)= ('degsotrsfann'…,63), ('qemleuzymt'…,63)(delete-marked) The full length of the secondary index column is 340 bytes (2 bytes per ASCII character, UTF-16BE format). Those byte strings are identical between the secondary and clustered index. Here is the string in ASCII (omitting every second byte): perl -pe 's/.(.)/$1/ge' < /dev/shm/340.bin degsotrsfannidwyvkuvlkeslrryhpkeevqmbksdrzadzpyisznignsytihyjixyalxfxpnafjwzgnkbbayklurufrsajtzohanbuvcfyykvtmesobixwipkoihhqykvoejckythjnjshxgohmecmklxryubdexjgxehdiqqui On a quick look, the collation function seems to work correctly. I wonder if this could be another problem with this optimization that was introduced in MDEV-20301 : 10.6 af418bb9ef7e422282dc976640409a6af8fcd58c #0 row_sel_sec_rec_is_for_clust_rec (sec_rec=0x7f24c5fb9093 "", sec_index=0x6160008abe08, clust_rec=0x61b000064b16 "\200", clust_index=0x6160008a4308, thr=0x621000261580) at /data/Server/10.6A/storage/innobase/row/row0sel.cc:183 #1 0x000055a4ef3c0a8d in Row_sel_get_clust_rec_for_mysql::operator() (this=0x6e0225b6b1f0, prebuilt=0x621000260d88, sec_index=0x6160008abe08, rec=0x7f24c5fb9093 "", thr=0x621000261580, out_rec=0x6e0225b6b0d0, offsets=0x6e0225b6b150, offset_heap=0x6e0225b6b130, vrow=0x0, mtr=0x6e0225b6bc10) at /data/Server/10.6A/storage/innobase/row/row0sel.cc:3496 #2 0x000055a4ef3ccb93 in row_search_mvcc (buf=0x6210002bc100 '\276' <repeats 200 times>..., mode=PAGE_CUR_G, prebuilt=0x621000260d88, match_mode=0, direction=0) at /data/Server/10.6A/storage/innobase/row/row0sel.cc:5383 #3 0x000055a4ef336896 in row_search_for_mysql (buf=0x6210002bc100 '\276' <repeats 200 times>..., mode=PAGE_CUR_G, prebuilt=0x621000260d88, match_mode=0, direction=0) at /data/Server/10.6A/storage/innobase/include/row0sel.ic:137 #4 0x000055a4ef351d0a in row_scan_index_for_mysql (prebuilt=0x621000260d88, index=0x6160008abe08, n_rows=0x6e0225b6ca30) at /data/Server/10.6A/storage/innobase/row/row0mysql.cc:4631 #5 0x000055a4eeff45b9 in ha_innobase::check (this=0x61d0008246b8, thd=0x62b00011f218, check_opt=0x62b0001244a8) at /data/Server/10.6A/storage/innobase/handler/ha_innodb.cc:14665

            I will have to debug the trace further (around when 61095) to check why exactly we are assigning clust_rec=NULL here:

            		if (clust_rec
            		    && (old_vers
            			|| trx->isolation_level <= TRX_ISO_READ_UNCOMMITTED
            			|| dict_index_is_spatial(sec_index)
            			|| rec_get_deleted_flag(rec, dict_table_is_comp(
            							sec_index->table)))) {
            			err = row_sel_sec_rec_is_for_clust_rec(rec, sec_index,
            						clust_rec, clust_index, thr);
            			switch (err) {
            			case DB_SUCCESS:
            				clust_rec = NULL;
            				break;
            

            Currently it looks like we are constructing wrong old_vers with (pk=63,DB_TRX_ID=0x3c,(update)) and apparently empty string. It is as if CHECK TABLE is using a newer read view for accessing the clustered index.

            marko Marko Mäkelä added a comment - I will have to debug the trace further (around when 61095) to check why exactly we are assigning clust_rec=NULL here: if (clust_rec && (old_vers || trx->isolation_level <= TRX_ISO_READ_UNCOMMITTED || dict_index_is_spatial(sec_index) || rec_get_deleted_flag(rec, dict_table_is_comp( sec_index->table)))) { err = row_sel_sec_rec_is_for_clust_rec(rec, sec_index, clust_rec, clust_index, thr); switch (err) { case DB_SUCCESS: clust_rec = NULL; break ; Currently it looks like we are constructing wrong old_vers with (pk=63,DB_TRX_ID=0x3c,(update)) and apparently empty string. It is as if CHECK TABLE is using a newer read view for accessing the clustered index.

            For the clustered index scan in the failing CHECK TABLE, row_sel_clust_sees() will not hold for the single record
            (pk,DB_TRX_ID,DB_ROLL_PTR,col_varchar_255_ucs2_key,…)=
            (63,0x3e,(update),'degsotrsfann'…,…)
            so we constructed old_vers that contains a BLOB reference for that column:
            (pk,DB_TRX_ID,DB_ROLL_PTR,col_varchar_255_ucs2_key,…)=
            (63,0x3c,(update),(space=8, page=5, offset=0x26, length=0x1f0),…)

            The payload of the BLOB page is the following UTF-16BE encoded 248-char ASCII string:

            qemleuzymtgopfikwzloyibtgehtjhlablwaewqzuglpeoqnnoxsqeyzbocilomtnpaxihztdrtqahffxekdhrwdvtdvmlshvdhrgfkmuwonkyqzwelujpeyggixymdugzhiqlmnrlpxzrytoatxolflxdzknhkgyttnjqwqcutogtwviuoqkbzxlbzkzdupmxeeifroaweulyyetwdnnrvxrtoksrgcplpyubzpyntedwlgnvzehakg
            

            The prefix of this seems to match the delete-marked record in the secondary index page that I posted earlier.

            So, nothing seems to be really corrupted, but the secondary index MVCC code probably fails to read the BLOB, and instead treats the 20-byte BLOB pointer as the actual string. I will post more, once I have debugged that part of the code.

            marko Marko Mäkelä added a comment - For the clustered index scan in the failing CHECK TABLE , row_sel_clust_sees() will not hold for the single record (pk,DB_TRX_ID,DB_ROLL_PTR,col_varchar_255_ucs2_key,…)= (63,0x3e,(update),'degsotrsfann'…,…) so we constructed old_vers that contains a BLOB reference for that column: (pk,DB_TRX_ID,DB_ROLL_PTR,col_varchar_255_ucs2_key,…)= (63,0x3c,(update),(space=8, page=5, offset=0x26, length=0x1f0),…) The payload of the BLOB page is the following UTF-16BE encoded 248-char ASCII string: qemleuzymtgopfikwzloyibtgehtjhlablwaewqzuglpeoqnnoxsqeyzbocilomtnpaxihztdrtqahffxekdhrwdvtdvmlshvdhrgfkmuwonkyqzwelujpeyggixymdugzhiqlmnrlpxzrytoatxolflxdzknhkgyttnjqwqcutogtwviuoqkbzxlbzkzdupmxeeifroaweulyyetwdnnrvxrtoksrgcplpyubzpyntedwlgnvzehakg The prefix of this seems to match the delete-marked record in the secondary index page that I posted earlier. So, nothing seems to be really corrupted, but the secondary index MVCC code probably fails to read the BLOB, and instead treats the 20-byte BLOB pointer as the actual string. I will post more, once I have debugged that part of the code.

            We have this code in row_sel_sec_rec_is_for_clust_rec():

            		if (ifield->prefix_len > 0 && len != UNIV_SQL_NULL
            		    && sec_len != UNIV_SQL_NULL && !is_virtual) {
             
            			if (rec_offs_nth_extern(clust_offs, clust_pos)) {
            				len -= BTR_EXTERN_FIELD_REF_SIZE;
            			}
             
            			len = dtype_get_at_most_n_mbchars(
            				col->prtype, col->mbminlen, col->mbmaxlen,
            				ifield->prefix_len, len, (char*) clust_field);
             
            			if (rec_offs_nth_extern(clust_offs, clust_pos)
            			    && len < sec_len) {
            				if (!row_sel_sec_rec_is_for_blob(
            

            Because the secondary index does not comprise a column prefix, but in fact the full column col_varchar_255_ucs2_key, we fail to invoke the check for BLOB (really, off-page column, in this case VARCHAR).

            This should mean that MVCC is broken for full-column secondary indexes in ROW_FORMAT=DYNAMIC and ROW_FORMAT=COMPRESSED InnoDB tables if the column happened to become chosen for external storage in the clustered index. In ROW_FORMAT=REDUNDANT (the default and only format until MySQL 5.0.3) and ROW_FORMAT=COMPACT (the default between MySQL 5.0.3 and 5.7, or MariaDB Server before 10.2.2), we would always store the entire column in the clustered index.

            I will have to check if something similar is broken in the implicit locking of secondary indexes, in row_vers_impl_x_locked_low(). If the call to row_build() is not fetching the externally stored VARCHAR column, locking for secondary index records could be broken. I will try to verify that on a copy of the data directory and possibly a patched server.

            marko Marko Mäkelä added a comment - We have this code in row_sel_sec_rec_is_for_clust_rec() : if (ifield->prefix_len > 0 && len != UNIV_SQL_NULL && sec_len != UNIV_SQL_NULL && !is_virtual) {   if (rec_offs_nth_extern(clust_offs, clust_pos)) { len -= BTR_EXTERN_FIELD_REF_SIZE; }   len = dtype_get_at_most_n_mbchars( col->prtype, col->mbminlen, col->mbmaxlen, ifield->prefix_len, len, ( char *) clust_field);   if (rec_offs_nth_extern(clust_offs, clust_pos) && len < sec_len) { if (!row_sel_sec_rec_is_for_blob( Because the secondary index does not comprise a column prefix, but in fact the full column col_varchar_255_ucs2_key , we fail to invoke the check for BLOB (really, off-page column, in this case VARCHAR ). This should mean that MVCC is broken for full-column secondary indexes in ROW_FORMAT=DYNAMIC and ROW_FORMAT=COMPRESSED InnoDB tables if the column happened to become chosen for external storage in the clustered index. In ROW_FORMAT=REDUNDANT (the default and only format until MySQL 5.0.3) and ROW_FORMAT=COMPACT (the default between MySQL 5.0.3 and 5.7, or MariaDB Server before 10.2.2), we would always store the entire column in the clustered index. I will have to check if something similar is broken in the implicit locking of secondary indexes, in row_vers_impl_x_locked_low() . If the call to row_build() is not fetching the externally stored VARCHAR column, locking for secondary index records could be broken. I will try to verify that on a copy of the data directory and possibly a patched server.

            I can repeat the problem with the following:

            --source include/innodb_page_size_small.inc
             
            CREATE TABLE t1 (
              pk int PRIMARY KEY, c varchar(255) UNIQUE,
              d char(255), e varchar(255), f char(255), g char(255)
            ) ENGINE=InnoDB ROW_FORMAT=DYNAMIC DEFAULT CHARACTER SET ucs2;
             
            INSERT INTO t1 VALUES
            (1,REPEAT('c',248),REPEAT('a',106),REPEAT('b',220),REPEAT('x',14),'');
             
            BEGIN;
            UPDATE t1 SET c=REPEAT('d',170);
             
            connect (con1,localhost,root,,);
            SELECT pk FROM t1 FORCE INDEX (c);
            connection default;
            COMMIT;
            connection con1;
            SELECT pk FROM t1 FORCE INDEX (c);
            disconnect con1;
            connection default;
            DROP TABLE t1;
            

            10.2 922e676b43c7b5cb0f20ca67c6d2222e2fc5ec03

            innodb.mvcc_secondary '16k,innodb'       w1 [ pass ]      6
            innodb.mvcc_secondary '8k,innodb'        w3 [ pass ]      5
            innodb.mvcc_secondary '4k,innodb'        w2 [ fail ]
                    Test ended at 2021-04-23 14:06:40
             
            CURRENT_TEST: innodb.mvcc_secondary
            --- /mariadb/10.2o/mysql-test/suite/innodb/r/mvcc_secondary.result	2021-04-23 14:06:25.841488727 +0300
            +++ /mariadb/10.2o/mysql-test/suite/innodb/r/mvcc_secondary.reject	2021-04-23 14:06:40.565659501 +0300
            @@ -9,7 +9,6 @@
             connect  con1,localhost,root,,;
             SELECT pk FROM t1 FORCE INDEX (c);
             pk
            -1
             connection default;
             COMMIT;
             connection con1;
             
            mysqltest: Result length mismatch
            

            I believe that the bug affects also earlier versions, but I did not test it, because MariaDB 5.5, 10.0, 10.1 have already reached their end of life.

            Also, this test case only repeats the problem for innodb_page_size=4k, it should be repeatable with any page size when using appropriately sized records.

            marko Marko Mäkelä added a comment - I can repeat the problem with the following: --source include/innodb_page_size_small.inc   CREATE TABLE t1 ( pk int PRIMARY KEY , c varchar (255) UNIQUE , d char (255), e varchar (255), f char (255), g char (255) ) ENGINE=InnoDB ROW_FORMAT= DYNAMIC DEFAULT CHARACTER SET ucs2;   INSERT INTO t1 VALUES (1,REPEAT( 'c' ,248),REPEAT( 'a' ,106),REPEAT( 'b' ,220),REPEAT( 'x' ,14), '' );   BEGIN ; UPDATE t1 SET c=REPEAT( 'd' ,170);   connect (con1,localhost,root,,); SELECT pk FROM t1 FORCE INDEX (c); connection default ; COMMIT ; connection con1; SELECT pk FROM t1 FORCE INDEX (c); disconnect con1; connection default ; DROP TABLE t1; 10.2 922e676b43c7b5cb0f20ca67c6d2222e2fc5ec03 innodb.mvcc_secondary '16k,innodb' w1 [ pass ] 6 innodb.mvcc_secondary '8k,innodb' w3 [ pass ] 5 innodb.mvcc_secondary '4k,innodb' w2 [ fail ] Test ended at 2021-04-23 14:06:40   CURRENT_TEST: innodb.mvcc_secondary --- /mariadb/10.2o/mysql-test/suite/innodb/r/mvcc_secondary.result 2021-04-23 14:06:25.841488727 +0300 +++ /mariadb/10.2o/mysql-test/suite/innodb/r/mvcc_secondary.reject 2021-04-23 14:06:40.565659501 +0300 @@ -9,7 +9,6 @@ connect con1,localhost,root,,; SELECT pk FROM t1 FORCE INDEX (c); pk -1 connection default; COMMIT; connection con1;   mysqltest: Result length mismatch I believe that the bug affects also earlier versions, but I did not test it, because MariaDB 5.5, 10.0, 10.1 have already reached their end of life. Also, this test case only repeats the problem for innodb_page_size=4k , it should be repeatable with any page size when using appropriately sized records.

            If I append FOR UPDATE or LOCK IN SHARE MODE to the first SELECT, a locking conflict will occur:

            10.2 922e676b43c7b5cb0f20ca67c6d2222e2fc5ec03

            mysqltest: At line 15: query 'SELECT pk FROM t1 FORCE INDEX (c) LOCK IN SHARE MODE' failed: 1205: Lock wait timeout exceeded; try restarting transaction
            

            This suggests that no change for row_vers_impl_x_locked_low() is needed, and only row_sel_sec_rec_is_for_clust_rec() is wrong.

            marko Marko Mäkelä added a comment - If I append FOR UPDATE or LOCK IN SHARE MODE to the first SELECT , a locking conflict will occur: 10.2 922e676b43c7b5cb0f20ca67c6d2222e2fc5ec03 mysqltest: At line 15: query 'SELECT pk FROM t1 FORCE INDEX (c) LOCK IN SHARE MODE' failed: 1205: Lock wait timeout exceeded; try restarting transaction This suggests that no change for row_vers_impl_x_locked_low() is needed, and only row_sel_sec_rec_is_for_clust_rec() is wrong.

            People

              marko Marko Mäkelä
              mleich Matthias Leich
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.