Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-18910

Hash value unique long column is miscalculated from versioning timestamp

Details

    Description

      The test case for MDEV-18901 is revealing bugs in two components. While it is clear that MDEV-371 is the main cause of the regression, that task did not modify any InnoDB code.

      InnoDB code was modified for implementing system-versioned tables. Hence, the assertion failure in InnoDB (reporting that indexes inside InnoDB are inconsistent with each other) could occur because of the changes for system-versioning, not directly because of MDEV-371.

      Please try to find out if the system versioning code inside InnoDB is always handling errors correctly. As noted in MDEV-18272, on any error inside InnoDB, the minimum error handling should be to roll back the current row operation. Then it will be up to the SQL layer to roll back to the start of the statement or roll back the entire transaction.

      For the test case, please refer to MDEV-18901 and use an affected revision of 10.4. I tested with commit 3568427d11f7afcd111b4c28c14cc8aba2b10807.

      Attachments

        Issue Links

          Activity

            This is still repeatable after merging into 10.4 the fixes of MDEV-18879/MDEV-18972 and MDEV-18981.

            Even though the current test case for this bug depends on a different bug, I believe that any InnoDB corruption bug should have a major if not critical priority.

            marko Marko Mäkelä added a comment - This is still repeatable after merging into 10.4 the fixes of MDEV-18879 / MDEV-18972 and MDEV-18981 . Even though the current test case for this bug depends on a different bug, I believe that any InnoDB corruption bug should have a major if not critical priority.

            I analyzed this a little further. Here is the test case:

            --source include/have_innodb.inc
            CREATE TABLE t1 (a VARCHAR(7961)) ENGINE=InnoDB ROW_FORMAT=REDUNDANT;
            INSERT INTO t1 VALUES ('f'), ('o'), ('o');
            SELECT * FROM t1 INTO OUTFILE 'load.data';
            ALTER IGNORE TABLE t1 ADD UNIQUE INDEX (a);
            set timestamp=1;
            ALTER TABLE t1 ADD SYSTEM VERSIONING ;
            set timestamp=2;
            REPLACE INTO t1 VALUES ('f'), ('o'), ('o');
            set timestamp=3;
            LOAD DATA INFILE 'load.data' REPLACE INTO TABLE t1;
            DROP TABLE t1;
            --let $datadir= `select @@datadir`
            --remove_file $datadir/test/load.data
            

            Internally, the table consists of two indexes:

            INDEX a(DB_ROW_HASH_1,DB_ROW_ID)

            0c60c4f8bdf6af36,000000000209
            6bf6e28eb0ae21ee,000000000205
            8987aff239e970b6,00000000020a
            d6605c1b0dff45ee,000000000206
            d6605c1b0dff45ee,000000000208
            

            GEN_CLUST_INDEX(DB_ROW_ID,DB_TRX_ID,DB_ROLL_PTR,a,row_start,row_end)

            0x205,0,1<<55,'f',0x2000000,0x7fffffff0f423f
            0x206,0,1<<55,'o',0x2000000,0x7fffffff0f423f
            0x207,0,1<<55,'f',0x1000000,0x00000002000000
            0x208,0,1<<55,'o',0x1000000,0x00000002000000
            0x209,71,(insert),'f',0x3000000,0x7fffffff0f423f
            0x20a,71,(insert),'o',0x3000000,0x7fffffff0f423f
            

            The current transaction (id=71) has appended the 2 records into the clustered index. The history for all previous records has been removed.

            What catches my attention is that for the 2 distinct values 'f' and 'o', we see 3 distinct values in the secondary index. Is the hash being calculated on uninitialized data?

            Also, the secondary index contains only 5 records, while the insert into the clustered index is supposed to have been completed. The transaction is now in an ‘update’ phase.

            marko Marko Mäkelä added a comment - I analyzed this a little further. Here is the test case: --source include/have_innodb.inc CREATE TABLE t1 (a VARCHAR (7961)) ENGINE=InnoDB ROW_FORMAT=REDUNDANT; INSERT INTO t1 VALUES ( 'f' ), ( 'o' ), ( 'o' ); SELECT * FROM t1 INTO OUTFILE 'load.data' ; ALTER IGNORE TABLE t1 ADD UNIQUE INDEX (a); set timestamp =1; ALTER TABLE t1 ADD SYSTEM VERSIONING ; set timestamp =2; REPLACE INTO t1 VALUES ( 'f' ), ( 'o' ), ( 'o' ); set timestamp =3; LOAD DATA INFILE 'load.data' REPLACE INTO TABLE t1; DROP TABLE t1; --let $datadir= `select @@datadir` --remove_file $datadir/test/load.data Internally, the table consists of two indexes: INDEX a(DB_ROW_HASH_1,DB_ROW_ID) 0c60c4f8bdf6af36,000000000209 6bf6e28eb0ae21ee,000000000205 8987aff239e970b6,00000000020a d6605c1b0dff45ee,000000000206 d6605c1b0dff45ee,000000000208 GEN_CLUST_INDEX(DB_ROW_ID,DB_TRX_ID,DB_ROLL_PTR,a,row_start,row_end) 0x205,0,1<<55,'f',0x2000000,0x7fffffff0f423f 0x206,0,1<<55,'o',0x2000000,0x7fffffff0f423f 0x207,0,1<<55,'f',0x1000000,0x00000002000000 0x208,0,1<<55,'o',0x1000000,0x00000002000000 0x209,71,(insert),'f',0x3000000,0x7fffffff0f423f 0x20a,71,(insert),'o',0x3000000,0x7fffffff0f423f The current transaction (id=71) has appended the 2 records into the clustered index. The history for all previous records has been removed. What catches my attention is that for the 2 distinct values 'f' and 'o', we see 3 distinct values in the secondary index. Is the hash being calculated on uninitialized data? Also, the secondary index contains only 5 records, while the insert into the clustered index is supposed to have been completed. The transaction is now in an ‘update’ phase.

            I believe that the problem is that Item_func_hash::val_int() for computing the virtual column value of DB_ROW_HASH_1 is sometimes invoked on the correct column a, but sometimes on the incorrect column row_end.

            So, from the InnoDB point of view it is a garbage-in/garbage-out situation caused by the virtual column expression seemingly being nondeterministic.

            For the correct values 'f' and 'o', we are returning the expected hash values 0x10400000c7c and 0x10400000f28. These values never occur in the secondary index! Instead, something causes hash values to be computed on the system-versioning timestamps:

            10.4 1dffa9d9c18b3d9b3cf16f3aa63c4ec3d157326b

            Thread 28 "mysqld" hit Breakpoint 2, Item_func_hash::val_int (
                this=0x7fff8c08e5d0) at /mariadb/10.4/sql/item_func.cc:1737
            1: arg_count = 2
            2: *str = {<Charset> = {
                m_charset = 0x55555735a0e0 <my_charset_latin1>}, <Binary_string> = {<Static_binary_string> = {<Sql_alloc> = {<No data fields>}, 
                  Ptr = 0x7fff8c0239b8 "2038-01-19 05:14:07.999999", str_length = 26}, 
                Alloced_length = 32, extra_alloc = 0, alloced = true, 
                thread_specific = false}, <No data fields>}
            0x00005555560b0bd3 in Item::save_int_in_field (this=0x7fff8c08e5d0, 
                field=0x7fff8c085110, no_conversions=false)
                at /mariadb/10.4/sql/item.cc:6493
            Value returned is $8 = -2999296080560044562
            (gdb) bt
            #0  0x00005555560b0bd3 in Item::save_int_in_field (this=0x7fff8c08e5d0, 
                field=0x7fff8c085110, no_conversions=false)
                at /mariadb/10.4/sql/item.cc:6493
            #1  0x0000555555f45880 in Type_handler_int_result::Item_save_in_field (
                this=0x5555573175c8 <type_handler_long>, item=0x7fff8c08e5d0, 
                field=0x7fff8c085110, no_conversions=false)
                at /mariadb/10.4/sql/sql_type.cc:3587
            #2  0x00005555560b0c9e in Item::save_in_field (this=0x7fff8c08e5d0, 
                field=0x7fff8c085110, no_conversions=false)
                at /mariadb/10.4/sql/item.cc:6503
            #3  0x0000555555e568dc in TABLE::update_virtual_fields (this=0x7fff8c038dd8, 
                h=0x7fff8c08de80, update_mode=VCOL_UPDATE_FOR_WRITE)
                at /mariadb/10.4/sql/table.cc:8199
            #4  0x0000555555e05725 in copy_data_between_tables (thd=0x7fff8c000cf8, 
                from=0x7fff8c07fd08, to=0x7fff8c038dd8, create=..., ignore=false, 
                order_num=0, order=0x0, copied=0x7ffff007ffd8, deleted=0x7ffff007ffd0, 
                keys_onoff=Alter_info::LEAVE_AS_IS, alter_ctx=0x7ffff0080da0)
                at /mariadb/10.4/sql/sql_table.cc:10639
            #5  0x0000555555dfc6c3 in mysql_alter_table (thd=0x7fff8c000cf8, 
                new_db=0x7fff8c005480, new_name=0x7fff8c005880, 
                create_info=0x7ffff00818e8, table_list=0x7fff8c016388, 
                alter_info=0x7ffff00821b0, order_num=0, order=0x0, ignore=false)
                at /mariadb/10.4/sql/sql_table.cc:10081
            #6  0x0000555555eb04e6 in Sql_cmd_alter_table::execute (this=0x7fff8c016a38, 
                thd=0x7fff8c000cf8) at /mariadb/10.4/sql/sql_alter.cc:490
            #7  0x0000555555cf6528 in mysql_execute_command (thd=0x7fff8c000cf8)
                at /mariadb/10.4/sql/sql_parse.cc:6343
            #8  0x0000555555ce5a80 in mysql_parse (thd=0x7fff8c000cf8, 
                rawbuf=0x7fff8c016290 "ALTER TABLE t1 ADD SYSTEM VERSIONING", length=36, 
                parser_state=0x7ffff0084da0, is_com_multi=false, is_next_command=false)
                at /mariadb/10.4/sql/sql_parse.cc:8154
            #9  0x0000555555ce2501 in dispatch_command (command=COM_QUERY, 
                thd=0x7fff8c000cf8, 
            (gdb) p *field
            $9 = {<Value_source> = {<No data fields>}, 
              _vptr$Field = 0x555557216b98 <vtable for Field_longlong+16>, 
              ptr = 0x7fff8c09462a "\356!\256\260\216\342\366k", '\245' <repeats 192 times>..., invisible = INVISIBLE_FULL, null_ptr = 0x7fff8c092700 "\374\001", 
              table = 0x7fff8c038dd8, orig_table = 0x7fff8c038dd8, 
              table_name = 0x7fff8c038f00, field_name = {
                str = 0x7fff8c0817ad "DB_ROW_HASH_1", length = 13}, comment = {
                str = 0x555556bbcb81 "", length = 0}, option_list = 0x0, 
              option_struct = 0x0, key_start = {map = 0}, part_of_key = {map = 0}, 
              part_of_key_not_clustered = {map = 0}, part_of_sortkey = {map = 0}, 
              unireg_check = Field::NONE, field_length = 8, flags = 1073758240, 
              field_index = 3, null_bit = 2 '\002', is_created_from_null_item = false, 
              is_stat_field = false, cond_selectivity = 1, next_equal_field = 0x0, 
              read_stats = 0x0, collected_stats = 0x0, vcol_info = 0x7fff8c08e690, 
              check_constraint = 0x0, default_value = 0x0}
            

            The value returned is 0xd6605c1b0dff45ee. It occurs twice in the secondary index.

            Observation: The "infinite future" value for row_end is not at the very end of the domain; it is 0x7fffffff0f423f.

            Another observation: For debugging purposes, it would be very beneficial to be able to add an invisible generated stored column for DB_ROW_HASH_1. Unfortunately, the following patch had no effect on that, possibly because cf->vcol_info->expr is not being initialized.

            diff --git a/sql/sql_table.cc b/sql/sql_table.cc
            index ad62ecc1103..c88e43fa428 100644
            --- a/sql/sql_table.cc
            +++ b/sql/sql_table.cc
            @@ -3322,7 +3322,7 @@ static Create_field * add_hash_field(THD * thd, List<Create_field> *create_list,
               cf->invisible= INVISIBLE_FULL;
               cf->pack_flag|= FIELDFLAG_MAYBE_NULL;
               cf->vcol_info= new (thd->mem_root) Virtual_column_info();
            -  cf->vcol_info->stored_in_db= false;
            +  cf->vcol_info->stored_in_db= true;
               uint num= 1;
               LEX_CSTRING field_name;
               field_name.str= (char *)thd->alloc(LONG_HASH_FIELD_NAME_LENGTH);
            

            marko Marko Mäkelä added a comment - I believe that the problem is that Item_func_hash::val_int() for computing the virtual column value of DB_ROW_HASH_1 is sometimes invoked on the correct column a , but sometimes on the incorrect column row_end . So, from the InnoDB point of view it is a garbage-in/garbage-out situation caused by the virtual column expression seemingly being nondeterministic. For the correct values 'f' and 'o', we are returning the expected hash values 0x10400000c7c and 0x10400000f28. These values never occur in the secondary index! Instead, something causes hash values to be computed on the system-versioning timestamps: 10.4 1dffa9d9c18b3d9b3cf16f3aa63c4ec3d157326b Thread 28 "mysqld" hit Breakpoint 2, Item_func_hash::val_int ( this=0x7fff8c08e5d0) at /mariadb/10.4/sql/item_func.cc:1737 1: arg_count = 2 2: *str = {<Charset> = { m_charset = 0x55555735a0e0 <my_charset_latin1>}, <Binary_string> = {<Static_binary_string> = {<Sql_alloc> = {<No data fields>}, Ptr = 0x7fff8c0239b8 "2038-01-19 05:14:07.999999", str_length = 26}, Alloced_length = 32, extra_alloc = 0, alloced = true, thread_specific = false}, <No data fields>} 0x00005555560b0bd3 in Item::save_int_in_field (this=0x7fff8c08e5d0, field=0x7fff8c085110, no_conversions=false) at /mariadb/10.4/sql/item.cc:6493 Value returned is $8 = -2999296080560044562 (gdb) bt #0 0x00005555560b0bd3 in Item::save_int_in_field (this=0x7fff8c08e5d0, field=0x7fff8c085110, no_conversions=false) at /mariadb/10.4/sql/item.cc:6493 #1 0x0000555555f45880 in Type_handler_int_result::Item_save_in_field ( this=0x5555573175c8 <type_handler_long>, item=0x7fff8c08e5d0, field=0x7fff8c085110, no_conversions=false) at /mariadb/10.4/sql/sql_type.cc:3587 #2 0x00005555560b0c9e in Item::save_in_field (this=0x7fff8c08e5d0, field=0x7fff8c085110, no_conversions=false) at /mariadb/10.4/sql/item.cc:6503 #3 0x0000555555e568dc in TABLE::update_virtual_fields (this=0x7fff8c038dd8, h=0x7fff8c08de80, update_mode=VCOL_UPDATE_FOR_WRITE) at /mariadb/10.4/sql/table.cc:8199 #4 0x0000555555e05725 in copy_data_between_tables (thd=0x7fff8c000cf8, from=0x7fff8c07fd08, to=0x7fff8c038dd8, create=..., ignore=false, order_num=0, order=0x0, copied=0x7ffff007ffd8, deleted=0x7ffff007ffd0, keys_onoff=Alter_info::LEAVE_AS_IS, alter_ctx=0x7ffff0080da0) at /mariadb/10.4/sql/sql_table.cc:10639 #5 0x0000555555dfc6c3 in mysql_alter_table (thd=0x7fff8c000cf8, new_db=0x7fff8c005480, new_name=0x7fff8c005880, create_info=0x7ffff00818e8, table_list=0x7fff8c016388, alter_info=0x7ffff00821b0, order_num=0, order=0x0, ignore=false) at /mariadb/10.4/sql/sql_table.cc:10081 #6 0x0000555555eb04e6 in Sql_cmd_alter_table::execute (this=0x7fff8c016a38, thd=0x7fff8c000cf8) at /mariadb/10.4/sql/sql_alter.cc:490 #7 0x0000555555cf6528 in mysql_execute_command (thd=0x7fff8c000cf8) at /mariadb/10.4/sql/sql_parse.cc:6343 #8 0x0000555555ce5a80 in mysql_parse (thd=0x7fff8c000cf8, rawbuf=0x7fff8c016290 "ALTER TABLE t1 ADD SYSTEM VERSIONING", length=36, parser_state=0x7ffff0084da0, is_com_multi=false, is_next_command=false) at /mariadb/10.4/sql/sql_parse.cc:8154 #9 0x0000555555ce2501 in dispatch_command (command=COM_QUERY, thd=0x7fff8c000cf8, (gdb) p *field $9 = {<Value_source> = {<No data fields>}, _vptr$Field = 0x555557216b98 <vtable for Field_longlong+16>, ptr = 0x7fff8c09462a "\356!\256\260\216\342\366k", '\245' <repeats 192 times>..., invisible = INVISIBLE_FULL, null_ptr = 0x7fff8c092700 "\374\001", table = 0x7fff8c038dd8, orig_table = 0x7fff8c038dd8, table_name = 0x7fff8c038f00, field_name = { str = 0x7fff8c0817ad "DB_ROW_HASH_1", length = 13}, comment = { str = 0x555556bbcb81 "", length = 0}, option_list = 0x0, option_struct = 0x0, key_start = {map = 0}, part_of_key = {map = 0}, part_of_key_not_clustered = {map = 0}, part_of_sortkey = {map = 0}, unireg_check = Field::NONE, field_length = 8, flags = 1073758240, field_index = 3, null_bit = 2 '\002', is_created_from_null_item = false, is_stat_field = false, cond_selectivity = 1, next_equal_field = 0x0, read_stats = 0x0, collected_stats = 0x0, vcol_info = 0x7fff8c08e690, check_constraint = 0x0, default_value = 0x0} The value returned is 0xd6605c1b0dff45ee. It occurs twice in the secondary index. Observation: The "infinite future" value for row_end is not at the very end of the domain; it is 0x7fffffff0f423f . Another observation: For debugging purposes, it would be very beneficial to be able to add an invisible generated stored column for DB_ROW_HASH_1 . Unfortunately, the following patch had no effect on that, possibly because cf->vcol_info->expr is not being initialized. diff --git a/sql/sql_table.cc b/sql/sql_table.cc index ad62ecc1103..c88e43fa428 100644 --- a/sql/sql_table.cc +++ b/sql/sql_table.cc @@ -3322,7 +3322,7 @@ static Create_field * add_hash_field(THD * thd, List<Create_field> *create_list, cf->invisible= INVISIBLE_FULL; cf->pack_flag|= FIELDFLAG_MAYBE_NULL; cf->vcol_info= new (thd->mem_root) Virtual_column_info(); - cf->vcol_info->stored_in_db= false; + cf->vcol_info->stored_in_db= true; uint num= 1; LEX_CSTRING field_name; field_name.str= (char *)thd->alloc(LONG_HASH_FIELD_NAME_LENGTH);

            MDEV-18967 appears to fix this problem. The test no longer crashes after applying this fix.

            marko Marko Mäkelä added a comment - MDEV-18967 appears to fix this problem. The test no longer crashes after applying this fix .

            People

              sachin.setiya.007 Sachin Setiya (Inactive)
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.