Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-27208

Implement 2-ary CRC32() and the CRC32C() function

Details

    Description

      The SQL parser defines the unary function crc32() that computes the CRC-32 of a string using the ISO 3309 polynomial that is being used by zlib and many others.

      Often, CRC is computed in pieces. To faciliate this, we introduce an optional second parameter: crc32('MariaDB') is equal to crc32(crc32('Maria','DB')).

      InnoDB files use a different polynomial, which is used by the special instructions that the Intel Nehalem microarchitecture introduced in SSE4.2. This is commonly called CRC-32C.

      It would be very convenient to introduce an SQL function crc32c() that would compute CRC-32C checksums. Then we could could define simple SQL function that would generate a logically empty InnoDB redo log corresponding to a particular checkpoint LSN. Starting with MDEV-14425 and MDEV-27199, InnoDB would refuse normal startup if the redo log file was deleted.

      Attachments

        Issue Links

          Activity

            Later, I will try to create an SQL version of the above Perl using appropriate constructs:

            SELECT …unhex(hex(crc32c(…)))… INTO DUMPFILE 'ib_logfile0';
            

            marko Marko Mäkelä added a comment - Later, I will try to create an SQL version of the above Perl using appropriate constructs: SELECT …unhex(hex(crc32c(…)))… INTO DUMPFILE 'ib_logfile0' ;

            Here is the SQL to create a logically empty log file in the MDEV-14425 format corresponding to the 64-bit log sequence number specified in the first line:

            set @lsn=x'000000000000c96b';
            set @header=concat('Phys',x'00000000',@lsn,repeat(x'00',492));
            set @header=concat(@header,unhex(hex(crc32c(@header))),repeat(x'00',3584));
             
            set @checkpoint=concat(@lsn,@lsn,repeat(x'00',44));
            set @checkpoint=concat(@checkpoint,unhex(hex(crc32c(@checkpoint))),
                                   repeat(x'00',8128));
            set @payload=concat(x'fa0000',@lsn);
            set @payload=concat(@payload,x'01',unhex(hex(crc32c(@payload))));
             
            select concat(@header,@checkpoint,@payload) into dumpfile 'ib_logfile0';
            

            marko Marko Mäkelä added a comment - Here is the SQL to create a logically empty log file in the MDEV-14425 format corresponding to the 64-bit log sequence number specified in the first line: set @lsn=x '000000000000c96b' ; set @header=concat( 'Phys' ,x '00000000' ,@lsn,repeat(x '00' ,492)); set @header=concat(@header,unhex(hex(crc32c(@header))),repeat(x '00' ,3584));   set @ checkpoint =concat(@lsn,@lsn,repeat(x '00' ,44)); set @ checkpoint =concat(@ checkpoint ,unhex(hex(crc32c(@ checkpoint ))), repeat(x '00' ,8128)); set @payload=concat(x 'fa0000' ,@lsn); set @payload=concat(@payload,x '01' ,unhex(hex(crc32c(@payload))));   select concat(@header,@ checkpoint ,@payload) into dumpfile 'ib_logfile0' ;

            serg, based on your review feedback I reversed the arguments of the 2-ary functions. When a previous checksum is specified, it must be the first and not the second argument:

            SELECT CRC32C('MariaDB'),CRC32C(CRC32C('Maria'),'DB'),CRC32C(0,'MariaDB');
            

            This would return the same value 809606978 three times.

            marko Marko Mäkelä added a comment - serg , based on your review feedback I reversed the arguments of the 2-ary functions. When a previous checksum is specified, it must be the first and not the second argument: SELECT CRC32C( 'MariaDB' ),CRC32C(CRC32C( 'Maria' ), 'DB' ),CRC32C(0, 'MariaDB' ); This would return the same value 809606978 three times.
            elenst Elena Stepanova added a comment - - edited

            According to GCOV, there are a few missing lines in the coverage:

            line numbers as of preview-10.8-MDEV-27265-misc 6e615c62b

            ===File sql/item_create.cc:
               3141 : +    my_error(ER_WRONG_PARAMCOUNT_TO_NATIVE_FCT, MYF(0), name->str);
               3142 : +    return nullptr;
               3150 : +    my_error(ER_WRONG_PARAMETERS_TO_NATIVE_FCT, MYF(0), name->str);
               3151 : +    return nullptr;
               3170 : +    my_error(ER_WRONG_PARAMCOUNT_TO_NATIVE_FCT, MYF(0), name->str);
               3171 : +    return nullptr;
               3179 : +    my_error(ER_WRONG_PARAMETERS_TO_NATIVE_FCT, MYF(0), name->str);
               3180 : +    return nullptr;
            

            These are minor omissions, but maybe it makes sense to add queries for them, for completeness.
            3141/3142 and 3170/3171 seem obvious, something like

            --error ER_WRONG_PARAMCOUNT_TO_NATIVE_FCT
            select crc32(1,'foo','bar');
            --error ER_WRONG_PARAMCOUNT_TO_NATIVE_FCT
            select crc32c(1,'foo','bar');
            

            should do it. For the other two errors, I can't see at the first glance how to get there, but I suppose whoever wrote it would know right away.

            I have no objections against merging the new variations of functions into 10.8 main and releasing with 10.8.1. The above note about MTR tests is not a mandatory requirement.
            It concerns the functions themselves, the way they appear to work as built-in functions. How InnoDB uses them to create its log is out of the scope of this task.

            The functions don't work with Columnstore (MCOL-4966 to track), but anyway Columnstore doesn't support everything that the server does, so I don't see it as a blocker. Also the functions lack proper parameter validation, but as a legacy issue concerning other existing functions (MDEV-27480), it cannot block the feature either.

            elenst Elena Stepanova added a comment - - edited According to GCOV, there are a few missing lines in the coverage: line numbers as of preview-10.8-MDEV-27265-misc 6e615c62b ===File sql/item_create.cc: 3141 : + my_error(ER_WRONG_PARAMCOUNT_TO_NATIVE_FCT, MYF(0), name->str); 3142 : + return nullptr; 3150 : + my_error(ER_WRONG_PARAMETERS_TO_NATIVE_FCT, MYF(0), name->str); 3151 : + return nullptr; 3170 : + my_error(ER_WRONG_PARAMCOUNT_TO_NATIVE_FCT, MYF(0), name->str); 3171 : + return nullptr; 3179 : + my_error(ER_WRONG_PARAMETERS_TO_NATIVE_FCT, MYF(0), name->str); 3180 : + return nullptr; These are minor omissions, but maybe it makes sense to add queries for them, for completeness. 3141/3142 and 3170/3171 seem obvious, something like --error ER_WRONG_PARAMCOUNT_TO_NATIVE_FCT select crc32(1, 'foo' , 'bar' ); --error ER_WRONG_PARAMCOUNT_TO_NATIVE_FCT select crc32c(1, 'foo' , 'bar' ); should do it. For the other two errors, I can't see at the first glance how to get there, but I suppose whoever wrote it would know right away. I have no objections against merging the new variations of functions into 10.8 main and releasing with 10.8.1. The above note about MTR tests is not a mandatory requirement. It concerns the functions themselves, the way they appear to work as built-in functions. How InnoDB uses them to create its log is out of the scope of this task. The functions don't work with Columnstore ( MCOL-4966 to track), but anyway Columnstore doesn't support everything that the server does, so I don't see it as a blocker. Also the functions lack proper parameter validation, but as a legacy issue concerning other existing functions ( MDEV-27480 ), it cannot block the feature either.

            The handling for the error ER_WRONG_PARAMETERS_TO_NATIVE_FCT is redundant and unreachable:

            --error ER_WRONG_PARAMETERS_TO_NATIVE_FCT
            select crc32c('' as empty);
            

            #0  my_error (nr=1583, MyFlags=0) at /mariadb/10.8/mysys/my_error.c:109
            #1  0x00005635f0b30506 in Create_native_func::create_func (
                this=0x5635f1ceb478 <Create_func_crc32c::s_singleton>, thd=0x7f8b34000d48, 
                name=0x7f8b4a75e6d0, item_list=0x7f8b34016f20)
                at /mariadb/10.8/sql/item_create.cc:2653
            #2  0x00005635f0a321e0 in MYSQLparse (thd=thd@entry=0x7f8b34000d48)
                at /mariadb/10.8/sql/sql_yacc.yy:10441
            

            Item*
            Create_native_func::create_func(THD *thd, LEX_CSTRING *name, List<Item> *item_list)
            {
              if (unlikely(has_named_parameters(item_list)))
              {
                my_error(ER_WRONG_PARAMETERS_TO_NATIVE_FCT, MYF(0), name->str);
                return NULL;
              }
             
              return create_native(thd, name, item_list);
            }
            

            I will replace those redundant error checks with assertions and include test cases to exercise this.

            marko Marko Mäkelä added a comment - The handling for the error ER_WRONG_PARAMETERS_TO_NATIVE_FCT is redundant and unreachable: --error ER_WRONG_PARAMETERS_TO_NATIVE_FCT select crc32c( '' as empty); #0 my_error (nr=1583, MyFlags=0) at /mariadb/10.8/mysys/my_error.c:109 #1 0x00005635f0b30506 in Create_native_func::create_func ( this=0x5635f1ceb478 <Create_func_crc32c::s_singleton>, thd=0x7f8b34000d48, name=0x7f8b4a75e6d0, item_list=0x7f8b34016f20) at /mariadb/10.8/sql/item_create.cc:2653 #2 0x00005635f0a321e0 in MYSQLparse (thd=thd@entry=0x7f8b34000d48) at /mariadb/10.8/sql/sql_yacc.yy:10441 Item* Create_native_func::create_func(THD *thd, LEX_CSTRING *name, List<Item> *item_list) { if (unlikely(has_named_parameters(item_list))) { my_error(ER_WRONG_PARAMETERS_TO_NATIVE_FCT, MYF(0), name->str); return NULL; }   return create_native(thd, name, item_list); } I will replace those redundant error checks with assertions and include test cases to exercise this.

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.