Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-17958

Make bug-endian innodb_checksum_algorithm=crc32 optional

Details

    Description

      In MySQL 5.7, it was noticed that files are not portable between big-endian and little-endian systems (such as SPARC and x86), because the original implementation of innodb_checksum_algorithm=crc32 was not byte order agnostic.

      A byte order agnostic implementation of innodb_checksum_algorithm=crc32 was only added to MySQL 5.7, not backported to 5.6. Consequently, MariaDB Server versions 10.0 and 10.1 only contain the CRC-32C implementation that works incorrectly on big-endian architectures, and MariaDB Server 10.2.2 got the byte-order agnostic CRC-32C implementation from MySQL 5.7.

      MySQL 5.7 introduced a "legacy crc32" variant that is functionally equivalent to the big-endian version of the original crc32 implementation. Thanks to this variant, old data files can be transferred from big-endian systems to newer versions.

      Introducing new variants of checksum algorithms (without introducing new names for them) generally is a bad idea, because each checksum algorithm is like a lottery ticket. The more algorithms you try, the more likely it will be for the checksum to match on a corrupted page.

      So, essentially MySQL 5.7 weakened innodb_checksum_algorithm=crc32, and MariaDB 10.2.2 inherited this weakening.

      We elect to remove the bug-compatible variant of innodb_checksum_algorithm=crc32 as follows:

      1. We assume that most users are on little-endian.
      2. Let us make the bug-compatible variant only present on big-endian systems (#ifdef WORDS_BIGENDIAN).
      3. If someone is upgrading from MariaDB 10.0 or 10.1 or MySQL 5.6 to MariaDB 10.2 or later, they will stay on the same architecture.
      4. Completely remove the bug-compatible variant from MariaDB 10.4.
      5. If someone is switching from big-endian to little-endian, they can do it with logical dump, or they can use innochecksum to recompute the checksums.

      Attachments

        Issue Links

          Activity

            thiru Thirunarayanan Balathandayuthapani created issue -
            thiru Thirunarayanan Balathandayuthapani made changes -
            Field Original Value New Value
            Description There are variant of crc32 algorithm for big endian. It also weakens the crc32 checksum
            algorithm.
            In MySQL 5.7, it was noticed that files are not portable between big-endian and little-endian systems. So, even the crc32 checksum was weakened by introducing a "legacy crc32" variant that allows files write on big-endian systems to be read on little-endian systems. This creates one more variant for crc32 checksum algorithm. This crc32 weakening is present since 10.2.2.

            Instead of removing it completely, move big-endian variant of crc32 under #ifdef WORDS_BIGENDIAN in 10.2, 10.3. It can be removed in 10.4
            marko Marko Mäkelä made changes -
            Fix Version/s 10.2 [ 14601 ]
            Fix Version/s 10.3 [ 22126 ]
            Fix Version/s 10.4 [ 22408 ]
            Affects Version/s 10.4.0 [ 23115 ]
            Affects Version/s 10.3.0 [ 22127 ]
            Affects Version/s 10.2.2 [ 22013 ]
            Affects Version/s 10.2 [ 14601 ]
            Description In MySQL 5.7, it was noticed that files are not portable between big-endian and little-endian systems. So, even the crc32 checksum was weakened by introducing a "legacy crc32" variant that allows files write on big-endian systems to be read on little-endian systems. This creates one more variant for crc32 checksum algorithm. This crc32 weakening is present since 10.2.2.

            Instead of removing it completely, move big-endian variant of crc32 under #ifdef WORDS_BIGENDIAN in 10.2, 10.3. It can be removed in 10.4
            In MySQL 5.7, it was noticed that files are not portable between big-endian and little-endian systems (such as SPARC and x86), because the original implementation of {{innodb_checksum_algorithm=crc32}} was not byte order agnostic.

            A byte order agnostic implementation of {{innodb_checksum_algorithm=crc32}} was only added to MySQL 5.7, not backported to 5.6. Consequently, MariaDB Server versions 10.0 and 10.1 only contain the CRC-32C implementation that works incorrectly on big-endian architectures, and MariaDB Server 10.2.2 got the byte-order agnostic CRC-32C implementation from MySQL 5.7.

            MySQL 5.7 introduced a "legacy crc32" variant that is functionally equivalent to the big-endian version of the original crc32 implementation. Thanks to this variant, old data files can be transferred from big-endian systems to newer versions.

            Introducing new variants of checksum algorithms (without introducing new names for them) generally is a bad idea, because each checksum algorithm is like a lottery ticket. The more algorithms you try, the more likely it will be for the checksum to match on a corrupted page.

            So, essentially MySQL 5.7 weakened {{innodb_checksum_algorithm=crc32}}, and MariaDB 10.2.2 inherited this weakening.

            We elect to remove the bug-compatible variant of {{innodb_checksum_algorithm=crc32}} as follows:
            # We assume that most users are on little-endian.
            # Let us make the bug-compatible variant only present on big-endian systems ({{#ifdef WORDS_BIGENDIAN}}).
            # If someone is upgrading from MariaDB 10.0 or 10.1 or MySQL 5.6 to MariaDB 10.2 or later, they will stay on the same architecture.
            # Completely remove the bug-compatible variant from MariaDB 10.4.
            # If someone is switching from big-endian to little-endian, they can do it with logical dump, or they can use {{innochecksum}} to recompute the checksums.
            Labels checksum compat56 compat57 corruption
            Summary Remove Big endian variant of crc32. Remove Big endian variant of innodb_checksum_algorithm=crc32
            thiru Thirunarayanan Balathandayuthapani made changes -
            Summary Remove Big endian variant of innodb_checksum_algorithm=crc32 Keep Big endian variant of innodb_checksum_algorithm=crc32 only in big-endian machine
            thiru Thirunarayanan Balathandayuthapani made changes -
            Summary Keep Big endian variant of innodb_checksum_algorithm=crc32 only in big-endian machine Keep big endian variant of innodb_checksum_algorithm=crc32 only in big-endian machine
            thiru Thirunarayanan Balathandayuthapani made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            thiru Thirunarayanan Balathandayuthapani made changes -
            Assignee Thirunarayanan Balathandayuthapani [ thiru ] Marko Mäkelä [ marko ]
            Status In Progress [ 3 ] In Review [ 10002 ]
            GeoffMontee Geoff Montee (Inactive) made changes -
            marko Marko Mäkelä made changes -
            Summary Keep big endian variant of innodb_checksum_algorithm=crc32 only in big-endian machine Make bug-endian innodb_checksum_algorithm=crc32 optional

            I introduced the build option

            cmake -DWITH_INNODB_BUG_ENDIAN_CRC32=ON
            

            It is OFF by default on little-endian systems, and ON by default on big-endian systems.

            I plan to remove this option (and the code for supporting the buggy algorithm) in the upcoming MariaDB 10.4.1 beta release.

            marko Marko Mäkelä added a comment - I introduced the build option cmake -DWITH_INNODB_BUG_ENDIAN_CRC32=ON It is OFF by default on little-endian systems, and ON by default on big-endian systems. I plan to remove this option (and the code for supporting the buggy algorithm) in the upcoming MariaDB 10.4.1 beta release.
            marko Marko Mäkelä made changes -
            issue.field.resolutiondate 2018-12-13 16:02:19.0 2018-12-13 16:02:19.011
            marko Marko Mäkelä made changes -
            Fix Version/s 10.4.1 [ 23228 ]
            Fix Version/s 10.2.20 [ 23212 ]
            Fix Version/s 10.3.12 [ 23214 ]
            Fix Version/s 10.2 [ 14601 ]
            Fix Version/s 10.3 [ 22126 ]
            Fix Version/s 10.4 [ 22408 ]
            Resolution Fixed [ 1 ]
            Status In Review [ 10002 ] Closed [ 6 ]
            marko Marko Mäkelä made changes -
            GeoffMontee Geoff Montee (Inactive) made changes -
            marko Marko Mäkelä made changes -
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 91139 ] MariaDB v4 [ 155338 ]
            mariadb-jira-automation Jira Automation (IT) made changes -
            Zendesk Related Tickets 118441 132996 109663 161967

            People

              marko Marko Mäkelä
              thiru Thirunarayanan Balathandayuthapani
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.