[MDEV-15032] Encrypted database created on big endian machine cannot be used on little endian machine Created: 2018-01-22  Updated: 2021-03-01  Resolved: 2021-03-01

Status: Closed
Project: MariaDB Server
Component/s: Encryption, Storage Engine - InnoDB, Storage Engine - XtraDB
Affects Version/s: 10.1
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Jan Lindström (Inactive) Assignee: Jan Lindström (Inactive)
Resolution: Won't Fix Votes: 0
Labels: None

Attachments: File big_endian_notpatched.tgz     File big_endian_patched.tgz     File little_endian_notpatched.tgz     File little_endian_patched.tgz     File my.cnf    
Issue Links:
Relates
relates to MDEV-16416 Crash on IMPORT TABLESPACE of a ROW_F... Closed
relates to MDEV-12396 MariaDB Server SEGV crash when import... Closed
Sprint: 10.2.13

 Description   

See attached test case how to create tables on test and actual used database.



 Comments   
Comment by Jan Lindström (Inactive) [ 2018-01-22 ]

jan@jan-laptop-asus:~/jandb/jan3$ /usr/local/mysql/bin/mysql -u root
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 2
Server version: 10.1.31-MariaDB-debug Source distribution
 
Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.
 
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
 
MariaDB [(none)]> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
 
Database changed
MariaDB [test]> select * from t0;
+----+------------------------------------------------------------------------------------------------------+
| c1 | b                                                                                                    |
+----+------------------------------------------------------------------------------------------------------+
|  1 | AaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAa |
+----+------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
 
MariaDB [test]> select * from t1;
ERROR 1296 (HY000): Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
MariaDB [test]> select * from t2;
+----+------------------------------------------------------------------------------------------------------+
| c1 | b                                                                                                    |
+----+------------------------------------------------------------------------------------------------------+
|  1 | AaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAa |
+----+------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
 
MariaDB [test]> select * from t3;
+----+------------------------------------------------------------------------------------------------------+
| c1 | b                                                                                                    |
+----+------------------------------------------------------------------------------------------------------+
|  1 | AaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAa |
+----+------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
 
MariaDB [test]> select * from t4;
+----+------------------------------------------------------------------------------------------------------+
| c1 | b                                                                                                    |
+----+------------------------------------------------------------------------------------------------------+
|  1 | AaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAa |
+----+------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
 
MariaDB [test]> select * from t5;
ERROR 1296 (HY000): Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB

Comment by Jan Lindström (Inactive) [ 2018-02-05 ]

Ported required change from MySQL 5.7 to both InnoDB and XtraDB to 10.1:

commit 13748be548815d68b4644c65ed7d2852b60b9b67
Author: Vasil Dimov <vasil.dimov@oracle.com>
Date: Wed Apr 22 15:10:23 2015 +0300

Fix Bug#20783098 INNODB_CHECKSUM_ALGORITHM=CRC32 IS NOT BYTE ORDER AGNOSTIC

The CRC32 checksum generation code interprets portions of the byte
string to checksum as a 8-byte integer so that it can process 8 bytes
at a time (rather than 1 byte at a time). For this, the code uses the
native byte order of the machine:

crc ^= (ib_uint64_t) buf;

and then does numerical calculations with the result (e.g. crc >> N).
Thus the resulting checksum depends on the byte order of the machine
and is different on big and little endian machines. This means that
files written to with --innodb-checksum-algorithm=crc32/strict_crc32 on
big (little) endian machines are not readable on little (big) endian
machines because the checksum, though valid, is not recognized.

The simplest solution would be to start writing only e.g. big endian
checksums and recognize only such ones, but this would introduce an
unacceptable backwards incompatibility.

The solution implemented is to recognize both big and little endian
CRC32 checksums during verification, while first calculating and
checking the little endian one.

Swapping the byteorder in order to calculate "the other" CRC32 checksum
slows down the checksum calculation by about 1-2% (e.g. recognize
big-endian-CRC32 on little endian machines or recognize
little-endian-CRC32 on big endian machines).

When generating the checksum (when writing to disk) we now always use
little endian byteorder (no change in little endian machines, and an
extra step of swapping the byteorder on big-endian machines).

Reviewed-by: Debarun Banerjee <debarun.banerjee@oracle.com>
RB: 8781

Comment by Jan Lindström (Inactive) [ 2018-02-05 ]

Manually tested using database created on big endian machine on little endian machine.
Currently IBM VPN can't connect so can't yet test other way around. Also used test
case on second commit (not pushed as not all bb machines have unzip).

https://github.com/MariaDB/server/commit/1cd1db9602476ec6864f4a521de97d788fe93fac

Comment by Marko Mäkelä [ 2018-02-06 ]

thiru, please review this. Once you think that the change is in accordance with my comments below, and once you are happy with it, reassign to me for final check. This will be a good opportunity for you to learn about the MariaDB 10.1 encryption and page_compression features that are somewhat similar to the corresponding MySQL 5.7 features.

We have to carefully consider the implications of this. This would not be the first file format problem that was introduced in the MariaDB 10.1 series. We already had MDEV-11623 and various problems with innodb_encrypt_log=ON, the latest being MDEV-14874.

You should be aware that the ut_crc32() that was introduced to MariaDB 10.0 and 10.1 via MySQL 5.6 is not portable between big endian and little endian systems. MySQL 5.7 finally fixed this and introduced a "legacy CRC-32C" for being compatible with checksums that were generated by big-endian systems running the buggy 5.6 checksum implementation. For POWER8, MariaDB contains a SIMD accelerated CRC-32C implemented by the vpmsum instruction. But there is also a ‘generic’ CRC-32C implementation.

With the assumption that MariaDB is rarely deployed on big-endian machines, I think that it would be acceptable and preferable to declare that the little endian machines produced the correct CRC-32C checksum for encrypted or page_compressed pages, and the big endian machines used the wrong checksum. On big-endian systems, we could implement fallback logic that if the correctly computed CRC-32C checksum does not match, we will try the buggy CRC-32C as well. As part of this, it might help to backport the CRC-32C changes from 10.2 to 10.1.

jplindst, I hope that you ported the code changes from 10.3 and not 10.2 or 5.7. svoj recently cleaned up the CRC-32C functions in 10.3. Also, I hope that we are not weakening the CRC-32C checksum on little-endian systems by tolerating the wrong checksums from big-endian systems.

jplindst, were the changes also implemented in innochecksum? Where is the 10.2 version of this fix? In the 10.2 version, innochecksum must support rewriting the page checksums, using the bug-free algorithm only.

I think that we need portability tests that involve generating or editing empty compressed, encrypted, and compressed+encrypted data files (just the clustered index page 3) with a perl snippet, similar to the innodb.log_corruption test in 10.2. Maybe it would be easiest to let IMPORT TABLESPACE access and adjust the generated files.

When reviewing this, try to pay close attention to what happens when upgrading from affected 10.1 versions to the fixed 10.1 version, and when upgrading from the fixed 10.1 to an older 10.2 version that is missing the fix. Also consider whether a downgrade to an earlier 10.1 version will be possible at all. Ideally all this would have been already documented by jplindst in this ticket.

Last but not least: I would ignore any portability problems on the 10.1 innodb_encrypt_log=1 redo log files. Starting with 10.2 and MDEV-11782, the encrypted redo log format is portable across platforms, and encryption.innodb_encrypt_log_corruption is demonstrating it.

Comment by Jan Lindström (Inactive) [ 2018-02-07 ]

See my report on portability tests on https://jira.mariadb.org/browse/MDEV-13103

Downgrade from the fixed 10.1 to not fixed 10.1 (same applies to 10.2) naturally will not work for big endian databases.

Comment by Marko Mäkelä [ 2018-02-07 ]

One more note: There are strong reasons not to change the function of innodb_checksum_algorithm=crc32 in 10.1. The reason why the bug was not fixed in MySQL 5.6 (or MariaDB 10.0 and 10.1) is that it is not the default setting, that is, by default, unencrypted data files are not affected by the portability issue. I do not want to introduce any format change in 10.1 for unencrypted tables, in case someone would want to downgrade from a later 10.1 to an earlier version (due to MDEV-11623, not earlier than 10.1.21), or export data from a newer version to an earlier version.

With encrypted tables in MariaDB 10.1, it is a different situation, because CRC-32C is the only checksum algorithm. Luckily for us, encryption is not enabled by default. Still, we must make sure that we only change the format of the encrypted page checksum on big-endian systems, and only accept a malformed encrypted page checksum when the code is running on a big-endian machine (#ifdef WORDS_BIGENDIAN).

Comment by Jan Lindström (Inactive) [ 2018-02-07 ]

Actually, now that I read your comment, I do agree. So I should revert the code changes to other checksum verify functions except fil0crypt.cc and even that should be modified to use above #ifdef.

Comment by Jan Lindström (Inactive) [ 2018-02-07 ]

https://github.com/MariaDB/server/commits/bb-10.1-MDEV-15032

Comment by Jan Lindström (Inactive) [ 2018-02-08 ]

Compatibility report based on attached databases:

  • Big-endian machine (rhel7-ppc64)
    • 10.1-unpatched
      • little_endian_notpatched.tgz: all not encrypted tables ok
      • little_endian_patched.tgz:all not encrypted tables ok
      • big_endian_notpatched.tgz: all tables ok
      • big_endian_patched.tgz: all not encrypted tables ok
    • 10.1-patched
      • little_endian_notpatched.tgz: all tables ok
      • little_endian_patched.tgz: all tables ok
      • big_endian_notpatched.tgz: all tables ok
      • big_endian_patched.tgz: all tables ok
  • Little-endian machine (my laptop)
    • 10.1-unpatched
      • little_endian_notpatched.tgz: all tables ok
      • little_endian_patched.tgz: all tables ok
      • big_endian_notpatched.tgz: all not encrypted tables ok
      • big_endian_patched.tgz: all tables ok
    • 10.1-patched
      • little_endian_notpatched.tgz: all tables ok
      • little_endian_patched.tgz: all tables ok
      • big_endian_notpatched.tgz: all tables ok
      • big_endian_patched.tgz: all tables ok

Note that import does work on all combinations as already documented on MDEV-13103 as post encryption checksum is not checked. Similarly innochecksum is not affected.

Comment by Jan Lindström (Inactive) [ 2018-03-01 ]

thiru Could you please review.

Comment by Thirunarayanan Balathandayuthapani [ 2018-03-02 ]

Patch looks good to me.

Comment by Marko Mäkelä [ 2018-03-05 ]

jplindst, I did not find any test case.
For what is it worth, MySQL 5.7 contains the test innodb.crc32_endianness which attempts IMPORT TABLESPACE with various data files using. I find the idea good, but the implementation should be based on Perl code that creates the data files. We would also have to deal with more combinations in MariaDB.

Comment by Marko Mäkelä [ 2018-03-07 ]

Please add a test case that creates a few copies of a 4-page-payload .ibd file using Perl, with different checksums, and using different compression or encryption on page 3 (the clustered index root page). The test should attempt ALTER TABLE…IMPORT TABLESPACE of the files and subsequently access the table.

This should take care of the cross-platform testing.

Comment by Marko Mäkelä [ 2018-03-12 ]

If PageConverter::validate() fails to validate page checksums during the import of encrypted or page_compressed tables, that is a grave bug that must be fixed. For encrypted tables it would very well fall within the scope of this very bug.

Actually, last week when I modified one of the ALTER TABLE…IMPORT TABLESPACE tests of an encrypted or compressed table, I accidentally made it copy a .cfg file to an .ibd file. The server would crash because of that. I would have expected it to report an error for a corrupted data file.

Comment by Jan Lindström (Inactive) [ 2018-03-12 ]

marko PageConverter::validate() should detect corruption if decrypt is done with incorrect key or page was corrupted. But, I do agree your point so I will add post encryption checksum check before decryption. For compressed tables can't help if it crashed during decompression within this bug. It will be fixed on https://jira.mariadb.org/browse/MDEV-13103

Comment by Jan Lindström (Inactive) [ 2018-03-13 ]

https://github.com/MariaDB/server/tree/bb-10.1-MDEV-15032

Comment by Marko Mäkelä [ 2018-06-06 ]

The fil_iterate() code had been refactored in MDEV-12396. I ported the patch and did some other cleanup.
This change alone (verify the encrypted page checksum before decryption in IMPORT) does not appear to fix anything. We additionally have to implement the correct CRC-32C checksum on big endian machines, so that after the fix, big endian machines will create portable data files.

I would not touch the checksums of little-endian machines, so that the checksums will not become weaker. On big-endian machines, they will necessarily be weakened, because in case the proper CRC-32C checksum produces a mismatch, we would fall back to computing the buggy "legacy CRC-32C" checksum. In MariaDB 10.4, we can remove support for the "legacy CRC-32C". This would affect big-endian machines only.

Comment by Marko Mäkelä [ 2018-06-07 ]

The submitted MDEV-15032 fix consisted of two commits. The first one backports some changes to ut_crc32(), and the second one modifies IMPORT TABLESPACE so that the encrypted page checksum will be validated before decryption of the page is attempted. I pushed the latter patch as part of the MDEV-16416 bug fix.

What remains to be done in this ticket is a fix of the CRC-32C checksum in a way that does not weaken the checksum on little-endian systems.

Comment by Marko Mäkelä [ 2018-08-29 ]

I don’t think that we should spend effort on this unless someone complains. The CRC-32C implementation that is used by InnoDB encryption is byte order agnostic starting with MariaDB 10.2.2.

Comment by Marko Mäkelä [ 2021-03-01 ]

This bug affected the 10.1 series only, and 10.1 has reached its end of life.

Generated at Thu Feb 08 08:18:10 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.