[MDEV-11759] Encryption code in MariaDB 10.1/10.2 causes compatibility problems Created: 2017-01-10  Updated: 2019-02-09  Resolved: 2017-03-13

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.1, 10.2
Fix Version/s: 10.1.22, 10.2.5

Type: Bug Priority: Critical
Reporter: Michael Graf Assignee: Jan Lindström (Inactive)
Resolution: Fixed Votes: 0
Labels: 10.2-rc
Environment:

Windows Server 2012 R2 Standard
128 GB RAM


Attachments: Zip Archive MG-GUI-SSD2.errorlog.zip     File my.ini     Zip Archive mysqld_minidumps.zip     Text File show full processlist.txt    
Issue Links:
Duplicate
is duplicated by MDEV-11929 During delete: InnoDB: Assertion fail... Closed
PartOf
is part of MDEV-12253 Buffer pool blocks are accessed after... Closed
Problem/Incident
causes MDEV-12419 IMPORT should not look up tablespace ... Closed
causes MDEV-18529 InnoDB wrongly skips decryption of en... Closed
Relates
relates to MDEV-11974 Disable encryption of spatial indexes... Closed
relates to MDEV-11939 innochecksum mistakes a file for an e... Closed
relates to MDEV-12396 MariaDB Server SEGV crash when import... Closed
Sprint: 10.2.4-1, 10.2.4-2, 10.1.22

 Description   

MariaDB server 10.2.3 stuck after our application did some inserts/updates/deletes/selects.

To avoid the crash from MDEV-11756 I dropped the index from table MsgID2Message.



 Comments   
Comment by Marko Mäkelä [ 2017-01-23 ]

In the error log, I spotted the following:

2017-01-10 17:07:26 5292 [ERROR] InnoDB: Block in space_id 0 in file innodb_system encrypted.
2017-01-10 17:07:26 5292 [ERROR] InnoDB: However key management plugin or used key_id 30 is not found or used encryption algorithm or method does not match.
2017-01-10 17:07:26 5292 [ERROR] InnoDB: Marking tablespace as missing. You may drop this table or install correct key management plugin and key file.

This is the wrong course of action. The InnoDB system tablespace contains essential data (enumerated in MDEV-11633) and we cannot disable access to it.
I do not see any hints of encryption in the my.ini. I guess the page was merely corrupted, but we are mistaking it for an encrypted one.

In the error log, I also recovered a page dump. It looks like an unencrypted B-tree page. If I am guessing correctly, it is the clustered index leaf page of some table whose PRIMARY KEY consists of three 32-bit integers. But the DB_TRX_ID of most (if not all) records would look suspiciously small, only 0x546. Either these records were inserted soon after the database was originally initialized, or we wrongly reset the TRX_ID sequence when access to the system tablespace was disabled. The records look a little strange for me, because some records are longer, some shorter, and some contain text strings of timestamps delimited by # characters. The linked list of records on the page looks correct to me.

Finally, I got curious and wanted to run innochecksum on the 16k page dump that I extracted from the hexdump in the error log. For some reason, innochecksum is not reporting the page as corrupted, but single-stepping in gdb revealed this problem:

591		if (mach_read_from_4(buf+FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION) != 0) {
592			is_corrupted = fil_space_verify_crypt_checksum(buf, page_size,

That is, innochecksum (and I guess also MariaDB Server) wrongly identifies the page as an encrypted one. Luckily for us, the first check fails (in a worse case, we could mistake a corrupted unencrypted page for a valid encrypted one), and then we fall back to the unencrypted check. But, that check is bogus for this page:

859		if (mach_read_from_4(read_buf+FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION) != 0
860		    || mach_read_from_2(read_buf+FIL_PAGE_TYPE) == FIL_PAGE_PAGE_COMPRESSED_ENCRYPTED) {
861			page_encrypted = true;
862		}
936		if (page_encrypted) {
937			return (FALSE);
938		}

Because the mach_read_from_4() happened to return 30 for this page, buf_page_is_corrupted() will wrongly assume that the page is encrypted, and ultimately claim that the page is valid, even though we just determined that it cannot be a valid encrypted page.

How did we get the value 30 into the field FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION?
And how can the MariaDB page encryption possibly work with SPATIAL INDEX in MariaDB Server 10.2? The same field was reused in MySQL 5.7 for FIL_RTREE_SPLIT_SEQ_NUM on R-tree index pages.

Finally, the hang in the error log is triggered by buf_wait_for_read() for this code:

			if (io_fix == BUF_IO_READ) {
				/* Wait by temporaly s-latch */
				rw_lock_s_lock(&block->lock);

Apparently the hang occurs either at the start of query execution when ha_innobase::records_in_range() is being invoked, or during CHECK TABLE.

The function buf_wait_for_read() is identical with MySQL 5.7 (including the typo in the comment). This area of the code was last changed several years ago (Oracle Bug#16249481), so we might want to look for the culprit elsewhere in the MariaDB Server 10.2 code base.

One simple explanation could be that when the modified InnoDB in MariaDB 10.2 noticed page corruption, it forgot to release the block->lock, setting a ‘trap’ for any subsequent thread that might want to access the block.

Comment by Marko Mäkelä [ 2017-01-23 ]

I think that we must fix several independent issues:

  1. if FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION happens to be nonzero (it could be nonzero on system tablespace pages if the system tablespace used to consist of multiple files at some point), we must not simply assume that the page is encrypted (more consistency checks are needed)
  2. the local variable page_encrypted must be removed from buf_page_is_corrupted(), and possibly replaced with a parameter
  3. Encryption must be properly tested with SPATIAL INDEX before 10.2 goes GA. I do not think that it can work in its current form.

Note that MySQL 5.7 contains this change (to prepare for the repurposing of the field as FIL_RTREE_SPLIT_SEQ_NUM), but it does not retroactively clear the field to 0 on old files:

commit 8406d6aac3be6353bf693159ea0a5163edae0179
Author: Marko Mäkelä <marko.makela@oracle.com>
Date:   Wed Jul 2 10:26:42 2014 +0300
 
    WL#7990 Repurpose FIL_PAGE_FLUSH_LSN
    
    The field FIL_PAGE_FLUSH_LSN is consulted during InnoDB startup.
    
    At startup, InnoDB reads the FIL_PAGE_FLUSH_LSN from the first page of
    each file in the InnoDB system tablespace. If there are multiple
    files, the minimum and maximum LSN can differ. These numbers are
    passed to InnoDB startup.
    
    Having the number in other files than the first file of the InnoDB
    system tablespace is not providing much additional value. It is
    conflicting with other use of the field, such as on InnoDB R-tree
    index pages.
    
    The FIL_PAGE_FLUSH_LSN was also being written to InnoDB undo
    tablespace files, even though the fields are not going to be consulted
    on crash recovery.
    
    This worklog will stop writing FIL_PAGE_FLUSH_LSN to other files
    than the first file of the InnoDB system tablespace (page number 0:0).

Comment by Marko Mäkelä [ 2017-01-23 ]

Note: The value 30 would be an unlikely small value of FIL_PAGE_FLUSH_LSN. A more likely explanation could be that the system tablespace file was originally created with an old version of MySQL where InnoDB did not bother to initialize unused data fields, writing garbage contents.

Comment by Jan Lindström (Inactive) [ 2017-02-02 ]

bb-10.1-MDEV-11759

Comment by Marko Mäkelä [ 2017-02-02 ]

I think that more work is needed to ensure that the encryption code is as compatible with old files as possible. (Old files may contain uninitialized garbage in data fields that have been repurposed for encryption.)

Instead of trusting the page contents before checksum validation, we should check fil_space_t if the pages are supposed to be encrypted. For encrypted pages, we can assume that FIL_PAGE_TYPE is valid. For unencrypted pages, we cannot assume so, because MySQL 5.1 and earlier did not always initialize FIL_PAGE_TYPE.

Comment by Marko Mäkelä [ 2017-02-06 ]

Another data set (page 0 of the system tablespace, and page 8192 of the system tablespace, in the first page of a second file of the system tablespace) is attached in the duplicate issue MDEV-11939.

Comment by Jan Lindström (Inactive) [ 2017-02-06 ]

commit ddf2fac73381b84114d31c178d9207afc27bfa4d
Author: Jan Lindström <jan.lindstrom@mariadb.com>
Date: Mon Feb 6 10:47:55 2017 +0200

MDEV-11759: Encryption code in MariaDB 10.1/10.2 causes

compatibility problems

Pages that are encrypted contain post encryption checksum on
different location that normal checksum fields. Therefore,
we should before decryption check this checksum to avoid
unencrypting corrupted pages. After decryption we can use
traditional checksum check to detect if page is corrupted
or unencryption was done using incorrect key.

Pages that are page compressed do not contain any checksum,
here we need to fist unencrypt, decompress and finally
use tradional checksum check to detect page corruption
or that we used incorrect key in unencryption.

buf0buf.cc: buf_page_is_corrupted() mofified so that
compressed pages are skipped.

buf0buf.h, buf_block_init(), buf_page_init_low():
removed unnecessary page_encrypted, page_compressed,
stored_checksum, valculated_checksum fields from
buf_page_t

buf_page_get_gen(): use new buf_page_check_corrupt() function
to detect corrupted pages.

buf_page_check_corrupt(): If page was not yet decrypted
check if post encryption checksum still matches.
If page is not anymore encrypted, use buf_page_is_corrupted()
traditional checksum method.

If page is detected as corrupted and it is not encrypted
we print corruption message to error log.
If page is still encrypted or it was encrypted and now
corrupted, we will print message that page is
encrypted to error log.

buf_page_io_complete(): use new buf_page_check_corrupt()
function to detect corrupted pages.

buf_page_decrypt_after_read(): Verify post encryption
checksum before tring to decrypt.

fil0crypt.cc: fil_encrypt_buf() verify post encryption
checksum and ind fil_space_decrypt() return true
if we really decrypted the page.

fil_space_verify_crypt_checksum(): rewrite to use
the method used when calculating post encryption
checksum. We also check if post encryption checksum
matches that traditional checksum check does not
match.

fil0fil.ic: Add missed page type encrypted and page
compressed to fil_get_page_type_name()

Note that this change does not yet fix innochecksum tool,
that will be done in separate MDEV.

Fix test failures caused by buf page corruption injection.

Comment by Jan Lindström (Inactive) [ 2017-02-07 ]

Addressing review comments:
http://lists.askmonty.org/pipermail/commits/2017-February/010637.html

Comment by Marko Mäkelä [ 2017-02-20 ]

I noticed some problems when running the test innodb.innodb_bug14147491, which is intentionally corrupting a data page.

  1. The function buf_page_check_corrupt() must be declared static, because it is only used in one module.
  2. In the test innodb.innodb_bug14147491, after getting a page dump for a corrupted page, and after buf_page_io_complete() called buf_mark_space_corrupt() to release the page to the buffer pool, buf_page_get_gen() is invoking buf_page_check_corrupt() on the already invalidated buffer block object, looking up bpage->space=ULINT32_UNDEFINED which obviously does not exist, and then dereferencing the null pointer when trying to access space->crypt_data.

Please do the error handling at the lowest possible level, and avoid accessing the already freed buffer block.

Comment by Marko Mäkelä [ 2017-02-23 ]

In this encryption.innodb_onlinealter_encryption failure we are attempting to re-read a page of a non-existing tablespace:

2017-02-23  3:19:41 2812263232 [ERROR] InnoDB: Trying to do i/o to a tablespace which does not exist. i/o type 10, space id 7, page no. 585, i/o length 16384 bytes
2017-02-23 03:19:41 a79fbb40  InnoDB: Error: trying to access tablespace 7 page no. 585,
InnoDB: but the tablespace does not exist or is just being dropped.
2017-02-23 03:19:41 a79fbb40  InnoDB: Error: trying to access tablespace 7 page no. 585,
InnoDB: but the tablespace does not exist or is just being dropped.
170223  3:19:41 [ERROR] mysqld got signal 11 ;
fil/fil0crypt.cc:1905(fil_crypt_get_page_throttle_func)[0x8a15a6e]

It seems that fil_crypt_get_page_throttle_func() should use fil_space_acquire_silent() and fil_space_release(). But maybe we should also revise the reread logic to prevent such a futile operation. A missing tablespace cannot reappear, so there is no point to retry.

Comment by Jan Lindström (Inactive) [ 2017-02-27 ]

https://github.com/MariaDB/server/commit/3412e6a73b45a83ba45368be22bc162bbf39ff2b

Comment by Marko Mäkelä [ 2017-02-27 ]

I think that the patch needs some more work, and it also seems to depend on some changes that are being done in MDEV-11738.

Comment by Jan Lindström (Inactive) [ 2017-03-01 ]

https://github.com/MariaDB/server/commit/dc166632ca30071d83c4a1c14de9b322c9510f6e

Comment by Marko Mäkelä [ 2017-03-02 ]

I think that we need some more testing to cover the error cases.

Comment by Marko Mäkelä [ 2017-03-14 ]

The comments after 2017-02-07 will be addressed as part of MDEV-12253, which will be fixed after the 10.1.22 release.

Generated at Thu Feb 08 07:52:26 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.