[MDEV-12419] IMPORT should not look up tablespace in PageConverter::validate() Created: 2017-03-31  Updated: 2017-05-09  Resolved: 2017-05-03

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.1, 10.2
Fix Version/s: 10.1.23, 10.2.6

Type: Bug Priority: Major
Reporter: Marko Mäkelä Assignee: Jan Lindström (Inactive)
Resolution: Fixed Votes: 0
Labels: 10.2-ga, regression

Issue Links:
Duplicate
is duplicated by MDEV-12396 MariaDB Server SEGV crash when import... Closed
PartOf
is part of MDEV-12253 Buffer pool blocks are accessed after... Closed
Problem/Incident
is caused by MDEV-11759 Encryption code in MariaDB 10.1/10.2 ... Closed

 Description   

MDEV-11759 added a tablespace lookup to PageConverter::validate(), based on the contents of the page. This does not seem to make any sense. When InnoDB is importing a tablespace, the tablespace ID that is being used in the file may or may not be in use by existing tables in the server instance. If the ID is not used, the lookup would always return NULL. If a tablespace by that ID happens to exist, then we could be in deep trouble, because we are invoking buf_page_is_corrupted() with a completely bogus tablespace.

Please add a proper IMPORT test for page_compression and encryption.

In 10.2, please remove the function fil_space_found_by_id() altogether. Like fil_space_get(), it is unsafe to use, because the pointer can be stale if the tablespace was dropped just after the call.



 Comments   
Comment by Jan Lindström (Inactive) [ 2017-04-04 ]

Removed.

Comment by Marko Mäkelä [ 2017-04-17 ]

In which MariaDB versions will this issue be fixed? In the same commit with MDEV-12253? If the commit is not yet pushed, please include this MDEV reference in the commit message.

Comment by Jan Lindström (Inactive) [ 2017-04-28 ]

10.1:

commit 765a43605a42c069ede604826ede2d93d72c4fdd
Author: Jan Lindström <jan.lindstrom@mariadb.com>
Date: Wed Apr 26 15:19:16 2017 +0300

MDEV-12253: Buffer pool blocks are accessed after they have been freed

Problem was that bpage was referenced after it was already freed
from LRU. Fixed by adding a new variable encrypted that is
passed down to buf_page_check_corrupt() and used in
buf_page_get_gen() to stop processing page read.

This patch should also address following test failures and
bugs:

MDEV-12419: IMPORT should not look up tablespace in
PageConverter::validate(). This is now removed.

MDEV-10099: encryption.innodb_onlinealter_encryption fails
sporadically in buildbot

MDEV-11420: encryption.innodb_encryption-page-compression
failed in buildbot

MDEV-11222: encryption.encrypt_and_grep failed in buildbot on P8

Removed dict_table_t::is_encrypted and dict_table_t::ibd_file_missing
and replaced these with dict_table_t::file_unreadable. Table
ibd file is missing if fil_get_space(space_id) returns NULL
and encrypted if not. Removed dict_table_t::is_corrupted field.

Ported FilSpace class from 10.2 and using that on buf_page_check_corrupt(),
buf_page_decrypt_after_read(), buf_page_encrypt_before_write(),
buf_dblwr_process(), buf_read_page(), dict_stats_save_defrag_stats().

Added test cases when enrypted page could be read while doing
redo log crash recovery. Also added test case for row compressed
blobs.

btr_cur_open_at_index_side_func(),
btr_cur_open_at_rnd_pos_func(): Avoid referencing block that is
NULL.

buf_page_get_zip(): Issue error if page read fails.

buf_page_get_gen(): Use dberr_t for error detection and
do not reference bpage after we hare freed it.

buf_mark_space_corrupt(): remove bpage from LRU also when
it is encrypted.

buf_page_check_corrupt(): @return DB_SUCCESS if page has
been read and is not corrupted,
DB_PAGE_CORRUPTED if page based on checksum check is corrupted,
DB_DECRYPTION_FAILED if page post encryption checksum matches but
after decryption normal page checksum does not match. In read
case only DB_SUCCESS is possible.

buf_page_io_complete(): use dberr_t for error handling.

buf_flush_write_block_low(),
buf_read_ahead_random(),
buf_read_page_async(),
buf_read_ahead_linear(),
buf_read_ibuf_merge_pages(),
buf_read_recv_pages(),
fil_aio_wait():
Issue error if page read fails.

btr_pcur_move_to_next_page(): Do not reference page if it is
NULL.

Introduced dict_table_t::is_readable() and dict_index_t::is_readable()
that will return true if tablespace exists and pages read from
tablespace are not corrupted or page decryption failed.
Removed buf_page_t::key_version. After page decryption the
key version is not removed from page frame. For unencrypted
pages, old key_version is removed at buf_page_encrypt_before_write()

dict_stats_update_transient_for_index(),
dict_stats_update_transient()
Do not continue if table decryption failed or table
is corrupted.

dict0stats.cc: Introduced a dict_stats_report_error function
to avoid code duplication.

fil_parse_write_crypt_data():
Check that key read from redo log entry is found from
encryption plugin and if it is not, refuse to start.

PageConverter::validate(): Removed access to fil_space_t as
tablespace is not available during import.

Fixed error code on innodb.innodb test.

Merged test cased innodb-bad-key-change5 and innodb-bad-key-shutdown
to innodb-bad-key-change2. Removed innodb-bad-key-change5 test.
Decreased unnecessary complexity on some long lasting tests.

Removed fil_inc_pending_ops(), fil_decr_pending_ops(),
fil_get_first_space(), fil_get_next_space(),
fil_get_first_space_safe(), fil_get_next_space_safe()
functions.

fil_space_verify_crypt_checksum(): Fixed bug found using ASAN
where FIL_PAGE_END_LSN_OLD_CHECKSUM field was incorrectly
accessed from row compressed tables. Fixed out of page frame
bug for row compressed tables in
fil_space_verify_crypt_checksum() found using ASAN. Incorrect
function was called for compressed table.

Added new tests for discard, rename table and drop (we should allow them
even when page decryption fails). Alter table rename is not allowed.
Added test for restart with innodb-force-recovery=1 when page read on
redo-recovery cant be decrypted. Added test for corrupted table where
both page data and FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION is corrupted.

Adjusted the test case innodb_bug14147491 so that it does not anymore
expect crash. Instead table is just mostly not usable.

fil0fil.h: fil_space_acquire_low is not visible function
and fil_space_acquire and fil_space_acquire_silent are
inline functions. FilSpace class uses fil_space_acquire_low
directly.

recv_apply_hashed_log_recs() does not return anything.

Comment by Marko Mäkelä [ 2017-05-03 ]

This was fixed in the MDEV-12253 patch.

Generated at Thu Feb 08 07:57:34 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.