[MDEV-8588] Assertion failure in file ha_innodb.cc line 21140 if at least one encrypted table exists and encryption service is not available - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 10.1(EOL)
Fix Version/s: 10.1.7
Component/s: Encryption, Storage Engine - InnoDB, Storage Engine - XtraDB
Labels:
None

Description

start server with file_key_management plugin and a valid key_management_filename, otherwise default options;
create an encrypted InnoDB table;
shutdown the server;
start server without file_key_management plugin (or with an invalid key_management_filename which is basically the same)
=>

2015-08-09 18:01:14 140240402089824 [ERROR] InnoDB: Tablespace id 7 encrypted but encryption service not available. Can't continue opening tablespace.

2015-08-09 18:01:14 7f8c43598760  InnoDB: Assertion failure in thread 140240402089824 in file ha_innodb.cc line 21140

InnoDB: We intentionally generate a memory trap.

Stack trace from 10.1 afd59b575a75ebbc57f71ce2865fdff85e3e233b
#3 <signal handler called>
#4 0x00007f7ad3829165 in *__GI_raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#5 0x00007f7ad382c3e0 in *__GI_abort () at abort.c:92
#6 0x00007f7ad655baf0 in ib_logf (level=IB_LOG_LEVEL_FATAL, format=0x7f7ad6c80900 "Tablespace id %ld encrypted but encryption service not available. Can't continue opening tablespace.\n") at 10.1/storage/xtradb/handler/ha_innodb.cc:21140
#7 0x00007f7ad67a8801 in fil_read_first_page (data_file=13, one_read_already=0, flags=0x7ffeb27fd0a8, space_id=0x7ffeb27fd0a0, min_flushed_lsn=0x7ffeb27fd098, max_flushed_lsn=0x7ffeb27fd098, crypt_data=0x7ffeb27fd0b8) at 10.1/storage/xtradb/fil/fil0fil.cc:2066
#8 0x00007f7ad67abae2 in fil_open_single_table_tablespace (validate=true, fix_dict=true, id=4, flags=0, tablename=0x7f7ab122c478 "test/t1", path_in=0x0) at 10.1/storage/xtradb/fil/fil0fil.cc:3765
#9 0x00007f7ad6785da5 in dict_check_tablespaces_and_store_max_id (dict_check=DICT_CHECK_NONE_LOADED) at 10.1/storage/xtradb/dict/dict0load.cc:1171
#10 0x00007f7ad66a8656 in innobase_start_or_create_for_mysql () at 10.1/storage/xtradb/srv/srv0start.cc:2592
#11 0x00007f7ad653e9d4 in innobase_init (p=0x7f7ad2c2e270) at 10.1/storage/xtradb/handler/ha_innodb.cc:4101
#12 0x00007f7ad63cdd56 in ha_initialize_handlerton (plugin=0x7f7ad2d756e8) at 10.1/sql/handler.cc:513
#13 0x00007f7ad61b44cc in plugin_initialize (tmp_root=0x7ffeb2801750, plugin=0x7f7ad2d756e8, argc=0x7f7ad7476430, argv=0x7f7ad2c1e6a0, options_only=false) at 10.1/sql/sql_plugin.cc:1407
#14 0x00007f7ad61b509e in plugin_init (argc=0x7f7ad7476430, argv=0x7f7ad2c1e6a0, flags=2) at 10.1/sql/sql_plugin.cc:1680
#15 0x00007f7ad6097fe4 in init_server_components () at 10.1/sql/mysqld.cc:5160
#16 0x00007f7ad6098f7c in mysqld_main (argc=11, argv=0x7f7ad2c1e6a0) at 10.1/sql/mysqld.cc:5717
#17 0x00007f7ad608e800 in main (argc=11, argv=0x7ffeb28024a8) at 10.1/sql/main.cc:25

I would expect that the table could not be read, but crashing on startup because there is an unreadable user table (not a system table!) seems to be an overkill.
Also,[ documentation|https://mariadb.com/kb/en/mariadb/data-at-rest-encryption/#file_key_management_filename] suggests otherwise:

If the key file can not be read at server startup, for example if the file key is not present, the encryption will not work and encrypted tables will be unreadable.

Attachments

Issue Links

causes

MDEV-12349 Remove useless (encryption) fields from dict_table_t

Closed

duplicates

MDEV-8591 Database page corruption on disk or a failed space, Assertion failure in file buf0buf.cc line 2856 on querying a table using wrong default encryption key

Closed

relates to

MDEV-20562 btr_cur_open_at_rnd_pos() fails to return error for corrupted page

Closed

Activity

Ascending order - Click to sort in descending order

Jan Lindström (Inactive) added a comment - 2015-08-14 08:00

I find your suggestion problematic, because

If user tries to read from encrypted table, what should be returned to user, ER_CRASHED_ON_USAGE ?
How constraints are enforced ? Consider a case where we have two tables, one encrypted and one not encrypted and foreign key constraint between them. Thus, you can read unencrypted table, but not modify it (because we can't enforce constraints to encrypted table). Again, what should be returned to user, ER_READ_ONLY_MODE ?
We find on buffer pool code that page is corrupted, how to identify that it is encrypted or it is really corrupted, from original page read from disk we could find out (page type, key_version) is it encrypted BUT if page is really corrupted on disk it could also corrupt exactly these bytes also ?
Ignoring the page corruption on buffer pool code is easy, but I need to research how "easy" is to skip a lot of page contents validations after that.

Before I close issue by Won't fix, I will do some code reading also.

Jan Lindström (Inactive) added a comment - 2015-08-14 08:00 I find your suggestion problematic, because If user tries to read from encrypted table, what should be returned to user, ER_CRASHED_ON_USAGE ? How constraints are enforced ? Consider a case where we have two tables, one encrypted and one not encrypted and foreign key constraint between them. Thus, you can read unencrypted table, but not modify it (because we can't enforce constraints to encrypted table). Again, what should be returned to user, ER_READ_ONLY_MODE ? We find on buffer pool code that page is corrupted, how to identify that it is encrypted or it is really corrupted, from original page read from disk we could find out (page type, key_version) is it encrypted BUT if page is really corrupted on disk it could also corrupt exactly these bytes also ? Ignoring the page corruption on buffer pool code is easy, but I need to research how "easy" is to skip a lot of page contents validations after that. Before I close issue by Won't fix, I will do some code reading also.

Elena Stepanova added a comment - 2015-08-14 12:11

If user tries to read from encrypted table, what should be returned to user, ER_CRASHED_ON_USAGE ?

How constraints are enforced ? Consider a case where we have two tables, one encrypted and one not encrypted and foreign key constraint between them. Thus, you can read unencrypted table, but not modify it (because we can't enforce constraints to encrypted table). Again, what should be returned to user, ER_READ_ONLY_MODE ?

From users' point of view, I don't see how any of this is a problem.
InnoDB already detects when a table cannot be read because of encryption, it does so on startup and returns the proper error message as quoted in the description. That's exactly what should be returned when the table cannot be used at runtime, either directly or via constraints (iirc there is some generic message that an error comes from a storage engine, and specifics get listed in warnings). It is still better than making the whole instance unavailable. Imagine that somebody was careless enough to experiment with some bad encryption plugin, created an encrypted table and then the plugin broke, could not work, could not decrypt, whatever. Does the user have to throw away the whole instance because of that?

We find on buffer pool code that page is corrupted, how to identify that it is encrypted or it is really corrupted, from original page read from disk we could find out (page type, key_version) is it encrypted BUT if page is really corrupted on disk it could also corrupt exactly these bytes also ?

Ignoring the page corruption on buffer pool code is easy, but I need to research how "easy" is to skip a lot of page contents validations after that.

I obviously cannot answer that. In fact, I know so little about internals of InnoDB that I can't even tell why such a page (which was encrypted earlier, but cannot be decrypted on server startup) has to be in a buffer pool at the first place. I'd think – it couldn't read the tablespace, complained and forgot about it till the next time. But apparently it is not so; anyway, I surely hope that InnoDB can tell an encrypted page from a corrupted page.

Before I close issue by Won't fix, I will do some code reading also.

It cannot be "Won't fix". At the very least, it should be not an assertion failure but an error message, followed by "InnoDB init returned error", and shutdown if InnoDB was a default engine or continuing without it if it was not.
But I still hope you'll find a better way than just failing the whole engine because of an unreadable user table.

Elena Stepanova added a comment - 2015-08-14 12:11 If user tries to read from encrypted table, what should be returned to user, ER_CRASHED_ON_USAGE ? How constraints are enforced ? Consider a case where we have two tables, one encrypted and one not encrypted and foreign key constraint between them. Thus, you can read unencrypted table, but not modify it (because we can't enforce constraints to encrypted table). Again, what should be returned to user, ER_READ_ONLY_MODE ? From users' point of view, I don't see how any of this is a problem. InnoDB already detects when a table cannot be read because of encryption, it does so on startup and returns the proper error message as quoted in the description. That's exactly what should be returned when the table cannot be used at runtime, either directly or via constraints (iirc there is some generic message that an error comes from a storage engine, and specifics get listed in warnings). It is still better than making the whole instance unavailable. Imagine that somebody was careless enough to experiment with some bad encryption plugin, created an encrypted table and then the plugin broke, could not work, could not decrypt, whatever. Does the user have to throw away the whole instance because of that? We find on buffer pool code that page is corrupted, how to identify that it is encrypted or it is really corrupted, from original page read from disk we could find out (page type, key_version) is it encrypted BUT if page is really corrupted on disk it could also corrupt exactly these bytes also ? Ignoring the page corruption on buffer pool code is easy, but I need to research how "easy" is to skip a lot of page contents validations after that. I obviously cannot answer that. In fact, I know so little about internals of InnoDB that I can't even tell why such a page (which was encrypted earlier, but cannot be decrypted on server startup) has to be in a buffer pool at the first place. I'd think – it couldn't read the tablespace, complained and forgot about it till the next time. But apparently it is not so; anyway, I surely hope that InnoDB can tell an encrypted page from a corrupted page. Before I close issue by Won't fix, I will do some code reading also. It cannot be "Won't fix". At the very least, it should be not an assertion failure but an error message, followed by "InnoDB init returned error", and shutdown if InnoDB was a default engine or continuing without it if it was not. But I still hope you'll find a better way than just failing the whole engine because of an unreadable user table.

Jan Lindström (Inactive) added a comment - 2015-08-16 10:03

http://lists.askmonty.org/pipermail/commits/2015-August/008263.html

Contains change on handler API.

Jan Lindström (Inactive) added a comment - 2015-08-16 10:03 http://lists.askmonty.org/pipermail/commits/2015-August/008263.html Contains change on handler API.

Jan Lindström (Inactive) added a comment - 2015-09-07 09:23

Actual review later in ~~MDEV-8764~~.

Jan Lindström (Inactive) added a comment - 2015-09-07 09:23 Actual review later in MDEV-8764 .

MariaDB Server

Assertion failure in file ha_innodb.cc line 21140 if at least one encrypted table exists and encryption service is not available

Details

Description

Attachments

Issue Links

Activity

People

Dates

Git Integration