[MDEV-15569] Index page corruption with data-at-rest encryption Created: 2018-03-14  Updated: 2020-08-25  Resolved: 2018-04-04

Status: Closed
Project: MariaDB Server
Component/s: Encryption, Storage Engine - InnoDB
Affects Version/s: 10.1.31
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Geoff Montee (Inactive) Assignee: Jan Lindström (Inactive)
Resolution: Not a Bug Votes: 0
Labels: corruption, encryption, innodb


 Description   

A user seemed to experience random page corruption while using data-at-rest encryption:

2018-03-08 15:37:22 140038675761920 [ERROR] Master 'dc1master': InnoDB:  Page 3759119 in space N/A (148162) maybe corrupted. Post encryption checksum 1117819446 stored [0:2893804962] key_version 1
InnoDB: Page directory corruption: infimum not pointed to
2018-03-08 15:37:22 7f5d4b860b00 InnoDB: Page dump in ascii and hex (16384 bytes):
 len 16384; hex 
...
InnoDB: End of page dump
2018-03-08 15:37:22 7f5d4b860b00 InnoDB: uncompressed page, stored checksum in field1 0, calculated checksums for field1: crc32 1117819446, innodb 996923242, none 3735928559, stored checksum in field2 2893804962, calculated checksums for field2: crc32 1117819446, innodb 2893804962, none 3735928559, page LSN 4285 488303025, low 4 bytes of LSN at page end 488303025, page number (if stored to page already) 3759119, space id (if created with >= MySQL-4.1.1 and stored already) 148162
InnoDB: page type 17855 meaning INDEX
InnoDB: Page may be an index page where index id is 7476710334411351822
InnoDB: Page directory corruption: supremum not pointed to
2018-03-08 15:37:22 7f5d4b860b00 InnoDB: Page dump in ascii and hex (16384 bytes):
 len 16384; hex 
...
InnoDB: End of page dump

This caused the following crash:

2018-03-08 15:37:22 7f5d4b860b00 InnoDB: uncompressed page, stored checksum in field1 0, calculated checksums for field1: crc32 1117819446, innodb 996923242, none 3735928559, stored checksum in field2 2893804962, calculated checksums for field2: crc32 1117819446, innodb 2893804962, none 3735928559, page LSN 4285 488303025, low 4 bytes of LSN at page end 488303025, page number (if stored to page already) 3759119, space id (if created with >= MySQL-4.1.1 and stored already) 148162
InnoDB: page type 17855 meaning INDEX
InnoDB: Page may be an index page where index id is 7476710334411351822
2018-03-08 15:37:22 7f5d4b860b00  InnoDB: Assertion failure in thread 140038675761920 in file rem0rec.cc line 581
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
180308 15:37:22 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
 
Server version: 10.1.31-MariaDB-1~trusty
key_buffer_size=134217728
read_buffer_size=2097152
max_used_connections=5
max_threads=2502
thread_count=2
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 15555412 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x7f26e282b008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f5d4b85f990 thread_stack 0x48400
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x56472af6122e]
/usr/sbin/mysqld(handle_fatal_signal+0x305)[0x56472aa833e5]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x7f5d4c867330]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x7f5d4bcb6c37]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7f5d4bcba028]
/usr/sbin/mysqld(+0x89f7bb)[0x56472ad607bb]
/usr/sbin/mysqld(+0x87fddc)[0x56472ad40ddc]
/usr/sbin/mysqld(+0x93d8f2)[0x56472adfe8f2]
/usr/sbin/mysqld(+0x8b316e)[0x56472ad7416e]
/usr/sbin/mysqld(+0x8b5bbb)[0x56472ad76bbb]
/usr/sbin/mysqld(+0x8b5f7b)[0x56472ad76f7b]
/usr/sbin/mysqld(+0x8c2597)[0x56472ad83597]
/usr/sbin/mysqld(+0x812bb3)[0x56472acd3bb3]
/usr/sbin/mysqld(_ZN7handler12ha_write_rowEPh+0x483)[0x56472aa8e153]
/usr/sbin/mysqld(_ZN14Rows_log_event9write_rowEP14rpl_group_infob+0x16d)[0x56472ab62cbd]
/usr/sbin/mysqld(_ZN20Write_rows_log_event11do_exec_rowEP14rpl_group_info+0x7d)[0x56472ab630cd]
/usr/sbin/mysqld(_ZN14Rows_log_event14do_apply_eventEP14rpl_group_info+0x31c)[0x56472ab55c0c]
/usr/sbin/mysqld(+0x3b528b)[0x56472a87628b]
/usr/sbin/mysqld(handle_slave_sql+0x27b9)[0x56472a8824d9]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8184)[0x7f5d4c85f184]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f5d4bd7e03d]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x0): is an invalid pointer
Connection ID (thread ID): 4
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off
 
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
 
We think the query pointer is invalid, but we will try to print it anyway. 
Query: 

To get the server started, innodb_force_recovery had to be set to 3. At that point, they were able to figure out which tablespace had space id = 148162:

MariaDB [(none)]> SELECT * FROM information_schema.INNODB_SYS_TABLESPACES WHERE SPACE=148162;
+--------+-----------------------------------+------+-------------+----------------------+-----------+---------------+
| SPACE | NAME | FLAG | FILE_FORMAT | ROW_FORMAT | PAGE_SIZE | ZIP_PAGE_SIZE |
+--------+-----------------------------------+------+-------------+----------------------+-----------+---------------+
| 148162 | db1/tab | 0 | Antelope | Compact or Redundant | 16384 | 0 |
+--------+-----------------------------------+------+-------------+----------------------+-----------+---------------+

They ran innochecksum on the table, but no checksum failures were reported.

However, when they ran CHECK TABLE on the table, they saw the following error:

2018-03-13 21:34:34 7f47dd5d5b00  InnoDB: Warning: CHECK TABLE on index `PRIMARY` of table `db1`.`tab` returned 12
2018-03-14 12:29:15 139946633288448 [ERROR] InnoDB:  Page 3759119 in space N/A (148162) maybe corrupted. Post encryption checksum 1117819446 stored [0:2893804962] key_version 1
InnoDB: Error in pages 20929391 and 3759119 of index "idx1" of table "db1"."tab"
InnoDB: 'compact' flag mismatch
2018-03-14 12:29:15 7f47dd5d5b00 InnoDB: Page dump in ascii and hex (16384 bytes):
 len 16384; hex 
...
InnoDB: End of page dump
2018-03-14 12:29:15 7f47dd5d5b00 InnoDB: uncompressed page, stored checksum in field1 1851218656, calculated checksums for field1: crc32 3816730793, innodb 1851218656, none 3735928559, stored checksum in field2 2568069479, calculated checksums for field2: crc32 3816730793, innodb 2568069479, none 3735928559, page LSN 4340 1363275092, low 4 bytes of LSN at page end 1363275092, page number (if stored to page already) 20929391, space id (if created with >= MySQL-4.1.1 and stored already) 148162
InnoDB: page type 17855 meaning INDEX
InnoDB: Page may be an index page where index id is 89357
InnoDB: (index "idx1" of table "db1"."tab")
2018-03-14 12:29:15 7f47dd5d5b00 InnoDB: Page dump in ascii and hex (16384 bytes):
 len 16384; hex 
...
InnoDB: End of page dump

Followed by the following crash:

2018-03-14 12:29:15 7f47dd5d5b00 InnoDB: uncompressed page, stored checksum in field1 0, calculated checksums for field1: crc32 1117819446, innodb 996923242, none 3735928559, stored checksum in field2 2893804962, calculated checksums for field2: crc32 1117819446, innodb 2893804962, none 3735928559, page LSN 4285 488303025, low 4 bytes of LSN at page end 488303025, page number (if stored to page already) 3759119, space id (if created with >= MySQL-4.1.1 and stored already) 148162
InnoDB: page type 17855 meaning INDEX
InnoDB: Page may be an index page where index id is 7476710334411351822
2018-03-14 12:29:15 139946633288448 [ERROR] InnoDB: Page index id 7476710334411351822 != data dictionary index id 89357
2018-03-14 12:29:15 7f47dd5d5b00  InnoDB: Assertion failure in thread 139946633288448 in file btr0btr.cc line 4904
InnoDB: Failing assertion: btr_page_get_level(page, &mtr) == level
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
180314 12:29:15 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
 
Server version: 10.1.31-MariaDB-1~trusty
key_buffer_size=67108864
read_buffer_size=2097152
max_used_connections=3
max_threads=502
thread_count=1
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 3160251 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x7f3e533f9008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f47dd5d51f0 thread_stack 0x48400
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55620797122e]
/usr/sbin/mysqld(handle_fatal_signal+0x305)[0x5562074933e5]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x7f47dc24f330]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x7f47db69fc37]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7f47db6a3028]
/usr/sbin/mysqld(+0x939e34)[0x55620780ae34]
/usr/sbin/mysqld(+0x811d2a)[0x5562076e2d2a]
/usr/sbin/mysqld(_ZN7handler8ha_checkEP3THDP15st_ha_check_opt+0x7a)[0x55620749ac6a]
/usr/sbin/mysqld(+0x51afc8)[0x5562073ebfc8]
/usr/sbin/mysqld(_ZN19Sql_cmd_check_table7executeEP3THD+0xb0)[0x5562073ecd60]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x14ac)[0x5562073072ac]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x321)[0x55620730f611]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x2693)[0x556207312e03]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x136)[0x556207313686]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x18a)[0x5562073df75a]
/usr/sbin/mysqld(handle_one_connection+0x40)[0x5562073df900]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8184)[0x7f47dc247184]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f47db766bed]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f3a2f034020): is an invalid pointer
Connection ID (thread ID): 43
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off
 
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
 
We think the query pointer is invalid, but we will try to print it anyway. 
Query: CHECK TABLE db1.tab

The corruption appears to be in the following index:

INDEX_ID: 89357
NAME: idx1
TABLE_ID: 148176
TYPE: 2
N_FIELDS: 3
PAGE_NO: 4
SPACE: 148162

The full table definition is:

CREATE TABLE `tab` (
`col1` bigint(20) NOT NULL AUTO_INCREMENT,
`col2` varchar(64) COLLATE utf8_unicode_ci NOT NULL,
`col3` varchar(64) COLLATE utf8_unicode_ci NOT NULL,
`col4` varchar(64) COLLATE utf8_unicode_ci NOT NULL,
`col5` varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL,
`col6` varchar(100) COLLATE utf8_unicode_ci DEFAULT NULL,
`col7` int(11) DEFAULT '0',
`col8` bit(1) DEFAULT b'0',
`col9` smallint(6) DEFAULT NULL,
`col10` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
`col11` varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL,
`col12` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
`col13` int(11) DEFAULT NULL,
PRIMARY KEY (`col1`),
UNIQUE KEY `idx1` (`col2`,`col3`,`col4`),
UNIQUE KEY `idx2` (`col2`,`col6`) USING BTREE,
KEY `idx3` (`col2`,`col4`) USING BTREE,
KEY `idx4` (`col4`) USING BTREE,
KEY `idx5` (`col10`)
) ENGINE=InnoDB AUTO_INCREMENT=5032253394 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci DATA DIRECTORY='./'

Is this corruption most likely caused by a bug, or some kind of storage failure?



 Comments   
Comment by Jan Lindström (Inactive) [ 2018-04-04 ]

To me this looks like a real data corruption not a bug.

Generated at Thu Feb 08 08:22:23 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.