Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Incomplete
-
10.5.13, 10.5.16
-
None
-
Debian 4.9.0-14-amd64 #1 SMP Debian 4.9.246-2 (2020-12-17) x86_64 GNU/Linux, ZFS
Description
Our MariaDB server crashed with "[ERROR] [FATAL] InnoDB: Page old data size 7898 new data size 7922, page old max ins size 8288 new max ins size 8264" and subsequent automatic restarts also crashed soon after with the same error, presumably as soon as they saw the corrupt data. This seems very similar to MDEV-26022 as far as I can tell.
Aug 09 04:53:52 mariadbd[35210]: 2022-08-09 4:53:52 1189 [ERROR] [FATAL] InnoDB: Page old data size 7898 new data size 7922, page old max ins size 8288 new max ins size 8264
|
Aug 09 04:53:52 mariadbd[35210]: 220809 4:53:52 [ERROR] mysqld got signal 6 ;
|
Aug 09 04:53:52 mariadbd[35210]: This could be because you hit a bug. It is also possible that this binary
|
Aug 09 04:53:52 mariadbd[35210]: or one of the libraries it was linked against is corrupt, improperly built,
|
Aug 09 04:53:52 mariadbd[35210]: or misconfigured. This error can also be caused by malfunctioning hardware.
|
Aug 09 04:53:52 mariadbd[35210]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
|
Aug 09 04:53:52 mariadbd[35210]: We will try our best to scrape up some info that will hopefully help
|
Aug 09 04:53:52 mariadbd[35210]: diagnose the problem, but since we have already crashed,
|
Aug 09 04:53:52 mariadbd[35210]: something is definitely wrong and this may fail.
|
Aug 09 04:53:52 mariadbd[35210]: Server version: 10.5.13-MariaDB-1:10.5.13+maria~stretch-log
|
Aug 09 04:53:52 mariadbd[35210]: key_buffer_size=4294967296
|
Aug 09 04:53:52 mariadbd[35210]: read_buffer_size=131072
|
Aug 09 04:53:52 mariadbd[35210]: max_used_connections=141
|
Aug 09 04:53:52 mariadbd[35210]: max_threads=642
|
Aug 09 04:53:52 mariadbd[35210]: thread_count=141
|
Aug 09 04:53:52 mariadbd[35210]: It is possible that mysqld could use up to
|
Aug 09 04:53:52 mariadbd[35210]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 5607626 K bytes of memory
|
Aug 09 04:53:52 mariadbd[35210]: Hope that's ok; if not, decrease some variables in the equation.
|
Aug 09 04:53:52 mariadbd[35210]: Thread pointer: 0x7fb0bc0009b8
|
Aug 09 04:53:52 mariadbd[35210]: Attempting backtrace. You can use the following information to find out
|
Aug 09 04:53:52 mariadbd[35210]: where mysqld died. If you see no messages after this, something went
|
Aug 09 04:53:52 mariadbd[35210]: terribly wrong...
|
Aug 09 04:53:52 mariadbd[35210]: stack_bottom = 0x7fb25880fc78 thread_stack 0x49000
|
Aug 09 04:53:52 mariadbd[35210]: ??:0(my_print_stacktrace)[0x55fc961a4a0e]
|
Aug 09 04:53:52 mariadbd[35210]: ??:0(handle_fatal_signal)[0x55fc95bbe31f]
|
Aug 09 04:53:53 mariadbd[35210]: ??:0(__restore_rt)[0x7fd46173f0e0]
|
Aug 09 04:53:53 mariadbd[35210]: linux/raise.c:51(__GI_raise)[0x7fd460b37fff]
|
Aug 09 04:53:53 mariadbd[35210]: stdlib/abort.c:91(__GI_abort)[0x7fd460b3942a]
|
Aug 09 04:53:54 mariadbd[35210]: ??:0(std::_Rb_tree<unsigned int, unsigned int, std::_Identity<unsigned int>, std::less<unsigned int>, std::allocator<unsigned int> >::_M_erase(std::_Rb_tree_node<unsigned int>*))[0x55fc9600bf91]
|
Aug 09 04:53:54 mariadbd[35210]: ??:0(std::_Rb_tree<unsigned int, unsigned int, std::_Identity<unsigned int>, std::less<unsigned int>, std::allocator<unsigned int> >::_M_erase(std::_Rb_tree_node<unsigned int>*))[0x55fc96011750]
|
Aug 09 04:53:54 mariadbd[35210]: ??:0(std::_Rb_tree<unsigned int, unsigned int, std::_Identity<unsigned int>, std::less<unsigned int>, std::allocator<unsigned int> >::_M_erase(std::_Rb_tree_node<unsigned int>*))[0x55fc96012c3e]
|
Aug 09 04:53:54 mariadbd[35210]: ??:0(std::_Rb_tree<unsigned int, unsigned int, std::_Identity<unsigned int>, std::less<unsigned int>, std::allocator<unsigned int> >::_M_erase(std::_Rb_tree_node<unsigned int>*))[0x55fc9602893f]
|
Aug 09 04:53:54 mariadbd[35210]: ??:0(std::unique_lock<std::mutex>::unlock())[0x55fc95f66176]
|
Aug 09 04:53:54 mariadbd[35210]: ??:0(std::unique_lock<std::mutex>::unlock())[0x55fc95f6c06c]
|
Aug 09 04:53:54 mariadbd[35210]: ??:0(std::unique_lock<std::mutex>::unlock())[0x55fc95f6c4ed]
|
Aug 09 04:53:54 mariadbd[35210]: ??:0(std::unique_lock<std::mutex>::unlock())[0x55fc95f7ff15]
|
Aug 09 04:53:54 mariadbd[35210]: ??:0(wsrep_notify_status(wsrep::server_state::state, wsrep::view const*))[0x55fc95eb8e35]
|
Aug 09 04:53:54 mariadbd[35210]: ??:0(handler::ha_write_row(unsigned char const*))[0x55fc95bcaa87]
|
Aug 09 04:53:54 mariadbd[35210]: ??:0(write_record(THD*, TABLE*, st_copy_info*, select_result*))[0x55fc95981e5d]
|
Aug 09 04:53:54 mariadbd[35210]: ??:0(mysql_insert(THD*, TABLE_LIST*, List<Item>&, List<List<Item> >&, List<Item>&, List<Item>&, enum_duplicates, bool, select_result*))[0x55fc95989cb5]
|
Aug 09 04:53:54 mariadbd[35210]: ??:0(mysql_execute_command(THD*))[0x55fc959b942c]
|
Aug 09 04:53:54 mariadbd[35210]: ??:0(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x55fc959be25c]
|
Aug 09 04:53:54 mariadbd[35210]: ??:0(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool))[0x55fc959b3fb9]
|
Aug 09 04:53:54 mariadbd[35210]: ??:0(do_command(THD*))[0x55fc959b238b]
|
Aug 09 04:53:54 mariadbd[35210]: ??:0(do_handle_one_connection(CONNECT*, bool))[0x55fc95aaf100]
|
Aug 09 04:53:54 mariadbd[35210]: ??:0(handle_one_connection)[0x55fc95aaf7c4]
|
Aug 09 04:53:54 mariadbd[35210]: ??:0(MyCTX_nopad::finish(unsigned char*, unsigned int*))[0x55fc95dfc591]
|
Aug 09 04:53:54 mariadbd[35210]: nptl/pthread_create.c:456(start_thread)[0x7fd4617354a4]
|
Aug 09 04:53:55 mariadbd[35210]: x86_64/clone.S:99(clone)[0x7fd460bedd0f]
|
Aug 09 04:53:55 mariadbd[35210]: Trying to get some variables.
|
Aug 09 04:53:55 mariadbd[35210]: Some pointers may be invalid and cause the dump to abort.
|
Aug 09 04:53:55 mariadbd[35210]: Query (0x7fb0bc010480): INSERT /* ManualLogEntry::insert */ INTO `wikivalorant_logging` (log_type,log_action,log_timestamp,log_namespace,log_title,log_page,log_params,log_comment_id,log_actor) VALUES ('review','approve-a','20220808195352',0,'Project_V/Clutch_Kings/1/Qualifier_3',31169,'a:3:{i:0;i:539550;i:1;i:489266;i:2;s:14:\"20220808195349\";}','24585',58921)
|
Most crashes mentioned this "wikivalorant_logging" table, which is a Mediawiki table:
+----------------+---------------------+------+-----+----------------+----------------+
|
| Field | Type | Null | Key | Default | Extra |
|
+----------------+---------------------+------+-----+----------------+----------------+
|
| log_id | int(10) unsigned | NO | PRI | NULL | auto_increment |
|
| log_type | varbinary(32) | NO | MUL | | |
|
| log_action | varbinary(32) | NO | | | |
|
| log_timestamp | binary(14) | NO | MUL | 19700101000000 | |
|
| log_actor | bigint(20) unsigned | NO | MUL | NULL | |
|
| log_namespace | int(11) | NO | MUL | 0 | |
|
| log_title | varbinary(255) | NO | | | |
|
| log_page | int(10) unsigned | YES | MUL | NULL | |
|
| log_comment_id | bigint(20) unsigned | NO | | NULL | |
|
| log_params | blob | NO | | NULL | |
|
| log_deleted | tinyint(3) unsigned | NO | | 0 | |
|
+----------------+---------------------+------+-----+----------------+----------------+
|
11 rows in set (0.000 sec)
|
Other potentially relavent system configuration:
table_open_cache = 8192
|
table_open_cache_instances = 16
|
table_definition_cache = 4496
|
|
open_files_limit = 262144
|
|
innodb_log_file_size = 256M
|
innodb_buffer_pool_size = 128G
|
innodb_log_buffer_size = 16M
|
innodb_file_per_table = 1
|
innodb_open_files = 12000
|
innodb_io_capacity = 1000
|
innodb_io_capacity_max = 2500
|
innodb_flush_method = O_DIRECT
|
innodb_file_per_table = on
|
innodb_flush_log_at_trx_commit = 2
|
innodb_doublewrite = 0
|
innodb_large_prefix = 1
|
innodb_use_native_aio = 0
|
innodb_use_atomic_writes = 0
|
|
skip-innodb_doublewrite
|
|
tmp_table_size = 134217728
|
max_heap_table_size = 134217728
|
We have upgraded through many older MariaDB versions, so whether this table was affected by the 10.4 corruption bug I cannot answer for certain. I restored the table from a backup before the first crash, and things seemed stable for a little while only to crash after 10 minutes. I then did an ANAYZLE TABLE FORCE on it and so far (~30 minutes) things have been stable. I have a copy of the wikivalorant_logging.ibd before I restored it that I am able to privately share if it would help.
If there's any other information I can provide please let me know.
Attachments
Issue Links
- relates to
-
MDEV-13542 Crashing on a corrupted page is unhelpful
- Closed