[MDEV-26604] InnoDB tables corruption and crashes Created: 2021-09-14  Updated: 2021-10-18  Resolved: 2021-10-18

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.3.31
Fix Version/s: 10.2.41, 10.3.32, 10.4.22, 10.5.13, 10.6.5

Type: Bug Priority: Critical
Reporter: Miroslav Lachman Assignee: Marko Mäkelä
Resolution: Duplicate Votes: 0
Labels: corruption
Environment:

FreeBSD 12.2-p10 amd64 GENERIC kernel on top of ZFS file system on NVME disks, PowerEdge R6515
CPU: AMD EPYC 7302P 16-Core Processor (2994.44-MHz K8-class CPU
real memory = 137438953472 (131072 MB)
avail memory = 133428338688 (127247 MB)


Issue Links:
Duplicate
is duplicated by MDEV-26537 InnoDB corrupts files due to incorrec... Closed

 Description   

There was no power loss nor unexpected reboot / shutdown, there was a normal upgrade procedure from 10.3.30 to 10.3.31 and the next day the mysqld process started crashing on access to certaing tables (it was in the morning of 2021-08-24)

The first crash was this:

2021-08-23 22:27:39 0 [Note] Added new Master_info '' to hash table
2021-08-23 22:27:39 0 [Note] /usr/local/libexec/mysqld: ready for connections.
Version: '10.3.31-MariaDB-log'  socket: '/tmp/mysql.sock'  port: 3306  FreeBSD Ports
2021-08-23 22:28:02 0 [Note] InnoDB: Buffer pool(s) load completed at 210823 22:28:02
2021-08-24  4:52:29 726 [Warning] IP address '10.xx.xx.xx' could not be resolved: Name does not resolve
2021-08-24  6:23:24 963 [Warning] IP address '10.xx.yy.yy' could not be resolved: Name does not resolve
2021-08-24  9:57:36 8217 [Warning] Aborted connection 8217 to db: 'confidential001' user: 'confidential001' host: 'localhost' (Got an error reading communication packets)
2021-08-24 10:11:03 0x1118be8b00  InnoDB: Assertion failure in file /wrkdirs/overlays/mfh_overlay2/databases/mariadb103-server/work/mariadb-10.3.31/storage/innobase/rem/rem0rec.cc line 824
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
InnoDB: about forcing recovery.
210824 10:11:03 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
 
Server version: 10.3.31-MariaDB-log
key_buffer_size=1073741824
read_buffer_size=2097152
max_used_connections=9
max_threads=202
thread_count=12
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 14704992 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x1165b3e2c8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7fffdc0f8f38 thread_stack 0x49000
0x1176c8c <my_print_stacktrace+0x3c> at /usr/local/libexec/mysqld
0xb53de5 <handle_fatal_signal+0x295> at /usr/local/libexec/mysqld
0x8017fdb70 <_pthread_sigmask+0x530> at /lib/libthr.so.3
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x1165b6bbe0): DELETE FROM `users` WHERE (`id` = '326')
 
Connection ID (thread ID): 8537
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on
 
The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
information that should help you find out what is causing the crash.
Core pattern: %N.core
2021-08-24 10:11:08 0 [Note] Using unique option prefix 'myisam_recover' is error-prone and can break in the future. Please use the full name 'myisam-recover-options' instead.
2021-08-24 10:11:08 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins

2021-08-24 10:34:21 13 [ERROR] InnoDB: Table `xxxsosrepor000`.`logs_login` is corrupted. Please drop the table and recreate.

Next error where like this

2021-08-24 10:42:15 55 [ERROR] InnoDB: In pages [page id: space=10837, page number=33] and [page id: space=10837, page number=40] of index `feedback_logs_users_fk` of table `multichanneli000`.`feedback_logs`
InnoDB: broken FIL_PAGE_NEXT or FIL_PAGE_PREV links
2021-08-24 10:42:15 55 [ERROR] InnoDB: In pages [page id: space=10837, page number=33] and [page id: space=10837, page number=40] of index `feedback_logs_users_fk` of table `multichanneli000`.`feedback_logs`
InnoDB: 'compact' flag mismatch
2021-08-24 10:42:15 55 [ERROR] InnoDB: Page index id 0 != data dictionary index id 29281
2021-08-24 10:42:15 0x1118bae600  InnoDB: Assertion failure in file /wrkdirs/overlays/mfh_overlay2/databases/mariadb103-server/work/mariadb-10.3.31/storage/innobase/btr/btr0btr.cc line 4898
InnoDB: Failing assertion: !page_is_empty(page) || (level == 0 && page_get_page_no(page) == dict_index_get_page(index))
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
InnoDB: about forcing recovery.
210824 10:42:15 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
 
Server version: 10.3.31-MariaDB-log
key_buffer_size=1073741824
read_buffer_size=2097152
max_used_connections=2
max_threads=202
thread_count=7
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 14704992 K  bytes of memory
Hope that''s ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x115668f7c8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7fffdbf42f38 thread_stack 0x49000
0x1176c8c <my_print_stacktrace+0x3c> at /usr/local/libexec/mysqld
0xb53de5 <handle_fatal_signal+0x295> at /usr/local/libexec/mysqld
0x8017fdb70 <_pthread_sigmask+0x530> at /lib/libthr.so.3
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x11566d41a0): CHECK TABLE `feedback_logs`
 
Connection ID (thread ID): 55
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on
 
The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
information that should help you find out what is causing the crash.
Core pattern: %N.core

Aug 24 10:11:07 hostnm kernel: pid 87319 (mysqld), jid 0, uid 88: exited on signal 6 (core dumped)
Aug 24 10:11:46 hostnm kernel: pid 13319 (mysqld), jid 0, uid 88: exited on signal 10 (core dumped)
Aug 24 10:12:54 hostnm kernel: pid 13432 (mysqld), jid 0, uid 88: exited on signal 6 (core dumped)
Aug 24 10:15:26 hostnm kernel: pid 13666 (mysqld), jid 0, uid 88: exited on signal 10 (core dumped)
Aug 24 10:21:20 hostnm kernel: pid 14701 (mysqld), jid 0, uid 88: exited on signal 6 (core dumped)
Aug 24 10:21:55 hostnm kernel: pid 16387 (mysqld), jid 0, uid 88: exited on signal 6 (core dumped)
Aug 24 10:22:56 hostnm kernel: pid 16492 (mysqld), jid 0, uid 88: exited on signal 10 (core dumped)
Aug 24 10:24:51 hostnm kernel: pid 16716 (mysqld), jid 0, uid 88: exited on signal 10 (core dumped)
Aug 24 10:25:54 hostnm kernel: pid 17165 (mysqld), jid 0, uid 88: exited on signal 6 (core dumped)
Aug 24 10:27:04 hostnm kernel: pid 17798 (mysqld), jid 0, uid 88: exited on signal 10 (core dumped)
Aug 24 10:27:14 hostnm kernel: pid 18079 (mysqld), jid 0, uid 88: exited on signal 10 (core dumped)
Aug 24 10:28:40 hostnm kernel: pid 18112 (mysqld), jid 0, uid 88: exited on signal 6 (core dumped)
Aug 24 10:30:50 hostnm kernel: pid 18429 (mysqld), jid 0, uid 88: exited on signal 6 (core dumped)
Aug 24 10:34:03 hostnm kernel: pid 20057 (mysqld), jid 0, uid 88: exited on signal 6 (core dumped)
Aug 24 10:39:07 hostnm kernel: pid 20402 (mysqld), jid 0, uid 88: exited on signal 10 (core dumped)
Aug 24 10:39:39 hostnm kernel: pid 21908 (mysqld), jid 0, uid 88: exited on signal 6 (core dumped)
Aug 24 10:42:18 hostnm kernel: pid 22056 (mysqld), jid 0, uid 88: exited on signal 6 (core dumped)
Aug 24 10:44:45 hostnm kernel: pid 23074 (mysqld), jid 0, uid 88: exited on signal 6 (core dumped)
Aug 24 10:51:03 hostnm kernel: pid 24933 (mysqld), jid 0, uid 88: exited on signal 6 (core dumped)
Aug 24 10:57:00 hostnm kernel: pid 26154 (mysqld), jid 0, uid 88: exited on signal 11 (core dumped)

It crashed every few minutes until I identified bunch of affected tables in different databases and deleted them on disk, restore from backup.

Then everything works for 2 weeks. *Exactly 2 weeks. *
It started crashing again on 2021-09-07 again at the 10AM.
There was no update, no power loss etc. prior to this crashes.

mysqlcheck or mysqldump cannot be used, every access to some data caused crash again

This is the first crash from 2021-09-07

2021-09-06 14:09:01 166494 [Warning] Aborted connection 166494 to db: 'invoiceapprov000' user: 'invoiceapprov000' host: 'localhost' (Got an error reading communication packets)
210907 10:00:02 [ERROR] mysqld got signal 10 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
 
Server version: 10.3.31-MariaDB-log
key_buffer_size=1073741824
read_buffer_size=2097152
max_used_connections=13
max_threads=202
thread_count=12
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 14704992 K  bytes of memory
Hope that''s ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x111acca288
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7fffdbe55f38 thread_stack 0x49000
0x1176c8c <my_print_stacktrace+0x3c> at /usr/local/libexec/mysqld
0xb53de5 <handle_fatal_signal+0x295> at /usr/local/libexec/mysqld
0x801809b70 <_pthread_sigmask+0x530> at /lib/libthr.so.3
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x111f1781e0): INSERT INTO `feedback_logs` (`feedbacks`, `users`, `user_name`, `changes`, `system_created`) VALUES (8141, NULL, '', '{\"states\":4}', NOW())
 
Connection ID (thread ID): 190863
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on
 
The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
information that should help you find out what is causing the crash.
Core pattern: %N.core
2021-09-07 10:00:06 0 [Note] Using unique option prefix 'myisam_recover' is error-prone and can break in the future. Please use the full name 'myisam-recover-options' instead.
2021-09-07 10:00:06 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2021-09-07 10:00:06 0 [Note] InnoDB: Uses event mutexes
2021-09-07 10:00:06 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
2021-09-07 10:00:06 0 [Note] InnoDB: Number of pools: 1
2021-09-07 10:00:06 0 [Note] InnoDB: Using SSE2 crc32 instructions
2021-09-07 10:00:06 0 [Note] InnoDB: Initializing buffer pool, total size = 32G, instances = 8, chunk size = 128M
2021-09-07 10:00:08 0 [Note] InnoDB: Completed initialization of buffer pool
2021-09-07 10:00:09 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=169732524444
2021-09-07 10:00:09 0 [Note] InnoDB: 1 transaction(s) which must be rolled back or cleaned up in total 294 row operations to undo
2021-09-07 10:00:09 0 [Note] InnoDB: Trx id counter is 116137602
2021-09-07 10:00:09 0 [Note] InnoDB: Starting final batch to recover 51 pages from redo log.
2021-09-07 10:00:09 0x85a8c9200  InnoDB: Assertion failure in file /wrkdirs/overlays/mfh_overlay2/databases/mariadb103-server/work/mariadb-10.3.31/storage/innobase/log/log0recv.cc line 1585
InnoDB: Failing assertion: !page || (ibool)!!page_is_comp(page) == dict_table_is_comp(index->table)
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
InnoDB: about forcing recovery.
210907 10:00:09 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
 
Server version: 10.3.31-MariaDB-log
key_buffer_size=1073741824
read_buffer_size=2097152
max_used_connections=0
max_threads=202
thread_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 14704992 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x49000
0x1176c8c <my_print_stacktrace+0x3c> at /usr/local/libexec/mysqld
0xb53de5 <handle_fatal_signal+0x295> at /usr/local/libexec/mysqld
0x801809b70 <_pthread_sigmask+0x530> at /lib/libthr.so.3
The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
information that should help you find out what is causing the crash.
Core pattern: %N.core

Then it was not able to start anymore, crashing imediatelly

2021-09-07 10:05:28 0 [Note] Using unique option prefix 'myisam_recover' is error-prone and can break in the future. Please use the full name 'myisam-recover-options' instead.
2021-09-07 10:05:28 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2021-09-07 10:05:28 0 [Note] InnoDB: Uses event mutexes
2021-09-07 10:05:28 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
2021-09-07 10:05:28 0 [Note] InnoDB: Number of pools: 1
2021-09-07 10:05:28 0 [Note] InnoDB: Using SSE2 crc32 instructions
2021-09-07 10:05:28 0 [Note] InnoDB: Initializing buffer pool, total size = 32G, instances = 8, chunk size = 128M
2021-09-07 10:05:30 0 [Note] InnoDB: Completed initialization of buffer pool
2021-09-07 10:05:30 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=169732366483
2021-09-07 10:05:30 0 [Note] InnoDB: 1 transaction(s) which must be rolled back or cleaned up in total 294 row operations to undo
2021-09-07 10:05:30 0 [Note] InnoDB: Trx id counter is 116137602
2021-09-07 10:05:30 0 [Note] InnoDB: Starting final batch to recover 51 pages from redo log.
2021-09-07 10:05:30 0x85a8c9200  InnoDB: Assertion failure in file /wrkdirs/overlays/mfh_overlay2/databases/mariadb103-server/work/mariadb-10.3.31/storage/innobase/log/log0recv.cc line 1585
InnoDB: Failing assertion: !page || (ibool)!!page_is_comp(page) == dict_table_is_comp(index->table)
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
InnoDB: about forcing recovery.
210907 10:05:30 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
 
Server version: 10.3.31-MariaDB-log
key_buffer_size=1073741824
read_buffer_size=2097152
max_used_connections=0
max_threads=202
thread_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 14704992 K  bytes of memory
Hope that''s ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x49000
0x1176c8c <my_print_stacktrace+0x3c> at /usr/local/libexec/mysqld
0xb53de5 <handle_fatal_signal+0x295> at /usr/local/libexec/mysqld
0x801809b70 <_pthread_sigmask+0x530> at /lib/libthr.so.3
The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
information that should help you find out what is causing the crash.
Core pattern: %N.core

We started with innodb_force_recovery = 1 but it does not help until innodb_force_recovery = 6 was set.

After start with settings to 6 there was many error messages Failed to find tablespace for table xxxx

We dumped as many data as was possible. Deleted all MariaDB data (/var/db/mysql), initialize new empty instance and restored dumped data from the previous step. Some tables must be restored from nightly backup.

MariaDB started, everything seems normal but right after production traffic was directed to it, it started crashing again.

2021-09-07 15:24:26 0xeecf6ab00  InnoDB: Assertion failure in file /wrkdirs/overlays/mfh_overlay2/databases/mariadb103-server/work/mariadb-10.3.31/storage/innobase/rem/rem0rec.cc line 824
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
InnoDB: about forcing recovery.
210907 15:24:26 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
 
Server version: 10.3.31-MariaDB-log
key_buffer_size=1073741824
read_buffer_size=2097152
max_used_connections=2
max_threads=202
thread_count=7
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 14704992 K  bytes of memory
Hope that''s ok; if not, decrease some variables in the equation.
 
Thread pointer: 0xf1f9257c8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7fffdbef7f38 thread_stack 0x49000
0x1176c8c <my_print_stacktrace+0x3c> at /usr/local/libexec/mysqld
0xb53de5 <handle_fatal_signal+0x295> at /usr/local/libexec/mysqld
0x801809b70 <_pthread_sigmask+0x530> at /lib/libthr.so.3
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0xf1f99f760): SELECT `locale`.`short` AS `locale`, `key`, `message` FROM `translations` LEFT JOIN `languages` `locale` ON `translations`.`locale` = `locale`.`id`
 
Connection ID (thread ID): 21
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on
 
The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
information that should help you find out what is causing the crash.
Core pattern: %N.core

So now the MariaDB on this machine is useless and nonworking. We tried to install version 10.5 instead of 10.3 (without recreating all the databases and tables, just with the data where 10.3.31 is crashing) but it does not change anything.

I tried lldb on one of the core files but we have no debug symbols

# lldb /usr/local/libexec/mysqld -c /var/db/mysql/mysqld.core
(lldb) target create "/usr/local/libexec/mysqld" --core "/var/db/mysql/mysqld.core"
Core file '/var/db/mysql/mysqld.core' (x86_64) was loaded.
(lldb) bt
* thread #1, name = 'mysqld', stop reason = signal SIGBUS
  * frame #0: 0x0000000801968bda libc.so.7`__sys_kill at kill.S:3
    frame #1: 0x0000000000b4fe8b mysqld`handle_fatal_signal + 1211
    frame #2: 0x0000000801805b70 libthr.so.3`handle_signal(actp=0x00007fffdc13db80, sig=10, info=0x00007fffdc13df70, ucp=0x00007fffdc13dc00) at thr_sig.c:248:3
    frame #3: 0x000000080180513f libthr.so.3`thr_sighandler(sig=10, info=0x00007fffdc13df70, _ucp=0x00007fffdc13dc00) at thr_sig.c:191:2
    frame #4: 0x00007fffffffe003
    frame #5: 0x0000000000f0a583 mysqld`___lldb_unnamed_symbol2476$$mysqld + 4483
    frame #6: 0x000000000100679f mysqld`___lldb_unnamed_symbol3817$$mysqld + 527
    frame #7: 0x0000000000ffd93b mysqld`___lldb_unnamed_symbol3799$$mysqld + 1067
    frame #8: 0x0000000000fc0229 mysqld`___lldb_unnamed_symbol3531$$mysqld + 617
    frame #9: 0x0000000000fc0a07 mysqld`___lldb_unnamed_symbol3532$$mysqld + 343
    frame #10: 0x0000000000f35a5a mysqld`___lldb_unnamed_symbol2652$$mysqld + 4922
    frame #11: 0x0000000000e6cf54 mysqld`___lldb_unnamed_symbol1442$$mysqld + 708
    frame #12: 0x0000000000e61aec mysqld`___lldb_unnamed_symbol1356$$mysqld + 2572
    frame #13: 0x0000000000a48c87 mysqld`handler::ha_open(TABLE*, char const*, int, unsigned int, st_mem_root*, List<String>*) + 71
    frame #14: 0x0000000000d0bc0c mysqld`open_table_from_share(THD*, TABLE_SHARE*, st_mysql_const_lex_string const*, unsigned int, unsigned int, unsigned int, TABLE*, bool, List<String>*) + 2844
    frame #15: 0x0000000000bd6265 mysqld`open_table(THD*, TABLE_LIST*, Open_table_context*) + 1749
    frame #16: 0x0000000000bd8383 mysqld`open_tables(THD*, DDL_options_st const&, TABLE_LIST**, unsigned int*, unsigned int, Prelocking_strategy*) + 1187
    frame #17: 0x0000000000bda47f mysqld`open_and_lock_tables(THD*, DDL_options_st const&, TABLE_LIST*, bool, unsigned int, Prelocking_strategy*) + 63
    frame #18: 0x0000000000c49876 mysqld`___lldb_unnamed_symbol558$$mysqld + 278
    frame #19: 0x0000000000c430ca mysqld`mysql_execute_command(THD*) + 1450
    frame #20: 0x0000000000c4086d mysqld`mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool) + 573
    frame #21: 0x0000000000c3d199 mysqld`dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool) + 2761
    frame #22: 0x0000000000c3f1f2 mysqld`do_command(THD*) + 386
    frame #23: 0x0000000000d48d59 mysqld`do_handle_one_connection(CONNECT*) + 505
    frame #24: 0x0000000000d48b56 mysqld`handle_one_connection + 54
    frame #25: 0x00000008017fffac libthr.so.3`thread_start(curthread=0x0000000860cbf600) at thr_create.c:292:1

We are using innodb_file_per_table

innodb_file_per_table
innodb_buffer_pool_size     = 32G
innodb_log_file_size        = 256M
innodb_log_buffer_size      = 16M
innodb_log_files_in_group   = 3
innodb_flush_log_at_trx_commit = 2

I am working with MySQL / MariaDB for 20 years but never seen something like this even after the hard OS crashes, power outages etc.

There is no evidence of HW failures logged anywhere in the OS. I checked filesystem with zpool scrub, no errors at all.

  scan: scrub in progress since Tue Sep  7 15:18:54 2021
        133G scanned at 1.24G/s, 61.5G issued at 583M/s, 133G total
        0 repaired, 46.10% done, 0 days 00:02:06 to go
config:
 
        NAME                STATE     READ WRITE CKSUM
        tank0               ONLINE       0     0     0
          mirror-0          ONLINE       0     0     0
            gpt/disk0tank0  ONLINE       0     0     0
            gpt/disk1tank0  ONLINE       0     0     0



 Comments   
Comment by Marko Mäkelä [ 2021-09-14 ]

This likely is a duplicate of MDEV-26537. Can you confirm that?

Comment by Miroslav Lachman [ 2021-09-14 ]

Is this bug in MDEV-26537 related only to version 10.3.31 or should it be in previous versions? We are running this machine for 2 years. If this bug is only in 10.3.31 then my problem is probably duplicate of MDEV-26537

Comment by Marko Mäkelä [ 2021-09-14 ]

Yes, MDEV-26537 only affected the latest point releases as indicated in the affected versions list. Unfortunately, any data files that were rebuilt, created, or increased in size after the upgrade may be corrupted. Unfortunately, we did not catch that before release, because we are missing FreeBSD in our continuous integration platforms. We did repeat it on our somewhat experimental AIX builder later, but this scenario would not be triggered by the basic set of regression tests.

I will leave this ticket open until you can provide final confirmation.

Comment by Miroslav Lachman [ 2021-09-14 ]

Thank you for the clarification. Then this is duplicate. We will try to rollback to 10.3.30 and wait for 10.3.32.

Generated at Thu Feb 08 09:46:34 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.