[MDEV-28695] InnoDB: Database page corruption on disk or a failed read => mysqld got signal 11 - std::unique_lock<std::mutex>::unlock() - Jira

XML

Word

Printable

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Incomplete
Affects Version/s: 10.5.15
Fix Version/s: N/A
Component/s: Locking, Platform Debian, Storage Engine - InnoDB
Labels:
None
Environment:

Hide
# cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
...

# uname -a
Linux ... 5.10.0-14-amd64 #1 SMP Debian 5.10.113-1 (2022-04-29) x86_64 GNU/Linux

# mysqld --version
mysqld Ver 10.5.15-MariaDB-0+deb11u1-log for debian-linux-gnu on x86_64 (Debian 11)

Show
# cat /etc/os-release PRETTY_NAME="Debian GNU/Linux 11 (bullseye)" ... # uname -a Linux ... 5.10.0-14-amd64 #1 SMP Debian 5.10.113-1 (2022-04-29) x86_64 GNU/Linux # mysqld --version mysqld Ver 10.5.15-MariaDB-0+deb11u1-log for debian-linux-gnu on x86_64 (Debian 11)

Description

I'm essentially facing two problems.
1)

every few days I'm getting... (db/table/col names manually obfuscated below)

2022-05-29  5:54:28 226 [ERROR] InnoDB: Database page corruption on disk or a failed read of file './xxxxdb/bad_table.ibd' page [page id: space=386040, page number=1446430]. You may have to recover from a backup.

2022-05-29  5:54:28 226 [Note] InnoDB: Page dump in ascii and hex (16384 bytes):

 len 16384; hex ...

InnoDB: End of page dump

2022-05-29  5:54:28 226 [Note] InnoDB: Uncompressed page, stored checksum in field1 642196308, calculated checksums for field1: crc32 3586801619, innodb 1305855102,  page type 10 == BLOB.none 3735928559, stored checksum in field2 642196308, calculated checksums for field2: crc32 3586801619, innodb 1806045908, none 3735928559,  page LSN 3289 3616289816, low 4 bytes of LSN at page end 3616289816, page number (if stored to page already) 1446430, space id (if create with >= MySQL-4.1.1 and stored already) 386040

InnoDB: Page may be a BLOB page

2022-05-29  5:54:28 226 [Note] InnoDB:  You can use CHECK TABLE to scan your table for corruption. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.

I don't know if this is i) a bug, ii) failing memory or iii) failing disk. It happens with a variety of tables and happens for about 5 seconds, logging this about 20 times per second, before...

Sometimes (I think the pattern probably has more to do with my application querying MariaDB, then MariaDB itself?) it then goes...

2022-05-27  0:03:55 353 [ERROR] InnoDB: We detected index corruption in an InnoDB type table. You have to dump + drop + reimport the table or, in a case of widespread corruption, dump all InnoDB tables and recreate the whole tablespace. If the mysqld server crashes after the startup or when you dump the tables. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.

2022-05-27  0:03:55 353 [ERROR] mariadbd: Index for table 'bad_table' is corrupt; try to repair it

but others times it then hits the second, more serious, problem...

220529  5:54:33 [ERROR] mysqld got signal 11 ;

This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail.

Server version: 10.5.15-MariaDB-0+deb11u1-log

key_buffer_size=402653184

read_buffer_size=2097152

max_used_connections=2

max_threads=153

thread_count=2

It is possible that mysqld could use up to

key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1963808 K  bytes of memory

Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7f9a5c000c58

Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong...

stack_bottom = 0x7f9bdc054d78 thread_stack 0x30000

??:0(my_print_stacktrace)[0x55d6fdc5154e]

??:0(handle_fatal_signal)[0x55d6fd750f65]

sigaction.c:0(__restore_rt)[0x7f9be0732140]

??:0(std::unique_lock<std::mutex>::unlock())[0x55d6fdb455b4]

??:0(std::unique_lock<std::mutex>::unlock())[0x55d6fdb51a96]

??:0(std::unique_lock<std::mutex>::unlock())[0x55d6fdaed4ea]

??:0(std::unique_lock<std::mutex>::unlock())[0x55d6fdaed90c]

??:0(std::unique_lock<std::mutex>::unlock())[0x55d6fdaf469d]

??:0(wsrep_notify_status(wsrep::server_state::state, wsrep::view const*))[0x55d6fda37b42]

??:0(handler::ha_index_read_map(unsigned char*, unsigned char const*, unsigned long, ha_rkey_function))[0x55d6fd756bd8]

??:0(cp_buffer_from_ref(THD*, TABLE*, st_table_ref*))[0x55d6fd5a8254]

??:0(sub_select(JOIN*, st_join_table*, bool))[0x55d6fd59480e]

??:0(Item_bool_func2::remove_eq_conds(THD*, Item::cond_result*, bool))[0x55d6fd580fec]

??:0(sub_select(JOIN*, st_join_table*, bool))[0x55d6fd5948a3]

??:0(JOIN::exec_inner())[0x55d6fd5bee38]

??:0(JOIN::exec())[0x55d6fd5bf295]

??:0(mysql_select(THD*, TABLE_LIST*, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*))[0x55d6fd5bd116]

??:0(mysql_multi_update(THD*, TABLE_LIST*, List<Item>*, List<Item>*, Item*, unsigned long long, enum_duplicates, bool, st_select_lex_unit*, st_select_lex*, multi_update**))[0x55d6fd6109ed]

??:0(mysql_execute_command(THD*))[0x55d6fd55b0b4]

??:0(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x55d6fd55c5db]

??:0(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool))[0x55d6fd55ea5d]

??:0(do_command(THD*))[0x55d6fd5602de]

??:0(do_handle_one_connection(CONNECT*, bool))[0x55d6fd651fb2]

??:0(handle_one_connection)[0x55d6fd65222d]

??:0(MyCTX_nopad::finish(unsigned char*, unsigned int*))[0x55d6fd98e11b]

nptl/pthread_create.c:478(start_thread)[0x7f9be0726ea7]

x86_64/clone.S:97(__GI___clone)[0x7f9be033ddef]

Trying to get some variables.

Some pointers may be invalid and cause the dump to abort.

Query (0x7f9a5c010470): update bad_table p, table2 n set p.n_col=n.n_col where p.colID=n.colID and n.col1=p.col1 and p.col='xx'

Connection ID (thread ID): 226

Status: NOT_KILLED

Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off

The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains information that should help you find out what is causing the crash.

Writing a core file...

Working directory at /mnt/disk3/mysqldata

Resource Limits:

Limit                     Soft Limit           Hard Limit           Units

Max cpu time              unlimited            unlimited            seconds

Max file size             unlimited            unlimited            bytes

Max data size             unlimited            unlimited            bytes

Max stack size            8388608              unlimited            bytes

Max core file size        0                    unlimited            bytes

Max resident set          unlimited            unlimited            bytes

Max processes             62978                62978                processes

Max open files            32768                32768                files

Max locked memory         65536                65536                bytes

Max address space         unlimited            unlimited            bytes

Max file locks            unlimited            unlimited            locks

Max pending signals       62978                62978                signals

Max msgqueue size         819200               819200               bytes

Max nice priority         0                    0

Max realtime priority     0                    0

Max realtime timeout      unlimited            unlimited            us

Core pattern: core

2022-05-29  5:54:38 0 [Note] Using unique option prefix 'myisam-recover' is error-prone and can break in the future. Please use the full name 'myisam-recover-options' instead.

2022-05-29  5:54:38 0 [Note] CONNECT: Version 1.07.0002 March 22, 2021

2022-05-29  5:54:38 0 [Warning] The parameter innodb_file_format is deprecated and has no effect. It may be removed in future releases. See https://mariadb.com/kb/en/library/xtradbinnodb-file-format/

2022-05-29  5:54:38 0 [Note] InnoDB: !!! innodb_force_recovery is set to 1 !!!

2022-05-29  5:54:38 0 [Note] InnoDB: Uses event mutexes

2022-05-29  5:54:38 0 [Note] InnoDB: Compressed tables use zlib 1.2.11

2022-05-29  5:54:38 0 [Note] InnoDB: Number of pools: 1

2022-05-29  5:54:38 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions

2022-05-29  5:54:39 0 [Note] InnoDB: Using Linux native AIO

2022-05-29  5:54:39 0 [Note] InnoDB: Initializing buffer pool, total size = 4294967296, chunk size = 134217728

2022-05-29  5:54:39 0 [Note] InnoDB: Completed initialization of buffer pool

2022-05-29  5:54:39 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=14858023269921,14858023269921

2022-05-29  5:54:40 0 [Note] InnoDB: Starting final batch to recover 17480 pages from redo log.

2022-05-29  5:54:41 0 [Note] InnoDB: 128 rollback segments are active.

2022-05-29  5:54:41 0 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"

2022-05-29  5:54:41 0 [Note] InnoDB: Creating shared tablespace for temporary tables

2022-05-29  5:54:41 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...

2022-05-29  5:54:41 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.

2022-05-29  5:54:41 0 [Note] InnoDB: 10.5.15 started; log sequence number 14858195549968; transaction id 5082853300

2022-05-29  5:54:41 0 [Note] InnoDB: Loading buffer pool(s) from /mnt/disk3/mysqldata/ib_buffer_pool

2022-05-29  5:54:41 0 [Note] Plugin 'FEEDBACK' is disabled.

2022-05-29  5:54:41 0 [ERROR] mariadbd: Plugin 'CONNECT' already installed

2022-05-29  5:54:41 0 [Note] Server socket created on IP: '0.0.0.0'.

2022-05-29  5:54:41 0 [Note] Reading of all Master_info entries succeeded

2022-05-29  5:54:41 0 [Note] Added new Master_info '' to hash table

2022-05-29  5:54:41 0 [Note] /usr/sbin/mariadbd: ready for connections.

Version: '10.5.15-MariaDB-0+deb11u1-log'  socket: '/run/mysqld/mysqld.sock'  port: 3306  Debian 11

2022-05-29  5:54:41 5 [Warning] ./sqlite3/xxxx.frm is inconsistent: engine typecode 44, engine name CONNECT (46)

2022-05-29  5:54:41 0 [Note] InnoDB: Buffer pool(s) load completed at 220529  5:54:41

once MariaDB is running again, my application starts querying again and the process repeats. I usually stop my application, mysqlcheck/repair/restore the table before re starting it again.

The segfault is caused by a variety of different queries accessing the corrupt table(s) in different ways, but each time it's the std::unique_lock<std::mutex>::unlock() that seems to be responsible for the segfault. It's particularly sad when the resulting CHECK TABLE/OPTIMIZE TABLE cause the segfault too :-/ e.g.

Thread pointer: 0x7fa500003648

Attempting backtrace. You can use the following information to find out

where mysqld died. If you see no messages after this, something went

terribly wrong...

stack_bottom = 0x7fa678090d78 thread_stack 0x30000

??:0(my_print_stacktrace)[0x55d8e417954e]

??:0(handle_fatal_signal)[0x55d8e3c78f65]

sigaction.c:0(__restore_rt)[0x7fa67c481140]

??:0(std::unique_lock<std::mutex>::unlock())[0x55d8e3ffba71]

??:0(std::unique_lock<std::mutex>::unlock())[0x55d8e3ffd13a]

??:0(wsrep_notify_status(wsrep::server_state::state, wsrep::view const*))[0x55d8e3f77f15]

??:0(mysql_alter_table(THD*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, HA_CREATE_INFO*, TABLE_LIST*, Alter_info*, unsigned int, st_order*, bool, bool))[0x55d8e3b247cb]

??:0(mysql_recreate_table(THD*, TABLE_LIST*, bool))[0x55d8e3b25017]

??:0(MDL_ticket::~MDL_ticket())[0x55d8e3b84f3d]

??:0(MDL_ticket::~MDL_ticket())[0x55d8e3b86dcc]

??:0(Sql_cmd_optimize_table::execute(THD*))[0x55d8e3b87f0d]

??:0(mysql_execute_command(THD*))[0x55d8e3a80356]

??:0(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x55d8e3a845db]

??:0(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool))[0x55d8e3a86a5d]

??:0(do_command(THD*))[0x55d8e3a882de]

??:0(do_handle_one_connection(CONNECT*, bool))[0x55d8e3b79fb2]

??:0(handle_one_connection)[0x55d8e3b7a22d]

??:0(MyCTX_nopad::finish(unsigned char*, unsigned int*))[0x55d8e3eb611b]

nptl/pthread_create.c:478(start_thread)[0x7fa67c475ea7]

x86_64/clone.S:97(__GI___clone)[0x7fa67c08cdef]

Trying to get some variables.

Some pointers may be invalid and cause the dump to abort.

Query (0x7fa500011cd0): OPTIMIZE TABLE `bad_table`

Connection ID (thread ID): 212

Status: NOT_KILLED

Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off

I don't suppose there's much that can easily be done to track down the first problem, but it would be great if the segfault could be fixed, so at least MariaDB stays up and carries on handling queries for the non-corrupt tables.

Attachments

Activity

People

Assignee:: Marko Mäkelä

Reporter:: A D

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 2022-05-29 22:17

Updated:: 2022-07-03 14:12

Resolved:: 2022-07-03 14:12

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.