Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Incomplete
-
10.2.27
-
None
Description
customer reported the following crash during mariabackup --prepare stage.
Actual command the used is as following:
/glide/mariadb/10.2.27snc2/bin/mariabackup --prepare --use-memory=32G --target-dir /glide/mysqld/acnnacioj_3402_peta/temp/restore_2021-03-16_0054271/accentureinternal_3400_db170016_s_2021-03-14_2210539 2> /glide/backup/log/restore.2021-03-16_0054271.accentureinternal_3400_db170016_s_2021-03-14_2210539.log |
- some of config
slave_parallel_threads=16
#binlog_format=MIXED
binlog_format=ROW
slave_parallel_mode=conservative
innodb_doublewrite=ON
#innodb_max_dirty_pages_pct=5
innodb_log_file_size=32768M
innodb_change_buffer_max_size=5
innodb_buffer_pool_instances=32
innodb_purge_batch_size=5000
table_definition_cache=524288
table_open_cache=524288
table_open_cache_instances=16
back_log=1500
innodb_adaptive_hash_index_partitions=16
slave_compressed_protocol=OFF
slave_parallel_max_queued=500000
expire_logs_days=3
innodb_max_dirty_pages_pct_lwm=50
- error log
2021-03-16 16:45:25 140315704616704 [Note] InnoDB: To recover: 532973 pages from log
2021-03-16 16:45:40 140315713009408 [Note] InnoDB: To recover: 401586 pages from log
2021-03-16 16:45:42 140315696224000 [ERROR] [FATAL] InnoDB: is_short 0, info_and_status_bits 0, offset 12896, o_offset 8, mismatch index 18446744073709551595, end_seg_len 33 parsed len 3
210316 16:45:42 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Server version: 10.2.27-MariaDB
key_buffer_size=0
read_buffer_size=131072
max_used_connections=0
max_threads=1
thread_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 5419 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x49000
/glide/mariadb/10.2.27snc2/bin/mariabackup(my_print_stacktrace+0x2e)[0x5578aff47c7e]
mysys/stacktrace.c:268(my_print_stacktrace)[0x5578af9ec54d]
sigaction.c:0(__restore_rt)[0x7fa658b2e630]
:0(__GI_raise)[0x7fa657938387]
:0(__GI_abort)[0x7fa657939a78]
ut/ut0ut.cc:645(ib::fatal::~fatal())[0x5578afd45823]
page/page0cur.cc:1200(page_cur_parse_insert_rec(unsigned long, unsigned char const*, unsigned char const*, buf_block_t*, dict_index_t*, mtr_t*))[0x5578afc64f9b]
log/log0recv.cc:1655(recv_parse_or_apply_log_rec_body(mlog_id_t, unsigned char*, unsigned char*, unsigned long, unsigned long, bool, buf_block_t*, mtr_t*))[0x5578afc431d1]
log/log0recv.cc:2162(recv_recover_page(buf_block_t*, mtr_t&, recv_addr_t*, unsigned long))[0x5578afc43e19]
log/log0recv.cc:2272(recv_recover_page(buf_page_t*))[0x5578af70278d]
buf/buf0buf.cc:6164(buf_page_io_complete(buf_page_t*, bool, bool))[0x5578afb35b9c]
fil/fil0fil.cc:5169(fil_aio_wait(unsigned long))[0x5578afb83461]
srv/srv0start.cc:331(io_handler_thread)[0x5578afcf7da8]
pthread_create.c:0(start_thread)[0x7fa658b26ea5]
/lib64/libc.so.6(clone+0x6d)[0x7fa657a0096d]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
Writing a core file...
Working directory at /glide/mysqld/acnnacioj_3402_peta/temp/restore_2021-03-16_0054271/accentureinternal_3400_db170016_s_2021-03-14_2210539
Resource Limits:
Fatal signal 11 while backtracing
Attachments
Issue Links
- is caused by
-
MDEV-14229 Stack trace is not resolved for shared objects
-
- Closed
-
- relates to
-
MDEV-24449 Corruption of system tablespace or last recovered page
-
- Closed
-
The crash within the stack trace reporter was probably caused by
MDEV-14229. I suspect that the signal hander is invoking library functions that are not listed in man 7 signal-safety on GNU/Linux.To analyze the original crash, we would need more information. Failures in recovery or mariabackup --prepare are usually deterministic, and thus there should be a good chance that the problem can be repeated with the original data files. It should be almost guaranteed if we had a copy of the files as they were before any recovery steps were applied.
I cannot claim it for sure, but I suspect that the bug that was fixed in
MDEV-24449can cause various sorts of corruption in both the InnoDB system tablespace (any page in it) and secondary index leaf pages of any tablespace. The output is from page_cur_parse_insert_rec(). When the mismatch_index is interpreted as a signed number, it is -21. The only reason for that failure should be that the index page contents is inconsistent with the redo log record.MDEV-24449could explain it.Does this occur with any server that was never corrupted by the bug that was fixed in
MDEV-24449?