[MDEV-27326] Mariabackup being overwhelmed during the prepare phase while using 32GB of memory - Jira

XML

Word

Printable

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 10.2.27
Fix Version/s: N/A
Component/s: Backup, Documentation
Labels:
None
Environment:
RHEL

Description

During the restore step the customer was getting "Restore failed on 10.2.27 with signal 6 and 11"

Oct 9 08:19:43 db166020 kernel: [5710622.181729] mariabackup[143845]: segfault at 0 ip 0000561e04a2dc88 sp 00007f2b853d97f0 error 6 in mariabackup[561e041da000+913000]

Below is the stack trace from the mariabackup output:

2021-10-09  8:19:42 139824895743744 [ERROR] [FATAL] InnoDB: is_short 0, info_and_status_bits 0, offset 10140, o_offset 9, mismatch index 18446744073709551594, end_seg_len 31 pars

ed len 3

211009  8:19:42 [ERROR] mysqld got signal 6 ;

This could be because you hit a bug. It is also possible that this binary

or one of the libraries it was linked against is corrupt, improperly built,

or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help

diagnose the problem, but since we have already crashed,

something is definitely wrong and this may fail.

Server version: 10.2.27-MariaDB

key_buffer_size=0

read_buffer_size=131072

max_used_connections=0

max_threads=1

thread_count=0

It is possible that mysqld could use up to

key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 5419 K  bytes of memory

Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0

Attempting backtrace. You can use the following information to find out

where mysqld died. If you see no messages after this, something went

terribly wrong...

stack_bottom = 0x0 thread_stack 0x49000

/glide/mariadb/10.2.27snc2/bin/mariabackup(my_print_stacktrace+0x2e)[0x561e04a30c7e]

mysys/stacktrace.c:268(my_print_stacktrace)[0x561e044d554d]

sigaction.c:0(__restore_rt)[0x7f3412b5e630]

:0(__GI_raise)[0x7f3411968387]

:0(__GI_abort)[0x7f3411969a78]

ut/ut0ut.cc:645(ib::fatal::~fatal())[0x561e0482e823]

page/page0cur.cc:1200(page_cur_parse_insert_rec(unsigned long, unsigned char const*, unsigned char const*, buf_block_t*, dict_index_t*, mtr_t*))[0x561e0474df9b]

log/log0recv.cc:1655(recv_parse_or_apply_log_rec_body(mlog_id_t, unsigned char*, unsigned char*, unsigned long, unsigned long, bool, buf_block_t*, mtr_t*))[0x561e0472c1d1]

log/log0recv.cc:2162(recv_recover_page(buf_block_t*, mtr_t&, recv_addr_t*, unsigned long))[0x561e0472ce19]

log/log0recv.cc:2272(recv_recover_page(buf_page_t*))[0x561e041eb78d]

buf/buf0buf.cc:6164(buf_page_io_complete(buf_page_t*, bool, bool))[0x561e0461eb9c]

fil/fil0fil.cc:5169(fil_aio_wait(unsigned long))[0x561e0466c461]

srv/srv0start.cc:331(io_handler_thread)[0x561e047e0da8]

pthread_create.c:0(start_thread)[0x7f3412b56ea5]

/lib64/libc.so.6(clone+0x6d)[0x7f3411a309fd]

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains

information that should help you find out what is causing the crash.

Writing a core file...

Working directory at /glide/mysqld/customecert_3400_peta/temp/restore_2021-10-08_1820332/customer_3400_db170011_s_2021-10-08_0612271

Resource Limits:

Fatal signal 11 while backtracing

The customer managed to capture thread dumps and the mariabakcup log, which I have attached to this ticket.

Unfortunately, there was no cored dump capture during said event.

Furthermore, the customer was able to avoid the mariabackup crash during the restore by increasing the previously sizeable "--use-memory" of 32GB to 256GB. However they did not see the following log:

[Warning] InnoDB: Difficult to find free blocks in the buffer pool (21 search iterations)! 21 failed attempts to flush a page! Consider increasing innodb_buffer_pool_size. Pending flushes (fsync) log: 0; buffer pool: 0. 5129 OS file reads, 0 OS file writes, 0 OS fsyncs.

Therefore, it does not seem related to https://jira.mariadb.org/browse/MDEV-20679 as I had initially hoped.

If you require any other information, please let us know.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

gdb-thread-dump-output-20211019_034935.txt
19 kB
2021-12-20 19:27
gdb-thread-dump-output-20211019_035107.txt
18 kB
2021-12-20 19:27
gdb-thread-dump-output-20211019_035225.txt
19 kB
2021-12-20 19:27
gdb-thread-dump-output-20211019_035242.txt
19 kB
2021-12-20 19:27
restore.2021-10-08_1820332.customer_3400_db170011_s_2021-10-08_0612271.log
30 kB
2021-12-20 19:40

Issue Links

relates to

MDEV-19586 Replace recv_sys_t::addr_hash with a std::map

Closed

MDEV-21351 Replace recv_sys.heap with list of buf_block_t*

Closed

MDEV-26784 [Warning] InnoDB: Difficult to find free blocks in the buffer pool

Closed

mentioned in: Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...

(10 mentioned in)

Activity

People

Assignee:: Geoff Montee (Inactive)

Reporter:: Scott Sommerville (Inactive)

Votes:: 1 Vote for this issue

Watchers:: 9 Start watching this issue

Dates

Created:: 2021-12-20 19:41

Updated:: 2024-09-05 20:03

Resolved:: 2022-04-05 16:12

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.