[MDEV-32650] mariabackup failes with: Error on mysqld got signal 11 Created: 2023-11-01  Updated: 2023-12-11

Status: Open
Project: MariaDB Server
Component/s: Backup, mariabackup
Affects Version/s: 10.4.31, 10.6.15
Fix Version/s: 10.4, 10.6

Type: Bug Priority: Major
Reporter: Daniel Kern Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None

Attachments: Zip Archive core-mariabackup-11-0-0-3227534-1696468987.zip    

 Description   

I can't create Full backup of mariadb 10.6.
Get this error:

[00] 2023-11-01 10:55:34 Finished backing up non-InnoDB tables and files
231101 10:55:34 [ERROR] mysqld got signal 11 ;

Here is the command I run:

/usr/bin/mariabackup   --verbose --backup  --extra-lsndir=$TARGETDIR  --user=mariabackup --password=xxx--stream=xbstream | gzip > $TARGETDIR/stream.xb.gz

mariabackup version:

# /usr/bin/mariabackup -v
/usr/bin/mariabackup based on MariaDB server 10.6.15-MariaDB Linux (x86_64)

Here is show engine innodb output:

| InnoDB |      |
=====================================
2023-11-01 11:27:15 0x7f4271b5f700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 34 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 1 srv_active, 0 srv_shutdown, 10554279 srv_idle
srv_master_thread log flush and writes: 10553777
----------
SEMAPHORES
----------
------------
TRANSACTIONS
------------
Trx id counter 61759031
Purge done for trx's n:o < 61759031 undo n:o < 0 state: running but idle
History list length 123
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION (0x7f428111a680), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7f4281119b80), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7f428111c780), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7f428111d280), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
--------
FILE I/O
--------
Pending flushes (fsync) log: 0; buffer pool: 0
45243989 OS file reads, 15247627 OS file writes, 12850070 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 0.18 writes/s, 0.18 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 0, seg size 2, 0 merges
merged operations:
 insert 0, delete mark 0, delete 0
discarded operations:
 insert 0, delete mark 0, delete 0
0.00 hash searches/s, 0.00 non-hash searches/s
---
LOG
---
Log sequence number 67880283194
Log flushed up to   67880283194
Pages flushed up to 67873972268
Last checkpoint at  67873972268
0 pending log flushes, 0 pending chkp writes
12803704 log i/o's done, 0.18 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total large memory allocated 167772160
Dictionary memory allocated 1765416
Buffer pool size   8112
Free buffers       793
Database pages     7311
Old database pages 2678
Modified db pages  1333
Percent of dirty pages(LRU & free pages): 16.447
Max dirty pages percent: 90.000
Pending reads 0
Pending writes: LRU 0, flush list 0
Pages made young 160711, not young 774607565
0.00 youngs/s, 0.00 non-youngs/s
Pages read 45243499, created 2131739, written 2437613
0.00 reads/s, 0.03 creates/s, 0.00 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 7311, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 read views open inside InnoDB
Process ID=0, Main thread ID=0, state: sleeping
Number of rows inserted 10573210, updated 3121656, deleted 1081, read 1646766094
0.09 inserts/s, 0.00 updates/s, 0.00 deletes/s, 278.26 reads/s
Number of system rows inserted 0, updated 0, deleted 0, read 0
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s

Backtrace:

stack_bottom = 0x0 thread_stack 0x49000
/usr/bin/mariabackup(my_print_stacktrace+0x2e)[0x556df5d7887e]
/usr/bin/mariabackup(handle_fatal_signal+0x485)[0x556df58c3f85]
/lib64/libpthread.so.0(+0x12ce0)[0x7f87afe77ce0]
/lib64/libc.so.6(opendir+0x4)[0x7f87af27af94]
/usr/bin/mariabackup(+0x703588)[0x556df554b588]
/usr/bin/mariabackup(_Z12backup_startP7ds_ctxtS0_R14CorruptedPages+0x10d)[0x556df554eedd]
/usr/bin/mariabackup(+0x6ef20e)[0x556df553720e]
/usr/bin/mariabackup(main+0x17c)[0x556df54d837c]
/lib64/libc.so.6(__libc_start_main+0xf3)[0x7f87af1bfcf3]
/usr/bin/mariabackup(_start+0x2e)[0x556df55264ae]
 
Resource limits:
Limit                     Soft Limit           Hard Limit           Units
Max cpu time              unlimited            unlimited            seconds
Max file size             unlimited            unlimited            bytes
Max data size             unlimited            unlimited            bytes
Max stack size            8388608              unlimited            bytes
Max core file size        unlimited            unlimited            bytes
Max resident set          unlimited            unlimited            bytes
Max processes             1543820              1543820              processes
Max open files            1024                 262144               files
Max locked memory         65536                65536                bytes
Max address space         unlimited            unlimited            bytes
Max file locks            unlimited            unlimited            locks
Max pending signals       1543820              1543820              signals
Max msgqueue size         819200               819200               bytes
Max nice priority         0                    0
Max realtime priority     0                    0
Max realtime timeout      unlimited            unlimited            us



 Comments   
Comment by Marko Mäkelä [ 2023-11-01 ]

It would seem to me that opendir() is being invoked on an invalid pointer to a string. In the source code, the call is os_file_opendir() via a simple macro:

# define os_file_opendir(dirname) opendir(dirname)

I would guess that this is call is in backup_files(), which is called by backup_start(), which is visible in the stack trace.

Comment by Sergei Golubchik [ 2023-11-02 ]

I'm sorry, I wasn't able to repeat this failure. I've run your command line and didn't get a crash. I must be missing some important detail. Could you please help to create a repeatable test case?

Comment by Daniel Kern [ 2023-11-05 ]

Sorry, not clear what else needs to be added. Every time I run the backup I get the same error. Attaching a core file

Comment by Daniel Black [ 2023-11-05 ]

Thanks for the core file. Which distro/version is the mariadb-backup from? Which version of mariadb-backup was used to generate this core?

Comment by Daniel Kern [ 2023-11-06 ]

Server version: 10.6.15-MariaDB

[root@~]# mariabackup --version
mariabackup based on MariaDB server 10.6.15-MariaDB Linux (x86_64)

 
# cat /etc/system-release
AlmaLinux release 8.6 (Sky Tiger)

[root@~]# rpm -qa|grep -i maria
MariaDB-server-10.6.10-1.el8.x86_64
MariaDB-shared-10.6.10-1.el8.x86_64
MariaDB-backup-10.6.15-1.el8.x86_64
MariaDB-common-10.6.10-1.el8.x86_64
MariaDB-client-10.6.10-1.el8.x86_64

Could there be an issue that mariabackup is a slightly different version than mariadb server?

Comment by Marko Mäkelä [ 2023-11-06 ]

dankern, the mariadb-backup version should be the same major version as the being-backed-up server. Are you using the backup from 10.6.15 against a 10.4.31 server? That is not expected to work, because the redo log record format was changed in 10.5 and the logic around DDL operations was changed in 10.6. If that is the case, the bug would be that backup is crashing in an obscure way, instead of reporting that it is not compatible with the server.

Comment by Daniel Kern [ 2023-11-06 ]

no, I am not doing that. I am using MariaDB-backup-10.6.15 against the DB MariaDB-server-10.6.10

Is that a problem?

Comment by Marko Mäkelä [ 2023-11-06 ]

Minor version differences within a major version (such as 10.6) are supposed to be fine.

Comment by Daniel Kern [ 2023-11-07 ]

Ok, so any idea what could be the issue?
I consistently get this error during the backup:
[00] 2023-11-01 10:55:34 Finished backing up non-InnoDB tables and files
231101 10:55:34 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

Comment by Sergei Golubchik [ 2023-12-11 ]

Unfortunately, I couldn't get a properly resolved stack trace from this core dump either

Generated at Thu Feb 08 10:32:57 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.