[MDEV-21293] mariabackup crashes when not using full path Created: 2019-12-11  Updated: 2023-04-27

Status: Open
Project: MariaDB Server
Component/s: Backup
Affects Version/s: 10.2.9, 10.3.2, 10.4.0, 10.4.10
Fix Version/s: 10.4

Type: Bug Priority: Major
Reporter: László Károlyi Assignee: Marko Mäkelä
Resolution: Unresolved Votes: 1
Labels: backup, crash
Environment:

OSX Catalina, Homebrew binary & source compiled


Issue Links:
Problem/Incident
is caused by MDEV-13466 Implement --export option for MariaDB... Closed
Relates
relates to MDEV-14453 mariabackup coredump for 10.2.10 Closed

 Description   

Hi,

this is the same bug that occurred in https://jira.mariadb.org/browse/MDEV-14453. Here's the transcript from IRC #maria:

18:22 < karolyi> hey guys, mariabackup is coredumping for me on Catalina: 'mariabackup -prepare --target...' terminated by signal SIGABRT (Abort)
18:22 < karolyi> does anyone have an idea as to what this could be?
18:24 < karolyi> I'm using 10.4.10 from homebrew, manually compiled, but the binary version segfaults too
18:27 < karolyi> what da...
18:27 < karolyi> mariabackup --help
18:27 < karolyi> fish: 'mariabackup --help' terminated by signal SIGABRT (Abort)
18:27 < karolyi> but when I call it with its full path, /usr/local/bin/mariabackup --help, works
18:28 < karolyi> what sorcery is this
18:32 < Xgc> karolyi: which mariabackup will show the implicit path.
18:32 < karolyi> with the binary package I get: '/usr/local/bin/mariabackup --he...' terminated by signal SIGSEGV (Address boundary error)
18:32 < karolyi> Xgc: it was that path all along
18:33 < karolyi> I bet this is something Catalina related
18:33 < karolyi> okay, reinstall from source...
18:33 < Xgc> karolyi: Hmmm... That suggests a logic problem in the application. Sometimes the application will take the 0th argument (the name) for some processing. If name contains the full path, maybe some error is avoided.
18:34 < Xgc> At first it seemed like a bad installation (wrong package installed).
18:35 < karolyi> Xgc: the one in /usr/local/bin is a symlink: lrwxr-xr-x 1 laszlokarolyi admin 43 Dec 11 18:07 /usr/local/bin/mariabackup -> ../Cellar/mariadb/10.4.10_1/bin/mariabackup
18:35 < Xgc> or missing prerequisites.
18:35 < karolyi> but interestingly, the binary version fails even when called with full path
18:36 < karolyi> let me try and call it with the path in Cellar
18:36 < karolyi> fish: '/usr/local/Cellar/mariadb/10.4....' terminated by signal SIGSEGV (Address boundary error)
18:37 < karolyi> bash-3.2$ /usr/local/Cellar/mariadb/10.4.10_1/bin/mariabackup --help
18:37 < karolyi> Segmentation fault: 11
18:37 < karolyi> building from source again
18:44 < karolyi> btw I use mbstream and mariabackup extensively to restore previously loaded DBs for development
18:44 < karolyi> so this is why it's a PITA
18:44 < karolyi> https://gist.github.com/karolyi/48e0e7909ae67e9d8f2503bfc3c79225
18:58 < karolyi> okay, so it only works when called with a full path AND compiled from source (binary package segfaults as shown above)

seemingly the bug still persists.



 Comments   
Comment by Marko Mäkelä [ 2019-12-12 ]

karolyi, can you produce a stack trace for the crash? I realize that invoking the program in gdb might use a full pathname to the executable, but perhaps you can try that nevertheless?

Another option would be to invoke as

mariabackup --core-file --help

and hope that the core dump will be enabled before the unknown option --help is being processed. Then, you would have to do

gdb -ex 'set height 0' -ex 'thread apply all backtrace' -ex quit /usr/local/bin/mariabackup core

If all this fails, then as a last resort I would ask you to attach the output of

strace mariabackup --help

so that we would get some idea where it crashes.

Finally, from which binary package was /usr/local/bin/mariabackup installed? We might need a build log of that.

Comment by László Károlyi [ 2019-12-12 ]

Hey,

I'm not sure I have `strace` on a macbook, much less GDB but I'll try to get some debug output when I get back there.

As for the package question, I believe homebrew uses the following script to build the package:
https://github.com/Homebrew/homebrew-core/blob/master/Formula/mariadb.rb

Comment by László Károlyi [ 2019-12-12 ]

Alright, I'm back on my mac again.

GDB will run the binary with the full path and it will work.

mariabackup --core-file --help doesn't create a core file, it crashes before it gets to that point.

the strace equivalent on osx is dtruss, that gives the following output:

{{

Laszlos-MBP:~ root# dtruss mariabackup --help
dtrace: system integrity protection is on, some features will not be available

SYSCALL(args) = return
open("/dev/dtracehelper\0", 0x2, 0xC91C000) = 3 0
ioctl(0x3, 0x80086804, 0x7FFEE32E3050) = 0 0
close(0x3) = 0 0
mprotect(0x10E45A000, 0x29000, 0x1) = 0 0
mprotect(0x10E291000, 0x8000, 0x1) = 0 0
mprotect(0x10D47E000, 0x115000, 0x1) = 0 0
access("/AppleInternal/XBS/.isChrooted\0", 0x0, 0x0) = -1 2
bsdthread_register(0x7FFF6FFB382C, 0x7FFF6FFB3818, 0x2000) = 1073742047 0
sysctlbyname(kern.bootargs, 0xD, 0x7FFEE32E2040, 0x7FFEE32E2030, 0x0) = 0 0
issetugid(0x0, 0x0, 0x0) = 0 0
ioctl(0x2, 0x4004667A, 0x7FFEE32E2384) = 0 0
mprotect(0x10E4EB000, 0x1000, 0x0) = 0 0
mprotect(0x10E4F2000, 0x1000, 0x0) = 0 0
mprotect(0x10E4F3000, 0x1000, 0x0) = 0 0
mprotect(0x10E4FA000, 0x1000, 0x0) = 0 0
mprotect(0x10E1F2000, 0x90, 0x1) = 0 0
mprotect(0x10E1CD000, 0x1000, 0x1) = 0 0
mprotect(0x10E1F2000, 0x90, 0x3) = 0 0
mprotect(0x10E1F2000, 0x90, 0x1) = 0 0
getentropy(0x7FFEE32E1740, 0x20, 0x0) = 0 0
getentropy(0x7FFEE32E1790, 0x40, 0x0) = 0 0
getpid(0x0, 0x0, 0x0) = 11765 0
stat64("/AppleInternal\0", 0x7FFEE32E24B0, 0x0) = -1 2
csops_audittoken(0x2DF5, 0x7, 0x7FFEE32E2000) = -1 22
proc_info(0x2, 0x2DF5, 0xD) = 64 0
csops_audittoken(0x2DF5, 0x7, 0x7FFEE32E1880) = -1 22
dtrace: error on enabled probe ID 2210 (ID 572: syscall::sysctl:return): invalid kernel access in action #10 at DIF offset 28
stat64("/\0", 0x7FFEE32E0810, 0x0) = 0 0
open_nocancel(".\0", 0x0, 0x1) = 3 0
fstat64(0x3, 0x7FFEE32E0610, 0x0) = 0 0
fcntl_nocancel(0x3, 0x32, 0x7FFEE32E25C0) = 0 0
close_nocancel(0x3) = 0 0
stat64("/private/var/root\0", 0x7FFEE32E0580, 0x0) = 0 0
stat64("/private/var/root\0", 0x7FFEE32E0810, 0x0) = 0 0
getattrlist("/private/var/root/mariabackup\0", 0x7FFF6FE7F944, 0x7FFEE32E2160) = -1 2
open_nocancel(".\0", 0x0, 0x1) = 3 0
fstat64(0x3, 0x7FFEE32E2140, 0x0) = 0 0
fcntl_nocancel(0x3, 0x32, 0x7FFEE32E1CB0) = 0 0
close_nocancel(0x3) = 0 0
stat64("/private/var/root\0", 0x7FFEE32E20B0, 0x0) = 0 0
sigaction(0xE, 0x7FFEE32E2988, 0x0) = 0 0
sigaction(0xD, 0x7FFEE32E2988, 0x0) = 0 0
sigaction(0xF, 0x7FFEE32E2988, 0x0) = 0 0
sigaction(0x1, 0x7FFEE32E2988, 0x0) = 0 0
sigprocmask(0x3, 0x7FFEE32E29D4, 0x0) = 0x0 0
__pthread_sigmask(0x3, 0x7FFEE32E29D4, 0x0) = 0 0
sigprocmask(0x2, 0x7FFEE32E28FC, 0x0) = 0x0 0
abort_with_payload(0x12, 0x4, 0x0) = 0 0

}}

hope this helps.

Comment by Marko Mäkelä [ 2019-12-12 ]

karolyi, can you try to set a breakpoint to sigaction and then remove those calls? I would expect SIGABRT to create a core dump by default. Then we should hopefully get to the right point.

For the record, on my Debian GNU/Linux system, I just tried invoking mariabackup that was built using cmake -DWITH_ASAN=ON, and I see no crash for

ASAN_OPTIONS=abort_on_error=1 PATH=/dev/shm/10.5/extra/mariabackup:"$PATH" mariabackup --help
echo $?

The exit code is 1, and I get a list of parameters and their values as output. With a signal, the exit code ought to be something like 128+SIGABRT = 134.

Comment by László Károlyi [ 2019-12-12 ]

I'm not experienced enough with debugging and setting breakpoints on already compiled stuff on OSX. If you can tell me how, I can try.

Comment by Marko Mäkelä [ 2019-12-12 ]

An alternative solution would be to add a sleep to the start of the main() function and then

mariabackup --help &
gdb -p $(pgrep mariabackup)

to attach the debugger to the process before it has a chance of crashing. Then we should get a proper stack trace of the crash.

Comment by László Károlyi [ 2019-12-12 ]

attaching a GDB right after firing up the process the way you told doesn't help unfortunately. the process quits before GDB could attach itself to it.

Comment by Marko Mäkelä [ 2019-12-12 ]

Did you try to add a call to sleep at the start of the main() function? man 3 sleep tells me:

#include <unistd.h>
unsigned int sleep(unsigned int seconds);

Comment by László Károlyi [ 2019-12-12 ]

I managed to put a my_sleep(1000000) into main() and copied the compiled file in place.

GDB can't attach, here's the message:

Attaching to process 30804
Unable to find Mach task port for process-id 30804: (os/kern) failure (0x5).
(please check gdb is codesigned - see taskgated(8))

the newly compiled file still crashes.

Comment by László Károlyi [ 2019-12-12 ]

Sorry, my bad, GDB has to be started as root. I see the following:

Attaching to process 31669
[New Thread 0x1003 of process 31669]
Error calling thread_get_state for GP registers for thread 0x1003

warning: Mach error at "i386-darwin-nat.c:132" in function "virtual void i386_darwin_nat_target::fetch_registers(struct regcache *, int)": (os/kern) invalid argument (0x4)
Reading symbols from /usr/local/Cellar/mariadb/10.4.10_1/bin/mariabackup...

warning: unhandled dyld version (16)
0x00007fff6fef75be in ?? ()
(gdb)

Comment by Marcin Gryszkalis [ 2020-01-14 ]

I debugged this problem (on FreeBSD - it doesn't happen on Linux as it uses /proc/self/exe which always works).
It crashes in my_realpath

#0  0x000000000161a637 in my_realpath (to=0x1a2e180 <mariabackup_exe> "", filename=0x7fffffffedb0 "mxb", MyFlags=0) at my_symlink.c:158
#1  0x00000000009a14d7 in get_exepath (size=<optimized out>, argv0=0x50ce9f "error", buf=<optimized out>) at xtrabackup.cc:6391
#2  main (argc=1, argv=0x7fffffffeb48) at xtrabackup.cc:6091
 
158         my_errno=errno;

because call to realpath(3) fails and above assignment is invalid as my_errno is not int, it's defined as

include/my_pthread.h:#define my_errno my_thread_var->thr_errno
include/my_pthread.h:#define my_thread_var (_my_thread_var())

but above test runs before threads are initialized and my_thread_var is unusable

Comment by Marko Mäkelä [ 2020-01-16 ]

marcin.gryszkalis, thank you for your analysis! Could you please submit a pull request for MariaDB Server 10.2 to fix it? That seems to be the earliest affected version.

Comment by Dmitry Petrov [ 2020-10-28 ]

So, will this annoying bug ever be fixed?

Generated at Thu Feb 08 09:06:02 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.