[MDEV-30613] output_core_info crashes in my_read() Created: 2023-02-08  Updated: 2023-10-19  Resolved: 2023-03-08

Status: Closed
Project: MariaDB Server
Component/s: Server
Affects Version/s: 10.4
Fix Version/s: 10.11.3, 11.0.2, 10.4.29, 10.5.20, 10.6.13, 10.8.8, 10.9.6, 10.10.4

Type: Bug Priority: Major
Reporter: Vladislav Vaintroub Assignee: Daniel Black
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Duplicate
is duplicated by MDEV-21582 mysqld crash with coredump on start up Closed

 Description   

Seen in 10.4, possibly present elsewhere, did not check

From
https://buildbot.mariadb.org/#/builders/168/builds/18664/steps/9/logs/stdio

gcol.innodb_virtual_fk_restart 'innodb'  w13 [ fail ]  Found warnings/errors in server log file!
        Test ended at 2023-02-08 11:04:48
line
==213377==ERROR: LeakSanitizer: detected memory leaks
SUMMARY: AddressSanitizer: 608 byte(s) leaked in 6 allocation(s).
Attempting backtrace. You can use the following information to find out
^ Found warnings in /buildbot/amd64-ubuntu-1804-clang10-asan/build/mysql-test/var/13/log/mysqld.1.err
ok
 - found 'core' (0/1)
Core generated by '/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mysqld'
Output from gdb follows. The first stack trace is from the failing thread.
The following stack traces are from all threads (so the failing one is
duplicated).
--------------------------
[New LWP 213377]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mysqld --defaults-group-suff'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007fd47f54c2a7 in kill () from /lib/x86_64-linux-gnu/libc.so.6
#0  0x00007fd47f54c2a7 in kill () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00000000011473b0 in handle_fatal_signal (sig=<optimized out>) at signal_handler.cc:380
#2  <signal handler called>
#3  0x000000000261acf7 in my_read (Filedes=4, Buffer=0x7fff855daaa0 "Limit", ' ' <repeats 21 times>, "Soft Limit", ' ' <repeats 11 times>, "Hard Limit", ' ' <repeats 11 times>, "Units     \nMax cpu time", ' ' <repeats 14 times>, "unlimited", ' ' <repeats 12 times>, "unlimited", ' ' <repeats 12 times>, "seconds   \nMax file size", ' ' <repeats 13 times>, "unlimited       "..., Count=4096, MyFlags=0) at my_read.c:63
#4  0x00000000011472b2 in output_core_info () at signal_handler.cc:73
#5  handle_fatal_signal (sig=<optimized out>) at signal_handler.cc:364
#6  <signal handler called>
#7  0x00007fd47f54bfb7 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#8  0x00007fd47f54d921 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#9  0x00000000007307f7 in __sanitizer::Abort() () at /home/brian/src/final/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_posix_libcdep.cpp:155
#10 0x000000000072f221 in __sanitizer::Die() () at /home/brian/src/final/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_termination.cpp:58
#11 0x000000000073c228 in __lsan::HandleLeaks() () at /home/brian/src/final/llvm-project/compiler-rt/lib/lsan/lsan_common_linux.cpp:115
#12 0x0000000000739991 in DoLeakCheck () at /home/brian/src/final/llvm-project/compiler-rt/lib/lsan/lsan_common.cpp:614
#13 0x00007fd47f550161 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#14 0x00007fd47f55025a in exit () from /lib/x86_64-linux-gnu/libc.so.6
#15 0x0000000000748c6c in mysqld_exit (exit_code=0) at mysqld.cc:1964
#16 0x0000000000750841 in mysqld_main (argc=<optimized out>, argv=0x646c6975622f0001) at mysqld.cc:5996
#17 0x00007fd47f52ebf7 in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
#18 0x000000000069a47a in _start ()

The problem here is using my_read, which may accesses thread local storage variables and can dereference null pointer, if variables are not initialized

for example, the innocuously looking line

      int got_errno= my_errno= errno;

in my_read() potentially dereferences null pointer, since my_errno expands to my_thread_var->thr_errno, which in turn does my_pthread_getspecific() to get my_thread_var.

I think the solution might be to abandon "my_" functions in error handler, an replace them with posix. unless one can rewrite the "my_" stuff in a safe manner, rather than assume existence of mysys thread local storage variable.



 Comments   
Comment by Daniel Black [ 2023-02-08 ]

Thanks. I'd looked over this fault numerous times over years without spotting it. Though I was recently bitten by it,

Comment by Otto Kekäläinen [ 2023-02-12 ]

This was referenced in discussion at https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1030510 about s390x failing to run on Debian buildd hosts (=zero tests passed). However it is likely that the actual root cause in that case is a kernel bug Daniel reported in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1020831.

Generated at Thu Feb 08 10:17:34 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.