[MDEV-20624] MariaDB crash if invoked by root Created: 2019-09-18  Updated: 2019-12-12  Resolved: 2019-12-12

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.4.8
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Marcelo Altmann Assignee: Jan Lindström (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

Linux marcelo-altmann--standalone-1 4.20.13-1.el7.elrepo.x86_64 #1 SMP Wed Feb 27 10:02:05 EST 2019 x86_64 x86_64 x86_64 GNU/Linux

cat /etc/redhat-release
CentOS Linux release 7.6.1810 (Core)



 Description   

Starting mysql as the root user causes the server to segfault:

[root@marcelo-altmann--standalone-1 mysql]# mysqld
2019-09-18 18:11:37 0 [Note] mysqld (mysqld 10.4.8-MariaDB) starting as process 5446 ...
mysqld: Please consult the Knowledge Base to find out how to run mysqld as root!
2019-09-18 18:11:37 0 [ERROR] Aborting
190918 18:11:37 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
 
Server version: 10.4.8-MariaDB
key_buffer_size=0
read_buffer_size=131072
max_used_connections=0
max_threads=153
thread_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 336663 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x49000
addr2line: 'mysqld': No such file
mysqld(my_print_stacktrace+0x2e)[0x5562330cd83e]
mysqld(handle_fatal_signal+0x30f)[0x556232b63e0f]
sigaction.c:0(__restore_rt)[0x7f767e1f15d0]
addr2line: 'mysqld': No such file
mysqld(unireg_abort+0x373)[0x5562328932b3]
mysqld(_Z11mysqld_mainiPPc+0x963)[0x55623289a323]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f767c4af3d5]
addr2line: 'mysqld': No such file
mysqld(+0x5cebd4)[0x55623288dbd4]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
Writing a core file...
Working directory at /var/lib/mysql
Resource Limits:
Limit                     Soft Limit           Hard Limit           Units
Max cpu time              unlimited            unlimited            seconds
Max file size             unlimited            unlimited            bytes
Max data size             unlimited            unlimited            bytes
Max stack size            8388608              unlimited            bytes
Max core file size        0                    unlimited            bytes
Max resident set          unlimited            unlimited            bytes
Max processes             unlimited            unlimited            processes
Max open files            1048576              1048576              files
Max locked memory         65536                65536                bytes
Max address space         unlimited            unlimited            bytes
Max file locks            unlimited            unlimited            locks
Max pending signals       1542427              1542427              signals
Max msgqueue size         819200               819200               bytes
Max nice priority         0                    0
Max realtime priority     0                    0
Max realtime timeout      unlimited            unlimited            us
Core pattern: core
 
Segmentation fault



 Comments   
Comment by Elena Stepanova [ 2019-09-18 ]

It says "mysqld: Please consult the Knowledge Base to find out how to run mysqld as root".
Have you consulted the Knowledge Base to find out how to run mysqld as root?

Comment by Marcelo Altmann [ 2019-09-18 ]

Hi Elena,

Thanks for getting back to me. The reason for reporting this is not that the server doesn't come up. The reason is that is segfault (including generating a coredump ) instead of running a clean shutdown.

Comment by Elena Stepanova [ 2019-09-18 ]

Yes, it is a strange aftermath. Usually it ends with Aborting.
Does it also happen if you run with --no-defaults? If not, could you please provide your config files which mysqld picks up from default locations?

Comment by Marcelo Altmann [ 2019-09-18 ]

I've been able to reduce it to 3 configurations. Seems to be related to galera:

mysqld --no-defaults --wsrep_on=ON --wsrep_provider=/usr/lib64/galera-4/libgalera_smm.so --binlog_format=ROW

Comment by Elena Stepanova [ 2019-09-18 ]

Thanks for the info.

Comment by Isaac Gremmer [ 2019-10-28 ]

Just chiming in that we are seeing the same issue. Galera works great until shutting down a node, and then it segfaults.

Monday, October 28th 2019 @ 12:16:41 pm 2019/10/28 17:16:41 shutdown requested
Monday, October 28th 2019 @ 12:16:41 pm 2019/10/28 17:16:41 initiating shutdown (SIGTERM) for mariadb
Monday, October 28th 2019 @ 12:16:41 pm 2019-10-28 17:16:41 0 [Note] mysqld (initiated by: unknown): Normal shutdown
Monday, October 28th 2019 @ 12:16:41 pm 2019-10-28 17:16:41 0 [Note] WSREP: Shutdown replication
Monday, October 28th 2019 @ 12:16:41 pm 2019-10-28 17:16:41 0 [Note] WSREP: Server status change synced -> disconnecting
Monday, October 28th 2019 @ 12:16:41 pm 2019-10-28 17:16:41 0 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
Monday, October 28th 2019 @ 12:16:41 pm 2019-10-28 17:16:41 0 [Note] WSREP: Closing send monitor...
Monday, October 28th 2019 @ 12:16:41 pm 2019-10-28 17:16:41 0 [Note] WSREP: Closed send monitor.
Monday, October 28th 2019 @ 12:16:41 pm 2019-10-28 17:16:41 0 [Note] WSREP: gcomm: terminating thread
Monday, October 28th 2019 @ 12:16:41 pm 2019-10-28 17:16:41 0 [Note] WSREP: gcomm: joining thread
Monday, October 28th 2019 @ 12:16:41 pm 2019-10-28 17:16:41 0 [Note] WSREP: gcomm: closing backend
Monday, October 28th 2019 @ 12:16:41 pm 2019-10-28 17:16:41 0 [Note] WSREP: view(view_id(NON_PRIM,06f74a82,33) memb

{ Monday, October 28th 2019 @ 12:16:41 pm 06f74a82,101 Monday, October 28th 2019 @ 12:16:41 pm }

joined

{ Monday, October 28th 2019 @ 12:16:41 pm }

left

{ Monday, October 28th 2019 @ 12:16:41 pm }

partitioned

{ Monday, October 28th 2019 @ 12:16:41 pm 08482b83,100 Monday, October 28th 2019 @ 12:16:41 pm 27efe653,100 Monday, October 28th 2019 @ 12:16:41 pm }

)
Monday, October 28th 2019 @ 12:16:41 pm 2019-10-28 17:16:41 0 [Note] WSREP: PC protocol downgrade 1 -> 0
Monday, October 28th 2019 @ 12:16:41 pm 2019-10-28 17:16:41 0 [Note] WSREP: view((empty))
Monday, October 28th 2019 @ 12:16:41 pm 191028 17:16:41 [ERROR] mysqld got signal 11 ;
Monday, October 28th 2019 @ 12:16:41 pm This could be because you hit a bug. It is also possible that this binary
Monday, October 28th 2019 @ 12:16:41 pm or one of the libraries it was linked against is corrupt, improperly built,
Monday, October 28th 2019 @ 12:16:41 pm or misconfigured. This error can also be caused by malfunctioning hardware.
Monday, October 28th 2019 @ 12:16:41 pm To report this bug, see https://mariadb.com/kb/en/reporting-bugs
Monday, October 28th 2019 @ 12:16:41 pm We will try our best to scrape up some info that will hopefully help
Monday, October 28th 2019 @ 12:16:41 pm diagnose the problem, but since we have already crashed,
Monday, October 28th 2019 @ 12:16:41 pm something is definitely wrong and this may fail.
Monday, October 28th 2019 @ 12:16:41 pm Server version: 10.4.8-MariaDB-1:10.4.8+maria~bionic-log
Monday, October 28th 2019 @ 12:16:41 pm key_buffer_size=134217728
Monday, October 28th 2019 @ 12:16:41 pm read_buffer_size=2097152
Monday, October 28th 2019 @ 12:16:41 pm max_used_connections=2
Monday, October 28th 2019 @ 12:16:41 pm max_threads=1002
Monday, October 28th 2019 @ 12:16:41 pm thread_count=11
Monday, October 28th 2019 @ 12:16:41 pm It is possible that mysqld could use up to
Monday, October 28th 2019 @ 12:16:41 pm key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 6311962 K bytes of memory
Monday, October 28th 2019 @ 12:16:41 pm Hope that's ok; if not, decrease some variables in the equation.
Monday, October 28th 2019 @ 12:16:41 pm Thread pointer: 0x0
Monday, October 28th 2019 @ 12:16:41 pm Attempting backtrace. You can use the following information to find out
Monday, October 28th 2019 @ 12:16:41 pm where mysqld died. If you see no messages after this, something went
Monday, October 28th 2019 @ 12:16:41 pm terribly wrong...
Monday, October 28th 2019 @ 12:16:41 pm stack_bottom = 0x0 thread_stack 0x49000
Monday, October 28th 2019 @ 12:16:41 pm mysqld(my_print_stacktrace+0x2e)[0x55c08df9dfae]
Monday, October 28th 2019 @ 12:16:41 pm mysqld(handle_fatal_signal+0x515)[0x55c08da13185]
Monday, October 28th 2019 @ 12:16:41 pm /lib/x86_64-linux-gnu/libpthread.so.0(+0x12890)[0x7f520bc4b890]
Monday, October 28th 2019 @ 12:16:41 pm /lib/x86_64-linux-gnu/libc.so.6(cfree+0x2b1)[0x7f520a5d8c01]
Monday, October 28th 2019 @ 12:16:41 pm /usr/lib/galera/libgalera_smm.so(_ZN5gcomm3evs8InputMapD1Ev+0x3c)[0x7f51dd8fadac]
Monday, October 28th 2019 @ 12:16:41 pm /usr/lib/galera/libgalera_smm.so(_ZN5gcomm3evs5ProtoD2Ev+0x133)[0x7f51dd90be33]
Monday, October 28th 2019 @ 12:16:41 pm /usr/lib/galera/libgalera_smm.so(_ZN5gcomm3evs5ProtoD0Ev+0x9)[0x7f51dd90c819]
Monday, October 28th 2019 @ 12:16:41 pm /usr/lib/galera/libgalera_smm.so(_ZN5gcomm2PCD2Ev+0x52)[0x7f51dd9585b2]
Monday, October 28th 2019 @ 12:16:41 pm /usr/lib/galera/libgalera_smm.so(_ZN5gcomm2PCD0Ev+0x9)[0x7f51dd958c89]
Monday, October 28th 2019 @ 12:16:41 pm /usr/lib/galera/libgalera_smm.so(_ZN9GCommConn5closeEb+0x255)[0x7f51dd9dc565]
Monday, October 28th 2019 @ 12:16:41 pm /usr/lib/galera/libgalera_smm.so(+0x1d3b65)[0x7f51dd9d6b65]
Monday, October 28th 2019 @ 12:16:41 pm /usr/lib/galera/libgalera_smm.so(_Z14gcs_core_closeP8gcs_core+0x36)[0x7f51dd9c3616]
Monday, October 28th 2019 @ 12:16:41 pm /usr/lib/galera/libgalera_smm.so(+0x1c3b6e)[0x7f51dd9c6b6e]
Monday, October 28th 2019 @ 12:16:41 pm /usr/lib/galera/libgalera_smm.so(_Z9gcs_closeP8gcs_conn+0x50)[0x7f51dd9c7430]
Monday, October 28th 2019 @ 12:16:41 pm /usr/lib/galera/libgalera_smm.so(_ZN6galera13ReplicatorSMM5closeEv+0x4a)[0x7f51dda3945a]
Monday, October 28th 2019 @ 12:16:41 pm /usr/lib/galera/libgalera_smm.so(galera_disconnect+0x2b)[0x7f51dda5d33b]
Monday, October 28th 2019 @ 12:16:42 pm mysqld(_ZN5wsrep18wsrep_provider_v2610disconnectEv+0x11)[0x55c08e024571]
Monday, October 28th 2019 @ 12:16:42 pm mysqld(_ZN5wsrep12server_state10disconnectEv+0x92)[0x55c08e013532]
Monday, October 28th 2019 @ 12:16:42 pm mysqld(_Z26wsrep_shutdown_replicationv+0xa5)[0x55c08d97a315]
Monday, October 28th 2019 @ 12:16:42 pm mysqld(_Z11mysqld_mainiPPc+0x1ca5)[0x55c08d747ab5]
Monday, October 28th 2019 @ 12:16:42 pm /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7f520a562b97]
Monday, October 28th 2019 @ 12:16:42 pm mysqld(_start+0x2a)[0x55c08d739f6a]
Monday, October 28th 2019 @ 12:16:42 pm The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
Monday, October 28th 2019 @ 12:16:42 pm information that should help you find out what is causing the crash.
Monday, October 28th 2019 @ 12:16:42 pm Writing a core file...
Monday, October 28th 2019 @ 12:16:42 pm Working directory at /var/lib/mysql
Monday, October 28th 2019 @ 12:16:42 pm Resource Limits:
Monday, October 28th 2019 @ 12:16:42 pm Limit Soft Limit Hard Limit Units
Monday, October 28th 2019 @ 12:16:42 pm Max cpu time unlimited unlimited seconds
Monday, October 28th 2019 @ 12:16:42 pm Max file size unlimited unlimited bytes
Monday, October 28th 2019 @ 12:16:42 pm Max data size unlimited unlimited bytes
Monday, October 28th 2019 @ 12:16:42 pm Max stack size 8388608 unlimited bytes
Monday, October 28th 2019 @ 12:16:42 pm Max core file size unlimited unlimited bytes
Monday, October 28th 2019 @ 12:16:42 pm Max resident set unlimited unlimited bytes
Monday, October 28th 2019 @ 12:16:42 pm Max processes unlimited unlimited processes
Monday, October 28th 2019 @ 12:16:42 pm Max open files 1048576 1048576 files
Monday, October 28th 2019 @ 12:16:42 pm Max locked memory 65536 65536 bytes
Monday, October 28th 2019 @ 12:16:42 pm Max address space unlimited unlimited bytes
Monday, October 28th 2019 @ 12:16:42 pm Max file locks unlimited unlimited locks
Monday, October 28th 2019 @ 12:16:42 pm Max pending signals 256878 256878 signals
Monday, October 28th 2019 @ 12:16:42 pm Max msgqueue size 819200 819200 bytes
Monday, October 28th 2019 @ 12:16:42 pm Max nice priority 0 0
Monday, October 28th 2019 @ 12:16:42 pm Max realtime priority 0 0
Monday, October 28th 2019 @ 12:16:42 pm Max realtime timeout unlimited unlimited us
Monday, October 28th 2019 @ 12:16:42 pm Core pattern: |/usr/share/apport/apport %p %s %c %d %P

Comment by Isaac Gremmer [ 2019-10-29 ]

I switched to using evs_version = 0, and it seems to be better. It's not segfaulting on shutdown of the nodes. It may be an issue with evs_version = 1 or the attempt at downgrading from 1 to 0 ("WSREP: PC protocol downgrade 1 -> 0" in the log trace above).

Comment by Teemu Ollakka [ 2019-10-30 ]

I was able to reproduce the crash when running mysqld as root:

(gdb) bt
#0  unireg_abort (exit_code=exit_code@entry=1)
    at /usr/src/debug/MariaDB-10.4.8/src_0/sql/mysqld.cc:1882
#1  0x0000555555b2f323 in set_effective_user (user_info_arg=<optimized out>, 
    user_info_arg=<optimized out>)
    at /usr/src/debug/MariaDB-10.4.8/src_0/sql/mysqld.cc:2256
#2  mysqld_main (argc=8, argv=0x555557639808)
    at /usr/src/debug/MariaDB-10.4.8/src_0/sql/mysqld.cc:5710
#3  0x00007ffff5e8d3d5 in __libc_start_main (main=
    0x555555afa510 <main(int, char**)>, argc=1, argv=0x7fffffffe5a8, 
    init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, 
    stack_end=0x7fffffffe598) at ../csu/libc-start.c:266
#4  0x0000555555b22bd4 in _start ()
(gdb) li
1877	  /* Don't write more notes to the log to not hide error message */
1878	  disable_log_notes= 1;
1879	
1880	#ifdef WITH_WSREP
1881	  if (WSREP_ON &&
1882	      Wsrep_server_state::instance().state() != wsrep::server_state::s_disconnected)
1883	  {
1884	    /*
1885	      This is an abort situation, we cannot expect to gracefully close all
1886	      wsrep threads here, we can only diconnect from service

The crash is caused by accessing uninitialized Wsrep_server_state in unireg_abort(), and this has been fixed in https://github.com/MariaDB/server/commit/9bacc9d0c1957650374951637dcfd42cd09c5f5f

The crash inside Galera library seems to be different one and I could not reproduce it with MariaDB 10.4.8.

Generated at Thu Feb 08 09:00:55 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.