[MDEV-17704] mysqld terminated unexpectedly Created: 2018-11-14  Updated: 2021-12-23  Resolved: 2021-12-23

Status: Closed
Project: MariaDB Server
Component/s: Galera SST
Affects Version/s: 10.2.18
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: TAO ZHOU Assignee: Jan Lindström (Inactive)
Resolution: Won't Fix Votes: 0
Labels: None
Environment:

FreeBSD11.2



 Description   

Running galera on 3 nodes(two nodes + garb).
SST method: xtrabackup-v2
Garb and the master node was already running well.
When I was trying to add the third node, every time it fails with the following errors:

WSREP_SST: [ERROR] Parent mysqld process (PID:41586) terminated unexpectedly. (20181114 10:12:36.N)
/usr/local//bin/wsrep_sst_xtrabackup-v2: line 813: kill: (-41586) - No such process

I can see that xtrabackup finished correctly and db files were copied over. And there's no error in innobackup.backup.log on the donor.

I also monitored the processes while the SST was in progress and I noticed that mysqld already exited while SST is running.



 Comments   
Comment by Elena Stepanova [ 2018-11-14 ]

Please provide the server error log.

Comment by TAO ZHOU [ 2018-11-14 ]

 
2018-11-14 11:31:30 35624391936 [Note] WSREP: Requesting state transfer: success, donor: 1
2018-11-14 11:31:30 35624391936 [Note] WSREP: GCache history reset: 00000000-0000-0000-0000-000000000000:0 -> 3bddc1d5-d334-11e8-8557-b313dc94ce65:14255958
++ RC=(${PIPESTATUS[@]})
+ extcode=0
+ return 0
+ set -e
+ popd
+ [[ 0 -eq 124 ]]
+ for ecode in "${RC[@]}"
+ [[ 0 -ne 0 ]]
+ for ecode in "${RC[@]}"
+ [[ 0 -ne 0 ]]
+ [[ 1 -eq 1 ]]
+ [[ ! -r /tmp/tmp.wse8Kauv/xtrabackup_galera_info ]]
+ ps -p 43794
+ '[' '!' -r /tmp/tmp.wse8Kauv/xtrabackup_ist ']'
+ [[ -d /var/mysql-data//.sst ]]
+ wsrep_log_info 'WARNING: Stale temporary SST directory: /var/mysql-data//.sst from previous state transfer. Removing'
+ wsrep_log '[INFO] WARNING: Stale temporary SST directory: /var/mysql-data//.sst from previous state transfer. Removing'
++ date '+%Y%m%d %H:%M:%S.%N'
++ cut -b -21
+ local readonly 'tst=20181114 11:31:31.N'
+ wsrep_log_info 'WARNING: Stale temporary SST directory: /var/mysql-data//.sst from previous state transfer. Removing'
+ wsrep_log '[INFO] WARNING: Stale temporary SST directory: /var/mysql-data//.sst from previous state transfer. Removing'
++ date '+%Y%m%d %H:%M:%S.%N'
++ cut -b -21
+ local readonly 'tst=20181114 11:31:31.N'
+ echo 'WSREP_SST: [INFO] WARNING: Stale temporary SST directory: /var/mysql-data//.sst from previous state transfer. Removing (20181114 11:31:31.N)'
WSREP_SST: [INFO] WARNING: Stale temporary SST directory: /var/mysql-data//.sst from previous state transfer. Removing (20181114 11:31:31.N)
+ rm -rf /var/mysql-data//.sst
+ mkdir -p /var/mysql-data//.sst
+ jpid=44007
+ wsrep_log_info 'Proceeding with SST'
+ wsrep_log '[INFO] Proceeding with SST'
+ recv_joiner /var/mysql-data//.sst Joiner-Recv-SST 0 0
+ local dir=/var/mysql-data//.sst
+ local msg=Joiner-Recv-SST
+ local tmt=0
+ local checkf=0
+ local ltcmd
+ [[ ! -d /var/mysql-data//.sst ]]
+ pushd /var/mysql-data//.sst
+ set +e
+ [[ 0 -gt 0 ]]
+ timeit Joiner-Recv-SST 'socat -u TCP-LISTEN:4444,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} )'
+ local stage=Joiner-Recv-SST
+ shift
+ local 'cmd=socat -u TCP-LISTEN:4444,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} )'
+ local x1 x2 took extcode
+ [[ 0 -eq 1 ]]
+ wsrep_log_info 'Evaluating socat -u TCP-LISTEN:4444,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} )'
++ date '+%Y%m%d %H:%M:%S.%N'
+ wsrep_log '[INFO] Evaluating socat -u TCP-LISTEN:4444,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} )'
++ cut -b -21
++ date '+%Y%m%d %H:%M:%S.%N'
++ cut -b -21
+ local readonly 'tst=20181114 11:31:31.N'
+ echo 'WSREP_SST: [INFO] Proceeding with SST (20181114 11:31:31.N)'
WSREP_SST: [INFO] Proceeding with SST (20181114 11:31:31.N)
+ wsrep_log_info 'Cleaning the existing datadir and innodb-data/log directories'
+ wsrep_log '[INFO] Cleaning the existing datadir and innodb-data/log directories'
+ local readonly 'tst=20181114 11:31:31.N'
+ echo 'WSREP_SST: [INFO] Evaluating socat -u TCP-LISTEN:4444,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} ) (20181114 11:31:31.N)'
WSREP_SST: [INFO] Evaluating socat -u TCP-LISTEN:4444,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} ) (20181114 11:31:31.N)
+ eval 'socat -u TCP-LISTEN:4444,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} )'
++ socat -u TCP-LISTEN:4444,reuseaddr stdio
++ xbstream -x
++ date '+%Y%m%d %H:%M:%S.%N'
++ cut -b -21
+ local readonly 'tst=20181114 11:31:31.N'
+ echo 'WSREP_SST: [INFO] Cleaning the existing datadir and innodb-data/log directories (20181114 11:31:31.N)'
WSREP_SST: [INFO] Cleaning the existing datadir and innodb-data/log directories (20181114 11:31:31.N)
+ '[' FreeBSD = FreeBSD ']'
+ find -E /var/mysql-data /var/mysql-logs /var/mysql-data/ -mindepth 1 -prune -regex '.*\.pem$|.*init\.ok$|.*galera\.cache$|.*sst_in_progress$|.*\.sst$|.*gvwstate\.dat$|.*grastate\.dat$|.*\.err$|.*\.log$|.*RPM_UPGRADE_MARKER$|.*RPM_UPGRADE_HISTORY$' -o -exec rm -rfv '{}' +
encryption: using gcrypt 1.8.4
/var/mysql-data/mysql/event.MYI
/var/mysql-data/mysql/innodb_table_stats.frm
/var/mysql-data/mysql/help_relation.MYI
/var/mysql-data/mysql/gtid_slave_pos.frm
/var/mysql-data/mysql/help_relation.frm
/var/mysql-data/mysql/help_relation.MYD
/var/mysql-data/mysql/event.frm
/var/mysql-data/mysql/event.MYD
/var/mysql-logs/ib_logfile2
/var/mysql-logs/ib_logfile5
/var/mysql-logs/ib_logfile1
/var/mysql-logs/ib_logfile6
++ parse_cnf --mysqld log-bin ''
++ local group=--mysqld
++ local var=log-bin
++ local reval=
+++ /usr/local/bin/my_print_defaults --defaults-extra-file=/usr/local/etc/mysql/my.cnf --mysqld
+++ awk -v var=log-bin 'BEGIN { OFS=FS="=" } { gsub(/_/,"-",$1); if ( $1=="--"var) lastval=substr($0,length($1)+2) } END { print lastval}'
++ reval=
++ '[' -z '' ']'
++ '[' -n '' ']'
++ echo
+ tempdir=
+ [[ -n '' ]]
+ TDATA=/var/mysql-data/
+ DATA=/var/mysql-data//.sst
+ MAGIC_FILE=/var/mysql-data//.sst/xtrabackup_galera_info
+ wsrep_log_info 'Waiting for SST streaming to complete!'
+ wsrep_log '[INFO] Waiting for SST streaming to complete!'
++ date '+%Y%m%d %H:%M:%S.%N'
++ cut -b -21
+ local readonly 'tst=20181114 11:31:31.N'
+ echo 'WSREP_SST: [INFO] Waiting for SST streaming to complete! (20181114 11:31:31.N)'
WSREP_SST: [INFO] Waiting for SST streaming to complete! (20181114 11:31:31.N)
+ monitor_process 44007
+ local sst_stream_pid=44007
+ true
+ ps --pid 43794
+ wsrep_log_error 'Parent mysqld process (PID:43794) terminated unexpectedly.'
+ wsrep_log '[ERROR] Parent mysqld process (PID:43794) terminated unexpectedly.'
++ date '+%Y%m%d %H:%M:%S.%N'
++ cut -b -21
+ local readonly 'tst=20181114 11:31:31.N'
+ echo 'WSREP_SST: [ERROR] Parent mysqld process (PID:43794) terminated unexpectedly. (20181114 11:31:31.N)'
WSREP_SST: [ERROR] Parent mysqld process (PID:43794) terminated unexpectedly. (20181114 11:31:31.N)
+ kill -- -43794
/usr/local//bin/wsrep_sst_xtrabackup-v2: line 813: kill: (-43794) - No such process
+ cleanup_joiner
+ local estatus=1
+ [[ 1 -ne 0 ]]
+ wsrep_log_error 'Cleanup after exit with status:1'
+ wsrep_log '[ERROR] Cleanup after exit with status:1'
++ date '+%Y%m%d %H:%M:%S.%N'
++ cut -b -21
+ local readonly 'tst=20181114 11:31:31.N'
+ echo 'WSREP_SST: [ERROR] Cleanup after exit with status:1 (20181114 11:31:31.N)'
WSREP_SST: [ERROR] Cleanup after exit with status:1 (20181114 11:31:31.N)
+ [[ -n '' ]]
+ [[ -n /tmp/tmp.wse8Kauv ]]
+ [[ -d /tmp/tmp.wse8Kauv ]]
+ rm -rf /tmp/tmp.wse8Kauv
++ ps -o pgid= 43797
++ grep -o '[0-9]*'
+ pgid=43253
+ [[ 43797 -eq 43253 ]]
+ exit 1

Comment by Elena Stepanova [ 2018-11-14 ]

It is not the server log, it is output of some script.
Please provide the server error log.

Comment by TAO ZHOU [ 2018-11-14 ]

In my case, all the logs are in the same file /var/log/mysql/error.log.
There's no server logs.

Comment by Elena Stepanova [ 2018-11-14 ]

Server error log might be written in syslog or alike, if that's how your instance is configured.
But if your server writes into /var/log/mysql/error.log and also some of your scripts write into the same location, then of course the server log might get re-written, which will make problem investigation impossible.

Comment by TAO ZHOU [ 2018-11-15 ]

First of all, I can confirm that galera wsre_sst scripts wouldn't overwrite mysqld logs.
I doubt that there was no logs generated from mysqld at all.

I tried to start mysqld in the foreground to see if I can get some log from the console and I see "abort".

/usr/local/libexec/mysqld '--defaults-extra-file=/usr/local/etc/mysql/my.cnf' '--basedir=/usr/local' '--datadir=/var/mysql-data' '--plugin-dir=/usr/local/lib/mysql/plugin' '--user=mysql' '--wsrep_provider=/usr/local/lib/libgalera_smm.so' '--wsrep_provider=/usr/local/lib/libgalera_smm.so' '--log-error=/var/log/mysql/error.log' '--pid-file=/var/mysql-data/mariadb01.pid' '--socket=/tmp/mysql.sock' '--port=3306'
2018-11-15  9:53:31 34426937344 [Note] /usr/local/libexec/mysqld (mysqld 10.2.18-MariaDB-log) starting as process 77315 ...
Abort

In the system log /var/log/messages:

kernel: pid 77315 (mysqld), uid 88: exited on signal 6

mysqld process id is 77315, which is '--parent' argument passed to sst script wsrep_sst_xtrabackup v2

ps auxwww|grep wsrep_sst
mysql  77525   0.0  0.0    7792   4052  0  I+   09:53       0:00.00 /usr/local/bin/bash -ue /usr/local//bin/wsrep_sst_xtrabackup-v2 --role joiner --address x.x.x.x  --datadir /var/mysql-data/ --defaults-extra-file /usr/local/etc/mysql/my.cnf --parent 77315

Is there a timeout in mysql configuration for galera SST?

Comment by Elena Stepanova [ 2018-11-15 ]

Regarding Xtrabackup-based SST, please note that Xtrabackup is known not to work properly with MariaDB 10.2 and higher:
https://mariadb.com/kb/en/meta/xtrabackup_warning/

Comment by Jan Lindström (Inactive) [ 2021-12-23 ]

xtrabackup is not recommended tool for backups in 10.2+

Generated at Thu Feb 08 08:38:28 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.