[MDEV-14063] mariadb102-server 10.2.9 cannot startup with galera Created: 2017-10-13  Updated: 2017-11-03  Resolved: 2017-11-03

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.2.9
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: TAO ZHOU Assignee: Andrii Nikitin (Inactive)
Resolution: Duplicate Votes: 0
Labels: None
Environment:

FreeBSD 11.1


Issue Links:
Duplicate
duplicates MDEV-13950 mysqld_safe could not start Galera no... Closed
Problem/Incident
is caused by MDEV-10767 /tmp/wsrep_recovery.${RANDOM} file cr... Closed

 Description   

I am running galera multi-master. After upgrading one node from 10.2.7_1 to 10.2.9. I couldn't start it. When I run bash -x /usr/local/etc/rc.d/mysql-server start, I found that mysqld_safe is looking for "WSREP: Running position" in wsrep_recovery.XXXXXX. Not sure if this is related to MDEV-13950.

I compared mysqld_safe with another machine which is running 10.2.7_1.

--- /usr/local/bin/mysqld_safe	2017-09-12 23:14:57.000000000 +1000
+++ mysqld_safe.10.2.9	2017-10-13 15:17:02.066860000 +1100
@@ -245,7 +245,7 @@
   local euid=$(id -u)
   local ret=0
 
-  local wr_logfile=$(mktemp $DATADIR/wsrep_recovery.XXXXXX)
+  local wr_logfile=$(mktemp wsrep_recovery.XXXXXX)
 
   # safety checks
   if [ -z $wr_logfile ]; then
@@ -263,11 +263,11 @@
 
   local wr_pidfile="$DATADIR/"`hostname`"-recover.pid"
 
-  local wr_options="--log_error='$wr_logfile' --pid-file='$wr_pidfile'"
+  local wr_options="--disable-log-error  --pid-file='$wr_pidfile'"
 
   log_notice "WSREP: Running position recovery with $wr_options"
 
-  eval_log_error "$mysqld_cmd --wsrep_recover $wr_options"
+  eval_log_error "$mysqld_cmd --wsrep_recover $wr_options 2> $wr_logfile"
 
   local rp="$(grep 'WSREP: Recovered position:' $wr_logfile)"
   if [ -z "$rp" ]; then

So I replaced mysqld_safe with version 10.2.7_1 and now I can start it up.
I don't see mysqld_safe from 10.2.7_1 has any problems. Why change it in 10.2.9?



 Comments   
Comment by TAO ZHOU [ 2017-10-13 ]

I can only start it with /usr/local/bin/mysqld_safe from command line. Still couldn't start with 'service mysql-server start'.

Comment by Andrii Nikitin (Inactive) [ 2017-10-13 ]

In addition to mentioned mysqld_safe patch, 10.2.9 may have sst scripts problems . Could you check in joiner node if sst was used and if it was rsync or some other ? In any case we need to see error log from both joiner and donor to assist

Comment by TAO ZHOU [ 2017-10-16 ]

There are no logs on the donor side because the joiner exited before it started to join.
On the joiner side:

2017-10-13 12:52:34 34426937344 [Note] /usr/local/libexec/mysqld (mysqld 10.2.9-MariaDB-log) starting as process 7117 ...
2017-10-13 12:52:34 34426937344 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2017-10-13 12:52:34 34426937344 [Note] InnoDB: Uses event mutexes
2017-10-13 12:52:34 34426937344 [Note] InnoDB: Compressed tables use zlib 1.2.11
2017-10-13 12:52:34 34426937344 [Note] InnoDB: Number of pools: 1
2017-10-13 12:52:34 34426937344 [Note] InnoDB: Using SSE2 crc32 instructions
2017-10-13 12:52:34 34426937344 [Note] InnoDB: Initializing buffer pool, total size = 12G, instances = 8, chunk size = 128M
2017-10-13 12:52:35 34426937344 [Note] InnoDB: Completed initialization of buffer pool
2017-10-13 12:52:35 34426937344 [Note] InnoDB: Highest supported file format is Barracuda.
2017-10-13 12:52:35 34426937344 [Note] InnoDB: 128 out of 128 rollback segments are active.
2017-10-13 12:52:35 34426937344 [Note] InnoDB: Creating shared tablespace for temporary tables
2017-10-13 12:52:35 34426937344 [Note] InnoDB: Setting file '/var/mysql-data/ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
2017-10-13 12:52:36 34426937344 [Note] InnoDB: File '/var/mysql-data/ibtmp1' size is now 12 MB.
2017-10-13 12:52:36 34426937344 [Note] InnoDB: Waiting for purge to start
2017-10-13 12:52:36 34426937344 [Note] InnoDB: 5.7.19 started; log sequence number 714948452513
2017-10-13 12:52:36 34426937344 [Warning] InnoDB: Skipping buffer pool dump/restore during wsrep recovery.
2017-10-13 12:52:36 34426937344 [Note] Plugin 'FEEDBACK' is disabled.
2017-10-13 12:52:36 34426937344 [Note] Server socket created on IP: '192.168.11.34'.
2017-10-13 12:52:36 34426937344 [Note] WSREP: Recovered position: 7a79d7e6-7ca9-11e7-b980-ef84f0f16abf:52913163
171013 12:56:32 mysqld_safe Starting mysqld daemon with databases from /var/mysql-data
171013 12:56:32 mysqld_safe WSREP: Running position recovery with --disable-log-error --pid-file='/var/mysql-data/db2.ish.com.au-recover.pid'

I am using mariabackup for sst.

Comment by Andrii Nikitin (Inactive) [ 2017-10-31 ]

My understanding is that if you see message "WSREP: Recovered position" in MariaDB Server error log - then problem is related to MDEV-13950 and mysqld_safe wasn't patched properly.
Thus this call will be closed with 'Duplicate' of MDEV-13950 resolution.

Generated at Thu Feb 08 08:10:38 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.