[MDEV-13329] node cannot join cluster if being monitored by maxscale Created: 2017-07-15  Updated: 2018-04-27  Resolved: 2018-04-27

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.1, 10.2, 10.3
Fix Version/s: 10.1.33, 10.2.15, 10.3.7

Type: Bug Priority: Major
Reporter: Andrii Nikitin (Inactive) Assignee: Jan Lindström (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Blocks
blocks MDEV-14069 galera_sst_mysqldump.test fails with:... Closed
Sprint: 10.2.11

 Description   

When MaxScale is monitoring all 4 nodes:

$ cluster1/galera_cluster_size.sh 
m0 :wsrep_cluster_size 4
m1 :wsrep_cluster_size 2
m2 :wsrep_cluster_size 3
m3 :wsrep_cluster_size 4
$ cluster1/sql.sh 'select count(*) from mysql.user'
m0 :8
m1 :ERROR 1047 (08S01) at line 1: WSREP has not yet prepared node for application use
 
m2 :ERROR 1047 (08S01) at line 1: WSREP has not yet prepared node for application use
 
m3 :ERROR 1047 (08S01) at line 1: WSREP has not yet prepared node for application use

When MaxScale is monitoring only one node (m2):

$ cluster1/galera_cluster_size.sh 
m0 :wsrep_cluster_size 4
m1 :wsrep_cluster_size 4
m2 :wsrep_cluster_size 3
m3 :wsrep_cluster_size 4
$ cluster1/sql.sh 'select count(*) from mysql.user'
m0 :8
m1 :8
m2 :ERROR 1047 (08S01) at line 1: WSREP has not yet prepared node for application use
 
m3 :8



 Comments   
Comment by Andrii Nikitin (Inactive) [ 2017-07-15 ]

EDIT: see also alternative instructions in later comments.
Use following script to setup environment :

set -e
MDBVER=${ENVIRON:-10.2.6}
 
# just use current directory if called from framework
if [ ! -f common.sh ] ; then
  [ -d mariadb-environs ] || git clone http://github.com/AndriiNikitin/mariadb-environs
  cd mariadb-environs
fi
 
./get_plugin.sh galera 
# set up cluster
_template/plant_cluster.sh cluster1
echo m0 > cluster1/nodes.lst
echo m1 >> cluster1/nodes.lst
echo m2 >> cluster1/nodes.lst
echo m3 >> cluster1/nodes.lst
cluster1/replant.sh ${MDBVER}
 
# download tar if needed
./build_or_download.sh m0
# workaround MDEV-13283
[ ! -d _depot/m-tar/${MDBVER} ] || \
  sed -i "s/Distrib 10.1/Distrib 10/g" _depot/m-tar/${MDBVER}/bin/wsrep_sst_mysqldump

Commands below will initialize and verify four nodes cluster on local machine on ports 3306 - 3309

cluster1/cleanup.sh
cluster1/gen_cnf.sh general_log=1
cluster1/install_db.sh
cluster1/galera_setup_acl.sh
cluster1/galera_start_new.sh
# sleep a while to let sst finish or let nodes shut down if initialization failed
sleep 45
cluster1/sql.sh select 1 from mysql.user limit 1
cluster1/galera_cluster_size.sh

Last command will show wsrep_cluster_size on each node. When it is 4 on every node, it should be a sign that cluster has been initialized properly.
Optionally - select from any table of the nodes and observe no error:

mysql --defaults-file=m0-${MDBVER}/my.cnf -e 'select 1 from mysql.user limit 1'
mysql --defaults-file=m1-${MDBVER}/my.cnf -e 'select 1 from mysql.user limit 1'
mysql --defaults-file=m2-${MDBVER}/my.cnf -e 'select 1 from mysql.user limit 1'
mysql --defaults-file=m3-${MDBVER}/my.cnf -e 'select 1 from mysql.user limit 1'

Optionally try whole sequence several times to confirm that it works reliably.
Now configure MaxScale to monitor one or more ports 3307 3308 3309 on 127.0.0.1 ( user/password : galera/galera ) e.g. with following maxscale.cnf :

[server1]
type=server
address=127.0.0.1
port=3307
protocol=MySQLBackend
 
[MySQL Monitor]
type=monitor
module=galeramon
servers=server1
user=galera
passwd=galera
monitor_interval=500
 
[Read-Write Service]
type=service
router=readwritesplit
servers=server1
user=galera
passwd=galera
 
[Read-Write Listener]
type=listener
service=Read-Write Service
protocol=MySQLClient
port=4006

Now the sequence of cluster commands will leave monitored nodes in state 'WSREP has not yet prepared node for application use' forever, while the other ones will work properly. In particular, with example cnf log above output will be:

$ cluster1/sql.sh select 1 from mysql.user limit 1
m0 :1
m1 :ERROR 1047 (08S01) at line 1: WSREP has not yet prepared node for application use
 
m2 :1
m3 :1
$ cluster1/galera_cluster_size.sh
m0 :wsrep_cluster_size 4
m1 :wsrep_cluster_size 2
m2 :wsrep_cluster_size 4
m3 :wsrep_cluster_size 4

General log has only commands like 'SHOW STATUS' and I wasn't able reproduce the problem when executing such commands against nodes directly (i.e. without MaxScale).

Comment by Andrii Nikitin (Inactive) [ 2017-11-01 ]

I am able to reliably reproduce the problem on system with 10.0 Galera packages installed, which has suitable maxscale packages from list https://downloads.mariadb.com/MaxScale/2.1.10/ . (E.g. xenial)
Using this script https://github.com/AndriiNikitin/bugs/blob/master/MDEV-13329.sh I confirm that following patch solves the problem:

--- a/sql/wsrep_mysqld.cc
+++ b/sql/wsrep_mysqld.cc
@@ -285,7 +285,7 @@ wsrep_view_handler_cb (void*                    app_ctx,
     if (!wsrep_before_SE())
     {
         WSREP_DEBUG("[debug]: closing client connections for PRIM");
-        wsrep_close_client_connections(TRUE);
+        wsrep_close_client_connections(FALSE);
     }
 
     ssize_t const req_len= wsrep_sst_prepare (sst_req);

output before patch is always like below:

$ bash ~/bugs/MDEV-13329.sh
Already up-to-date.
Already up-to-date.
GENERATE TEMPLATES
Process 17861 still exists, sleeping 1 sec
Process 18115 still exists, sleeping 1 sec
 
 
Shutting down MaxScale
 
DOWNLOAD PACKAGES (if needed)
GENERATE CONFIG FILES
m7 :
m8 :
INITIALIZE DATADIRs
m7 :171101 11:26:00 [Note] /usr/local/mysql/bin/mysqld (mysqld 10.0.33-MariaDB-wsrep) starting as process 20866 ...
171101 11:26:07 [Note] /usr/local/mysql/bin/mysqld (mysqld 10.0.33-MariaDB-wsrep) starting as process 20893 ...
calling /usr/local/mysql/scripts/mysql_install_db Installing MariaDB/MySQL system tables in '/home/a/env1/m7-system2/dt' ... OK Filling help tables... OK To start mysqld at boot time you have to copy support-files/mysql.server to the right place for your system PLEASE REMEMBER TO SET A PASSWORD FOR THE MariaDB root USER ! To do so, start the server, then issue the following commands: '/usr/local/mysql/bin/mysqladmin' -u root password 'new-password' '/usr/local/mysql/bin/mysqladmin' -u root -h UBINTI password 'new-password' Alternatively you can run: '/usr/local/mysql/bin/mysql_secure_installation' which will also give you the option of removing the test databases and anonymous user created by default. This is strongly recommended for production servers. See the MariaDB Knowledgebase at http://mariadb.com/kb or the MySQL manual for more instructions. You can start the MariaDB daemon with: cd '/usr/local/mysql' ; /usr/local/mysql/bin/mysqld_safe --datadir='/home/a/env1/m7-system2/dt' You can test the MariaDB daemon with mysql-test-run.pl cd '/usr/local/mysql/mysql-test' ; perl mysql-test-run.pl Please report any problems at http://mariadb.org/jira The latest information about MariaDB is available at http://mariadb.org/. You can find additional information about the MySQL part at: http://dev.mysql.com Consider joining MariaDB's strong and vibrant community: https://mariadb.org/get-involved/
m8 :171101 11:26:12 [Note] /usr/local/mysql/bin/mysqld (mysqld 10.0.33-MariaDB-wsrep) starting as process 20959 ...
171101 11:26:20 [Note] /usr/local/mysql/bin/mysqld (mysqld 10.0.33-MariaDB-wsrep) starting as process 20988 ...
calling /usr/local/mysql/scripts/mysql_install_db Installing MariaDB/MySQL system tables in '/home/a/env1/m8-system2/dt' ... OK Filling help tables... OK To start mysqld at boot time you have to copy support-files/mysql.server to the right place for your system PLEASE REMEMBER TO SET A PASSWORD FOR THE MariaDB root USER ! To do so, start the server, then issue the following commands: '/usr/local/mysql/bin/mysqladmin' -u root password 'new-password' '/usr/local/mysql/bin/mysqladmin' -u root -h UBINTI password 'new-password' Alternatively you can run: '/usr/local/mysql/bin/mysql_secure_installation' which will also give you the option of removing the test databases and anonymous user created by default. This is strongly recommended for production servers. See the MariaDB Knowledgebase at http://mariadb.com/kb or the MySQL manual for more instructions. You can start the MariaDB daemon with: cd '/usr/local/mysql' ; /usr/local/mysql/bin/mysqld_safe --datadir='/home/a/env1/m8-system2/dt' You can test the MariaDB daemon with mysql-test-run.pl cd '/usr/local/mysql/mysql-test' ; perl mysql-test-run.pl Please report any problems at http://mariadb.org/jira The latest information about MariaDB is available at http://mariadb.org/. You can find additional information about the MySQL part at: http://dev.mysql.com Consider joining MariaDB's strong and vibrant community: https://mariadb.org/get-involved/
STARTUP SERVERS TO SETUP ACL
m7 :calling /usr/local/mysql/bin/mysqld_safe 171101 11:26:24 mysqld_safe Logging to '/home/a/env1/m7-system2/dt/error.log'. 171101 11:26:24 mysqld_safe Starting mysqld daemon with databases from /home/a/env1/m7-system2/dt mysqld is alive
m8 :calling /usr/local/mysql/bin/mysqld_safe 171101 11:26:28 mysqld_safe Logging to '/home/a/env1/m8-system2/dt/error.log'. 171101 11:26:28 mysqld_safe Starting mysqld daemon with databases from /home/a/env1/m8-system2/dt mysqld is alive
m7 :
m8 :
STARTUP Galera and MaxScale
m7 :calling /usr/local/mysql/bin/mysqld_safe
171101 11:26:47 mysqld_safe Logging to '/home/a/env1/m7-system2/dt/error.log'.
171101 11:26:47 mysqld_safe Starting mysqld daemon with databases from /home/a/env1/m7-system2/dt
mysqld is alive
m8 :calling /usr/local/mysql/bin/mysqld_safe
171101 11:26:51 mysqld_safe Logging to '/home/a/env1/m8-system2/dt/error.log'.
171101 11:26:51 mysqld_safe Starting mysqld daemon with databases from /home/a/env1/m8-system2/dt
mysqld is alive
CREATING TABLE in MAXSCALE
MONITOR Nodes' OUTPUT
m7 :5
m8 :ERROR 1047 (08S01) at line 1: WSREP has not yet prepared node for application use
 
m7 :wsrep_cluster_size 2
m8 :wsrep_cluster_size 2
m7 :5
m8 :ERROR 1047 (08S01) at line 1: WSREP has not yet prepared node for application use
 
m7 :wsrep_cluster_size 2
m8 :wsrep_cluster_size 2
m7 :5
m8 :ERROR 1047 (08S01) at line 1: WSREP has not yet prepared node for application use
 
m7 :wsrep_cluster_size 2
m8 :wsrep_cluster_size 2
m7 :5
m8 :ERROR 1047 (08S01) at line 1: WSREP has not yet prepared node for application use
 
m7 :wsrep_cluster_size 2
m8 :wsrep_cluster_size 2
m7 :5
m8 :ERROR 1047 (08S01) at line 1: WSREP has not yet prepared node for application use
 
m7 :wsrep_cluster_size 2
m8 :wsrep_cluster_size 2
m7 :5
m8 :ERROR 1047 (08S01) at line 1: WSREP has not yet prepared node for application use
 
m7 :wsrep_cluster_size 2
m8 :wsrep_cluster_size 2

Output after patch:

$ bash ~/bugs/MDEV-13329.sh
Already up-to-date.
Already up-to-date.
GENERATE TEMPLATES
Process 21818 still exists, sleeping 1 sec
Process 22073 still exists, sleeping 1 sec
 
 
Shutting down MaxScale
 
DOWNLOAD PACKAGES (if needed)
GENERATE CONFIG FILES
m7 :
m8 :
INITIALIZE DATADIRs
m7 :171101 11:30:48 [Note] /usr/local/mysql/bin/mysqld (mysqld 10.0.33-MariaDB-wsrep) starting as process 24883 ...
171101 11:31:01 [Note] /usr/local/mysql/bin/mysqld (mysqld 10.0.33-MariaDB-wsrep) starting as process 24910 ...
calling /usr/local/mysql/scripts/mysql_install_db Installing MariaDB/MySQL system tables in '/home/a/env1/m7-system2/dt' ... OK Filling help tables... OK To start mysqld at boot time you have to copy support-files/mysql.server to the right place for your system PLEASE REMEMBER TO SET A PASSWORD FOR THE MariaDB root USER ! To do so, start the server, then issue the following commands: '/usr/local/mysql/bin/mysqladmin' -u root password 'new-password' '/usr/local/mysql/bin/mysqladmin' -u root -h UBINTI password 'new-password' Alternatively you can run: '/usr/local/mysql/bin/mysql_secure_installation' which will also give you the option of removing the test databases and anonymous user created by default. This is strongly recommended for production servers. See the MariaDB Knowledgebase at http://mariadb.com/kb or the MySQL manual for more instructions. You can start the MariaDB daemon with: cd '/usr/local/mysql' ; /usr/local/mysql/bin/mysqld_safe --datadir='/home/a/env1/m7-system2/dt' You can test the MariaDB daemon with mysql-test-run.pl cd '/usr/local/mysql/mysql-test' ; perl mysql-test-run.pl Please report any problems at http://mariadb.org/jira The latest information about MariaDB is available at http://mariadb.org/. You can find additional information about the MySQL part at: http://dev.mysql.com Consider joining MariaDB's strong and vibrant community: https://mariadb.org/get-involved/
m8 :171101 11:31:05 [Note] /usr/local/mysql/bin/mysqld (mysqld 10.0.33-MariaDB-wsrep) starting as process 24976 ...
171101 11:31:15 [Note] /usr/local/mysql/bin/mysqld (mysqld 10.0.33-MariaDB-wsrep) starting as process 25003 ...
calling /usr/local/mysql/scripts/mysql_install_db Installing MariaDB/MySQL system tables in '/home/a/env1/m8-system2/dt' ... OK Filling help tables... OK To start mysqld at boot time you have to copy support-files/mysql.server to the right place for your system PLEASE REMEMBER TO SET A PASSWORD FOR THE MariaDB root USER ! To do so, start the server, then issue the following commands: '/usr/local/mysql/bin/mysqladmin' -u root password 'new-password' '/usr/local/mysql/bin/mysqladmin' -u root -h UBINTI password 'new-password' Alternatively you can run: '/usr/local/mysql/bin/mysql_secure_installation' which will also give you the option of removing the test databases and anonymous user created by default. This is strongly recommended for production servers. See the MariaDB Knowledgebase at http://mariadb.com/kb or the MySQL manual for more instructions. You can start the MariaDB daemon with: cd '/usr/local/mysql' ; /usr/local/mysql/bin/mysqld_safe --datadir='/home/a/env1/m8-system2/dt' You can test the MariaDB daemon with mysql-test-run.pl cd '/usr/local/mysql/mysql-test' ; perl mysql-test-run.pl Please report any problems at http://mariadb.org/jira The latest information about MariaDB is available at http://mariadb.org/. You can find additional information about the MySQL part at: http://dev.mysql.com Consider joining MariaDB's strong and vibrant community: https://mariadb.org/get-involved/
STARTUP SERVERS TO SETUP ACL
m7 :calling /usr/local/mysql/bin/mysqld_safe 171101 11:31:19 mysqld_safe Logging to '/home/a/env1/m7-system2/dt/error.log'. 171101 11:31:19 mysqld_safe Starting mysqld daemon with databases from /home/a/env1/m7-system2/dt mysqld is alive
m8 :calling /usr/local/mysql/bin/mysqld_safe 171101 11:31:23 mysqld_safe Logging to '/home/a/env1/m8-system2/dt/error.log'. 171101 11:31:23 mysqld_safe Starting mysqld daemon with databases from /home/a/env1/m8-system2/dt mysqld is alive
m7 :
m8 :
STARTUP Galera and MaxScale
m7 :calling /usr/local/mysql/bin/mysqld_safe
171101 11:31:43 mysqld_safe Logging to '/home/a/env1/m7-system2/dt/error.log'.
171101 11:31:43 mysqld_safe Starting mysqld daemon with databases from /home/a/env1/m7-system2/dt
mysqld is alive
m8 :calling /usr/local/mysql/bin/mysqld_safe
171101 11:31:47 mysqld_safe Logging to '/home/a/env1/m8-system2/dt/error.log'.
171101 11:31:47 mysqld_safe Starting mysqld daemon with databases from /home/a/env1/m8-system2/dt
mysqld is alive
CREATING TABLE in MAXSCALE
MONITOR Nodes' OUTPUT
m7 :5
m8 :5
m7 :wsrep_cluster_size 2
m8 :wsrep_cluster_size 2
m7 :5
m8 :5
m7 :wsrep_cluster_size 2
m8 :wsrep_cluster_size 2
m7 :5
m8 :5
m7 :wsrep_cluster_size 2
m8 :wsrep_cluster_size 2
m7 :5
m8 :5
m7 :wsrep_cluster_size 2
m8 :wsrep_cluster_size 2

sachin.setiya.007 seppo could you confirm that the patch above is reasonable? Node is ongoing sst, so no need to try wait forever until all clients are gracefully disconnected.
I believe current galera_sst_mysqldump test is suffering the same problem, where Node is hopelessly waiting mtr to disconnect - see stack in MDEV-14069

Comment by Andrii Nikitin (Inactive) [ 2017-11-01 ]

sachin.setiya.007 please review one line patch in previous comment

Comment by markus makela [ 2017-11-01 ]

Workaround would be to stop the monitor in MaxScale to force the closing of the connections.

Comment by Andrii Nikitin (Inactive) [ 2017-11-14 ]

The problem happens because the node waits all connections to gracefully disconnect before starting sst.
So if some existing connection is just idle - sst will never initialize.

E.g. following command will attempt to connect and then just remain idle. If command like this runs (e.g. by some monitoring software or just broken connection waits for some timeout to be detected) on joining node, then node remains in 'Ininitalized' state.

( while : ; do mysql -e 'show variables like "wsrep_on"; system sleep 10000' 2>>log.log || : ; done ) &

See also https://github.com/AndriiNikitin/bugs/blob/master/MDEV-13329-simple.sh . which gives in result:

m1 :wsrep_cluster_size 3
m2 :wsrep_cluster_size 2
m3 :wsrep_cluster_size 3
m1 :wsrep_local_state_comment Synced
m2 :wsrep_local_state_comment Initialized
m3 :wsrep_local_state_comment Synced
m1 :wsrep_local_state_comment Synced
m2 :wsrep_local_state_comment Initialized
m3 :wsrep_local_state_comment Synced
m1 :wsrep_local_state_comment Synced
m2 :wsrep_local_state_comment Initialized
m3 :wsrep_local_state_comment Synced
m1 :wsrep_local_state_comment Synced
m2 :wsrep_local_state_comment Initialized
m3 :wsrep_local_state_comment Synced
...

Generated at Thu Feb 08 08:04:44 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.