Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.4.3
-
CentOS Linux release 7.6.1810 (Core)
Description
Galera: Rolling upgrade: Upgraded with 10.4 node is stopped with signal 6 on commit being joined to the cluster with not yet upgraded nodes if wsrep_trx_fragment_size > 0.
This issue was discovered on testing of Rolling Upgrade according to "MariaDB 10.4 Cluster Rolling Upgrade - Naive Approach" by Seppo Jaakola: https://docs.google.com/document/d/1z4XTpLpzStWMFaNnrSmiESaIVeCoKhu9Hbb1SrDPf0w
10.4.3-MariaDB-debug built from sources: commit f0b65102b23f006f596eef35e6e5f4f8b6d8146d
galera4 lib: Galera 26.4.0, commit 9cdbeb86c330b808571b14270e6428accb899c58
Steps:
1. Start 3 MariaDB 10.3 nodes with mtr:
1.0. export WSREP_PROVIDER=/usr/lib/libgalera_smm_3.so
1.1. cd mysql-test
1.2. "./mtr --suite=galera_3nodes --start-and-exit"
2. Copy [mysqld.3] group from var/my.cnf (attached my.cnf) into separate configuration file: mysqld.3.cnf (attached mysqld.3.cnf), and make following edits:
2.1. Edit:
wsrep_cluster_address='gcomm://127.0.0.1:16003,127.0.0.1:16006,127.0.0.1:16009'
|
wsrep_provider=<path to galera 4 library>
|
basedir=<10.4 source tree>
|
character-sets-dir=<10.4 source tree>/sql/share/charsets
|
lc-messages-dir=<10.4 source tree>/sql/share/
|
2.2. And add there also:
binlog-format=row
|
wsrep_sst_method=rsync
|
innodb-autoinc-lock-mode=2
|
3.1 Load some data.
3.2. Stop data loading.
4. Upgrade node 3.
4.1 Stop the Server:
/home/stepan/mariadb/10.3/client/mysqladmin -u root shutdown -S /home/stepan/mariadb/10.3/mysql-test/var/tmp/mysqld.3.sock
4.2. Make sure that wsrep-on is off:
sudo vi /home/stepan/mariadb/10.3/mysql-test/var/mysqld.3.cnf
#wsrep-on=1
4.3. Run 10.4 binaries with 10.3 data:
/home/stepan/mariadb/10.4/sql/mysqld --defaults-file=/home/stepan/mariadb/10.3/mysql-test/var/mysqld.3.cnf --wsrep_provider=none
4.4. Run mysql_upgrade:
/home/stepan/mariadb/10.4/client/mysql_upgrade --defaults-file=/home/stepan/mariadb/10.3/mysql-test/var/mysqld.3.cnf -uroot -h0 -P16002
4.5. Stop the Server:
/home/stepan/mariadb/10.3/client/mysqladmin -u root shutdown -S /home/stepan/mariadb/10.3/mysql-test/var/tmp/mysqld.3.sock
4.6. export PATH=$PATH:/home/stepan/mariadb/10.4/scripts
5. Check upgraded node 3 without the cluster.
5.1. Start the server:
/home/stepan/mariadb/10.4/sql/mysqld --defaults-file=/home/stepan/mariadb/10.3/mysql-test/var/mysqld.3.cnf
5.2. Start the client:
/home/stepan/mariadb/10.3/client/mysql -u root -S /home/stepan/mariadb/10.3/mysql-test/var/tmp/mysqld.3.sock
Actual result:
Server version: 10.4.3-MariaDB-debug-log Source distribution
5.3. Stop the Server:
/home/stepan/mariadb/10.3/client/mysqladmin -u root shutdown -S /home/stepan/mariadb/10.3/mysql-test/var/tmp/mysqld.3.sock
6. Join node 3 back to the cluster.
6.1. Add to /home/stepan/mariadb/10.3/mysql-test/var/mysqld.3.cnf:
wsrep-on=1
|
6.2. Start the server:
/home/stepan/mariadb/10.4/sql/mysqld --defaults-file=/home/stepan/mariadb/10.3/mysql-test/var/mysqld.3.cnf
7. Check how streaming replication behaves on partially upgraded cluster.
7.1. Run clients for all three nodes:
/home/stepan/mariadb/10.3/client/mysql -u root -S /home/stepan/mariadb/10.3/mysql-test/var/tmp/mysqld.1.sock
/home/stepan/mariadb/10.3/client/mysql -u root -S /home/stepan/mariadb/10.3/mysql-test/var/tmp/mysqld.2.sock
/home/stepan/mariadb/10.3/client/mysql -u root -S /home/stepan/mariadb/10.3/mysql-test/var/tmp/mysqld.3.sock
7.2. Check with wsrep_trx_fragment_size by default.
7.2.1. On the Node 3:
START TRANSACTION;
|
update t set j = 28700 where i = 287;
|
update t set j = 28900 where i = 289;
|
Actual result:
The rows which have been updated on the node 3 have not been yet updated on the nodes 1 and 2.
7.2.2. On the Node 3:
commit;
|
Actual result:
The rows which have been updated on the node 3 have been updated on the nodes 1 and 2 only after commit!
7.3. Check with wsrep_trx_fragment_size > 0.
7.3.1. Set wsrep_trx_fragment_size > 0 on the Node 3:
SET SESSION wsrep_trx_fragment_size = 1;
|
Query OK, 0 rows affected (0.000 sec)
|
 |
MariaDB [test]> SHOW VARIABLES LIKE 'wsrep_trx%';
|
+-------------------------+-------+
|
| Variable_name | Value |
|
+-------------------------+-------+
|
| wsrep_trx_fragment_size | 1 |
|
| wsrep_trx_fragment_unit | bytes |
|
+-------------------------+-------+
|
7.3.2. On the Node 3:
START TRANSACTION;
|
update t set j = 28300 where i = 283;
|
Actual result:
The row which has been updated on the node 3 has been updated on the nodes 1 and 2 without commit!
7.3.3. On the Node 3:
commit;
|
Actual result:
The node 3 has stopped:
Client:
ERROR 2013 (HY000): Lost connection to MySQL server during query
mysqld.3.err:
190222 20:57:14 [ERROR] mysqld got signal 6 ;
Expected result:
Upgraded node 3 is NOT stopped on commit being joined to the cluster with not yet upgraded nodes if wsrep_trx_fragment_size > 0.
Other log and config files are also attached.
Attachments
Issue Links
- relates to
-
MDEV-18271 Galera 4: test manually rolling upgrade to Server 10.4 + Galera 4
- Closed
-
MDEV-18552 Galera: Rolling upgrade: Assertion `version() >= WriteSetNG::VER5' after setting wsrep_trx_fragment_size
- Closed