[MDEV-26019] Upgrading MariaDB from 10.5.10 to 10.5.11 breaks TLS mariabackup SST Created: 2021-06-24  Updated: 2021-12-06  Resolved: 2021-06-26

Status: Closed
Project: MariaDB Server
Component/s: Galera SST, mariabackup, wsrep
Affects Version/s: 10.6.2, 10.2.39, 10.3.30, 10.4.20, 10.5.11
Fix Version/s: 10.2.40, 10.3.31, 10.4.21, 10.5.12, 10.6.3

Type: Bug Priority: Critical
Reporter: Matthew Latin Assignee: Julius Goryavsky
Resolution: Fixed Votes: 0
Labels: galera, replication
Environment:

Linux vc-galera01 5.4.114-1-pve #1 SMP PVE 5.4.114-1 (Sun, 09 May 2021 17:13:05 +0200) x86_64 x86_64 x86_64 GNU/Linux

mysql Ver 15.1 Distrib 10.5.11-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2


Issue Links:
Duplicate
is duplicated by MDEV-26117 Typo in wsrep_sst_mariabackup script ... Closed
Relates
relates to MDEV-26360 Using hostnames for MariaBackup SSTs ... Closed

 Description   

The upgrade from MariaDB from 10.5.10 to 10.5.11 breaks the wsrep_sst_mariabackup script. This is due to adding the commonname option to socat which at least on my end defaults to "localhost". There is also a typo (E and S swapped in ESCAPED) on line 389 that most likely is breaking something as well.

elif is_local_ip "$WSREP_SST_OPT_HOST_UNESCAPED"; then
    CN_option=',commonname=localhost'
else
    CN_option=",commonname='$WSREP_SST_OPT_HOST_UNSECAPED'" <- Right here
fi

To just get the node up I had to make the following change on line 391 of wsrep_sst_mariabackup, which also triggered a very inconvenient SST.

#tcmd="$tcmd,cert='$tpem',key='$tkey',cafile='$tcert'$CN_option$sockopt"
tcmd="$tcmd,cert='$tpem',key='$tkey',cafile='$tcert'$sockopt"

Below is my log that led me to looking at the differences between the two versions:

Jun 24 15:05:03 node1 -wsrep-sst-joiner: Decrypting with cert=/etc/mysql/certs/server-cert.pem, key=/etc/mysql/certs/server-key.pem, cafile=/etc/mysql/certs/ca.pem
Jun 24 15:05:03 node1 -wsrep-sst-joiner: Evaluating timeout -k 310 300 socat -u openssl-listen:4444,reuseaddr,cert='/etc/mysql/certs/server-cert.pem',key='/etc/mysql/certs/server-key.pem',cafile='/etc/mysql/certs/ca.pem',commonname=localhost stdio | '/usr//bin/mbstream' -x; RC=( ${PIPESTATUS[@]} )
Jun 24 15:05:03 node1 mariadbd[9378]: 2021-06-24 15:05:03 1 [Note] WSREP: ####### IST uuid:9795eb17-c967-11eb-896e-32dd10aa7427 f: 628742, l: 628743, STRv: 3
Jun 24 15:05:03 node1 mariadbd[9378]: 2021-06-24 15:05:03 1 [Note] WSREP: IST receiver addr using ssl://node1.mycompany.com:4568
Jun 24 15:05:03 node1 mariadbd[9378]: 2021-06-24 15:05:03 1 [Note] WSREP: IST receiver using ssl
Jun 24 15:05:03 node1 mariadbd[9378]: 2021-06-24 15:05:03 1 [Note] WSREP: Prepared IST receiver for 628742-628743, listening at: ssl://node1-ip:4568
Jun 24 15:05:03 node1 mariadbd[9378]: 2021-06-24 15:05:03 0 [Note] WSREP: Member 2.0 (node1) requested state transfer from '*any*'. Selected 1.0 (node3)(SYNCED) as donor.
Jun 24 15:05:03 node1 mariadbd[9378]: 2021-06-24 15:05:03 0 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 628743)
Jun 24 15:05:03 node1 mariadbd[9378]: 2021-06-24 15:05:03 1 [Note] WSREP: Requesting state transfer: success, donor: 1
Jun 24 15:05:05 node1 mariadbd[9378]: 2021-06-24 15:05:05 0 [Note] WSREP: (75d699bd-98d2, 'ssl://0.0.0.0:4567') turning message relay requesting off
Jun 24 15:05:06 node1 -wsrep-sst-joiner: 2021/06/24 15:05:06 socat[9577] E certificate is valid but its commonName does not match hostname
Jun 24 15:05:06 node1 -wsrep-sst-joiner: Error while getting data from donor node:  exit codes: 1 0
Jun 24 15:05:06 node1 -wsrep-sst-joiner: Cleanup after exit with status:32
Jun 24 15:05:06 node1 -wsrep-sst-joiner: Removing the sst_in_progress file
Jun 24 15:05:06 node1 -wsrep-sst-joiner: Cleaning up temporary directories
Jun 24 15:05:06 node1 mariadbd[9378]: 2021-06-24 15:05:06 0 [ERROR] WSREP: Process completed with error: wsrep_sst_mariabackup --role 'joiner' --address 'node1.mycompany.com' --datadir '/var/lib/mysql/' --parent '9378' --mysqld-args --wsrep_start_position=9795eb17-c967-11eb-896e-32dd10aa7427:628741: 32 (Broken pipe)
Jun 24 15:05:06 node1 mariadbd[9378]: 2021-06-24 15:05:06 0 [ERROR] WSREP: Failed to read uuid:seqno and wsrep_gtid_domain_id from joiner script.



 Comments   
Comment by Julius Goryavsky [ 2021-06-25 ]

https://github.com/MariaDB/server/commit/4ad148b148cfbb6f78b33ad9a7662f47c24cb759 and https://buildbot.askmonty.org/buildbot/grid?category=main&branch=bb-10.2-MDEV-26019-galera

Comment by Jan Lindström (Inactive) [ 2021-06-25 ]

Ok to push if buildbot is happy

Comment by Julius Goryavsky [ 2021-06-26 ]

Fixed:
10.2: https://github.com/MariaDB/server/commit/4ad148b148cfbb6f78b33ad9a7662f47c24cb759
10.3: https://github.com/MariaDB/server/commit/29098083f7ac3b445ee59c3e765eb634ec70b947

Comment by Felix Huettner [ 2021-08-19 ]

Hello everyone,

I think the fix does not really cover the full issue. While the typo is gone now (which allows the donor to actually start up) the joiner still has the issue that socat is started with `commonname=localhost`. This will cause SST to fail as then the joiner will not validate the certificate of the donor correctly (as the certificate definately not matches localhost).

This seems to be introduced here: https://github.com/MariaDB/server/commit/fe7e44d8ad5d7fe9c91f476353a3e1749f18afc6?branch=fe7e44d8ad5d7fe9c91f476353a3e1749f18afc6&diff=split#diff-1f9bb0e7c32584ac58bd554eeb3bb5f5f69b9310e7566d7566e71725926503dbR353 (in the diff of scripts/wsrep_sst_mariabackup.sh). Here the change removes the previous different behaviour between donor and joiner (where only the donor actually gets `commonname` set) and requires the common name for both the donor and the joiner.

It is using the variable `WSREP_SST_OPT_HOST_UNESCAPED` for that which is always the hostname/ip of the joining node. Therefor the check here (https://github.com/MariaDB/server/blob/d1a948cfaaab67e699674af4c11efad3868a629d/scripts/wsrep_sst_mariabackup.sh#L387) reports for the joiner that it in fact is the local node and thereby sets `commonname=localhost`.

To fix this i would propose to not append `$CN_option` at https://github.com/MariaDB/server/blob/d1a948cfaaab67e699674af4c11efad3868a629d/scripts/wsrep_sst_mariabackup.sh#L392 if `$WSREP_SST_OPT_ROLE = 'joiner'`.

Thank you

Comment by Julius Goryavsky [ 2021-12-06 ]

felix.huettner@mail.schwarz Thanks for the comment, this change was added to MDEV-26360

Generated at Thu Feb 08 09:42:08 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.