[MDEV-27459] SST works as expected on joiner node but donor node never leaves donor state. Created: 2022-01-10  Updated: 2022-06-27  Resolved: 2022-02-08

Status: Closed
Project: MariaDB Server
Component/s: Galera, Galera SST
Affects Version/s: 10.4.21, 10.5.12
Fix Version/s: 10.4.23, 10.5.14, 10.6.6, 10.7.2

Type: Bug Priority: Critical
Reporter: Pon Suresh Pandian (Inactive) Assignee: Jan Lindström (Inactive)
Resolution: Fixed Votes: 5
Labels: None
Environment:

Ubuntu Linux


Attachments: Text File db-prod02.donor.txt     Text File db-prod04.joiner.txt     File node1.err     File node2.err    
Issue Links:
Blocks
blocks MDEV-27789 mysql_upgrade / mariadb-upgrade in 10... Closed
Relates
relates to MDEV-26918 mariabackup SST donor stuck in DONOR/... Closed
relates to MDEV-26969 mariabackup sst donor stuck in Donor/... Closed

 Description   

Hi Team,

Galera donor node left in "donor/desynced" after SST with mariabackup. Here I have attached the logs. Please check it.

After SST the donor node is left in:

MariaDB [(none)]> show status like 'wsrep_local_state%';
+---------------------------+--------------------------------------+
| Variable_name | Value |
+---------------------------+--------------------------------------+
| wsrep_local_state_uuid | abacdfcc-5c70-11ea-b34c-c2bcad908195 |
| wsrep_local_state | 2 |
| wsrep_local_state_comment | Donor/Desynced |
+---------------------------+--------------------------------------+
3 rows in set (0.001 sec)



 Comments   
Comment by Ralf Becker [ 2022-01-11 ]

Happens for me (5 node geo-distributed Galera cluster running 10.5.13) too in roughly 50% of my SSTs.
Killing/restarting the stuck donor brings it back to nor with an IST.

Ralf

Comment by Jean-Louis Dupond [ 2022-01-11 ]

Same happens here on 10.5.12.
Also using mariabackup.

Could this get some priority? As the only fix is restarting which is annoying.

Comment by Jan Lindström (Inactive) [ 2022-01-14 ]

julien.fritsch Based on Seppo's tests this does not look like regression so not a blocker for next release.

Comment by Seppo Jaakola [ 2022-01-17 ]

This was analyzed to be a problem in galera side. A fix has been prepared to both galera 3 and 4 versions, and I cannot reproduce the issue with this test scenario anymore.

Comment by Seppo Jaakola [ 2022-01-17 ]

fixes are merged in galera-bugs 3.x, 4.x and 4.ee HEAD, please confirm if customer's use case is fixed by this

Comment by Jan Lindström (Inactive) [ 2022-01-20 ]

Should be fixed with Galera library 26.4.11 with commit 67341d07

Comment by Ramesh Sivaraman [ 2022-01-20 ]

jplindst bug fix looks good. Donor wsrep state changes correctly in the given test case and in normal SST.

Comment by Jan Lindström (Inactive) [ 2022-01-20 ]

ponsuresh.pandians Can customer test with 26.4.11 Galera library ?

Comment by Lars Mikkelsen [ 2022-01-20 ]

Please provide download link for galera-26.4.11 so I can test this

Newest version I can find on archive.mariadb.org is 26.4.6 as it is packaged even in the newer tar files.

➜  Downloads mdmd5 galera-26.4.9-systemd-x86_64/usr/lib/galera/libgalera_smm.so mariadb-10.4.21-linux-systemd-x86_64/lib/galera/libgalera_smm.so galera-26.4.6-systemd-x86_64/usr/lib/galera/libgalera_smm.so
MD5 (galera-26.4.9-systemd-x86_64/usr/lib/galera/libgalera_smm.so) = a04f49ca276e50bb0a48e7decfeebcff
MD5 (mariadb-10.4.21-linux-systemd-x86_64/lib/galera/libgalera_smm.so) = a04f49ca276e50bb0a48e7decfeebcff
MD5 (galera-26.4.6-systemd-x86_64/usr/lib/galera/libgalera_smm.so) = a04f49ca276e50bb0a48e7decfeebcff

Some of your build rutines may need a checkup.

Comment by Lars Mikkelsen [ 2022-01-20 ]

And in status on a 10.4.21 server its also the reported version:

MariaDB [(none)]> show status like 'wsrep_provider_ver%';
+------------------------+------------------+
| Variable_name          | Value            |
+------------------------+------------------+
| wsrep_provider_version | 26.4.6(r1d8d67c) |
+------------------------+------------------+
1 row in set (0.000 sec)

Comment by Jan Lindström (Inactive) [ 2022-01-26 ]

ponsuresh.pandians You can compile newest Galera library from sources if you need it now. If not we will release MariaDB Community Server soon with new Galera library. This link https://dlm.mariadb.com/browse/mariadb_server/76/1194/ seems to contain 26.4.9. I do not know why your link shows so old library.

Comment by Lars Mikkelsen [ 2022-01-26 ]

It actually dont. Package names are correct and says 26.4.9 but once you unpack you will see the version is only 26.4.6 and from october 2020.

This is all ready reported to and confirmed by your support.

Comment by Jan Lindström (Inactive) [ 2022-01-26 ]

lmk@netic.dk Thank you for your report. I opened https://jira.mariadb.org/browse/TODO-3320 to fix this.

Comment by Jan Lindström (Inactive) [ 2022-01-26 ]

Please try following: https://archive.mariadb.org/mariadb-10.4.22/galera-26.4.9/

Comment by Lars Mikkelsen [ 2022-01-26 ]

Thats also an old version from october 2020

➜  galera wget https://archive.mariadb.org/mariadb-10.4.22/galera-26.4.9/bintar/galera-26.4.9-systemd-x86_64.tar.gz
--2022-01-26 08:16:35--  https://archive.mariadb.org/mariadb-10.4.22/galera-26.4.9/bintar/galera-26.4.9-systemd-x86_64.tar.gz
Resolving archive.mariadb.org (archive.mariadb.org)... 2a01:4f8:c17:cad6::1, 138.201.152.105
Connecting to archive.mariadb.org (archive.mariadb.org)|2a01:4f8:c17:cad6::1|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 20394498 (19M) [application/octet-stream]
Saving to: ‘galera-26.4.9-systemd-x86_64.tar.gz’
 
galera-26.4.9-systemd-x86_64.tar.gz                        100%[=======================================================================================================================================>]  19,45M  4,71MB/s    in 5,5s
 
2022-01-26 08:16:41 (3,52 MB/s) - ‘galera-26.4.9-systemd-x86_64.tar.gz’ saved [20394498/20394498]
 
➜  galera tar zxvf galera-26.4.9-systemd-x86_64.tar.gz
x galera-26.4.9-systemd-x86_64/
x galera-26.4.9-systemd-x86_64/usr/
x galera-26.4.9-systemd-x86_64/usr/lib/
x galera-26.4.9-systemd-x86_64/usr/lib/galera/
x galera-26.4.9-systemd-x86_64/usr/lib/galera/libgalera_smm.so
x galera-26.4.9-systemd-x86_64/usr/lib/libgalera_smm.so
...   
x galera-26.4.9-systemd-x86_64/etc/
x galera-26.4.9-systemd-x86_64/etc/default/
x galera-26.4.9-systemd-x86_64/etc/default/garb
x galera-26.4.9-systemd-x86_64/etc/init.d/
x galera-26.4.9-systemd-x86_64/etc/init.d/garb
➜  galera ls -l galera-26.4.9-systemd-x86_64/usr/lib/galera/libgalera_smm.so
-rw-r--r--  1 lmk  staff  40221093 Oct 22  2020 galera-26.4.9-systemd-x86_64/usr/lib/galera/libgalera_smm.so

Comment by Jan Lindström (Inactive) [ 2022-01-26 ]

Debian seems to be fine but bintar not.

Comment by Jan Lindström (Inactive) [ 2022-02-08 ]

Fixed on Galera library 26.4.11 commit 9561a159c

Generated at Thu Feb 08 09:53:04 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.