Details
-
Bug
-
Status: Closed (View Workflow)
-
Blocker
-
Resolution: Fixed
-
10.5.9, 10.5.10, 10.5
-
None
Description
After upgrade to 10.5.10 and also on 10.5.9 we started to see following issue where node will never sync back to cluster and end in failed state
jaro@cv-sqa-us-east4-k8s-lmgmt-a:~$ kubectl logs mysql-2 -n sde mysql |
2021/05/20 07:04:07 Peer list updated |
was []
|
now [mysql-0.mysql.sde.svc.cluster.local mysql-1.mysql.sde.svc.cluster.local mysql-2.mysql.sde.svc.cluster.local] |
2021/05/20 07:04:07 execing: /opt/galera/on-start.sh with stdin: mysql-0.mysql.sde.svc.cluster.local |
mysql-1.mysql.sde.svc.cluster.local |
mysql-2.mysql.sde.svc.cluster.local |
2021/05/20 07:04:07 *** [Galera] Joining cluster: mysql-0.mysql.sde.svc.cluster.local,mysql-1.mysql.sde.svc.cluster.local |
2021/05/20 07:04:08 Peer finder exiting |
Galera - Determining recovery position...
|
galera-recovery.sh: Attempting to recover GTID positon...
|
2021-05-20 7:04:08 0 [Note] mysqld (mysqld 10.5.10-MariaDB-1:10.5.10+maria~focal) starting as process 48 ... |
galera-recovery.sh: Found WSREP position: 6c9afcd7-96b7-11ea-96a5-76e81fcbb085:19951024 |
Galera recovery position: --wsrep_start_position=6c9afcd7-96b7-11ea-96a5-76e81fcbb085:19951024 |
2021-05-20 7:04:09 0 [Note] mysqld (mysqld 10.5.10-MariaDB-1:10.5.10+maria~focal) starting as process 1 ... |
2021-05-20 7:04:09 0 [Note] WSREP: Loading provider /usr/lib/galera/libgalera_smm.so initial position: 6c9afcd7-96b7-11ea-96a5-76e81fcbb085:19951024 |
2021-05-20 7:04:09 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so' |
2021-05-20 7:04:09 0 [Note] WSREP: wsrep_load(): Galera 26.4.8(r902dd268) by Codership Oy <info@codership.com> loaded successfully. |
2021-05-20 7:04:09 0 [Note] WSREP: CRC-32C: using 64-bit x86 acceleration. |
2021-05-20 7:04:09 0 [Note] WSREP: Found saved state: 6c9afcd7-96b7-11ea-96a5-76e81fcbb085:-1, safe_to_bootstrap: 1 |
2021-05-20 7:04:09 0 [Note] WSREP: GCache DEBUG: opened preamble: |
Version: 2 |
UUID: 6c9afcd7-96b7-11ea-96a5-76e81fcbb085
|
Seqno: -1 - -1 |
Offset: -1 |
Synced: 0 |
2021-05-20 7:04:09 0 [Note] WSREP: Recovering GCache ring buffer: version: 2, UUID: 6c9afcd7-96b7-11ea-96a5-76e81fcbb085, offset: -1 |
2021-05-20 7:04:09 0 [Note] WSREP: GCache::RingBuffer initial scan... 0.0% ( 0/134217752 bytes) complete. |
2021-05-20 7:04:09 0 [ERROR] WSREP: deque::_M_new_elements_at_back |
2021-05-20 7:04:09 0 [ERROR] WSREP: Failed to create a new provider '/usr/lib/galera/libgalera_smm.so' with options '': Failed to initialize wsrep provider |
2021-05-20 7:04:09 0 [ERROR] WSREP: Failed to load provider |
2021-05-20 7:04:09 0 [ERROR] Aborting |
On the other nodes we also could see
2021/05/20 07:37:08 Peer list updated |
was []
|
now [mysql-0.mysql.default.svc.cluster.local mysql-1.mysql.default.svc.cluster.local mysql-2.mysql.default.svc.cluster.local] |
2021/05/20 07:37:08 execing: /opt/galera/on-start.sh with stdin: mysql-0.mysql.default.svc.cluster.local |
mysql-1.mysql.default.svc.cluster.local |
mysql-2.mysql.default.svc.cluster.local |
2021/05/20 07:37:08 *** [Galera] Joining cluster: mysql-1.mysql.default.svc.cluster.local,mysql-2.mysql.default.svc.cluster.local |
2021/05/20 07:37:09 Peer finder exiting |
Galera - Determining recovery position...
|
galera-recovery.sh: Attempting to recover GTID positon...
|
2021-05-20 7:37:09 0 [Note] mysqld (mysqld 10.5.9-MariaDB-1:10.5.9+maria~focal) starting as process 49 ... |
galera-recovery.sh: Found WSREP position: 8338b624-66eb-11eb-93e0-323a1dc8d4de:8714889 |
Galera recovery position: --wsrep_start_position=8338b624-66eb-11eb-93e0-323a1dc8d4de:8714889 |
2021-05-20 7:37:10 0 [Note] mysqld (mysqld 10.5.9-MariaDB-1:10.5.9+maria~focal) starting as process 1 ... |
2021-05-20 7:37:10 0 [Note] WSREP: Loading provider /usr/lib/galera/libgalera_smm.so initial position: 8338b624-66eb-11eb-93e0-323a1dc8d4de:8714889 |
2021-05-20 7:37:10 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so' |
2021-05-20 7:37:10 0 [Note] WSREP: wsrep_load(): Galera 4.7(ree4f10fc) by Codership Oy <info@codership.com> loaded successfully. |
2021-05-20 7:37:10 0 [Note] WSREP: CRC-32C: using 64-bit x86 acceleration. |
2021-05-20 7:37:10 0 [Note] WSREP: Found saved state: 8338b624-66eb-11eb-93e0-323a1dc8d4de:-1, safe_to_bootstrap: 1 |
2021-05-20 7:37:10 0 [Note] WSREP: GCache DEBUG: opened preamble: |
Version: 2 |
UUID: 8338b624-66eb-11eb-93e0-323a1dc8d4de
|
Seqno: -1 - -1 |
Offset: -1 |
Synced: 0 |
2021-05-20 7:37:10 0 [Note] WSREP: Recovering GCache ring buffer: version: 2, UUID: 8338b624-66eb-11eb-93e0-323a1dc8d4de, offset: -1 |
2021-05-20 7:37:10 0 [Note] WSREP: GCache::RingBuffer initial scan... 0.0% ( 0/134217752 bytes) complete. |
2021-05-20 7:37:10 0 [ERROR] WSREP: std::bad_alloc |
2021-05-20 7:37:10 0 [ERROR] WSREP: Failed to create a new provider '/usr/lib/galera/libgalera_smm.so' with options '': Failed to initialize wsrep provider |
2021-05-20 7:37:10 0 [ERROR] WSREP: Failed to load provider |
2021-05-20 7:37:10 0 [ERROR] Aborting |
There is nothing we can do just to delete the disk for the node and let it fully resync. From linked tickets it seems the /var/lib/mysql/galera.cache got corrupted somehow and deleting it "solves" the issue.
Attachments
Issue Links
- relates to
-
MDEV-24615 MariaDB 10.5.8 Galera node fails to start with WSREP: std::bad_alloc
- Closed
-
MDEV-25605 Failed to initialize wsrep provider - Galera
- Closed