Details
-
Bug
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
11.2.2
Description
The Kubernetes operator for MariaDB is able to provision MariaDB clusters by creating containers one by one waiting until the `wsrep_ready` variable is enabled. This is to ensure that just one node is attempting to join the cluster at a given time.
When a new node joins the cluster, Galera performs an SST choosing an existing node as donor and transferring the state to the new node in order to initialize it. We've seen this process failing sometimes when bootstrapping the cluster or when a node goes down, so the unhealthy container gets restarted by Kubernetes and a new one is created, which implies that the SST is retried again. This situation is repeated until the container reaches a healthy state and the node is part of the cluster:
Here there are the configuration files for each node:
mariadb-galera-0
[mariadb]
|
bind-address=0.0.0.0 |
default_storage_engine=InnoDB
|
binlog_format=row
|
innodb_autoinc_lock_mode=2 |
|
# Cluster configuration
|
wsrep_on=ON
|
wsrep_provider=/usr/lib/galera/libgalera_smm.so
|
wsrep_cluster_address="gcomm://mariadb-galera-0.mariadb-galera-internal.default.svc.cluster.local,mariadb-galera-1.mariadb-galera-internal.default.svc.cluster.local,mariadb-galera-2.mariadb-galera-internal.default.svc.cluster.local" |
wsrep_cluster_name=mariadb-operator
|
wsrep_slave_threads=1 |
|
# Node configuration
|
wsrep_node_address="mariadb-galera-0.mariadb-galera-internal.default.svc.cluster.local" |
wsrep_node_name="mariadb-galera-0" |
wsrep_sst_method="mariabackup" |
wsrep_sst_auth="<user>:<password>" |
mariadb-galera-1
[mariadb]
|
bind-address=0.0.0.0 |
default_storage_engine=InnoDB
|
binlog_format=row
|
innodb_autoinc_lock_mode=2 |
|
# Cluster configuration
|
wsrep_on=ON
|
wsrep_provider=/usr/lib/galera/libgalera_smm.so
|
wsrep_cluster_address="gcomm://mariadb-galera-0.mariadb-galera-internal.default.svc.cluster.local,mariadb-galera-1.mariadb-galera-internal.default.svc.cluster.local,mariadb-galera-2.mariadb-galera-internal.default.svc.cluster.local" |
wsrep_cluster_name=mariadb-operator
|
wsrep_slave_threads=1 |
|
# Node configuration
|
wsrep_node_address="mariadb-galera-1.mariadb-galera-internal.default.svc.cluster.local" |
wsrep_node_name="mariadb-galera-1" |
wsrep_sst_method="mariabackup" |
wsrep_sst_auth="<user>:<password>" |
mariadb-galera-2
[mariadb]
|
bind-address=0.0.0.0 |
default_storage_engine=InnoDB
|
binlog_format=row
|
innodb_autoinc_lock_mode=2 |
|
# Cluster configuration
|
wsrep_on=ON
|
wsrep_provider=/usr/lib/galera/libgalera_smm.so
|
wsrep_cluster_address="gcomm://mariadb-galera-0.mariadb-galera-internal.default.svc.cluster.local,mariadb-galera-1.mariadb-galera-internal.default.svc.cluster.local,mariadb-galera-2.mariadb-galera-internal.default.svc.cluster.local" |
wsrep_cluster_name=mariadb-operator
|
wsrep_slave_threads=1 |
|
# Node configuration
|
wsrep_node_address="mariadb-galera-2.mariadb-galera-internal.default.svc.cluster.local" |
wsrep_node_name="mariadb-galera-2" |
wsrep_sst_method="mariabackup" |
wsrep_sst_auth="<user>:<password>" |
It is important to note that we are using DNS names in the cluster address, which get resolved to the IP of the containers, but every time a container is restarted by Kubernetes it gets a new IP assigned.
There are some log files attached to the current Jira showing the crash happening after a node went down and attempting to rejoin the cluster for a while. Also, after ~30m or so, it finally managed to join.
I have also tried to use rsync instead of mariabackup, but it didn't help.
I was a bit curious about the InnoDB part of this and initially thought that there was something strange. But that part looks fine after all.
In mariadb-galera-2-recovered.log
we can see that wsrep_sst_method=mariabackup did invoke mariadb-backup --prepare, because InnoDB starts up with a dummy ib_logfile0:
mariadb-11.2.2
2024-01-31 10:29:35 0 [Note] InnoDB: End of log at LSN=1623012
2024-01-31 10:29:35 0 [Note] InnoDB: Resizing redo log from 12.016KiB to 96.000MiB; LSN=1623012
Starting with
MDEV-14425, the normal log file size should be a multiple of 4096 bytes. Here, the file size seems to be 12288+16=12304 bytes. Starting withMDEV-27199, a log file is mandatory. Such dummy 12304-byte files are created at the end of mariadb-backup --prepare in order to pass a start LSN to the server.