[MDEV-10004] Galera's pc.recovery process fails in 10.1 with systemd Created: 2016-04-27 Updated: 2020-08-25 Resolved: 2016-05-27 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Galera, wsrep |
| Affects Version/s: | 10.1.13 |
| Fix Version/s: | 10.1.15 |
| Type: | Bug | Priority: | Major |
| Reporter: | Geoff Montee (Inactive) | Assignee: | Nirbhay Choubey (Inactive) |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | galera, systemd, wsrep | ||
| Attachments: |
|
||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||
| Sprint: | 10.2.1-3, 10.2.1-4 | ||||||||||||||||||||
| Description |
|
Galera's pc.recovery process allows a cluster to automatically recover after a crash without bootstrapping. When I try to test this recovery process in MariaDB 10.1 on CentOS/RHEL 7, automatic recovery always fails with a vague "Operation not permitted" error. To reproduce, let's say that we have a two node cluster. First bootstrap the first node:
Then start mysqld on the second node,
Now to simulate a crash, let's kill mysqld on both nodes:
Now let's verify that both grastate.dat and gvwstate.dat have meaningful information:
Now, start mysqld both nodes normally. We are not bootstrapping the first node here because we would like automatic recovery to take place:
When this happens, you will likely see that the saved state is initially restored:
But eventually, mysqld will abort when it supposedly attempts to SST, but fails:
I suspect the failure might be caused because Group UUID is 00000000-0000-0000-0000-000000000000, even though grastate.dat and gvwstate.dat both seem to have valid values. It seems as though the server is ignoring the valid Group UUID, or it is not being transmitted or received properly. When the server SSTs during a normal startup in which automatic recovery is not attempted, everything works fine. This only seems to happen during recovery. The configuration files for these nodes look like this:
The galera provider being used is 25.3.15. |
| Comments |
| Comment by Geoff Montee (Inactive) [ 2016-04-27 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Note that automatic recovery seems to work fine on MariaDB Galera Cluster 10.0 on CentOS/RHEL 6, so this problem might be specific to MariaDB 10.1 or systemd. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Geoff Montee (Inactive) [ 2016-04-28 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Automatic recovery also seems to work fine on MariaDB 10.1 on CentOS/RHEL 6, so this problem might be specific to systemd or something else on CentOS/RHEL 7. To see it working on CentOS 6, I did the following: Bootstrap the first node:
Then start the second node:
Then to simulate a crash, kill mysqld on both nodes:
Then verify that both grastate.dat and gvwstate.dat have meaningful information (I'll attach the full logs as node1_centos6_success.err and node2_centos6_success.err):
Then start mysqld both nodes normally. Again, we are not bootstrapping the first node here because we would like automatic recovery to take place:
Here's an example of the relevant log section:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Geoff Montee (Inactive) [ 2016-04-28 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Automatic recovery also seems to work fine in MariaDB Galera Cluster 10.0 on CentOS/RHEL 7. Since MariaDB Galera Cluster 10.0 doesn't use systemd and MariaDB 10.1 does, maybe systemd is somehow causing the failure. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Nirbhay Choubey (Inactive) [ 2016-05-04 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
GeoffMontee Right, systemd is the culprit here. The init scripts use mysqld_safe to start mysqld. mysqld_safe | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergey Vojtovich [ 2016-05-05 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
One of options would be to use ExecStartPre to generate additionoal config file with wsrep-start-position and ExecStartPost to remove it. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Black [ 2016-05-05 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Can the --wsrep-recover logic be moved inside the server to make the second startup with --wsrep-start-position redundant? Was there a reason for keeping these separate? If the mysqld_safe is adding wsrep-recover logic for whenever wsrep=on is there any reason for this not to be part of the server logic? systemctl set-environment .. may also be of assistance if I've missed some important logic here. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Geoff Montee (Inactive) [ 2016-05-11 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
It does seem that automatic recovery works on MariaDB 10.1 on CentOS/RHEL 7 if I use the old SysV init scripts instead of the systemd service. To reproduce: First, move the systemd service file and reload on both nodes:
Then bootstrap the first node. Note: we cannot use galera_new_cluster here, since that relies on mariadb.service (which we just moved from its expected location). We also can't pass --wsrep_new_cluster directly to the init script, since systemd ignores extra options.
Then start the second node:
Then to simulate a crash, kill mysqld on both nodes:
After that, we have to stop the service on both nodes for some reason, even though mysqld is already dead. systemd might keep extra state somewhere that needs to be cleared.
Then start mysqld on both nodes normally:
We can see that recovery is working:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Nirbhay Choubey (Inactive) [ 2016-05-21 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
svoj danblack Would you be interested in reviewing the patch? | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Nirbhay Choubey (Inactive) [ 2016-05-21 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The rsync/xtrabackup based SST requires file transfer to happen before the SE initialization. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Nirbhay Choubey (Inactive) [ 2016-05-27 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
http://lists.askmonty.org/pipermail/commits/2016-May/009384.html |