[MDEV-15265] mysql failed to log into server on a member of the Galera cluster when it reboots after restore Created: 2018-02-09 Updated: 2020-11-20 Resolved: 2020-11-20 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Galera |
| Affects Version/s: | 10.2.8 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Minor |
| Reporter: | Samantha Chu | Assignee: | Jan Lindström (Inactive) |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | None | ||
| Environment: |
OS = RHEL7.4 |
||
| Description |
|
Hi: There is a GALERA issue when a member of the Galera Cluster reboots after the restore operation. In this case, mysql (mysteriously) fails to connect to the server, specifically: [root@mdbaas-demo-app-1 ~]# mysql -p Enter password: ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: YES) There is a manual workaround for this situation - on the impacted member, run the folowing commands: 1. systemctl stop mariadb 2. DATADIR=/mariadb/data # Note: This is the default, set to match your configuration 3. mv $DATADIR/grastate.dat $DATADIR/grastate.dat.OLD 4. systemctl start mariadb The above manual workaround needs to be reapplied as needed whenever you reboot AFTER executing the restore operation for the Galera configuration. The manual workaround works until the next restore operation. In other words, subsequent reboots continue to work until the operator performs another restore operation. At that point, the workaround needs to be re-applied again. We only see this issue in Galera configuration, not in Master/2-slave, or Master/Master or standalone configurations. Also, I think the node that performed the restore does not have this problem, only the nodes that get replicated are having this issue. The backup/restore method was using mysqldump. Wondering if you have any suggestions on why this is happening? p.s. I do not have a lab setup on this now, if more info is needed, let me know, and I'll find some time to do it. Thanks, |
| Comments |
| Comment by Zdravelina Sokolovska (Inactive) [ 2018-02-15 ] |
|
sschuHello , |
| Comment by Samantha Chu [ 2018-02-16 ] |
|
Thank you for this info... we are in the process of developing a feature to use mariabackup as the wsrep_sst_method for Galera. It may take a couple of weeks before the feature is ready for me to test this and see if it makes any difference. I will provide update as soon as I learn something. Thanks, |
| Comment by Samantha Chu [ 2018-02-20 ] |
|
Hi: We tested using wsrep_sst_method=mariabackup while mysqldump is used for backup/restore. But we still see the same symptoms (e.g. mysql cannot connect to server). We also tried using wsrep_sst_method=mariabackup while mariabackup is used for backup/restore and mysql can connect to server when a reboot is done after a restore just fine. Any other suggestions and can you confirm if this is a bug? Thanks, |
| Comment by Zdravelina Sokolovska (Inactive) [ 2018-02-22 ] |
|
hello sschu, you said that using wsrep_sst_method=mariabackup while mariabackup is used for backup/restore and mysql can connect to server when a reboot is done after a restore just fine. that's the correct flow |
| Comment by Samantha Chu [ 2018-02-22 ] |
|
Hi: Yes, I understand the flow is correct for that case, but my problem is that when wsrep_sst_method=mariabackup (as you suggested to try) or rsync (our original setup for this ticket) while mysqldump is used for backup/restore, when a reboot is done after a restore, mysql cannot connect to server. So can you confirm if this is a bug? Thanks, |
| Comment by Samantha Chu [ 2018-03-02 ] |
|
Hi: Wondering if you have any updates? Thanks, |
| Comment by Zdravelina Sokolovska (Inactive) [ 2018-03-07 ] |
|
hello sschu, |
| Comment by Zdravelina Sokolovska (Inactive) [ 2018-03-07 ] |
|
sschu you may also create feature requests regarding further improvements or feature usefulness |
| Comment by Samantha Chu [ 2018-03-08 ] |
|
Hi: Yes in our restore script it did remove the grastate.data file before rebooting the node but we still have the same issue. This looks like a bug since things worked fine after restore and only after rebooting the node it has this problem. Sounds like you would rather treat this as a feature request for further improvement instead of bug fix? If so, would you let me know how do I open a feature request for improvement on this? Thanks, |
| Comment by Jan Lindström (Inactive) [ 2020-11-20 ] |
|
I suggest upgrading more recent version of MariaDB server and Galera library. If you then can repeat the issue, please provide detailed steps how to reproduce and provide full error logs from all nodes. |