[MDEV-10414] After updating to MariaDB-server-10.1.16-1.el7.centos.x86_64 cannot start galera cluster Created: 2016-07-21 Updated: 2016-08-05 Resolved: 2016-07-22 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Galera |
| Affects Version/s: | 10.1.16 |
| Fix Version/s: | 10.1.17 |
| Type: | Bug | Priority: | Major |
| Reporter: | sam stein | Assignee: | Nirbhay Choubey (Inactive) |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | galera | ||
| Environment: |
OpenVZ VPS. Fully updated Centos v7 with latest yum updates as of today |
||
| Issue Links: |
|
||||||||
| Description |
|
Been running MariaDB-server-10.1.14-1.el7.centos.x86_64.rpm for awhile with no problems As soon as I updated to MariaDB-server-10.1.16-1.el7.centos.x86_64.rpm the galera cluster no longer works. I can start Mariadb by setting wsrep_on=OFF. It will not start with wsrep_on=ON. I tried restarting the cluster and setting the node to primary. wsrep_cluster_address=gcomm:// But that did not work. I checked the logs but there is nothing in there. I tried disabling all the setting in the config with the exception of cluster config but that didn't work. Downgrading back to MariaDB-server-10.1.14-1.el7.centos.x86_64.rpm was the only way to get the cluster working again. I did not change anything else. Tried upgrading again and same problem. There is something wrong in MariaDB-server-10.1.16-1.el7.centos.x86_64.rpm that causes galera to fail . Also, you guys removed MariaDB-server-10.1.14-1.el7.centos.x86_64.rpm from the repository. Luckily I was able to find a mirror that had it archived. |
| Comments |
| Comment by Elena Stepanova [ 2016-07-21 ] | |||||||||||||||||
|
For the repo question, these are not MariaDB's, ours are named differently: http://yum.mariadb.org/10.1.16/centos7-amd64/rpms/ The actual Galera question goes to nirbhay_c, to see whether it's something wrong with MGC. It is also a possibility that the packages themselves are broken. | |||||||||||||||||
| Comment by Igor Gueths [ 2016-07-21 ] | |||||||||||||||||
|
I can confirm that this bug has existed since the 10.1.15 package release. As to what causes it, I traced it to some sort of file parsing issue in the wsrep_recover_position function in /usr/bin/galera_recovery. Specifically somewhere within this block:
Unfortunately I have not had further time to work on a fix for this; however, hoping Assignee et al can pick this up soon, as I am stuck on 10.1.14 for my clusters now as a result. Thanks. | |||||||||||||||||
| Comment by Richard Lane [ 2016-07-22 ] | |||||||||||||||||
|
I ran into the same issue and traced it down to the same wsrep_recover_position() code in galera_recover script. What I think is happening is that this script assumes that when it starts mysqld to perform the position recovery, the output it is looking for "WSREP: Recovered position: " will be directed to stdout, but in my case, the output needed is directed to where I asked mysql error logging to go: # tail -1 /var/log/mariadb/mysqld.log Not where galera_recover was expecting it. I did get galera_new_cluster to work by changing the line at the beginning of the wsrep_recover_position() to: # Redirect server's error log to the log file. | |||||||||||||||||
| Comment by sam stein [ 2016-07-22 ] | |||||||||||||||||
|
To clarify, I am using the official MariaDB10 repository. I copied the name wrong. | |||||||||||||||||
| Comment by Nirbhay Choubey (Inactive) [ 2016-07-22 ] | |||||||||||||||||
|
rvlane You got it right. | |||||||||||||||||
| Comment by Nirbhay Choubey (Inactive) [ 2016-07-22 ] | |||||||||||||||||