[MDEV-31155] How to troubleshoot "operation not permitted" error in Galera cluster? Created: 2023-04-30 Updated: 2023-05-02 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | Galera SST |
| Affects Version/s: | 10.6.10 |
| Fix Version/s: | 10.6 |
| Type: | Bug | Priority: | Major |
| Reporter: | William Wong | Assignee: | Unassigned |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Environment: |
RHEL7 on VMware |
||
| Attachments: |
|
| Description |
|
Hi, Our databases are mostly Galera cluster - 2 db nodes + 1 arbitrator.
This problem happened 4~5 times already out of 20 to 100 DB node restart. Any way to troubleshoot in next occurrence? We guess the problem is at donor node side. But since we need to resume the cluster, we restarted donor and cannot troubleshoot from donor at this moment. Only can troubleshoot in next occurrence. DB log of one case are uploaded: DB parameter file are uploaded: Regards, |
| Comments |
| Comment by Daniel Black [ 2023-05-01 ] | |||
|
The main error is in the node1 logs:
The underlying case is that fgets failing to read. MariaDB could be more descriptive with the cause of this error with ferror on the first error message in sql/wsrep_sst.cc:sst_donor_thread The "Operations not permitted" is annoying galera fantasy of making up errno based on other info and pretending its the same It seems likely the sst script started, and mariabackup too, and then terminated itself immediately. The joiner message of:
The joiner is just unpacking the donors message in group_unserialize_code_msg. Are there the follow log files in the datadir?
Can you include these? | |||
| Comment by William Wong [ 2023-05-01 ] | |||
|
Thanks @daniel Agree the root cause should be at donor side. We could not find mariabackup log file in datadir of donor node at issue time in past several incidents. | |||
| Comment by Daniel Black [ 2023-05-01 ] | |||
|
Alternately look at attempting a mariabackup outside of SST following similar options and see if it succeeds. | |||
| Comment by William Wong [ 2023-05-02 ] | |||
|
Below is found in node 1 DB log, run wsrep_sst_mariabackup in next occurrence? Cannot find mariabackup alone command in DB log.
|