[MDEV-22554] galera.galera_sst_mariabackup fails with "Failed to start mysqld.2" Created: 2020-05-14 Updated: 2020-06-03 Resolved: 2020-05-18 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Galera, Galera SST, mariabackup, Tests |
| Affects Version/s: | 10.5.3 |
| Fix Version/s: | 10.5.4, 10.2.33, 10.3.24, 10.4.14 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Michael Widenius | Assignee: | Julius Goryavsky |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
BUILD/compile-pentium64-valgrind-max |
||
| Attachments: |
|
| Description |
|
It always fails with: |
| Comments |
| Comment by Stepan Patryshev (Inactive) [ 2020-05-14 ] | |||||||||||||||||||||
|
It failed also on Jenkins 10.5 ES:
| |||||||||||||||||||||
| Comment by Julius Goryavsky [ 2020-05-15 ] | |||||||||||||||||||||
|
The problem is related to the operation of netcat streamer and does not appear on systems where socat is installed. Probably we need to add the -N option for netcat. As a local fix, find the second comment "# Debian netcat" in the /scripts/wsrep_scripts_mariabackup file (in the scripts directory) and change tcmd = "nc ${REMOTEIP} ${TSST_PORT}" to tcmd = "nc -N ${REMOTEIP} ${TSST_PORT} ". I am now figuring out whether adding this option is enough, or as a perfect solution, another refinement is needed. | |||||||||||||||||||||
| Comment by Julius Goryavsky [ 2020-05-15 ] | |||||||||||||||||||||
|
Fixed, https://github.com/MariaDB/server/commit/08f3ca8020af50fad80783b87bc70733036e5269 | |||||||||||||||||||||
| Comment by Julius Goryavsky [ 2020-05-15 ] | |||||||||||||||||||||
|
The problem turned out to be a netcat streamer freeze after the successful completion of SST. As a result of several experiments, it was found that the data transmitted during the SST is correct, but netcat does not make a graceful TCP disconnect when receiving EOF from STDIN. To solve this problem, we need to call netcat with the -N option on the donor side. The fix here: https://github.com/MariaDB/server/commit/08f3ca8020af50fad80783b87bc70733036e5269 | |||||||||||||||||||||
| Comment by Jan Lindström (Inactive) [ 2020-05-15 ] | |||||||||||||||||||||
|
ok to push but please push change to 10.2 also. | |||||||||||||||||||||
| Comment by Julius Goryavsky [ 2020-05-18 ] | |||||||||||||||||||||
|
Fixed & closed after verification |