Details
-
Bug
-
Status: Open (View Workflow)
-
Critical
-
Resolution: Unresolved
-
10.11.9
-
None
-
Ubuntu 22.04.5
Description
Hi MariaDB-team,
we have a five node MariaDB Galera Cluster.
After taking one node out of service an clearing data directories and start MariaDB the transfer starts but after about 90% it fails with:
...
|
Sep 27 09:20:58 maria-ham-3 mariadbd[2442465]: 2024-09-27 9:20:58 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 19935177778, "done": 19935177778, "indefinite": -1 }'
|
Sep 27 09:21:04 maria-ham-3 mariadbd[2442465]: 2024-09-27 9:21:04 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 19947753124, "done": 19947753124, "indefinite": -1 }'
|
Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: 2024-09-27 9:21:08 0 [Warning] WSREP: 0.0 (maria-muc-1): State transfer to 3.0 (maria-ham-3) failed: -22 (Invalid argument)
|
Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: 2024-09-27 9:21:08 0 [ERROR] WSREP: ./gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():1217: Will never receive state. Need to abort.
|
Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: 2024-09-27 9:21:08 0 [Note] WSREP: gcomm: terminating thread
|
Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: 2024-09-27 9:21:08 0 [Note] WSREP: gcomm: joining thread
|
Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: 2024-09-27 9:21:08 0 [Note] WSREP: gcomm: closing backend
|
Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: 2024-09-27 9:21:08 0 [Note] WSREP: view(view_id(NON_PRIM,0abc8cb4-9ae0,609) memb {
|
Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: a591b16e-9113,0
|
Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: } joined {
|
Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: } left {
|
Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: } partitioned {
|
Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: 0abc8cb4-9ae0,0
|
Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: 29958514-86e6,0
|
Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: 7150c82e-8f59,0
|
Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: dfc13b8b-8838,0
|
Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: })
|
Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: 2024-09-27 9:21:08 0 [Note] WSREP: PC protocol downgrade 1 -> 0
|
Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: 2024-09-27 9:21:08 0 [Note] WSREP: view((empty))
|
Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: 2024-09-27 9:21:08 0 [Note] WSREP: gcomm: closed
|
Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: 2024-09-27 9:21:08 0 [Note] WSREP: /usr/sbin/mariadbd: Terminated.
|
Sep 27 09:21:08 maria-ham-3 systemd[1]: mariadb.service: Main process exited, code=killed, status=6/ABRT
|
Sep 27 09:21:08 maria-ham-3 systemd[1]: mariadb.service: Failed with result 'signal'.
|
Sep 27 09:21:08 maria-ham-3 systemd[1]: Failed to start MariaDB 10.11.9 database server.
|
This error comes again at each try to restart the transfer.
Versions:
# dpkg -l galera-4 mariadb-server
|
Desired=Unknown/Install/Remove/Purge/Hold
|
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
|
||/ Name Version Architecture Description
|
+++-==============-=======================-============-====================================================
|
ii galera-4 26.4.19-ubu2204 amd64 Replication framework for transactional applications
|
ii mariadb-server 1:10.11.9+maria~ubu2204 amd64 MariaDB database server binaries
|
I set the wsrep_sst_donor on node maria-ham-3 to maria-muc-1 to make debugging easier.
The filesystems have enough space and I can't find any reason why the transfer stops after it started successfully.
Can you give me any hint where to look or what to do for further debugging?
Greetings
Lars