Details
-
Task
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
Description
It turned out that Galera node status details stored in the JSON file as created in frames of MDEV-21901 do NOT include note eviction status, represented by some messages in the error log and status variables like wsrep_evs_status, wsrep_evs_delayes and wsrep_evs_evict_list.
Please, add this information.
Attachments
Issue Links
- relates to
-
MDEV-26971 JSON file interface to wsrep node state / SST progress logging
-
- Closed
-
Activity
- branch : preview-10.11-
MDEV-29281-galea-node-eviction-status - Galera library version : 26.4.14 from branch : mariadb-4.x-test
Node eviction status is now written to JSON file.
{
|
"date": "2022-12-09 08:13:27.573",
|
"timestamp": 1670573607.57340074,
|
"errors": [
|
{
|
"timestamp": 1670573607.00000000,
|
"msg": "WSREP: exception from gcomm, backend must be restarted: this node has been evicted out of the cluster, gcomm backend restart is required (FATAL)\n\t at \/test\/mtest\/10.11_galera\/gcomm\/src\/gmcast_proto.cpp:handle_failed():295"
|
}
|
],
|
"warnings": [
|
{
|
"timestamp": 1670573188.00000000,
|
"msg": "'user' entry 'root@node1' ignored in --skip-name-resolve mode."
|
},
|
{
|
"timestamp": 1670573188.00000000,
|
"msg": "'user' entry '@node1' ignored in --skip-name-resolve mode."
|
},
|
{
|
"timestamp": 1670573188.00000000,
|
"msg": "'proxies_priv' entry '@% root@node1' ignored in --skip-name-resolve mode."
|
},
|
{
|
"timestamp": 1670573607.00000000,
|
"msg": "WSREP: handshake with 514a168d-a05d tcp:\/\/192.168.100.10:4567 failed: 'evicted'"
|
}
|
],
|
"events": [
|
{
|
"timestamp": 1670573607.57092166,
|
"event": {"status": "evicted", "message": "This node was evicted permanently from cluster, restart is required"}
|
}
|
],
|
"status": {
|
"state": "DISCONNECTED",
|
"comment": "Disconnected",
|
"progress": { "from": -1, "to": -1, "total": -1, "done": -1, "indefinite": -1 }
|
}
|
}
|
|
|
MariaDB [(none)]> SHOW STATUS LIKE 'wsrep%stat%';
|
+---------------------------+--------------------------------------+
|
| Variable_name | Value |
|
+---------------------------+--------------------------------------+
|
| wsrep_local_state_uuid | d42e04ae-7796-11ed-9641-164ea6a4b8d0 |
|
| wsrep_local_state | 0 |
|
| wsrep_local_state_comment | Initialized |
|
| wsrep_cluster_state_uuid | 00000000-0000-0000-0000-000000000000 |
|
| wsrep_cluster_status | Disconnected |
|
+---------------------------+--------------------------------------+
|
5 rows in set (0.002 sec)
|
|
MariaDB [(none)]>
|
|
denis.protivensky 11.0 does not print node eviction status in event section.
11.0.0 1cb0835be98985f20cccd1724ac78de3649eb2e6 |
|
test case
|
|
node3:root@localhost> show global status like 'wsrep_gcomm_uuid';
|
+------------------+--------------------------------------+
|
| Variable_name | Value |
|
+------------------+--------------------------------------+
|
| wsrep_gcomm_uuid | 74ae01e0-9316-11ed-a9d5-7208a7fc2a19 |
|
+------------------+--------------------------------------+
|
1 row in set (0.002 sec)
|
|
node3:root@localhost> set global wsrep_provider_evs_evict='74ae01e0-9316-11ed-a9d5-7208a7fc2a19';
|
Query OK, 0 rows affected (0.002 sec)
|
|
node3:root@localhost> show global status like 'wsrep%stat%';
|
+---------------------------+--------------------------------------+
|
| Variable_name | Value |
|
+---------------------------+--------------------------------------+
|
| wsrep_local_state_uuid | 1e476211-930e-11ed-89f9-da74090fa0cb |
|
| wsrep_local_state | 0 |
|
| wsrep_local_state_comment | Initialized |
|
| wsrep_cluster_state_uuid | 00000000-0000-0000-0000-000000000000 |
|
| wsrep_cluster_status | Disconnected |
|
+---------------------------+--------------------------------------+
|
5 rows in set (0.001 sec)
|
|
node3:root@localhost>
|
|
|
|
Status file.
|
|
Every 3.0s: cat node3/wsrep_status.json galapq: Fri Jan 13 09:51:53 2023
|
|
{
|
"date": "2023-01-13 09:48:00.000",
|
"timestamp": 1673596080.00000000,
|
"errors": [
|
{
|
"timestamp": 1673596080.00000000,
|
"msg": "WSREP: exception from gcomm, backend must be restarted: this node has been evicted out of the cluster, gcomm backend restart is required (FATAL)\n\t at \/test\/mtest\/galera\/gcomm
|
\/src\/gmcast_proto.cpp:handle_failed():283"
|
}
|
],
|
"warnings": [
|
{
|
"timestamp": 1673596080.00000000,
|
"msg": "WSREP: handshake with e2482adf-ac83 tcp:\/\/127.0.0.1:11391 failed: 'evicted'"
|
},
|
{
|
"timestamp": 1673596080.00000000,
|
"msg": "Aborted connection 2 to db: 'unconnected' user: 'unauthenticated' host: '' (This connection closed normally without authentication)"
|
},
|
{
|
"timestamp": 1673596080.00000000,
|
"msg": "Aborted connection 6 to db: 'unconnected' user: 'unauthenticated' host: '' (This connection closed normally without authentication)"
|
}
|
],
|
"events": [
|
],
|
"status": {
|
"state": "DISCONNECTED",
|
"comment": "Disconnected",
|
"progress": { "from": -1, "to": -1, "total": -1, "done": -1, "indefinite": -1 }
|
}
|
}
|
|
Ramesh Sivaraman I checked out the commit SHA and performed the steps you described to evict the node, and the event is generated for me. Can you check that you're using the appropriate Galera library that contains the fix to emit node eviction events?
denis.protivensky Sorry, you are right, I was using Galera 4.x base branch with 11.0 version. It works fine when using the Galera branch mariadb-4.x-test.
],
|
"events": [
|
{
|
"timestamp": 1673624273.64578247,
|
"event": {"status": "evicted", "message": "This node was evicted permanently from cluster, restart is required"}
|
}
|
],
|
"status": {
|
"state": "DISCONNECTED",
|
"comment": "Disconnected",
|
"progress": { "from": -1, "to": -1, "total": -1, "done": -1, "indefinite": -1 }
|
}
|
}
|
My original task for this,
MDEV-21901, was closed as "Won't Do" with the idea to use this JSON status file instead. Unfortunately this had not happened.