Details
-
Task
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
Description
https://github.com/MariaDB/server/pull/1982
Codership is planning to add a new feature to cluster nodes: reporting some wsrep status variables in a dedicated JSON file, that then can be read by an external monitoring tool. Or a human for that matter.
Rationale: until the server is fully initialized it is inaccessible by client and the only source of information is an error log which is not machine-friendly. Since wsrep node can spend a very long time in initialization phase (state transfer), it may be a very long time that automatic tools can't easily monitor its liveness and progression.
Rationale behind using a file as opposed to some sort of a socket: it is simpler and safer and the file stays in case of the process abort, so it is easy to get the last error that caused the abort.
For now the file contents will look as follows:
$ cat /tmp/galera/0/mysql/var/wsrep_status.json
|
{
|
"date": "2021-09-04 15:35:02.000",
|
"timestamp": 1630758902.00000000,
|
"errors": [
|
{
|
"timestamp": 1630758901.00000000,
|
"msg": "mysqld: Can't open shared library '/tmp/galera/0/mysql/lib64/mysql/plugin/audit_log.so' (errno: 0, cannot open shared object file: No such file or directory)"
|
},
|
{
|
"timestamp": 1630758901.00000000,
|
"msg": "Couldn't load plugins from 'audit_log.so'."
|
}
|
],
|
"warnings": [
|
{
|
"timestamp": 1630758902.00000000,
|
"msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown option '--loose-skip_mysqlx'"
|
},
|
{
|
"timestamp": 1630758902.00000000,
|
"msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-log_error_verbosity=3'"
|
},
|
{
|
"timestamp": 1630758902.00000000,
|
"msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-audit_log_file=/tmp/galera/0/mysql/var/audit.log'"
|
},
|
{
|
"timestamp": 1630758902.00000000,
|
"msg": "'proxies_priv' entry '@% root@void' ignored in --skip-name-resolve mode."
|
}
|
],
|
"status": {
|
"state": "DISCONNECTED",
|
"comment": "Disconnected",
|
"progress": -1.00000
|
}
|
}
|
So there are a few most recent errors and warnings form the error log, wsrep state and a progress indicator (in case of SST/IST).
I have an ready patch for MariaDB 10.4. It introduces a new variable: `wsrep_status_file`. If that variable is unset, no file is created and no reporting is done. The patch does not support SST/IST progress reporting yet, only discrete state changes. We plan to add progress reporting in the followup patches.
This task contains also progress reporting for mariabackup SST
- Progress reporting requires tool pv
- Progress/rate-limiting can be disabled by configuration (progress = NONE)
- Progress is reported now in server error log for example :
2022-03-16 14:39:27 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 392923645, "indefinite": -1 }'
2022-03-16 14:39:28 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 896353227, "indefinite": -1 }'
2022-03-16 14:39:29 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 1386740995, "indefinite": -1 }'
2022-03-16 14:39:30 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 1914292021, "indefinite": -1 }'
2022-03-16 14:39:31 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 2429366550, "indefinite": -1 }'
2022-03-16 14:39:32 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 2731243266, "indefinite": -1 }'
2022-03-16 14:39:32 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2734803150, "done": 2734803150, "indefinite": -1 }'
Attachments
Issue Links
- causes
-
MDEV-28423 IST is failing on Joiner node when active data load on donor node
-
- Closed
-
-
MDEV-28656 Inability to roll upgrade without stopping the Galera cluster
-
- Closed
-
-
MDEV-31738 Unable to install MariaDB Community version 10.9+ on RHEL based OS due to unresolved dependency pv
-
- Stalled
-
- includes
-
MDEV-21901 Write details into a separate .dat file in case of Galera node auto-eviction
-
- Closed
-
- is part of
-
MDEV-28112 prepare 10.9.0 preview releases
-
- Closed
-
- relates to
-
MDEV-29281 Add details about node eviction status to the JSON file with Galera node status
-
- Closed
-
Activity
Field | Original Value | New Value |
---|---|---|
Assignee | Jan Lindström [ jplindst ] |
Component/s | Galera [ 10124 ] | |
Component/s | wsrep [ 11500 ] | |
Component/s | wsrep [ 15006 ] | |
Key |
|
|
Issue Type | New Feature [ 2 ] | Task [ 3 ] |
Project | MariaDB Enterprise [ 11500 ] | MariaDB Server [ 10000 ] |
Fix Version/s | 10.8 [ 26121 ] |
Workflow | MariaDB v3 [ 124861 ] | MariaDB v4 [ 131551 ] |
Priority | Major [ 3 ] | Critical [ 2 ] |
Priority | Critical [ 2 ] | Major [ 3 ] |
Fix Version/s | 10.9 [ 26905 ] | |
Fix Version/s | 10.8 [ 26121 ] |
Priority | Major [ 3 ] | Critical [ 2 ] |
Status | Open [ 1 ] | In Progress [ 3 ] |
Status | In Progress [ 3 ] | In Testing [ 10301 ] |
Assignee | Jan Lindström [ jplindst ] | Ramesh Sivaraman [ JIRAUSER48189 ] |
Assignee | Ramesh Sivaraman [ JIRAUSER48189 ] | Jan Lindström [ jplindst ] |
Status | In Testing [ 10301 ] | Stalled [ 10000 ] |
Fix Version/s | 10.10 [ 27530 ] | |
Fix Version/s | 10.9 [ 26905 ] |
Fix Version/s | 10.9.0 [ 27113 ] | |
Fix Version/s | 10.10 [ 27530 ] |
Fix Version/s | 10.9 [ 26905 ] | |
Fix Version/s | 10.9.0 [ 27113 ] |
Link |
This issue is part of |
Description |
Codership is planning to add a new feature to cluster nodes: reporting some wsrep status variables in a dedicated JSON file, that then can be read by an external monitoring tool. Or a human for that matter.
Rationale: until the server is fully initialized it is inaccessible by client and the only source of information is an error log which is not machine-friendly. Since wsrep node can spend a very long time in initialization phase (state transfer), it may be a very long time that automatic tools can't easily monitor its liveness and progression. Rationale behind using a file as opposed to some sort of a socket: it is simpler and safer and the file stays in case of the process abort, so it is easy to get the last error that caused the abort. For now the file contents will look as follows: {code:json} $ cat /tmp/galera/0/mysql/var/wsrep_status.json { "date": "2021-09-04 15:35:02.000", "timestamp": 1630758902.00000000, "errors": [ { "timestamp": 1630758901.00000000, "msg": "mysqld: Can't open shared library '/tmp/galera/0/mysql/lib64/mysql/plugin/audit_log.so' (errno: 0, cannot open shared object file: No such file or directory)" }, { "timestamp": 1630758901.00000000, "msg": "Couldn't load plugins from 'audit_log.so'." } ], "warnings": [ { "timestamp": 1630758902.00000000, "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown option '--loose-skip_mysqlx'" }, { "timestamp": 1630758902.00000000, "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-log_error_verbosity=3'" }, { "timestamp": 1630758902.00000000, "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-audit_log_file=/tmp/galera/0/mysql/var/audit.log'" }, { "timestamp": 1630758902.00000000, "msg": "'proxies_priv' entry '@% root@void' ignored in --skip-name-resolve mode." } ], "status": { "state": "DISCONNECTED", "comment": "Disconnected", "progress": -1.00000 } } {code} So there are a few most recent errors and warnings form the error log, wsrep state and a progress indicator (in case of SST/IST). I have an ready patch for MariaDB 10.4. It introduces a new variable: `wsrep_status_file`. If that variable is unset, no file is created and no reporting is done. The patch does not support SST/IST progress reporting yet, only discrete state changes. We plan to add progress reporting in the followup patches. This maybe a feature that you'd want to have in the Enterprise alone. Hence we'd like to have a discussion on that and what Maria decides before going on with our own patch. |
Codership is planning to add a new feature to cluster nodes: reporting some wsrep status variables in a dedicated JSON file, that then can be read by an external monitoring tool. Or a human for that matter.
Rationale: until the server is fully initialized it is inaccessible by client and the only source of information is an error log which is not machine-friendly. Since wsrep node can spend a very long time in initialization phase (state transfer), it may be a very long time that automatic tools can't easily monitor its liveness and progression. Rationale behind using a file as opposed to some sort of a socket: it is simpler and safer and the file stays in case of the process abort, so it is easy to get the last error that caused the abort. For now the file contents will look as follows: {code:json} $ cat /tmp/galera/0/mysql/var/wsrep_status.json { "date": "2021-09-04 15:35:02.000", "timestamp": 1630758902.00000000, "errors": [ { "timestamp": 1630758901.00000000, "msg": "mysqld: Can't open shared library '/tmp/galera/0/mysql/lib64/mysql/plugin/audit_log.so' (errno: 0, cannot open shared object file: No such file or directory)" }, { "timestamp": 1630758901.00000000, "msg": "Couldn't load plugins from 'audit_log.so'." } ], "warnings": [ { "timestamp": 1630758902.00000000, "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown option '--loose-skip_mysqlx'" }, { "timestamp": 1630758902.00000000, "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-log_error_verbosity=3'" }, { "timestamp": 1630758902.00000000, "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-audit_log_file=/tmp/galera/0/mysql/var/audit.log'" }, { "timestamp": 1630758902.00000000, "msg": "'proxies_priv' entry '@% root@void' ignored in --skip-name-resolve mode." } ], "status": { "state": "DISCONNECTED", "comment": "Disconnected", "progress": -1.00000 } } {code} So there are a few most recent errors and warnings form the error log, wsrep state and a progress indicator (in case of SST/IST). I have an ready patch for MariaDB 10.4. It introduces a new variable: `wsrep_status_file`. If that variable is unset, no file is created and no reporting is done. The patch does not support SST/IST progress reporting yet, only discrete state changes. We plan to add progress reporting in the followup patches. |
Summary | JSON file interface to wsrep node state. | JSON file interface to wsrep node state / SST progress logging |
Description |
Codership is planning to add a new feature to cluster nodes: reporting some wsrep status variables in a dedicated JSON file, that then can be read by an external monitoring tool. Or a human for that matter.
Rationale: until the server is fully initialized it is inaccessible by client and the only source of information is an error log which is not machine-friendly. Since wsrep node can spend a very long time in initialization phase (state transfer), it may be a very long time that automatic tools can't easily monitor its liveness and progression. Rationale behind using a file as opposed to some sort of a socket: it is simpler and safer and the file stays in case of the process abort, so it is easy to get the last error that caused the abort. For now the file contents will look as follows: {code:json} $ cat /tmp/galera/0/mysql/var/wsrep_status.json { "date": "2021-09-04 15:35:02.000", "timestamp": 1630758902.00000000, "errors": [ { "timestamp": 1630758901.00000000, "msg": "mysqld: Can't open shared library '/tmp/galera/0/mysql/lib64/mysql/plugin/audit_log.so' (errno: 0, cannot open shared object file: No such file or directory)" }, { "timestamp": 1630758901.00000000, "msg": "Couldn't load plugins from 'audit_log.so'." } ], "warnings": [ { "timestamp": 1630758902.00000000, "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown option '--loose-skip_mysqlx'" }, { "timestamp": 1630758902.00000000, "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-log_error_verbosity=3'" }, { "timestamp": 1630758902.00000000, "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-audit_log_file=/tmp/galera/0/mysql/var/audit.log'" }, { "timestamp": 1630758902.00000000, "msg": "'proxies_priv' entry '@% root@void' ignored in --skip-name-resolve mode." } ], "status": { "state": "DISCONNECTED", "comment": "Disconnected", "progress": -1.00000 } } {code} So there are a few most recent errors and warnings form the error log, wsrep state and a progress indicator (in case of SST/IST). I have an ready patch for MariaDB 10.4. It introduces a new variable: `wsrep_status_file`. If that variable is unset, no file is created and no reporting is done. The patch does not support SST/IST progress reporting yet, only discrete state changes. We plan to add progress reporting in the followup patches. |
Codership is planning to add a new feature to cluster nodes: reporting some wsrep status variables in a dedicated JSON file, that then can be read by an external monitoring tool. Or a human for that matter.
Rationale: until the server is fully initialized it is inaccessible by client and the only source of information is an error log which is not machine-friendly. Since wsrep node can spend a very long time in initialization phase (state transfer), it may be a very long time that automatic tools can't easily monitor its liveness and progression. Rationale behind using a file as opposed to some sort of a socket: it is simpler and safer and the file stays in case of the process abort, so it is easy to get the last error that caused the abort. For now the file contents will look as follows: {code:json} $ cat /tmp/galera/0/mysql/var/wsrep_status.json { "date": "2021-09-04 15:35:02.000", "timestamp": 1630758902.00000000, "errors": [ { "timestamp": 1630758901.00000000, "msg": "mysqld: Can't open shared library '/tmp/galera/0/mysql/lib64/mysql/plugin/audit_log.so' (errno: 0, cannot open shared object file: No such file or directory)" }, { "timestamp": 1630758901.00000000, "msg": "Couldn't load plugins from 'audit_log.so'." } ], "warnings": [ { "timestamp": 1630758902.00000000, "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown option '--loose-skip_mysqlx'" }, { "timestamp": 1630758902.00000000, "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-log_error_verbosity=3'" }, { "timestamp": 1630758902.00000000, "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-audit_log_file=/tmp/galera/0/mysql/var/audit.log'" }, { "timestamp": 1630758902.00000000, "msg": "'proxies_priv' entry '@% root@void' ignored in --skip-name-resolve mode." } ], "status": { "state": "DISCONNECTED", "comment": "Disconnected", "progress": -1.00000 } } {code} So there are a few most recent errors and warnings form the error log, wsrep state and a progress indicator (in case of SST/IST). I have an ready patch for MariaDB 10.4. It introduces a new variable: `wsrep_status_file`. If that variable is unset, no file is created and no reporting is done. The patch does not support SST/IST progress reporting yet, only discrete state changes. We plan to add progress reporting in the followup patches. This task contains also progress reporting for mariabackup SST * Progress reporting requires tool pv * Progress/rate-limiting can be disabled by configuration (progress = NONE) * Progress is reported now in server error log {noformat} |
Description |
Codership is planning to add a new feature to cluster nodes: reporting some wsrep status variables in a dedicated JSON file, that then can be read by an external monitoring tool. Or a human for that matter.
Rationale: until the server is fully initialized it is inaccessible by client and the only source of information is an error log which is not machine-friendly. Since wsrep node can spend a very long time in initialization phase (state transfer), it may be a very long time that automatic tools can't easily monitor its liveness and progression. Rationale behind using a file as opposed to some sort of a socket: it is simpler and safer and the file stays in case of the process abort, so it is easy to get the last error that caused the abort. For now the file contents will look as follows: {code:json} $ cat /tmp/galera/0/mysql/var/wsrep_status.json { "date": "2021-09-04 15:35:02.000", "timestamp": 1630758902.00000000, "errors": [ { "timestamp": 1630758901.00000000, "msg": "mysqld: Can't open shared library '/tmp/galera/0/mysql/lib64/mysql/plugin/audit_log.so' (errno: 0, cannot open shared object file: No such file or directory)" }, { "timestamp": 1630758901.00000000, "msg": "Couldn't load plugins from 'audit_log.so'." } ], "warnings": [ { "timestamp": 1630758902.00000000, "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown option '--loose-skip_mysqlx'" }, { "timestamp": 1630758902.00000000, "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-log_error_verbosity=3'" }, { "timestamp": 1630758902.00000000, "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-audit_log_file=/tmp/galera/0/mysql/var/audit.log'" }, { "timestamp": 1630758902.00000000, "msg": "'proxies_priv' entry '@% root@void' ignored in --skip-name-resolve mode." } ], "status": { "state": "DISCONNECTED", "comment": "Disconnected", "progress": -1.00000 } } {code} So there are a few most recent errors and warnings form the error log, wsrep state and a progress indicator (in case of SST/IST). I have an ready patch for MariaDB 10.4. It introduces a new variable: `wsrep_status_file`. If that variable is unset, no file is created and no reporting is done. The patch does not support SST/IST progress reporting yet, only discrete state changes. We plan to add progress reporting in the followup patches. This task contains also progress reporting for mariabackup SST * Progress reporting requires tool pv * Progress/rate-limiting can be disabled by configuration (progress = NONE) * Progress is reported now in server error log {noformat} |
Codership is planning to add a new feature to cluster nodes: reporting some wsrep status variables in a dedicated JSON file, that then can be read by an external monitoring tool. Or a human for that matter.
Rationale: until the server is fully initialized it is inaccessible by client and the only source of information is an error log which is not machine-friendly. Since wsrep node can spend a very long time in initialization phase (state transfer), it may be a very long time that automatic tools can't easily monitor its liveness and progression. Rationale behind using a file as opposed to some sort of a socket: it is simpler and safer and the file stays in case of the process abort, so it is easy to get the last error that caused the abort. For now the file contents will look as follows: {code:json} $ cat /tmp/galera/0/mysql/var/wsrep_status.json { "date": "2021-09-04 15:35:02.000", "timestamp": 1630758902.00000000, "errors": [ { "timestamp": 1630758901.00000000, "msg": "mysqld: Can't open shared library '/tmp/galera/0/mysql/lib64/mysql/plugin/audit_log.so' (errno: 0, cannot open shared object file: No such file or directory)" }, { "timestamp": 1630758901.00000000, "msg": "Couldn't load plugins from 'audit_log.so'." } ], "warnings": [ { "timestamp": 1630758902.00000000, "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown option '--loose-skip_mysqlx'" }, { "timestamp": 1630758902.00000000, "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-log_error_verbosity=3'" }, { "timestamp": 1630758902.00000000, "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-audit_log_file=/tmp/galera/0/mysql/var/audit.log'" }, { "timestamp": 1630758902.00000000, "msg": "'proxies_priv' entry '@% root@void' ignored in --skip-name-resolve mode." } ], "status": { "state": "DISCONNECTED", "comment": "Disconnected", "progress": -1.00000 } } {code} So there are a few most recent errors and warnings form the error log, wsrep state and a progress indicator (in case of SST/IST). I have an ready patch for MariaDB 10.4. It introduces a new variable: `wsrep_status_file`. If that variable is unset, no file is created and no reporting is done. The patch does not support SST/IST progress reporting yet, only discrete state changes. We plan to add progress reporting in the followup patches. This task contains also progress reporting for mariabackup SST * Progress reporting requires tool pv * Progress/rate-limiting can be disabled by configuration (progress = NONE) * Progress is reported now in server error log for example : {noformat} 2022-03-16 14:39:27 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 392923645, "indefinite": -1 }' 2022-03-16 14:39:28 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 896353227, "indefinite": -1 }' 2022-03-16 14:39:29 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 1386740995, "indefinite": -1 }' 2022-03-16 14:39:30 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 1914292021, "indefinite": -1 }' 2022-03-16 14:39:31 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 2429366550, "indefinite": -1 }' 2022-03-16 14:39:32 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 2731243266, "indefinite": -1 }' 2022-03-16 14:39:32 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2734803150, "done": 2734803150, "indefinite": -1 }' {noformat} |
Description |
Codership is planning to add a new feature to cluster nodes: reporting some wsrep status variables in a dedicated JSON file, that then can be read by an external monitoring tool. Or a human for that matter.
Rationale: until the server is fully initialized it is inaccessible by client and the only source of information is an error log which is not machine-friendly. Since wsrep node can spend a very long time in initialization phase (state transfer), it may be a very long time that automatic tools can't easily monitor its liveness and progression. Rationale behind using a file as opposed to some sort of a socket: it is simpler and safer and the file stays in case of the process abort, so it is easy to get the last error that caused the abort. For now the file contents will look as follows: {code:json} $ cat /tmp/galera/0/mysql/var/wsrep_status.json { "date": "2021-09-04 15:35:02.000", "timestamp": 1630758902.00000000, "errors": [ { "timestamp": 1630758901.00000000, "msg": "mysqld: Can't open shared library '/tmp/galera/0/mysql/lib64/mysql/plugin/audit_log.so' (errno: 0, cannot open shared object file: No such file or directory)" }, { "timestamp": 1630758901.00000000, "msg": "Couldn't load plugins from 'audit_log.so'." } ], "warnings": [ { "timestamp": 1630758902.00000000, "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown option '--loose-skip_mysqlx'" }, { "timestamp": 1630758902.00000000, "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-log_error_verbosity=3'" }, { "timestamp": 1630758902.00000000, "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-audit_log_file=/tmp/galera/0/mysql/var/audit.log'" }, { "timestamp": 1630758902.00000000, "msg": "'proxies_priv' entry '@% root@void' ignored in --skip-name-resolve mode." } ], "status": { "state": "DISCONNECTED", "comment": "Disconnected", "progress": -1.00000 } } {code} So there are a few most recent errors and warnings form the error log, wsrep state and a progress indicator (in case of SST/IST). I have an ready patch for MariaDB 10.4. It introduces a new variable: `wsrep_status_file`. If that variable is unset, no file is created and no reporting is done. The patch does not support SST/IST progress reporting yet, only discrete state changes. We plan to add progress reporting in the followup patches. This task contains also progress reporting for mariabackup SST * Progress reporting requires tool pv * Progress/rate-limiting can be disabled by configuration (progress = NONE) * Progress is reported now in server error log for example : {noformat} 2022-03-16 14:39:27 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 392923645, "indefinite": -1 }' 2022-03-16 14:39:28 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 896353227, "indefinite": -1 }' 2022-03-16 14:39:29 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 1386740995, "indefinite": -1 }' 2022-03-16 14:39:30 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 1914292021, "indefinite": -1 }' 2022-03-16 14:39:31 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 2429366550, "indefinite": -1 }' 2022-03-16 14:39:32 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 2731243266, "indefinite": -1 }' 2022-03-16 14:39:32 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2734803150, "done": 2734803150, "indefinite": -1 }' {noformat} |
https://github.com/MariaDB/server/pull/1982
Codership is planning to add a new feature to cluster nodes: reporting some wsrep status variables in a dedicated JSON file, that then can be read by an external monitoring tool. Or a human for that matter. Rationale: until the server is fully initialized it is inaccessible by client and the only source of information is an error log which is not machine-friendly. Since wsrep node can spend a very long time in initialization phase (state transfer), it may be a very long time that automatic tools can't easily monitor its liveness and progression. Rationale behind using a file as opposed to some sort of a socket: it is simpler and safer and the file stays in case of the process abort, so it is easy to get the last error that caused the abort. For now the file contents will look as follows: {code:json} $ cat /tmp/galera/0/mysql/var/wsrep_status.json { "date": "2021-09-04 15:35:02.000", "timestamp": 1630758902.00000000, "errors": [ { "timestamp": 1630758901.00000000, "msg": "mysqld: Can't open shared library '/tmp/galera/0/mysql/lib64/mysql/plugin/audit_log.so' (errno: 0, cannot open shared object file: No such file or directory)" }, { "timestamp": 1630758901.00000000, "msg": "Couldn't load plugins from 'audit_log.so'." } ], "warnings": [ { "timestamp": 1630758902.00000000, "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown option '--loose-skip_mysqlx'" }, { "timestamp": 1630758902.00000000, "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-log_error_verbosity=3'" }, { "timestamp": 1630758902.00000000, "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-audit_log_file=/tmp/galera/0/mysql/var/audit.log'" }, { "timestamp": 1630758902.00000000, "msg": "'proxies_priv' entry '@% root@void' ignored in --skip-name-resolve mode." } ], "status": { "state": "DISCONNECTED", "comment": "Disconnected", "progress": -1.00000 } } {code} So there are a few most recent errors and warnings form the error log, wsrep state and a progress indicator (in case of SST/IST). I have an ready patch for MariaDB 10.4. It introduces a new variable: `wsrep_status_file`. If that variable is unset, no file is created and no reporting is done. The patch does not support SST/IST progress reporting yet, only discrete state changes. We plan to add progress reporting in the followup patches. This task contains also progress reporting for mariabackup SST * Progress reporting requires tool pv * Progress/rate-limiting can be disabled by configuration (progress = NONE) * Progress is reported now in server error log for example : {noformat} 2022-03-16 14:39:27 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 392923645, "indefinite": -1 }' 2022-03-16 14:39:28 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 896353227, "indefinite": -1 }' 2022-03-16 14:39:29 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 1386740995, "indefinite": -1 }' 2022-03-16 14:39:30 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 1914292021, "indefinite": -1 }' 2022-03-16 14:39:31 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 2429366550, "indefinite": -1 }' 2022-03-16 14:39:32 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 2731243266, "indefinite": -1 }' 2022-03-16 14:39:32 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2734803150, "done": 2734803150, "indefinite": -1 }' {noformat} |
Fix Version/s | 10.9.0 [ 27113 ] | |
Fix Version/s | 10.9 [ 26905 ] | |
Resolution | Fixed [ 1 ] | |
Status | Stalled [ 10000 ] | Closed [ 6 ] |
Link | This issue relates to TODO-3382 [ TODO-3382 ] |
Link |
This issue is part of |
Link |
This issue includes |
Link |
This issue is part of |
Link |
This issue causes |
Link |
This issue causes |
Link |
This issue relates to |
Labels | Preview_10.9 |
Link | This issue causes MDEV-31738 [ MDEV-31738 ] |
Zendesk Related Tickets | 144345 |