Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-26971

JSON file interface to wsrep node state / SST progress logging

Details

    Description

      https://github.com/MariaDB/server/pull/1982

      Codership is planning to add a new feature to cluster nodes: reporting some wsrep status variables in a dedicated JSON file, that then can be read by an external monitoring tool. Or a human for that matter.

      Rationale: until the server is fully initialized it is inaccessible by client and the only source of information is an error log which is not machine-friendly. Since wsrep node can spend a very long time in initialization phase (state transfer), it may be a very long time that automatic tools can't easily monitor its liveness and progression.

      Rationale behind using a file as opposed to some sort of a socket: it is simpler and safer and the file stays in case of the process abort, so it is easy to get the last error that caused the abort.

      For now the file contents will look as follows:

      $ cat /tmp/galera/0/mysql/var/wsrep_status.json 
      {
      	"date": "2021-09-04 15:35:02.000",
      	"timestamp": 1630758902.00000000,
      	"errors": [
      		{
      			"timestamp": 1630758901.00000000,
      			"msg": "mysqld: Can't open shared library '/tmp/galera/0/mysql/lib64/mysql/plugin/audit_log.so' (errno: 0, cannot open shared object file: No such file or directory)"
      		},
      		{
      			"timestamp": 1630758901.00000000,
      			"msg": "Couldn't load plugins from 'audit_log.so'."
      		}
      	],
      	"warnings": [
      		{
      			"timestamp": 1630758902.00000000,
      			"msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown option '--loose-skip_mysqlx'"
      		},
      		{
      			"timestamp": 1630758902.00000000,
      			"msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-log_error_verbosity=3'"
      		},
      		{
      			"timestamp": 1630758902.00000000,
      			"msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-audit_log_file=/tmp/galera/0/mysql/var/audit.log'"
      		},
      		{
      			"timestamp": 1630758902.00000000,
      			"msg": "'proxies_priv' entry '@% root@void' ignored in --skip-name-resolve mode."
      		}
      	],
      	"status": {
      		"state": "DISCONNECTED",
      		"comment": "Disconnected",
      		"progress": -1.00000
      	}
      }
      

      So there are a few most recent errors and warnings form the error log, wsrep state and a progress indicator (in case of SST/IST).

      I have an ready patch for MariaDB 10.4. It introduces a new variable: `wsrep_status_file`. If that variable is unset, no file is created and no reporting is done. The patch does not support SST/IST progress reporting yet, only discrete state changes. We plan to add progress reporting in the followup patches.

      This task contains also progress reporting for mariabackup SST

      • Progress reporting requires tool pv
      • Progress/rate-limiting can be disabled by configuration (progress = NONE)
      • Progress is reported now in server error log for example :

        2022-03-16 14:39:27 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 392923645, "indefinite": -1 }'
        2022-03-16 14:39:28 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 896353227, "indefinite": -1 }'
        2022-03-16 14:39:29 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 1386740995, "indefinite": -1 }'
        2022-03-16 14:39:30 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 1914292021, "indefinite": -1 }'
        2022-03-16 14:39:31 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 2429366550, "indefinite": -1 }'
        2022-03-16 14:39:32 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 2731243266, "indefinite": -1 }'
        2022-03-16 14:39:32 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2734803150, "done": 2734803150, "indefinite": -1 }'
        

      Attachments

        Issue Links

          Activity

            Yurchenko Alexey created issue -
            jplindst Jan Lindström (Inactive) made changes -
            Field Original Value New Value
            Assignee Jan Lindström [ jplindst ]
            jplindst Jan Lindström (Inactive) made changes -
            Component/s Galera [ 10124 ]
            Component/s wsrep [ 11500 ]
            Component/s wsrep [ 15006 ]
            Key MENT-1326 MDEV-26971
            Issue Type New Feature [ 2 ] Task [ 3 ]
            Project MariaDB Enterprise [ 11500 ] MariaDB Server [ 10000 ]
            jplindst Jan Lindström (Inactive) made changes -
            Fix Version/s 10.8 [ 26121 ]
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 124861 ] MariaDB v4 [ 131551 ]
            julien.fritsch Julien Fritsch made changes -
            Priority Major [ 3 ] Critical [ 2 ]
            jplindst Jan Lindström (Inactive) made changes -
            Priority Critical [ 2 ] Major [ 3 ]
            jplindst Jan Lindström (Inactive) made changes -
            Fix Version/s 10.9 [ 26905 ]
            Fix Version/s 10.8 [ 26121 ]
            jplindst Jan Lindström (Inactive) made changes -
            Priority Major [ 3 ] Critical [ 2 ]
            jplindst Jan Lindström (Inactive) made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            jplindst Jan Lindström (Inactive) made changes -
            Status In Progress [ 3 ] In Testing [ 10301 ]
            jplindst Jan Lindström (Inactive) made changes -
            Assignee Jan Lindström [ jplindst ] Ramesh Sivaraman [ JIRAUSER48189 ]
            ramesh Ramesh Sivaraman made changes -
            Assignee Ramesh Sivaraman [ JIRAUSER48189 ] Jan Lindström [ jplindst ]
            Status In Testing [ 10301 ] Stalled [ 10000 ]
            serg Sergei Golubchik made changes -
            Fix Version/s 10.10 [ 27530 ]
            Fix Version/s 10.9 [ 26905 ]
            ralf.gebhardt Ralf Gebhardt made changes -
            Fix Version/s 10.9.0 [ 27113 ]
            Fix Version/s 10.10 [ 27530 ]
            ralf.gebhardt Ralf Gebhardt made changes -
            Fix Version/s 10.9 [ 26905 ]
            Fix Version/s 10.9.0 [ 27113 ]
            serg Sergei Golubchik made changes -
            ralf.gebhardt Ralf Gebhardt made changes -
            Description Codership is planning to add a new feature to cluster nodes: reporting some wsrep status variables in a dedicated JSON file, that then can be read by an external monitoring tool. Or a human for that matter.

            Rationale: until the server is fully initialized it is inaccessible by client and the only source of information is an error log which is not machine-friendly. Since wsrep node can spend a very long time in initialization phase (state transfer), it may be a very long time that automatic tools can't easily monitor its liveness and progression.

            Rationale behind using a file as opposed to some sort of a socket: it is simpler and safer and the file stays in case of the process abort, so it is easy to get the last error that caused the abort.

            For now the file contents will look as follows:
            {code:json}
            $ cat /tmp/galera/0/mysql/var/wsrep_status.json
            {
            "date": "2021-09-04 15:35:02.000",
            "timestamp": 1630758902.00000000,
            "errors": [
            {
            "timestamp": 1630758901.00000000,
            "msg": "mysqld: Can't open shared library '/tmp/galera/0/mysql/lib64/mysql/plugin/audit_log.so' (errno: 0, cannot open shared object file: No such file or directory)"
            },
            {
            "timestamp": 1630758901.00000000,
            "msg": "Couldn't load plugins from 'audit_log.so'."
            }
            ],
            "warnings": [
            {
            "timestamp": 1630758902.00000000,
            "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown option '--loose-skip_mysqlx'"
            },
            {
            "timestamp": 1630758902.00000000,
            "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-log_error_verbosity=3'"
            },
            {
            "timestamp": 1630758902.00000000,
            "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-audit_log_file=/tmp/galera/0/mysql/var/audit.log'"
            },
            {
            "timestamp": 1630758902.00000000,
            "msg": "'proxies_priv' entry '@% root@void' ignored in --skip-name-resolve mode."
            }
            ],
            "status": {
            "state": "DISCONNECTED",
            "comment": "Disconnected",
            "progress": -1.00000
            }
            }
            {code}

            So there are a few most recent errors and warnings form the error log, wsrep state and a progress indicator (in case of SST/IST).

            I have an ready patch for MariaDB 10.4. It introduces a new variable: `wsrep_status_file`. If that variable is unset, no file is created and no reporting is done. The patch does not support SST/IST progress reporting yet, only discrete state changes. We plan to add progress reporting in the followup patches.

            This maybe a feature that you'd want to have in the Enterprise alone. Hence we'd like to have a discussion on that and what Maria decides before going on with our own patch.
            Codership is planning to add a new feature to cluster nodes: reporting some wsrep status variables in a dedicated JSON file, that then can be read by an external monitoring tool. Or a human for that matter.

            Rationale: until the server is fully initialized it is inaccessible by client and the only source of information is an error log which is not machine-friendly. Since wsrep node can spend a very long time in initialization phase (state transfer), it may be a very long time that automatic tools can't easily monitor its liveness and progression.

            Rationale behind using a file as opposed to some sort of a socket: it is simpler and safer and the file stays in case of the process abort, so it is easy to get the last error that caused the abort.

            For now the file contents will look as follows:
            {code:json}
            $ cat /tmp/galera/0/mysql/var/wsrep_status.json
            {
            "date": "2021-09-04 15:35:02.000",
            "timestamp": 1630758902.00000000,
            "errors": [
            {
            "timestamp": 1630758901.00000000,
            "msg": "mysqld: Can't open shared library '/tmp/galera/0/mysql/lib64/mysql/plugin/audit_log.so' (errno: 0, cannot open shared object file: No such file or directory)"
            },
            {
            "timestamp": 1630758901.00000000,
            "msg": "Couldn't load plugins from 'audit_log.so'."
            }
            ],
            "warnings": [
            {
            "timestamp": 1630758902.00000000,
            "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown option '--loose-skip_mysqlx'"
            },
            {
            "timestamp": 1630758902.00000000,
            "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-log_error_verbosity=3'"
            },
            {
            "timestamp": 1630758902.00000000,
            "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-audit_log_file=/tmp/galera/0/mysql/var/audit.log'"
            },
            {
            "timestamp": 1630758902.00000000,
            "msg": "'proxies_priv' entry '@% root@void' ignored in --skip-name-resolve mode."
            }
            ],
            "status": {
            "state": "DISCONNECTED",
            "comment": "Disconnected",
            "progress": -1.00000
            }
            }
            {code}

            So there are a few most recent errors and warnings form the error log, wsrep state and a progress indicator (in case of SST/IST).

            I have an ready patch for MariaDB 10.4. It introduces a new variable: `wsrep_status_file`. If that variable is unset, no file is created and no reporting is done. The patch does not support SST/IST progress reporting yet, only discrete state changes. We plan to add progress reporting in the followup patches.

            ralf.gebhardt Ralf Gebhardt made changes -
            Summary JSON file interface to wsrep node state. JSON file interface to wsrep node state / SST progress logging
            jplindst Jan Lindström (Inactive) made changes -
            Description Codership is planning to add a new feature to cluster nodes: reporting some wsrep status variables in a dedicated JSON file, that then can be read by an external monitoring tool. Or a human for that matter.

            Rationale: until the server is fully initialized it is inaccessible by client and the only source of information is an error log which is not machine-friendly. Since wsrep node can spend a very long time in initialization phase (state transfer), it may be a very long time that automatic tools can't easily monitor its liveness and progression.

            Rationale behind using a file as opposed to some sort of a socket: it is simpler and safer and the file stays in case of the process abort, so it is easy to get the last error that caused the abort.

            For now the file contents will look as follows:
            {code:json}
            $ cat /tmp/galera/0/mysql/var/wsrep_status.json
            {
            "date": "2021-09-04 15:35:02.000",
            "timestamp": 1630758902.00000000,
            "errors": [
            {
            "timestamp": 1630758901.00000000,
            "msg": "mysqld: Can't open shared library '/tmp/galera/0/mysql/lib64/mysql/plugin/audit_log.so' (errno: 0, cannot open shared object file: No such file or directory)"
            },
            {
            "timestamp": 1630758901.00000000,
            "msg": "Couldn't load plugins from 'audit_log.so'."
            }
            ],
            "warnings": [
            {
            "timestamp": 1630758902.00000000,
            "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown option '--loose-skip_mysqlx'"
            },
            {
            "timestamp": 1630758902.00000000,
            "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-log_error_verbosity=3'"
            },
            {
            "timestamp": 1630758902.00000000,
            "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-audit_log_file=/tmp/galera/0/mysql/var/audit.log'"
            },
            {
            "timestamp": 1630758902.00000000,
            "msg": "'proxies_priv' entry '@% root@void' ignored in --skip-name-resolve mode."
            }
            ],
            "status": {
            "state": "DISCONNECTED",
            "comment": "Disconnected",
            "progress": -1.00000
            }
            }
            {code}

            So there are a few most recent errors and warnings form the error log, wsrep state and a progress indicator (in case of SST/IST).

            I have an ready patch for MariaDB 10.4. It introduces a new variable: `wsrep_status_file`. If that variable is unset, no file is created and no reporting is done. The patch does not support SST/IST progress reporting yet, only discrete state changes. We plan to add progress reporting in the followup patches.

            Codership is planning to add a new feature to cluster nodes: reporting some wsrep status variables in a dedicated JSON file, that then can be read by an external monitoring tool. Or a human for that matter.

            Rationale: until the server is fully initialized it is inaccessible by client and the only source of information is an error log which is not machine-friendly. Since wsrep node can spend a very long time in initialization phase (state transfer), it may be a very long time that automatic tools can't easily monitor its liveness and progression.

            Rationale behind using a file as opposed to some sort of a socket: it is simpler and safer and the file stays in case of the process abort, so it is easy to get the last error that caused the abort.

            For now the file contents will look as follows:
            {code:json}
            $ cat /tmp/galera/0/mysql/var/wsrep_status.json
            {
            "date": "2021-09-04 15:35:02.000",
            "timestamp": 1630758902.00000000,
            "errors": [
            {
            "timestamp": 1630758901.00000000,
            "msg": "mysqld: Can't open shared library '/tmp/galera/0/mysql/lib64/mysql/plugin/audit_log.so' (errno: 0, cannot open shared object file: No such file or directory)"
            },
            {
            "timestamp": 1630758901.00000000,
            "msg": "Couldn't load plugins from 'audit_log.so'."
            }
            ],
            "warnings": [
            {
            "timestamp": 1630758902.00000000,
            "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown option '--loose-skip_mysqlx'"
            },
            {
            "timestamp": 1630758902.00000000,
            "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-log_error_verbosity=3'"
            },
            {
            "timestamp": 1630758902.00000000,
            "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-audit_log_file=/tmp/galera/0/mysql/var/audit.log'"
            },
            {
            "timestamp": 1630758902.00000000,
            "msg": "'proxies_priv' entry '@% root@void' ignored in --skip-name-resolve mode."
            }
            ],
            "status": {
            "state": "DISCONNECTED",
            "comment": "Disconnected",
            "progress": -1.00000
            }
            }
            {code}

            So there are a few most recent errors and warnings form the error log, wsrep state and a progress indicator (in case of SST/IST).

            I have an ready patch for MariaDB 10.4. It introduces a new variable: `wsrep_status_file`. If that variable is unset, no file is created and no reporting is done. The patch does not support SST/IST progress reporting yet, only discrete state changes. We plan to add progress reporting in the followup patches.

            This task contains also progress reporting for mariabackup SST
            * Progress reporting requires tool pv
            * Progress/rate-limiting can be disabled by configuration (progress = NONE)
            * Progress is reported now in server error log
            {noformat}
            jplindst Jan Lindström (Inactive) made changes -
            Description Codership is planning to add a new feature to cluster nodes: reporting some wsrep status variables in a dedicated JSON file, that then can be read by an external monitoring tool. Or a human for that matter.

            Rationale: until the server is fully initialized it is inaccessible by client and the only source of information is an error log which is not machine-friendly. Since wsrep node can spend a very long time in initialization phase (state transfer), it may be a very long time that automatic tools can't easily monitor its liveness and progression.

            Rationale behind using a file as opposed to some sort of a socket: it is simpler and safer and the file stays in case of the process abort, so it is easy to get the last error that caused the abort.

            For now the file contents will look as follows:
            {code:json}
            $ cat /tmp/galera/0/mysql/var/wsrep_status.json
            {
            "date": "2021-09-04 15:35:02.000",
            "timestamp": 1630758902.00000000,
            "errors": [
            {
            "timestamp": 1630758901.00000000,
            "msg": "mysqld: Can't open shared library '/tmp/galera/0/mysql/lib64/mysql/plugin/audit_log.so' (errno: 0, cannot open shared object file: No such file or directory)"
            },
            {
            "timestamp": 1630758901.00000000,
            "msg": "Couldn't load plugins from 'audit_log.so'."
            }
            ],
            "warnings": [
            {
            "timestamp": 1630758902.00000000,
            "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown option '--loose-skip_mysqlx'"
            },
            {
            "timestamp": 1630758902.00000000,
            "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-log_error_verbosity=3'"
            },
            {
            "timestamp": 1630758902.00000000,
            "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-audit_log_file=/tmp/galera/0/mysql/var/audit.log'"
            },
            {
            "timestamp": 1630758902.00000000,
            "msg": "'proxies_priv' entry '@% root@void' ignored in --skip-name-resolve mode."
            }
            ],
            "status": {
            "state": "DISCONNECTED",
            "comment": "Disconnected",
            "progress": -1.00000
            }
            }
            {code}

            So there are a few most recent errors and warnings form the error log, wsrep state and a progress indicator (in case of SST/IST).

            I have an ready patch for MariaDB 10.4. It introduces a new variable: `wsrep_status_file`. If that variable is unset, no file is created and no reporting is done. The patch does not support SST/IST progress reporting yet, only discrete state changes. We plan to add progress reporting in the followup patches.

            This task contains also progress reporting for mariabackup SST
            * Progress reporting requires tool pv
            * Progress/rate-limiting can be disabled by configuration (progress = NONE)
            * Progress is reported now in server error log
            {noformat}
            Codership is planning to add a new feature to cluster nodes: reporting some wsrep status variables in a dedicated JSON file, that then can be read by an external monitoring tool. Or a human for that matter.

            Rationale: until the server is fully initialized it is inaccessible by client and the only source of information is an error log which is not machine-friendly. Since wsrep node can spend a very long time in initialization phase (state transfer), it may be a very long time that automatic tools can't easily monitor its liveness and progression.

            Rationale behind using a file as opposed to some sort of a socket: it is simpler and safer and the file stays in case of the process abort, so it is easy to get the last error that caused the abort.

            For now the file contents will look as follows:
            {code:json}
            $ cat /tmp/galera/0/mysql/var/wsrep_status.json
            {
            "date": "2021-09-04 15:35:02.000",
            "timestamp": 1630758902.00000000,
            "errors": [
            {
            "timestamp": 1630758901.00000000,
            "msg": "mysqld: Can't open shared library '/tmp/galera/0/mysql/lib64/mysql/plugin/audit_log.so' (errno: 0, cannot open shared object file: No such file or directory)"
            },
            {
            "timestamp": 1630758901.00000000,
            "msg": "Couldn't load plugins from 'audit_log.so'."
            }
            ],
            "warnings": [
            {
            "timestamp": 1630758902.00000000,
            "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown option '--loose-skip_mysqlx'"
            },
            {
            "timestamp": 1630758902.00000000,
            "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-log_error_verbosity=3'"
            },
            {
            "timestamp": 1630758902.00000000,
            "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-audit_log_file=/tmp/galera/0/mysql/var/audit.log'"
            },
            {
            "timestamp": 1630758902.00000000,
            "msg": "'proxies_priv' entry '@% root@void' ignored in --skip-name-resolve mode."
            }
            ],
            "status": {
            "state": "DISCONNECTED",
            "comment": "Disconnected",
            "progress": -1.00000
            }
            }
            {code}

            So there are a few most recent errors and warnings form the error log, wsrep state and a progress indicator (in case of SST/IST).

            I have an ready patch for MariaDB 10.4. It introduces a new variable: `wsrep_status_file`. If that variable is unset, no file is created and no reporting is done. The patch does not support SST/IST progress reporting yet, only discrete state changes. We plan to add progress reporting in the followup patches.

            This task contains also progress reporting for mariabackup SST
            * Progress reporting requires tool pv
            * Progress/rate-limiting can be disabled by configuration (progress = NONE)
            * Progress is reported now in server error log for example :
            {noformat}
            2022-03-16 14:39:27 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 392923645, "indefinite": -1 }'
            2022-03-16 14:39:28 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 896353227, "indefinite": -1 }'
            2022-03-16 14:39:29 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 1386740995, "indefinite": -1 }'
            2022-03-16 14:39:30 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 1914292021, "indefinite": -1 }'
            2022-03-16 14:39:31 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 2429366550, "indefinite": -1 }'
            2022-03-16 14:39:32 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 2731243266, "indefinite": -1 }'
            2022-03-16 14:39:32 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2734803150, "done": 2734803150, "indefinite": -1 }'
            {noformat}
            jplindst Jan Lindström (Inactive) made changes -
            Description Codership is planning to add a new feature to cluster nodes: reporting some wsrep status variables in a dedicated JSON file, that then can be read by an external monitoring tool. Or a human for that matter.

            Rationale: until the server is fully initialized it is inaccessible by client and the only source of information is an error log which is not machine-friendly. Since wsrep node can spend a very long time in initialization phase (state transfer), it may be a very long time that automatic tools can't easily monitor its liveness and progression.

            Rationale behind using a file as opposed to some sort of a socket: it is simpler and safer and the file stays in case of the process abort, so it is easy to get the last error that caused the abort.

            For now the file contents will look as follows:
            {code:json}
            $ cat /tmp/galera/0/mysql/var/wsrep_status.json
            {
            "date": "2021-09-04 15:35:02.000",
            "timestamp": 1630758902.00000000,
            "errors": [
            {
            "timestamp": 1630758901.00000000,
            "msg": "mysqld: Can't open shared library '/tmp/galera/0/mysql/lib64/mysql/plugin/audit_log.so' (errno: 0, cannot open shared object file: No such file or directory)"
            },
            {
            "timestamp": 1630758901.00000000,
            "msg": "Couldn't load plugins from 'audit_log.so'."
            }
            ],
            "warnings": [
            {
            "timestamp": 1630758902.00000000,
            "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown option '--loose-skip_mysqlx'"
            },
            {
            "timestamp": 1630758902.00000000,
            "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-log_error_verbosity=3'"
            },
            {
            "timestamp": 1630758902.00000000,
            "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-audit_log_file=/tmp/galera/0/mysql/var/audit.log'"
            },
            {
            "timestamp": 1630758902.00000000,
            "msg": "'proxies_priv' entry '@% root@void' ignored in --skip-name-resolve mode."
            }
            ],
            "status": {
            "state": "DISCONNECTED",
            "comment": "Disconnected",
            "progress": -1.00000
            }
            }
            {code}

            So there are a few most recent errors and warnings form the error log, wsrep state and a progress indicator (in case of SST/IST).

            I have an ready patch for MariaDB 10.4. It introduces a new variable: `wsrep_status_file`. If that variable is unset, no file is created and no reporting is done. The patch does not support SST/IST progress reporting yet, only discrete state changes. We plan to add progress reporting in the followup patches.

            This task contains also progress reporting for mariabackup SST
            * Progress reporting requires tool pv
            * Progress/rate-limiting can be disabled by configuration (progress = NONE)
            * Progress is reported now in server error log for example :
            {noformat}
            2022-03-16 14:39:27 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 392923645, "indefinite": -1 }'
            2022-03-16 14:39:28 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 896353227, "indefinite": -1 }'
            2022-03-16 14:39:29 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 1386740995, "indefinite": -1 }'
            2022-03-16 14:39:30 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 1914292021, "indefinite": -1 }'
            2022-03-16 14:39:31 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 2429366550, "indefinite": -1 }'
            2022-03-16 14:39:32 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 2731243266, "indefinite": -1 }'
            2022-03-16 14:39:32 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2734803150, "done": 2734803150, "indefinite": -1 }'
            {noformat}
            https://github.com/MariaDB/server/pull/1982

            Codership is planning to add a new feature to cluster nodes: reporting some wsrep status variables in a dedicated JSON file, that then can be read by an external monitoring tool. Or a human for that matter.

            Rationale: until the server is fully initialized it is inaccessible by client and the only source of information is an error log which is not machine-friendly. Since wsrep node can spend a very long time in initialization phase (state transfer), it may be a very long time that automatic tools can't easily monitor its liveness and progression.

            Rationale behind using a file as opposed to some sort of a socket: it is simpler and safer and the file stays in case of the process abort, so it is easy to get the last error that caused the abort.

            For now the file contents will look as follows:
            {code:json}
            $ cat /tmp/galera/0/mysql/var/wsrep_status.json
            {
            "date": "2021-09-04 15:35:02.000",
            "timestamp": 1630758902.00000000,
            "errors": [
            {
            "timestamp": 1630758901.00000000,
            "msg": "mysqld: Can't open shared library '/tmp/galera/0/mysql/lib64/mysql/plugin/audit_log.so' (errno: 0, cannot open shared object file: No such file or directory)"
            },
            {
            "timestamp": 1630758901.00000000,
            "msg": "Couldn't load plugins from 'audit_log.so'."
            }
            ],
            "warnings": [
            {
            "timestamp": 1630758902.00000000,
            "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown option '--loose-skip_mysqlx'"
            },
            {
            "timestamp": 1630758902.00000000,
            "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-log_error_verbosity=3'"
            },
            {
            "timestamp": 1630758902.00000000,
            "msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-audit_log_file=/tmp/galera/0/mysql/var/audit.log'"
            },
            {
            "timestamp": 1630758902.00000000,
            "msg": "'proxies_priv' entry '@% root@void' ignored in --skip-name-resolve mode."
            }
            ],
            "status": {
            "state": "DISCONNECTED",
            "comment": "Disconnected",
            "progress": -1.00000
            }
            }
            {code}

            So there are a few most recent errors and warnings form the error log, wsrep state and a progress indicator (in case of SST/IST).

            I have an ready patch for MariaDB 10.4. It introduces a new variable: `wsrep_status_file`. If that variable is unset, no file is created and no reporting is done. The patch does not support SST/IST progress reporting yet, only discrete state changes. We plan to add progress reporting in the followup patches.

            This task contains also progress reporting for mariabackup SST
            * Progress reporting requires tool pv
            * Progress/rate-limiting can be disabled by configuration (progress = NONE)
            * Progress is reported now in server error log for example :
            {noformat}
            2022-03-16 14:39:27 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 392923645, "indefinite": -1 }'
            2022-03-16 14:39:28 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 896353227, "indefinite": -1 }'
            2022-03-16 14:39:29 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 1386740995, "indefinite": -1 }'
            2022-03-16 14:39:30 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 1914292021, "indefinite": -1 }'
            2022-03-16 14:39:31 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 2429366550, "indefinite": -1 }'
            2022-03-16 14:39:32 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 2731243266, "indefinite": -1 }'
            2022-03-16 14:39:32 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2734803150, "done": 2734803150, "indefinite": -1 }'
            {noformat}
            serg Sergei Golubchik made changes -
            Fix Version/s 10.9.0 [ 27113 ]
            Fix Version/s 10.9 [ 26905 ]
            Resolution Fixed [ 1 ]
            Status Stalled [ 10000 ] Closed [ 6 ]
            ramesh Ramesh Sivaraman made changes -
            ralf.gebhardt Ralf Gebhardt made changes -
            ralf.gebhardt Ralf Gebhardt made changes -
            ralf.gebhardt Ralf Gebhardt made changes -
            sysprg Julius Goryavsky made changes -
            sysprg Julius Goryavsky made changes -
            ralf.gebhardt Ralf Gebhardt made changes -
            ralf.gebhardt Ralf Gebhardt made changes -
            Labels Preview_10.9
            ralf.gebhardt Ralf Gebhardt made changes -
            mariadb-jira-automation Jira Automation (IT) made changes -
            Zendesk Related Tickets 144345

            People

              jplindst Jan Lindström (Inactive)
              Yurchenko Alexey
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.