Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-26971

JSON file interface to wsrep node state / SST progress logging

    XMLWordPrintable

Details

    Description

      https://github.com/MariaDB/server/pull/1982

      Codership is planning to add a new feature to cluster nodes: reporting some wsrep status variables in a dedicated JSON file, that then can be read by an external monitoring tool. Or a human for that matter.

      Rationale: until the server is fully initialized it is inaccessible by client and the only source of information is an error log which is not machine-friendly. Since wsrep node can spend a very long time in initialization phase (state transfer), it may be a very long time that automatic tools can't easily monitor its liveness and progression.

      Rationale behind using a file as opposed to some sort of a socket: it is simpler and safer and the file stays in case of the process abort, so it is easy to get the last error that caused the abort.

      For now the file contents will look as follows:

      $ cat /tmp/galera/0/mysql/var/wsrep_status.json 
      {
      	"date": "2021-09-04 15:35:02.000",
      	"timestamp": 1630758902.00000000,
      	"errors": [
      		{
      			"timestamp": 1630758901.00000000,
      			"msg": "mysqld: Can't open shared library '/tmp/galera/0/mysql/lib64/mysql/plugin/audit_log.so' (errno: 0, cannot open shared object file: No such file or directory)"
      		},
      		{
      			"timestamp": 1630758901.00000000,
      			"msg": "Couldn't load plugins from 'audit_log.so'."
      		}
      	],
      	"warnings": [
      		{
      			"timestamp": 1630758902.00000000,
      			"msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown option '--loose-skip_mysqlx'"
      		},
      		{
      			"timestamp": 1630758902.00000000,
      			"msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-log_error_verbosity=3'"
      		},
      		{
      			"timestamp": 1630758902.00000000,
      			"msg": "/tmp/galera/0/mysql/sbin/mysqld: unknown variable 'loose-audit_log_file=/tmp/galera/0/mysql/var/audit.log'"
      		},
      		{
      			"timestamp": 1630758902.00000000,
      			"msg": "'proxies_priv' entry '@% root@void' ignored in --skip-name-resolve mode."
      		}
      	],
      	"status": {
      		"state": "DISCONNECTED",
      		"comment": "Disconnected",
      		"progress": -1.00000
      	}
      }
      

      So there are a few most recent errors and warnings form the error log, wsrep state and a progress indicator (in case of SST/IST).

      I have an ready patch for MariaDB 10.4. It introduces a new variable: `wsrep_status_file`. If that variable is unset, no file is created and no reporting is done. The patch does not support SST/IST progress reporting yet, only discrete state changes. We plan to add progress reporting in the followup patches.

      This task contains also progress reporting for mariabackup SST

      • Progress reporting requires tool pv
      • Progress/rate-limiting can be disabled by configuration (progress = NONE)
      • Progress is reported now in server error log for example :

        2022-03-16 14:39:27 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 392923645, "indefinite": -1 }'
        2022-03-16 14:39:28 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 896353227, "indefinite": -1 }'
        2022-03-16 14:39:29 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 1386740995, "indefinite": -1 }'
        2022-03-16 14:39:30 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 1914292021, "indefinite": -1 }'
        2022-03-16 14:39:31 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 2429366550, "indefinite": -1 }'
        2022-03-16 14:39:32 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2731303065, "done": 2731243266, "indefinite": -1 }'
        2022-03-16 14:39:32 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 2734803150, "done": 2734803150, "indefinite": -1 }'
        

      Attachments

        Issue Links

          Activity

            People

              jplindst Jan Lindström (Inactive)
              Yurchenko Alexey
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.