Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
None
-
None
Description
To determine a Galera Cluster node's position, users can execute the following:
mysqld --wsrep_recover
|
Or if the node is using systemd, then users can execute the following wrapper script:
galera_recovery
|
See here:
MariaDB's systemd unit file also automatically calls this wrapper script to recover the Galera position before starting the server:
https://github.com/MariaDB/server/blob/mariadb-10.4.8/support-files/mariadb.service.in#L70
When mysqld is started in this recovery mode, it returns a successful exit code, even if the datadir isn't initialized. For example:
[ec2-user@ip-172-30-0-96 ~]$ sudo rm -fr /var/lib/mysql/*
|
[ec2-user@ip-172-30-0-96 ~]$ sudo galera_recovery
|
WSREP: Recovered position 00000000-0000-0000-0000-000000000000:-1
|
--wsrep_start_position=00000000-0000-0000-0000-000000000000:-1
|
[ec2-user@ip-172-30-0-96 ~]$ echo $?
|
0
|
This process needs to work with a non-initialized datadir. This is because with Galera Cluster, deleting the contents of the datadir is one of the most common ways to trigger SSTs.
However, two things can currently cause this to fail:
- Plugin system variables in MariaDB configuration files.
- Plugin activation options in MariaDB configuration files.
This is a very big problem on systems that use systemd, since this recovery process happens automatically.
More details below.
Plugin System Variables
If the MariaDB configuration file references plugin system variables without the loose prefix, then that will lead to failures.
For example, let's say that we have the following:
[ec2-user@ip-172-30-0-96 ~]$ cat /etc/my.cnf.d/audit.cnf
|
[mariadb]
|
server_audit_logging = 1
|
server_audit_events = 'CONNECT,QUERY,TABLE'
|
This will cause the recovery process to fail when the datadir is empty. For example:
[ec2-user@ip-172-30-0-96 ~]$ sudo rm -fr /var/lib/mysql/*
|
[ec2-user@ip-172-30-0-96 ~]$ sudo galera_recovery
|
WSREP: Failed to start mysqld for wsrep recovery: '2019-10-01 21:13:48 0 [Note] /usr/sbin/mysqld (mysqld 10.3.17-MariaDB-log) starting as process 17991 ...
|
2019-10-01 21:13:48 0 [ERROR] Plugin 'aws_key_management' registration as a ENCRYPTION failed.
|
2019-10-01 21:13:48 0 [Warning] The parameter innodb_file_format is deprecated and has no effect. It may be removed in future releases. See https://mariadb.com/kb/en/library/xtradbinnodb-file-format/
|
2019-10-01 21:13:48 0 [Note] InnoDB: Using Linux native AIO
|
2019-10-01 21:13:48 0 [Note] InnoDB: The first innodb_system data file 'ibdata1' did not exist. A new tablespace will be created!
|
2019-10-01 21:13:48 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
|
2019-10-01 21:13:48 0 [Note] InnoDB: Uses event mutexes
|
2019-10-01 21:13:48 0 [Note] InnoDB: Compressed tables use zlib 1.2.7
|
2019-10-01 21:13:48 0 [Note] InnoDB: Number of pools: 1
|
2019-10-01 21:13:48 0 [Note] InnoDB: Using SSE2 crc32 instructions
|
2019-10-01 21:13:48 0 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M
|
2019-10-01 21:13:48 0 [Note] InnoDB: Completed initialization of buffer pool
|
2019-10-01 21:13:48 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
|
2019-10-01 21:13:48 0 [Note] InnoDB: Setting file './ibdata1' size to 12 MB. Physically writing the file full; Please wait ...
|
2019-10-01 21:13:48 0 [Note] InnoDB: File './ibdata1' size is now 12 MB.
|
2019-10-01 21:13:48 0 [Note] InnoDB: Setting log file ./ib_logfile101 size to 50331648 bytes
|
2019-10-01 21:13:48 0 [Note] InnoDB: Setting log file ./ib_logfile1 size to 50331648 bytes
|
2019-10-01 21:13:48 0 [Note] InnoDB: Renaming log file ./ib_logfile101 to ./ib_logfile0
|
2019-10-01 21:13:48 0 [Note] InnoDB: New log files created, LSN=45786
|
2019-10-01 21:13:48 0 [Note] InnoDB: Doublewrite buffer not found: creating new
|
2019-10-01 21:13:48 0 [Note] InnoDB: Doublewrite buffer created
|
2019-10-01 21:13:48 0 [Note] InnoDB: 128 out of 128 rollback segments are active.
|
2019-10-01 21:13:48 0 [Note] InnoDB: Creating foreign key constraint system tables.
|
2019-10-01 21:13:48 0 [Note] InnoDB: Creating tablespace and datafile system tables.
|
2019-10-01 21:13:48 0 [Note] InnoDB: Creating sys_virtual system tables.
|
2019-10-01 21:13:48 0 [Note] InnoDB: Creating shared tablespace for temporary tables
|
2019-10-01 21:13:48 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
|
2019-10-01 21:13:48 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
|
2019-10-01 21:13:48 0 [Note] InnoDB: Waiting for purge to start
|
2019-10-01 21:13:48 0 [Note] InnoDB: 10.3.17 started; log sequence number 0; transaction id 7
|
2019-10-01 21:13:48 0 [Warning] InnoDB: Skipping buffer pool dump/restore during wsrep recovery.
|
2019-10-01 21:13:48 0 [Note] InnoDB: Creating #1 encryption thread id 140312278198016 total threads 2.
|
2019-10-01 21:13:48 0 [Note] InnoDB: Creating #2 encryption thread id 140312269805312 total threads 2.
|
2019-10-01 21:13:48 0 [Note] Plugin 'FEEDBACK' is disabled.
|
2019-10-01 21:13:48 0 [ERROR] Could not open mysql.plugin table. Some plugins may be not loaded
|
2019-10-01 21:13:48 0 [ERROR] /usr/sbin/mysqld: unknown variable 'server_audit_logging=1'
|
2019-10-01 21:13:48 0 [ERROR] Aborting'
|
[ec2-user@ip-172-30-0-96 ~]$ echo $?
|
1
|
The workaround for this is to add the loose prefix.
https://mariadb.com/kb/en/library/mysqld-options/#-loose-
For example, let's say that we have the following:
[ec2-user@ip-172-30-0-96 ~]$ sudo cat /etc/my.cnf.d/audit.cnf
|
[mariadb]
|
loose_server_audit_logging = 1
|
loose_server_audit_events = 'CONNECT,QUERY,TABLE'
|
This will allow the recovery process to succeed when the datadir is empty. For example:
[ec2-user@ip-172-30-0-96 ~]$ sudo rm -fr /var/lib/mysql/*
|
[ec2-user@ip-172-30-0-96 ~]$ sudo galera_recovery
|
WSREP: Recovered position 00000000-0000-0000-0000-000000000000:-1
|
--wsrep_start_position=00000000-0000-0000-0000-000000000000:-1
|
[ec2-user@ip-172-30-0-96 ~]$ echo $?
|
0
|
Plugin Activation Options
If the MariaDB configuration file sets plugin activation options to FORCE or FORCE_PLUS_PERMANENT without an associated plugin_load_add option, then that will lead to failures.
https://mariadb.com/kb/en/library/plugin-overview/#configuring-plugin-activation-at-server-startup
For example, let's say that we have the following:
[ec2-user@ip-172-30-0-96 ~]$ sudo cat /etc/my.cnf.d/audit.cnf
|
[mariadb]
|
server_audit = FORCE_PLUS_PERMANENT
|
loose_server_audit_logging = 1
|
loose_server_audit_events = 'CONNECT,QUERY,TABLE'
|
This will cause the recovery process to fail when the datadir is empty. For example:
[ec2-user@ip-172-30-0-96 ~]$ sudo rm -fr /var/lib/mysql/*
|
[ec2-user@ip-172-30-0-96 ~]$ sudo galera_recovery
|
WSREP: Failed to start mysqld for wsrep recovery: '2019-10-01 21:19:53 0 [Note] /usr/sbin/mysqld (mysqld 10.3.17-MariaDB-log) starting as process 18334 ...
|
2019-10-01 21:19:53 0 [ERROR] Plugin 'aws_key_management' registration as a ENCRYPTION failed.
|
2019-10-01 21:19:53 0 [Warning] The parameter innodb_file_format is deprecated and has no effect. It may be removed in future releases. See https://mariadb.com/kb/en/library/xtradbinnodb-file-format/
|
2019-10-01 21:19:53 0 [Note] InnoDB: Using Linux native AIO
|
2019-10-01 21:19:53 0 [Note] InnoDB: The first innodb_system data file 'ibdata1' did not exist. A new tablespace will be created!
|
2019-10-01 21:19:53 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
|
2019-10-01 21:19:53 0 [Note] InnoDB: Uses event mutexes
|
2019-10-01 21:19:53 0 [Note] InnoDB: Compressed tables use zlib 1.2.7
|
2019-10-01 21:19:53 0 [Note] InnoDB: Number of pools: 1
|
2019-10-01 21:19:53 0 [Note] InnoDB: Using SSE2 crc32 instructions
|
2019-10-01 21:19:53 0 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M
|
2019-10-01 21:19:53 0 [Note] InnoDB: Completed initialization of buffer pool
|
2019-10-01 21:19:53 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
|
2019-10-01 21:19:53 0 [Note] InnoDB: Setting file './ibdata1' size to 12 MB. Physically writing the file full; Please wait ...
|
2019-10-01 21:19:53 0 [Note] InnoDB: File './ibdata1' size is now 12 MB.
|
2019-10-01 21:19:53 0 [Note] InnoDB: Setting log file ./ib_logfile101 size to 50331648 bytes
|
2019-10-01 21:19:53 0 [Note] InnoDB: Setting log file ./ib_logfile1 size to 50331648 bytes
|
2019-10-01 21:19:53 0 [Note] InnoDB: Renaming log file ./ib_logfile101 to ./ib_logfile0
|
2019-10-01 21:19:53 0 [Note] InnoDB: New log files created, LSN=45786
|
2019-10-01 21:19:53 0 [Note] InnoDB: Doublewrite buffer not found: creating new
|
2019-10-01 21:19:53 0 [Note] InnoDB: Doublewrite buffer created
|
2019-10-01 21:19:53 0 [Note] InnoDB: 128 out of 128 rollback segments are active.
|
2019-10-01 21:19:53 0 [Note] InnoDB: Creating foreign key constraint system tables.
|
2019-10-01 21:19:53 0 [Note] InnoDB: Creating tablespace and datafile system tables.
|
2019-10-01 21:19:53 0 [Note] InnoDB: Creating sys_virtual system tables.
|
2019-10-01 21:19:53 0 [Note] InnoDB: Creating shared tablespace for temporary tables
|
2019-10-01 21:19:53 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
|
2019-10-01 21:19:53 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
|
2019-10-01 21:19:53 0 [Note] InnoDB: Waiting for purge to start
|
2019-10-01 21:19:53 0 [Note] InnoDB: 10.3.17 started; log sequence number 0; transaction id 7
|
2019-10-01 21:19:53 0 [Warning] InnoDB: Skipping buffer pool dump/restore during wsrep recovery.
|
2019-10-01 21:19:53 0 [Note] InnoDB: Creating #1 encryption thread id 140458902673152 total threads 2.
|
2019-10-01 21:19:53 0 [Note] InnoDB: Creating #2 encryption thread id 140458894280448 total threads 2.
|
2019-10-01 21:19:53 0 [Note] Plugin 'FEEDBACK' is disabled.
|
2019-10-01 21:19:53 0 [ERROR] Could not open mysql.plugin table. Some plugins may be not loaded
|
2019-10-01 21:19:53 0 [ERROR] /usr/sbin/mysqld: unknown variable 'server_audit=FORCE_PLUS_PERMANENT'
|
2019-10-01 21:19:53 0 [ERROR] Aborting'
|
[ec2-user@ip-172-30-0-96 ~]$ echo $?
|
1
|
This situation is most likely to occur when the user has previously installed the plugin dynamically by executing INSTALL SONAME.
https://mariadb.com/kb/en/library/plugin-overview/#installing-a-plugin-dynamically
The workaround for this is to add the plugin_load_add to the MariaDB configuration file.
https://mariadb.com/kb/en/library/plugin-overview/#installing-a-plugin-with-plugin-load-options
https://mariadb.com/kb/en/library/mysqld-options/#-plugin-load-add
For example, let's say that we have the following:
[ec2-user@ip-172-30-0-96 ~]$ cat /etc/my.cnf.d/audit.cnf
|
[mariadb]
|
plugin_load_add = server_audit
|
server_audit = FORCE_PLUS_PERMANENT
|
loose_server_audit_logging = 1
|
loose_server_audit_events = 'CONNECT,QUERY,TABLE'
|
This will allow the recovery process to succeed when the datadir is empty. For example:
[ec2-user@ip-172-30-0-96 ~]$ sudo rm -fr /var/lib/mysql/*
|
[ec2-user@ip-172-30-0-96 ~]$ sudo galera_recovery
|
WSREP: Recovered position 00000000-0000-0000-0000-000000000000:-1
|
--wsrep_start_position=00000000-0000-0000-0000-000000000000:-1
|
[ec2-user@ip-172-30-0-96 ~]$ echo $?
|
0
|
Suggested Fix
I think the suggested fix should probably be something like this:
- If mysqld is started in --wsrep_recover mode, and if any plugin system variables are specified, then they should be treated as loose automatically.
- If mysqld is started in --wsrep_recover mode, and if any plugin activation options are set to FORCE or FORCE_PLUS_PERMANENT, then they should be treated the same as ON, so that any errors are non-fatal. Should there be an exception for encryption plugins, since an encryption plugin may be needed to recover the position from an encrypted database server?