Details
-
Bug
-
Status: Stalled (View Workflow)
-
Major
-
Resolution: Unresolved
-
10.3.31, 10.4.21, 10.5.12, 10.6.4
-
None
Description
After a MariaDB/galera node starts, it will do IST/SST but if latter an invalid variable is found it will crash as can be seen from the log:
2021-10-27 18:56:33 0 [Note] WSREP: Loading provider /usr/lib64/galera-4/libgalera_smm.so initial position: 08260eb4-3755-11ec-9a25-86269a5f3dc7:151
|
2021-10-27 18:56:33 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera-4/libgalera_smm.so'
|
2021-10-27 18:56:33 0 [Note] WSREP: wsrep_load(): Galera 26.4.9(r819f29c) by Codership Oy <info@codership.com> loaded successfully.
|
...
|
2021-10-27 18:56:34 0 [Note] WSREP: Joiner monitor thread started to monitor
|
2021-10-27 18:56:34 0 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '10.202.236.218' --datadir '/var/lib/mysql/' --parent '23191' --mysqld-args --wsrep_start_position=08260eb4-3755-11ec-9a25-86269a5f3dc7:151'
|
2021-10-27 18:56:34 1 [Note] WSREP: ####### IST uuid:00000000-0000-0000-0000-000000000000 f: 0, l: 153, STRv: 3
|
2021-10-27 18:56:34 1 [Note] WSREP: IST receiver addr using tcp://10.202.236.218:4568
|
2021-10-27 18:56:34 1 [Note] WSREP: Prepared IST receiver for 0-153, listening at: tcp://10.202.236.218:4568
|
2021-10-27 18:56:34 0 [Note] WSREP: Member 0.0 (node1) requested state transfer from '*any*'. Selected 1.0 (node2)(SYNCED) as donor.
|
...
|
021-10-27 18:56:37 0 [ERROR] /usr/sbin/mariadbd: unknown option '--enforce_gtid_consistency'
|
2021-10-27 18:56:37 0 [ERROR] Aborting
|
terminate called after throwing an instance of 'wsrep::runtime_error'
|
what(): State wait was interrupted
|
211027 18:56:37 [ERROR] mysqld got signal 6 ;
|
It can be seen that one invalid variable was found on the my.cnf file, but the server aborted AFTER doing the entire SST process which can take long.
Not only the server abort might be missed if the SST process is left running overnight, but also when you remove the invalid variable and restart MariaDB once again, SST might need to execute again, taking again a long time.
To reproduce just create a 2/3 node galera cluster and put some invalid variable name on the config file under [mysqld]
Generally, it's not possible to detect invalid parameters before SST. The thing is, invalid parameters can only be detected after all plugins (storage engines are plugins too) are loaded — that is, after server knows what parameters are valid plugin parameters. And SST can copy InnoDB files, so it needs to be done before InnoDB is loaded.
If you use "mysqldump" SST method, that doesn't copy files, you'll likely see "invalid option" error first (but I didn't try it).