[MDEV-26336] wsrep_sst_rsync crashes when log-bin is set but log-bin-index is not. Created: 2021-08-10  Updated: 2023-04-27

Status: Open
Project: MariaDB Server
Component/s: None
Affects Version/s: 10.3.30
Fix Version/s: 10.4, 10.5, 10.6

Type: Bug Priority: Major
Reporter: Rolf Fokkens Assignee: Julius Goryavsky
Resolution: Unresolved Votes: 0
Labels: None


 Description   

After having upgraded several systems from MariaDB 10.2.x to MariaDB 10.3.30 we ran into the situation that a full SST does not work, this is in the donor logs:

2021-08-10 10:23:53 0 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 5740707)
2021-08-10 10:23:53 0 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'donor' --address '10.220.200.61:4444/rsync_sst' --local-port '3306' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --gtid 'fcd85f39-ec71-11eb-8b1b-3fa7ede6267b:5740707' --gtid-domain-id '0' --binlog 'galera' --mysqld-args --wsrep_start_position=fcd85f39-ec71-11eb-8b1b-3fa7ede6267b:5740707'
2021-08-10 10:23:53 2 [Note] WSREP: sst_donor_thread signaled with 0
2021-08-10 10:23:54 0 [Note] WSREP: Flushing tables for SST...
2021-08-10 10:23:54 0 [Note] WSREP: Provider paused at fcd85f39-ec71-11eb-8b1b-3fa7ede6267b:5740707 (83)
2021-08-10 10:23:54 0 [Note] WSREP: Tables flushed.
tail: cannot open 'mysql-bin.index' for reading: No such file or directory
2021-08-10 10:23:54 0 [ERROR] WSREP: Failed to read from: wsrep_sst_rsync --role 'donor' --address '10.220.200.61:4444/rsync_sst' --local-port '3306' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --gtid 'fcd85f39-ec71-11eb-8b1b-3fa7ede6267b:5740707' --gtid-domain-id '0' --binlog 'galera' --mysqld-args --wsrep_start_position=fcd85f39-ec71-11eb-8b1b-3fa7ede6267b:5740707
2021-08-10 10:23:54 0 [ERROR] WSREP: Process completed with error: wsrep_sst_rsync --role 'donor' --address '10.220.200.61:4444/rsync_sst' --local-port '3306' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --gtid 'fcd85f
39-ec71-11eb-8b1b-3fa7ede6267b:5740707' --gtid-domain-id '0' --binlog 'galera' --mysqld-args --wsrep_start_position=fcd85f39-ec71-11eb-8b1b-3fa7ede6267b:5740707: 1 (Operation not permitted)
2021-08-10 10:23:54 0 [Note] WSREP: resuming provider at 83
2021-08-10 10:23:54 0 [Note] WSREP: Provider resumed.
2021-08-10 10:23:54 0 [ERROR] WSREP: Command did not run: wsrep_sst_rsync --role 'donor' --address '10.220.200.61:4444/rsync_sst' --local-port '3306' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --gtid 'fcd85f39-ec71-1
1eb-8b1b-3fa7ede6267b:5740707' --gtid-domain-id '0' --binlog 'galera' --mysqld-args --wsrep_start_position=fcd85f39-ec71-11eb-8b1b-3fa7ede6267b:5740707
2021-08-10 10:23:54 0 [Warning] WSREP: 0.0 (tb-test-zabbix-db-el): State transfer to 1.0 (tb-test-zabbix-db-wp) failed: -1 (Operation not permitted)



 Comments   
Comment by Rolf Fokkens [ 2021-08-10 ]

The issue seems to be caused by the fact that MariaDB implicitly sets log-bin-index based on log-bin when log-bin is set. But the wsrep_sst_rsync script chooses if different default (mysql-bin.index) which does not match the MariaDB default.

The workaround is simple: explicitly set log-bin-index in the galera config. This is what we did.

The solution might be to have the wsrep_sst_rsync script adopt a default that is consistent with MariaDB's default.

Comment by Rolf Fokkens [ 2021-08-10 ]

Not sure why we suddenly run into this, there are two changes that may cause this:

  • Moved from CentOS7 to CentOS8
  • Moved from MariaDB 10.2.x to MariaDB 10.3.30
Generated at Thu Feb 08 09:44:32 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.