[MDEV-10754] wsrep_sst_rsync does not support innodb_data_home_dir Created: 2016-09-06  Updated: 2020-08-25  Resolved: 2018-08-21

Status: Closed
Project: MariaDB Server
Component/s: Galera, Galera SST
Affects Version/s: 10.1.17
Fix Version/s: 10.1.36, 10.2.18, 10.3.11

Type: Bug Priority: Major
Reporter: Geoff Montee (Inactive) Assignee: Jan Lindström (Inactive)
Resolution: Fixed Votes: 1
Labels: galera, sst

Issue Links:
Problem/Incident
causes MDEV-21397 rsync SST fails if donor is running >... Closed
Relates
relates to MDEV-15436 If log_bin and log_bin_index is diffe... Closed

 Description   

If you have an entry in your configuration file like this:

innodb_data_home_dir=/innodbdata

Then rsync SSTs will fail.

The first time a joiner attempts to SST, the node will see log entries like this, showing that a new InnoDB data file is being created after the SST:

2016-09-06 15:47:41 140177158698752 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '172.31.22.174' --datadir '/var/lib/mysql/'   --parent '1957' --binlog '/var/mysql/binlogs/mariadb-bin' '
2016-09-06 15:47:41 140177453222656 [Note] WSREP: Prepared SST request: rsync|172.31.22.174:4444/rsync_sst
2016-09-06 15:47:41 140177453222656 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-09-06 15:47:41 140177453222656 [Note] WSREP: REPL Protocols: 7 (3, 2)
2016-09-06 15:47:41 140177231623936 [Note] WSREP: Service thread queue flushed.
2016-09-06 15:47:41 140177453222656 [Note] WSREP: Assign initial position for certification: 0, protocol version: 3
2016-09-06 15:47:41 140177231623936 [Note] WSREP: Service thread queue flushed.
2016-09-06 15:47:41 140177453222656 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (71de705d-6faa-11e6-9a78-8295746971ef) does not match group state UUID (bae8d70c-746a-11e6-869f-632ad20bfb5c): 1 (Operation not permitted)
         at galera/src/replicator_str.cpp:prepare_for_IST():482. IST will be unavailable.
2016-09-06 15:47:41 140177175471872 [Note] WSREP: Member 1.0 () requested state transfer from '*any*'. Selected 0.0 (ip-172-31-19-192.us-west-2.compute.internal)(SYNCED) as donor.
2016-09-06 15:47:41 140177175471872 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 0)
2016-09-06 15:47:41 140177453222656 [Note] WSREP: Requesting state transfer: success, donor: 0
2016-09-06 15:47:43 140177183864576 [Note] WSREP: (c52d1933, 'tcp://0.0.0.0:4567') turning message relay requesting off
2016-09-06 15:47:44 140177175471872 [Note] WSREP: 0.0 (ip-172-31-19-192.us-west-2.compute.internal): State transfer to 1.0 () complete.
2016-09-06 15:47:44 140177175471872 [Note] WSREP: Member 0.0 (ip-172-31-19-192.us-west-2.compute.internal) synced with group.
WSREP_SST: [INFO] Extracting binlog files: (20160906 15:47:44.804)
mariadb-bin.000006
WSREP_SST: [INFO] Joiner cleanup. rsync PID: 1997 (20160906 15:47:44.815)
WSREP_SST: [INFO] Joiner cleanup done. (20160906 15:47:45.320)
2016-09-06 15:47:45 140177453537408 [Note] WSREP: SST complete, seqno: 0
2016-09-06 15:47:45 140177453537408 [Note] InnoDB: Using mutexes to ref count buffer pool pages
2016-09-06 15:47:45 140177453537408 [Note] InnoDB: The InnoDB memory heap is disabled
2016-09-06 15:47:45 140177453537408 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2016-09-06 15:47:45 140177453537408 [Note] InnoDB: Memory barrier is not used
2016-09-06 15:47:45 140177453537408 [Note] InnoDB: Compressed tables use zlib 1.2.7
2016-09-06 15:47:45 140177453537408 [Note] InnoDB: Using Linux native AIO
2016-09-06 15:47:45 140177453537408 [Note] InnoDB: Using SSE crc32 instructions
2016-09-06 15:47:45 140177453537408 [Note] InnoDB: Initializing buffer pool, size = 128.0M
2016-09-06 15:47:45 140177453537408 [Note] InnoDB: Completed initialization of buffer pool
2016-09-06 15:47:45 140177453537408 [Note] InnoDB: The first specified data file /innodbdata/ibdata1 did not exist: a new database to be created!
2016-09-06 15:47:45 140177453537408 [Note] InnoDB: Setting file /innodbdata/ibdata1 size to 12 MB
2016-09-06 15:47:45 140177453537408 [Note] InnoDB: Database physically writes the file full: wait...
2016-09-06 15:47:45 140177453537408 [Note] InnoDB: Setting log file ./ib_logfile101 size to 48 MB
2016-09-06 15:47:46 140177453537408 [ERROR] InnoDB: Cannot create ./ib_logfile1
2016-09-06 15:47:46 140177453537408 [ERROR] Plugin 'InnoDB' init function returned error.
2016-09-06 15:47:46 140177453537408 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
2016-09-06 15:47:46 140177453537408 [Note] Plugin 'FEEDBACK' is disabled.
2016-09-06 15:47:46 140177453537408 [ERROR] Unknown/unsupported storage engine: InnoDB
2016-09-06 15:47:46 140177453537408 [ERROR] Aborting

And then every time the joiner performs an SST after the first, the error log will show issues with the InnoDB tablespace:

2016-09-06 15:51:15 139831426414336 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '172.31.22.174' --datadir '/var/lib/mysql/'   --parent '2092' --binlog '/var/mysql/binlogs/mariadb-bin' '
2016-09-06 15:51:15 139831721270016 [Note] WSREP: Prepared SST request: rsync|172.31.22.174:4444/rsync_sst
2016-09-06 15:51:15 139831721270016 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-09-06 15:51:15 139831721270016 [Note] WSREP: REPL Protocols: 7 (3, 2)
2016-09-06 15:51:15 139831499339520 [Note] WSREP: Service thread queue flushed.
2016-09-06 15:51:15 139831721270016 [Note] WSREP: Assign initial position for certification: 0, protocol version: 3
2016-09-06 15:51:15 139831499339520 [Note] WSREP: Service thread queue flushed.
2016-09-06 15:51:15 139831721270016 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (bae8d70c-746a-11e6-869f-632ad20bfb5c): 1 (Operation not permitted)
         at galera/src/replicator_str.cpp:prepare_for_IST():482. IST will be unavailable.
2016-09-06 15:51:15 139831443187456 [Note] WSREP: Member 0.0 () requested state transfer from '*any*'. Selected 1.0 (ip-172-31-19-192.us-west-2.compute.internal)(SYNCED) as donor.
2016-09-06 15:51:15 139831443187456 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 0)
2016-09-06 15:51:15 139831721270016 [Note] WSREP: Requesting state transfer: success, donor: 1
2016-09-06 15:51:17 139831443187456 [Note] WSREP: 1.0 (ip-172-31-19-192.us-west-2.compute.internal): State transfer to 0.0 () complete.
2016-09-06 15:51:17 139831443187456 [Note] WSREP: Member 1.0 (ip-172-31-19-192.us-west-2.compute.internal) synced with group.
2016-09-06 15:51:17 139831451580160 [Note] WSREP: (44e8acf0, 'tcp://0.0.0.0:4567') turning message relay requesting off
WSREP_SST: [INFO] Extracting binlog files: (20160906 15:51:18.363)
mariadb-bin.000007
WSREP_SST: [INFO] Joiner cleanup. rsync PID: 2132 (20160906 15:51:18.369)
WSREP_SST: [INFO] Joiner cleanup done. (20160906 15:51:18.874)
2016-09-06 15:51:18 139831721584768 [Note] WSREP: SST complete, seqno: 0
2016-09-06 15:51:18 139831721584768 [Note] InnoDB: Using mutexes to ref count buffer pool pages
2016-09-06 15:51:18 139831721584768 [Note] InnoDB: The InnoDB memory heap is disabled
2016-09-06 15:51:18 139831721584768 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2016-09-06 15:51:18 139831721584768 [Note] InnoDB: Memory barrier is not used
2016-09-06 15:51:18 139831721584768 [Note] InnoDB: Compressed tables use zlib 1.2.7
2016-09-06 15:51:18 139831721584768 [Note] InnoDB: Using Linux native AIO
2016-09-06 15:51:18 139831721584768 [Note] InnoDB: Using SSE crc32 instructions
2016-09-06 15:51:18 139831721584768 [Note] InnoDB: Initializing buffer pool, size = 128.0M
2016-09-06 15:51:19 139831721584768 [Note] InnoDB: Completed initialization of buffer pool
2016-09-06 15:51:19 139831721584768 [Note] InnoDB: Restoring page 0 of tablespace 0
2016-09-06 15:51:19 139831721584768 [Warning] InnoDB: Doublewrite does not have page_no=0 of space: 0
2016-09-06 15:51:19 139831721584768 [ERROR] InnoDB: space header page consists of zero bytes in data file /innodbdata/ibdata1
2016-09-06 15:51:19 139831721584768 [ERROR] InnoDB: Could not open or create the system tablespace. If you tried to add new data files to the system tablespace, and it failed here, you should now edit innodb_data_file_path in my.cnf back to what it was, and remove the new ibdata files InnoDB created in this failed attempt. InnoDB only wrote those files full of zeros, but did not yet use them in any way. But be careful: do not remove old data files which contain your precious data!
2016-09-06 15:51:19 139831721584768 [ERROR] Plugin 'InnoDB' init function returned error.



 Comments   
Comment by Julius Goryavsky [ 2018-07-24 ]

https://github.com/MariaDB/server/pull/824

If innodb_data_home_dir path specified in the configuration file
then rsync IST/SST will fail because the wsrep_sst_rsync.sh script
does not read this parameter from the configuration file and then
tries to find the data files in the default directory.

To fix this error, we need to add reading of the innodb_data_home_dir
configuration parameter to the rsync-related SST script.

Comment by Julius Goryavsky [ 2018-07-27 ]

Patch for 10.1 branch: https://github.com/MariaDB/server/pull/828

Comment by Jan Lindström (Inactive) [ 2018-07-30 ]

Patch needs adjustment and please run regression testing! See current regressions on git.

Comment by Julius Goryavsky [ 2018-08-09 ]

Added new fixes after review and correction of the problems caused by the fact that during the SST innodb plugin is not yet initialized, as well as problems with running tests from the root user (not directly related to the MDEV-10754).

Generated at Thu Feb 08 07:44:39 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.