[MDEV-27444] Perform backup prepare using mariabbackup 10.4 version when performing rolling upgrade on joiner node with 10.5 Created: 2022-01-07  Updated: 2023-06-16

Status: Stalled
Project: MariaDB Server
Component/s: Galera SST
Affects Version/s: 10.5
Fix Version/s: 10.5

Type: Bug Priority: Critical
Reporter: Ramesh Sivaraman Assignee: Julius Goryavsky
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Problem/Incident
is caused by MDEV-12353 Efficient InnoDB redo log record format Closed

 Description   

SST will not work when upgrading a server with an active workload on the donor node. This problem exists in rsync and mariabackup SST.

mariabackup SST is failing in prepare stage.

ramesh@galapq:~/qa$ cat /home/ramesh/qa/node3/mariabackup.prepare.log
/home/ramesh/qa/GAL_MD070122-mariadb-10.5.14-linux-x86_64-opt//bin/mariabackup based on MariaDB server 10.5.14-MariaDB Linux (x86_64)
[00] 2022-01-07 13:27:08 cd to /home/ramesh/qa/node3/.sst/
[00] 2022-01-07 13:27:08 open files limit requested 0, set to 1048576
[00] 2022-01-07 13:27:08 This target seems to be not prepared yet.
[00] 2022-01-07 13:27:08 mariabackup: using the following InnoDB configuration for recovery:
[00] 2022-01-07 13:27:08 innodb_data_home_dir = .
[00] 2022-01-07 13:27:08 innodb_data_file_path = ibdata1:12M:autoextend
[00] 2022-01-07 13:27:08 innodb_log_group_home_dir = .
[00] 2022-01-07 13:27:08 InnoDB: Using Linux native AIO
[00] 2022-01-07 13:27:08 Starting InnoDB instance for recovery.
[00] 2022-01-07 13:27:08 mariabackup: Using 104857600 bytes for buffer pool (set by --use-memory parameter)
2022-01-07 13:27:08 0 [Note] InnoDB: Uses event mutexes
2022-01-07 13:27:08 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
2022-01-07 13:27:08 0 [Note] InnoDB: Number of pools: 1
2022-01-07 13:27:08 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
2022-01-07 13:27:08 0 [Note] InnoDB: Using Linux native AIO
2022-01-07 13:27:08 0 [Note] InnoDB: Initializing buffer pool, total size = 104857600, chunk size = 104857600
2022-01-07 13:27:08 0 [Note] InnoDB: Completed initialization of buffer pool
2022-01-07 13:27:08 0 [ERROR] InnoDB: Upgrade after a crash is not supported. The redo log was created with Backup 10.4.23-MariaDB.
2022-01-07 13:27:08 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
[00] FATAL ERROR: 2022-01-07 13:27:08 mariabackup: innodb_init() returned 11 (Generic error).
ramesh@galapq:~/qa$ 

The issue is also exists in regular mariabackup. Backup restore fails when preparing backup using a higher version instead of the backed up version

rsync SST is failing when initializing InnoDB SE

2022-01-07 15:53:05 2 [Note] WSREP: Cert index reset to 00000000-0000-0000-0000-000000000000:-1 (proto: 10), state transfer needed: yes
2022-01-07 15:53:05 0 [Note] WSREP: Service thread queue flushed.
2022-01-07 15:53:05 2 [Note] WSREP: ####### Assign initial position for certification: 00000000-0000-0000-0000-000000000000:-1, protocol version: -1
2022-01-07 15:53:05 2 [Note] WSREP: State transfer required: 
	Group state: 0115e87b-6fc1-11ec-8678-ae594c5a826c:23678
	Local state: 0115e87b-6fc1-11ec-8678-ae594c5a826c:10678
2022-01-07 15:53:05 2 [Note] WSREP: Server status change connected -> joiner
2022-01-07 15:53:05 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2022-01-07 15:53:05 0 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '127.0.0.1' --datadir '/home/ramesh/qa/node3/' --defaults-file '/home/ramesh/qa/conf/node3.cnf' --parent '3426500' --mysqld-args --defaults-file=/home/ramesh/qa/conf/node3.cnf --wsrep-provider=/home/ramesh/qa/GAL_MD070122-mariadb-10.5.14-linux-x86_64-opt/lib/libgalera_smm.so --datadir=/home/ramesh/qa/node3 --basedir=/home/ramesh/qa/GAL_MD070122-mariadb-10.5.14-linux-x86_64-opt'
2022-01-07 15:53:05 0 [Note] WSREP: Joiner monitor thread started to monitor
WSREP_SST: [INFO] new ssl configuration options (ssl-ca[path], ssl-cert and ssl-key) are ignored by SST due to presence of the tca[path], tcert and/or tkey in the [sst] section (20220107 15:53:05.322)
WSREP_SST: [INFO] Using stunnel for SSL encryption: CA: '', CAPATH='', ssl-mode='REQUIRED' (20220107 15:53:05.346)
2022-01-07 15:53:05 2 [Note] WSREP: ####### IST uuid:0115e87b-6fc1-11ec-8678-ae594c5a826c f: 10679, l: 23678, STRv: 3
2022-01-07 15:53:05 2 [Note] WSREP: IST receiver addr using ssl://127.0.0.1:5109
2022-01-07 15:53:05 2 [Note] WSREP: IST receiver using ssl
2022-01-07 15:53:05 2 [Note] WSREP: Prepared IST receiver for 10679-23678, listening at: ssl://127.0.0.1:5109
2022-01-07 15:53:05 0 [Note] WSREP: Member 2.0 (galapq) requested state transfer from '*any*'. Selected 0.0 (galapq)(SYNCED) as donor.
2022-01-07 15:53:05 0 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 25246)
2022-01-07 15:53:05 2 [Note] WSREP: Requesting state transfer: success, donor: 0
2022-01-07 15:53:07 0 [Note] WSREP: (2256b564-b519, 'ssl://127.0.0.1:5108') turning message relay requesting off
2022-01-07 15:53:09 0 [Note] WSREP: 0.0 (galapq): State transfer to 2.0 (galapq) complete.
WSREP_SST: [INFO] Joiner cleanup: rsync PID=0, stunnel PID=3426629 (20220107 15:53:10.116)
WSREP_SST: [INFO] Joiner cleanup done. (20220107 15:53:10.658)
2022-01-07 15:53:10 3 [Note] WSREP: SST received
2022-01-07 15:53:10 3 [Note] WSREP: Server status change joiner -> initializing
2022-01-07 15:53:10 3 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2022-01-07 15:53:10 0 [Note] WSREP: Member 0.0 (galapq) synced with group.
2022-01-07 15:53:10 0 [Note] mysqld: Aria engine: starting recovery
recovered pages: 0% 10% 28% 57% 71% 86% 100% (0.0 seconds); tables to flush: 1 0
 (0.0 seconds); 
2022-01-07 15:53:10 0 [Note] mysqld: Aria engine: recovery done
2022-01-07 15:53:10 0 [Note] InnoDB: Uses event mutexes
2022-01-07 15:53:10 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
2022-01-07 15:53:10 0 [Note] InnoDB: Number of pools: 1
2022-01-07 15:53:10 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
2022-01-07 15:53:10 0 [Note] InnoDB: Using Linux native AIO
2022-01-07 15:53:10 0 [Note] InnoDB: Initializing buffer pool, total size = 134217728, chunk size = 134217728
2022-01-07 15:53:10 0 [Note] InnoDB: Completed initialization of buffer pool
2022-01-07 15:53:10 0 [ERROR] InnoDB: Upgrade after a crash is not supported. The redo log was created with MariaDB 10.4.23.
2022-01-07 15:53:10 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
2022-01-07 15:53:10 0 [Note] InnoDB: Starting shutdown...
2022-01-07 15:53:11 0 [ERROR] Plugin 'InnoDB' init function returned error.
2022-01-07 15:53:11 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
2022-01-07 15:53:11 0 [Note] Plugin 'FEEDBACK' is disabled.
2022-01-07 15:53:11 0 [ERROR] Unknown/unsupported storage engine: InnoDB
2022-01-07 15:53:11 0 [ERROR] Aborting
terminate called after throwing an instance of 'wsrep::runtime_error'
  what():  State wait was interrupted
220107 15:53:11 [ERROR] mysqld got signal 6 ;



 Comments   
Comment by Jan Lindström (Inactive) [ 2022-01-13 ]

Plan:

  • Add sending donors server version to jointer
  • Add finding out joiners server version
  • Add both to error log
  • Check donor's and joiners versions
    • If donor < 10.2 reject SST with clear error
    • For rsynch method do not allow SST between major releases (e.g. 10.4 -> 10.5)
    • Not clear what to do with 10.3. Let x be major release number in donor and y be maror release number in joiner. What to do if
      • y - x > 1 (e.g. donor 10.4, joiner 10.6)
      • y - x < 0 (e.g. donor 10.6, joiner 10.4)
    • If donor e.g 10.4.x and joiner 10.5.x then set up INNOPREPARE at joiner so that it will use 10.4.x mariabackup binary (make this generic e.g. it will do same for 10.5.x vs 10.6.x)
    • else use current mariabackup binary
    • after SST has completed and we have started MariaDB execute mysql_upgrade if major version has changed
Generated at Thu Feb 08 09:52:57 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.