[MDEV-5336] Implement BACKUP STAGE for safe external backups Created: 2013-11-26 Updated: 2024-01-19 Resolved: 2019-02-04 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Backup, Server |
| Fix Version/s: | 10.4.2 |
| Type: | Task | Priority: | Critical |
| Reporter: | erkan yanar | Assignee: | Michael Widenius |
| Resolution: | Fixed | Votes: | 19 |
| Labels: | Backup | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
The purpose of this task is to ensure that mariabackup will be able to copy Instead of using FLUSH TABLES WITH READ LOCK in mariabackup
With the above in mind, here is a detailed description of how the BACKUP STAGE's
BACKUP STAGE START
mariabackup can now copy all transactional tables and aria_log_control, aria_log.# and BACKUP STAGE FLUSH
Next lock can be taken directly after this lock. While waiting for the BACKUP STAGE BLOCK_DDL
Next lock can be taken directly after this lock. While waiting for the
BACKUP STAGE BLOCK_COMMIT
When stage BLOCK_COMMITs returns, this is the 'backup time'. mariabackup can now copy the last changes to the redo files for InnoDB BACKUP STAGE END
Other things:
|
| Comments |
| Comment by erkan yanar [ 2015-01-28 ] |
|
In the end we need something like LOCK TABLES FOR BACKUP like with Percona: |
| Comment by Peter McLarty [ 2015-03-16 ] |
|
Oracle has tables as part of a tablespace and the tablespace is able to be anywhere on disk the tables are within a tablespace and oracle you can place a tablespace in backup mode, all data within that tablespace is kept consistent. Oracle writes logs of activity to be changed in that tablespace until the lock is released. This might be a step too far but may help formulate a way forward to get this to work. Tablespaces probably need to be implemented within the engine but could still be kept in the mariadb variables. |
| Comment by guo feng [ 2015-11-17 ] |
|
Hope backup locks come to MariaDB soon~ |
| Comment by Vladislav Vaintroub [ 2018-03-27 ] |
|
Perhaps we should already adopt 8.0 syntax, not sure about it https://dev.mysql.com/doc/refman/8.0/en/lock-instance-for-backup.html |
| Comment by Sergei Golubchik [ 2018-09-17 ] |
|
Notes:
|
| Comment by Sergei Golubchik [ 2018-09-17 ] |
|
UI Notes. Better to avoid explicit stage numbers. Ideas, in the order from small to big
|
| Comment by Marko Mäkelä [ 2018-09-17 ] |
|
Stage 1 is using confusing terminology:
In InnoDB, redo log files can only be removed at startup, when creating new redo log files due to size mismatch, upgrade, or a change of the parameter innodb_encrypt_log. The InnoDB purge is associated with undo log records, which reside in data pages in the system tablespace or undo tablespaces. These data pages are redo-logged just like any other data pages. For stage 1, InnoDB could issue a redo log checkpoint. This would require new handler API. Stage 3 would block TRUNCATE TABLE among other things. I believe that this is unnecessary after the As a prerequisite of Note: For ALTER TABLE…ALGORITHM=INPLACE, the .frm file is replaced after the operation has been committed in the storage engine. The commit inside InnoDB cannot be blocked, because data dictionary transactions are special. So, the block would have to be somewhere before the call to handler::commit_inplace_alter_table(commit=true). Note: If ALTER TABLE is blocked before the final rename operations, then the backup can end up containing tables whose name starts with #sql. Inside InnoDB, such tables would be dropped at startup by Stage 4 would again be ignored by InnoDB. InnoDB can basically write redo and data files at any time, so the ib_logfile* would have to be copied until the very end, which marks the logical time of the backup. I strongly believe that the TRUNCATE fix ( |
| Comment by Sergey Vojtovich [ 2018-09-19 ] |
|
From what I understand this framework aims to be engine independent. If so it should be reflected in description, so that we can think of other storage engines impact in advance. Stage 1 Stage 2
At low level it means: close unused non-transactional TABLE instances and mark their TABLE_SHARE's flushed? Anything else?
Are we going to block all new write locks for all unused non-transactional tables, or only for just flushed tables? Are we going to block write locks, but not DDL? Is it going to be implemented via MDL or every non-transactional storage engine has to implement this functionality? Or some other way? How do we pass list of used non-transactional tables to the client (that is tables that it should skip at this point)? Stage 3 backup thread: writes to non-transactional tables are blocked |
| Comment by Sergei Golubchik [ 2018-09-19 ] |
|
Another possible deadlock:
With MyISAM tables one can solve a deadlock with fallback-and-retry approach, as MyISAM tables should not be locked in the middle of the statement. But non-InnoDB transactional tables are locked till the end of the transaction, so fallback-and-retry doesn't work for then. And RocksDB is one of these "non-InnoDB transactional tables", currently it's backed up in one-shot action under FTWRL (see |
| Comment by Michael Widenius [ 2018-09-19 ] |
|
Comment to Sergei's deadlock:
This can be solved as we do with FTWRL where we abort any transaction that is waiting for a lock on non transnational table. As transactional tables will not be locked, user thread will either be able to continue (not blocked in thr_lock) or if locked in thr_lock, FTWRL will abort the thr_lock and stage 3 can continue. |
| Comment by Vladislav Vaintroub [ 2018-09-19 ] |
|
serg monty : Why it is called "BACKUP STAGE X", if server only takes locks. The SQL command does not do backup. |
| Comment by Vladislav Vaintroub [ 2018-09-20 ] |
|
we do not always need BACKUP stage 4. |
| Comment by Sergey Vojtovich [ 2018-09-21 ] |
|
Why do we need to start "service to log changed tables" at STAGE 1? Why this service has to log changes for all storage engines, given that "mariabackup can detect DDL's for InnoDB that are stored in the redo log"? Blocking purge of redo files (aria specific) doesn't also seem to be required at STAGE 1. However if it is instant and doesn't produce measurable overhead, it is probably fine. Edge cases: What if storage engine is loaded using INSTALL PLUGIN between prepare_for_backup() and end_backup()? |
| Comment by Sergey Vojtovich [ 2018-11-15 ] |
|
Just for the record: previous review base before force push https://github.com/MariaDB/server/commits/6bb480cc58ab137d675497205e0d3566cbb39dd0 |
| Comment by Michael Widenius [ 2018-11-23 ] |
|
matthias, the above assert is fixed in the current code. I just added a test case to cover this case. |
| Comment by Michael Widenius [ 2018-11-27 ] |
|
Will fix the privileges to only allow root to execute backup stages; |
| Comment by Vladislav Vaintroub [ 2018-11-27 ] |
|
If you only allow "root" for the BACKUP STAGE statements, then users, who were able to run backup in 10.3, would not be able to do so in 10.4 . If this is by design, it should be documented. |
| Comment by Michael Widenius [ 2018-11-27 ] |
|
I will ensure it will use same privileges as FTWRL |
| Comment by VAROQUI Stephane [ 2019-01-25 ] |
|
Quick question on myisam: BLOCK all new write row locks for all non transactional tables |
| Comment by Michael Widenius [ 2019-02-04 ] |
|
Discussions to implement usage of backup stages in mariabackup |