Details
-
Epic
-
Status: Stalled (View Workflow)
-
Critical
-
Resolution: Unresolved
-
None
-
None
Description
Enhance mariadb-dump features similar to MyDumper
Backup
- Increase the parallelism,
MDEV-32216using --tab option we can able to backup only one database at a time. - In mariadb-dump --tab, the binlog positions are not stored in a separate file. It is displayed in stdout
- The big tables were split into multiple data files which will help faster restore.
- this should be done when bulk-load in multiple connection becomes faster
Other features that could be added to mariadb-dump that would make it more useful:
- Compression of the dump (especially with --tab). This also implies that we would implement de-compression of the files ion the server (MDEV-28395).
- Dump tables not alphabetically, but in inverse size order (biggest tables first). This will help ensuring that the user does not have to wait for the big tables to finish at the end of the backup (in most cases). See benchmarks in
MDEV-32216.
Restore:
- Using mariadb-import can restore only one database at a time and the user need to restore the table structure and data files separately. We should enhance mariadb-import with an option to re-create the table structure first and then start the data import.
- In case of customers having 100+ databases mariadb-dump (with --tab) and mariadb-import are not user-friendly. (Not hard to fix)
- Have mariadb-import in parallel mode start importing the tables in inverse size order (biggest tables first). This will help ensuring that the user does not have to wait for the big tables to finish at the end of the backup (in most cases).
We need to improve the functionality of mariadb-dump and mariadb-import to allow for the backup and restoration of multiple databases using a single command.
Description of the intended design
- allow multiple databases with a new switch --dir (suggestions for a better name welcome, as I got no good idea at the moment)
With this set, mariadb-dump creates a directory tree structure in the given path, <path>/dbname/ for each db, and <tablename>.txt (tab-separated data, created using SELECT INTO OUTFILE) and tablename.sql with DDL . for all tables under the corresponding db directory.
The files are almost exactly the same as before with --tab, except for directory tree
- mariadb-import also takes new --dir parameter, that points to directory tree created by dump, executes all *.sql files to create tables (and databases if not exist), loads the data using "LOAD DATA INFILE" for the .txt files
Big tables won't be split into small tables now, and I do not think they should be in the future. If required, threading should be handled transparently by LOAD DATA INFILE, i.e by the server, rather than loading single table from 2 different connections. LOAD DATA is our officiaL bulk-loading interface, and we better improve it in the server, rather than building workarounds.
Attachments
Issue Links
- is blocked by
-
MDEV-34719 Disable purge for LOAD DATA INFILE into empty table
- In Review
-
MDEV-34740 mariadb-import: delay creation of secondary indexes until after data load
- In Testing
- relates to
-
MDEV-34703 Innodb bulk load : high IO, crashes on Linux, OOM on Linux, as tested with LOAD DATA INFILE
- Open
-
MDEV-34739 Implement DISABLE KEYS/ENABLE KEYS in Innodb
- Open
-
MDEV-34832 Support adding AUTO_INCREMENT flag to existing numeric using INPLACE
- Open
-
MDEV-28395 LOAD DATA transfer plugins
- Open
-
MDEV-34890 SELECT INTO OUTFILE/LOAD DATA INFILE - option to encode binary data as hex of base64
- Open
- split to
-
MDEV-33625 Add option --dir to mariadb-dump
- Closed
-
MDEV-33627 Add option --dir to mariadb-import
- Closed