[MDEV-32010] Deduplicated dump via zpaq technology Created: 2023-08-25 Updated: 2023-08-25 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | Backup |
| Fix Version/s: | None |
| Type: | Task | Priority: | Minor |
| Reporter: | Franco Corbelli | Assignee: | Unassigned |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Description |
|
The functionality of mysqldump can be greatly enhanced by using a program with block deduplication (actually, it could be done even much better, being ASCII text, but I won't burden the discussion) One of these, in my opinion the best, is zpaq This is a program that has been in major Linux distributions for years (runs on Windows too), but is very little known: http://mattmahoney.net/dc/zpaq.html However, it has a major "flaw" for use with mysqldump: does not work with stdin (aka: no pipe). So it is possible to do "things" like mysqldump -uroot -ppassword franco | zpaqfranz a archivio_franco.zpaq backup.sql -stdin My suggestion is: why not improve mysqldump with versioned/snapshot binary-archive too? Basically instead of using more or less complex scripts to "split" the various days of dumps (Monday, Tuesday, Wednesday...) to maintain a backup-history, you can store them all together as if they were snapshots. I am not hypothesizing the use of "my" program, but of "a" program integrated with mysqldump to compress and store deduplicated dumps in the main mariadb codebase. Something like This would make life much, much easier for any DBA, and maybe not too hard to implement (I think that mysqldump somewhere "stream out" 1 byte at time to stdout, wherefore a "stream in" 1 byte ad time compressor will be OK. Just like zpaq) Just a suggestion! _I would have other suggestions for more complex situations (databases too large to use mysqldump), but they are less straightforward I apologize if I opened a task (can get only task, bug or epic) on Jira but I am a beginner here, and so I hope to be forgiven |