Details
-
New Feature
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
Description
Problem:
When using mysqldump with --tab or --dir to export tables to tab-separated text files (.txt), BLOB and BINARY columns containing binary data (especially null bytes 0x00) get corrupted or truncated. The tab-separated text format cannot properly represent binary data, making it impossible to reliably backup and restore tables with binary content using this method.
Current workarounds are inadequate:
- Using standard mysqldump (without --tab) creates large SQL files with INSERT statements
- Manual hex encoding/decoding is error-prone and not integrated
- Binary data is lost or corrupted during the export/import cycle
Proposed Solution:
Extend the existing --hex-blob option to work with --tab/--dir mode for both mysqldump and mariadb-import:
1. mysqldump --hex-blob --tab/--dir:
- Automatically detect BLOB and BINARY columns
- Wrap them with HEX() in the SELECT statement
- Export binary data as hexadecimal strings in .txt files
2. mariadb-import --hex-blob:
- Automatically detect BLOB and BINARY columns from table structure
- Apply UNHEX() transformation during LOAD DATA
- Convert hex strings back to binary data
Together, these changes enable a complete round-trip export/import workflow for tables containing binary data.
Implementation Details:
mysqldump changes (client/mysqldump.cc):
- Query INFORMATION_SCHEMA.COLUMNS for column metadata:
- data_type (binary, varbinary, tinyblob, blob, mediumblob, longblob)
- character_set_name (NULL or 'binary' for binary data)
- When --hex-blob and --tab/--dir are both specified:
- Wrap BLOB/BINARY columns with HEX(column_name) AS column_name
- Generate SELECT statement with hex-encoded binary columns
- Non-binary columns remain unchanged
mariadb-import changes (client/mysqlimport.cc):
- Add --hex-blob command-line option
- Query INFORMATION_SCHEMA.COLUMNS to determine table structure:
- column_name, data_type, character_set_name
- Identify BLOB/BINARY columns based on data type and charset
- Modify LOAD DATA INFILE statement:
- Load hex data into user variables: (col1, @blob_col_hex, col3)
- Add SET clause: SET blob_col=UNHEX(@blob_col_hex)
- Works with:
- --columns parameter (explicit column list)
- All columns (default)
- --dir mode and standalone .txt files
Benefits:
1. Data Integrity:
- Preserves binary data exactly, including null bytes
- Verifiable round-trip export/import
- No data loss or corruption
2. Performance:
- Tab-separated files are faster than SQL INSERT statements
- Efficient for large tables
- Suitable for parallel processing
3. Compatibility:
- Works with existing --tab/--dir infrastructure
- Compatible with other text processing tools
- Human-readable hex format for debugging
4. Flexibility:
- Works with mixed column types (BLOB + VARCHAR + INT)
- Handles NULL and empty values correctly
- Supports partial column imports with --columns
Backward Compatibility:
- Fully backward compatible
- --hex-blob is optional; existing behavior unchanged without it
- No changes to default mysqldump or mariadb-import behavior
- Only affects --tab and --dir mode when explicitly requested