Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.11.10
-
None
-
Ubuntu 22.04, Lenovo Thinksystem SR630, 512GB RAM, 128 cores, 15TB SSD.
-
Can result in unexpected behaviour
-
Description
Over the last year or so I've been migrating our systems away from MySQL 8.0 to MariaDB 10.11.
A few of my databases contain extremely large multi-billion row tables. The database is stored on SSD, and the tables are in MyISAM format. Migrating to InnoDB is NOT an option, since the InnoDB footprint is at least triple the size of MyISAM and we simply don't have the SSD to spare. Partitioning the tables is also not currently an option. The databases have performed fine since they were originally created in MySQL 5 (CentOS 7), then upgraded to MySQL 8 (Ubuntu 22.04). Most of my databases have now been migrated to MariaDB 10.11 (Ubuntu 22.04). During the migration most of the MyISAM tables, which were moved as binary file copies using XtraBackup, MariaDB did complain that the tables needed to be "repaired". In the vast majority of cases the tables repaired successfully (specifically the indexes) and all are functioning fine, accepting tens of millions of new daily inserts, updates and read-only queries.
However, I have one database (which I've been running successfully for 15 years) that has two MyISAM tables (different structures) that contain 9.7 billion rows (tcs_transient_objects, 1 pk and 34 secondary indexes, 2.6TB MYD, 3.8TB MYI) and 13.4 billion rows (tcs_transient_reobservations, 1 pk and 14 secondary indexes, 3.6TB MYD, 2.0TB MYI) respectively. Both work fine in MySQL 8.0.
Both tables needed to run a REPAIR operation before I could use them in MariaDB (I'm not entirely sure why), the main result of which was a rebuild of the secondary indexes. After the rebuild (no errors reported) both primary keys performed as expected, returning a non-zero number of rows when used in a where clause.
The 9.7 billion row table's secondary indexes (34 of them!) all functioned as expected and returned non-zero rows when used in a where clause.
However, the 13.4 billion row table (tcs_transient_reobservations, and much narrower than the above table) returned no rows when attempting to use the secondary indexes. I have attempted to rebuild the indexes in the following way:
1. repair table: REPAIR TABLE tcs_transient_reobservations
2. myisamchk, using the following parameters (the server has 512GB memory):
myisamchk --force --update-state --information --verbose --key_buffer_size=64G --sort_buffer_size=64G --read_buffer_size=1G --write_buffer_size=1G tcs_transient_reobservations.MYI
|
3. Complete outgest and dump of the entire table into a CSV file and re-ingest into a recreated table structure.
Each stage took at least 4 days to complete. At no stage were any errors reported, and when complete, the primary key functioned correctly, the secondary indexes appear to have built, but do NOT function correctly.
(Note that mysqldump takes several days and simple mysql ingest of the dump file takes at least 1 month, so I haven't tried re-ingest this way for the moment.)
In my searching, googling, even using AI, I have not come across a satisfactory solution.
Initial reports appeared to indicate implicit creation of 32-bit index pointers, which of course would break with > 4B rows. Indeed, there even seems to be a MAX_ROWS hard limit specified when attempting to use it when creating the table. (It is NOT used in my case in the table creation script.)
The default pointer size in my database is 6, which is enough to address all the required rows.
MariaDB [(none)]> SHOW VARIABLES LIKE 'myisam_data_pointer_size';
|
+--------------------------+-------+
|
| Variable_name | Value |
|
+--------------------------+-------+
|
| myisam_data_pointer_size | 6 |
|
+--------------------------+-------+
|
|
From my various experiments, the conclusion seems to be that this has nothing to do with pointer size allocation. (My 9.6 billion row table works fine.) It appears there is a silent bug in the implementation of MyISAM for MariaDB. Furthermore, although I haven't yet tried an Aria rebuild of the table, it would also appear that the bug exists in the Aria engine.
This is a real show stopper for us. I would like to harmonise our systems to one server, one OS. Most of our databases do use InnoDB (and the smaller ones have been migrated to use that engine), but our very large catalogue tables need to run in MyISAM or Aria format, for space purposes. It's looking increasingly clear that our only current option is to migrate back to MySQL (8.4 now).
I can provide copies of the table creation script and the data if necessary - though note the sizes above. Likewise I can provide server config if required.