[MDEV-9763] Replication breaks MyISAM table indexes Created: 2016-03-18 Updated: 2017-09-18 Resolved: 2017-09-18 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Replication |
| Affects Version/s: | 10.0.22 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Major |
| Reporter: | Vitaliy Margolen | Assignee: | Sergei Golubchik |
| Resolution: | Not a Bug | Votes: | 0 |
| Labels: | None | ||
| Environment: |
OpenSUSE 13.2 x64 |
||
| Description |
|
Have Master with multiple slaves replication setup. Have numerous MyISAM tables that are being imported from flat files in about this way:
On master indexes appears to be correct:
However on slave indexes for the same table look this way:
In fact, I have 3 slaves replicating from same master. And on all of the salves indexes look exactly the same. The table create SQL:
This is just one out of 22 tables that have exactly the same problem. Running analyze table <table>; fixes it temporarily until next data import. |
| Comments |
| Comment by Elena Stepanova [ 2016-03-18 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I assume that you are using RBR, and that by "broken indexes" you mean that statistics is not collected (I don't see anything else wrong in the output that you provided). But what remains unclear is what you mean by "just one out of 22 tables have exactly the same problem". Do other tables not have any problem, and their statistics on slaves is identical to the master, or is it just that the stat values are anything but NULL? From what I see, statistics is indeed not collected upon executing row events. In some cases values remain NULL, in other they are populated with some rubbish. It's the same in MySQL, only values are different. I couldn't find in the manual anything about whether statistics in this case is supposed to be collected. serg, what do you say? Test case to reproduce is pretty simple:
Output on MySQL 5.7
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vitaliy Margolen [ 2016-03-18 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I didn't say "just one table". I said "22 tables have this exact problem" - broken indexes. And yes, index without statistics (cardinality) will not be used. So in the sense it is broken. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2016-03-18 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Vitaliy, sorry for the mix-up about 1 vs 22 tables, I misunderstood you. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2016-03-18 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I don't see any bug here. The statistics is created when you do ANALYZE TABLE. Sometimes it might be created in other cases, as a side effect. You should not rely on that, just as you should not rely on SELECT * FROM table returning rows in any specific order. If you need rows to be ordered — you should use ORDER BY. If you need to update your table's statistics — you should use ANALYZE TABLE. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vitaliy Margolen [ 2016-03-18 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Please reopen. You did not read correctly. This table is created from scratch. And has all data inserted into it in one operation. This should always update table statistics. Especially that this operation is performed in exactly the same way on slave, replicating all master operations. The load data infile in mixed mode sends SDL with file contents. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2016-03-18 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Sorry for closing the issue prematurely. I've reopened it. Still, I did understand the situation correctly. This table is created from scratch. And has all data inserted into it in one operation. This is true. But this does not mean that the statistics will be automatically updated, not at all. Usually when you bulk insert into an empty MyISAM table the statistics is updated as a convenient side-effect of the "repair-by-sort" optimization, but not always. You don't even need to use replication — create a table with only UNIQUE indexes (over nullable columns) and LOAD DATA into it, you will see that the cardinality will be NULL. The only reliable way of updating table statistics is ANALYZE TABLE. Everything else is not guaranteed to work. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vitaliy Margolen [ 2016-03-18 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
But why does it behave differently on Master and Slave servers? Almost reminds me of an old old bug that was breaking cardinality after table rename. But that was around 5.0.x. As to claim that only "analyze table" shell update cardinality. This is not true. I could not find anything stating this in the MySQL manual. On the contrary. Everyone saying that all update operations will update it. But it might not be accurate. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2016-03-19 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
When a statement inserts many rows (e.g. LOAD DATA or INSERT ... SELECT) into an empty MyISAM table, MyISAM uses bulk-insert optimization — it disables all non-UNIQUE indexes and creates them at the end of the statement using "repair-by-sort". This gives a significant speedup, as repair-by-sort can be orders of magnitude faster than inserting keys into the index one by one. Cardinality data are a byproduct of the repair-by-sort. Note that the table must be originally empty, and this optimization applies only to UNIQUE indexes. Granted, NOT NULL UNIQUE do not need much statistics, the cardinality is always one there, but nullable UNIQUE indexes will have no statistics after a LOAD DATA. Also this optimization never updates engine independent statistics. In your particular case, a bulk insert operation (LOAD DATA) on the master is converted into a series of single row inserts on the slave (RBR Write_row events). That's why the slave gets no index statistics. Run ANALYZE TABLE after the LOAD DATA and you'll have it. Now, I'm starting to think that it might be possible to turn a LOAD DATA on the slave into a bulk insert operation. But it will be a new optimization, by no means a bug fix. This optimization, if at all possible, cannot be done in 10.0 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vitaliy Margolen [ 2016-03-19 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I see what you saying. Yes, we are using LOAD DATA to speed up date import. And yes I know that it rebuilds all indexes after finishing load. So in RBR the data load will be significantly slower, but also indexes will not be optimal and will be missing statistics. Think it worked for me before, several years ago, because replication was statement based then. After switching it over to mixed mode it seems most statements are replicated as RBR. Just looked at the relay log and indeed all rows for LOAD DATA are sent as RBR not original file:
So it's not a bug indeed. Just an implication of RBR. Thanks for clarification. |