[MDEV-18443] not using key in galera nodes Created: 2019-02-01 Updated: 2019-03-13 Resolved: 2019-03-13 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Galera |
| Affects Version/s: | 10.3.12 |
| Fix Version/s: | 10.2.23, 10.3.14, 10.4.4 |
| Type: | Bug | Priority: | Major |
| Reporter: | Christian Hesse | Assignee: | Jan Lindström (Inactive) |
| Resolution: | Duplicate | Votes: | 2 |
| Labels: | galera, innodb | ||
| Environment: |
Arch Linux with packages: |
||
| Description |
|
I upgraded a mariadb galera cluster from version 10.1.37 to version 10.3.12. The galera version did not change (though the package was rebuilt). Since the update the cluster nodes show different behavior on key usage.
That looks ok. However another node does not use the key:
The host receiving the structural change from client is ok, others fail. This effects tables with millions of rows, breaking joins that need several seconds or minutes to complete where it finished in some milliseconds before. This takes down the whole cluster due to massively increased load. |
| Comments |
| Comment by Christian Hesse [ 2019-02-01 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Looks like running...
... works around the issue. That helps for now, but a real fix is highly appreciated. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2019-02-03 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Could you please provide the example of different plans from real big tables? Do you have innodb_stats_auto_recalc enabled? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Christian Hesse [ 2019-02-03 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
All cluster nodes have:
Just dumped two large tables from live system and imported them on my testing cluster. First node (which did the import):
Second node (which shows identical results with third node):
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2019-02-04 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thanks. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Robert Kirscht [ 2019-03-05 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi, we are also seeing this behaviour on our 3-node Galera cluster running MariaDB v10.2.17 in a Read/Write-split configuration (no real master/slave, but only one machine is writing at one time to maintain data consistency). This was btw a fresh install of 10.2.17, not an upgrade from an earlier version. The index stats are correctly calculated and updated on the writer machine, but not on the two nodes receiving the changes via replication, where the cardinality values seem to correctly change to 0 when the tables are emptied, but then don't get recalculated once data is re-inserted. This is causing severe problems on our infrastructure because the Query Analyzer is optimizing queries based non-existent index data, causing severe CPU load on the DB servers (blocking them from efficient usage) and frustratingly long delays on our application servers. We have implemented a cronjob which regularly does an "ANALYZE TABLE" on certain candidates that regularly get rebuilt from scratch (search index tables etc), but would also much appreciate a proper fix for this. Sounds to me like the background process that is supposed to do the stats recalculations necessary from certain replication changes is failing to do its job properly. Agree that https://jira.mariadb.org/browse/MDEV-18226 was closed prematurely as it describes exactly this problem as well. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jan Lindström (Inactive) [ 2019-03-13 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||