[MDEV-18443] not using key in galera nodes Created: 2019-02-01  Updated: 2019-03-13  Resolved: 2019-03-13

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.3.12
Fix Version/s: 10.2.23, 10.3.14, 10.4.4

Type: Bug Priority: Major
Reporter: Christian Hesse Assignee: Jan Lindström (Inactive)
Resolution: Duplicate Votes: 2
Labels: galera, innodb
Environment:

Arch Linux with packages:
mariadb 10.3.12-5
galera 25.3.25-3



 Description   

I upgraded a mariadb galera cluster from version 10.1.37 to version 10.3.12. The galera version did not change (though the package was rebuilt). Since the update the cluster nodes show different behavior on key usage.

MariaDB [(none)]> SELECT @@hostname;
+------------+
| @@hostname |
+------------+
| mariadb01  |
+------------+
1 row in set (0.000 sec)
 
MariaDB [(none)]> CREATE TABLE `test`.`reproduce` (
    ->   `id` INT NOT NULL AUTO_INCREMENT,
    ->   `value1` INT NOT NULL,
    ->   `value2` VARCHAR(255) NOT NULL,
    ->   PRIMARY KEY (`id`),
    ->   KEY `value1` (`value1`),
    ->   KEY `value2` (`value2`));
Query OK, 0 rows affected (0.023 sec)
 
MariaDB [(none)]> INSERT INTO `test`.`reproduce` VALUES (NULL, 1, "1"),
    -> (NULL, 1, "2"),
    -> (NULL, 2, "3"),
    -> (NULL, 3, "4");
Query OK, 4 rows affected (0.008 sec)
Records: 4  Duplicates: 0  Warnings: 0
 
MariaDB [(none)]> EXPLAIN SELECT * FROM `test`.`reproduce` WHERE `value1` = 1;
+------+-------------+-----------+------+---------------+--------+---------+-------+------+-------+
| id   | select_type | table     | type | possible_keys | key    | key_len | ref   | rows | Extra |
+------+-------------+-----------+------+---------------+--------+---------+-------+------+-------+
|    1 | SIMPLE      | reproduce | ref  | value1        | value1 | 4       | const |    2 |       |
+------+-------------+-----------+------+---------------+--------+---------+-------+------+-------+
1 row in set (0.000 sec)

That looks ok. However another node does not use the key:

MariaDB [(none)]> SELECT @@hostname;
+------------+
| @@hostname |
+------------+
| mariadb02  |
+------------+
1 row in set (0.000 sec)
 
MariaDB [(none)]> EXPLAIN SELECT * FROM `test`.`reproduce` WHERE `value1` = 1;
+------+-------------+-----------+------+---------------+------+---------+------+------+-------------+
| id   | select_type | table     | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+------+-------------+-----------+------+---------------+------+---------+------+------+-------------+
|    1 | SIMPLE      | reproduce | ALL  | value1        | NULL | NULL    | NULL |    1 | Using where |
+------+-------------+-----------+------+---------------+------+---------+------+------+-------------+
1 row in set (0.000 sec)

The host receiving the structural change from client is ok, others fail.

This effects tables with millions of rows, breaking joins that need several seconds or minutes to complete where it finished in some milliseconds before.

This takes down the whole cluster due to massively increased load.



 Comments   
Comment by Christian Hesse [ 2019-02-01 ]

Looks like running...

ANALYZE TABLE `test`.`reproduce`;

... works around the issue. That helps for now, but a real fix is highly appreciated.

Comment by Elena Stepanova [ 2019-02-03 ]

Could you please provide the example of different plans from real big tables?
Unless you run ANALYZE, the lack of accurate statistics is not necessarily a bug, it depends on how inaccurate it is, and how persistent the problem is.

Do you have innodb_stats_auto_recalc enabled?
Did you try to tune innodb_stats_persistent_sample_pages?

Comment by Christian Hesse [ 2019-02-03 ]

All cluster nodes have:

MariaDB [(none)]> SHOW VARIABLES LIKE 'innodb_stats_auto_recalc';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| innodb_stats_auto_recalc | ON    |
+--------------------------+-------+
1 row in set (0.012 sec)

Just dumped two large tables from live system and imported them on my testing cluster. First node (which did the import):

MariaDB [(none)]> SELECT COUNT(*) FROM `test`.`t_personengruppen`;
+----------+
| COUNT(*) |
+----------+
|  8289996 |
+----------+
1 row in set (1.386 sec)
 
MariaDB [(none)]> SELECT COUNT(*) FROM `test`.`t_gruppen`;
+----------+
| COUNT(*) |
+----------+
|  3643386 |
+----------+
1 row in set (0.693 sec)
 
MariaDB [(none)]> EXPLAIN SELECT * FROM `test`.`t_personengruppen` JOIN `test`.`t_gruppen` ON `GruppenID` = `GruppenFID`;
+------+-------------+-------------------+------+---------------+------------+---------+--------------------------+---------+-------+
| id   | select_type | table             | type | possible_keys | key        | key_len | ref                      | rows    | Extra |
+------+-------------+-------------------+------+---------------+------------+---------+--------------------------+---------+-------+
|    1 | SIMPLE      | t_gruppen         | ALL  | GruppenID     | NULL       | NULL    | NULL                     | 3557634 |       |
|    1 | SIMPLE      | t_personengruppen | ref  | GruppenFID    | GruppenFID | 4       | test.t_gruppen.GruppenID |       4 |       |
+------+-------------+-------------------+------+---------------+------------+---------+--------------------------+---------+-------+
2 rows in set (0.000 sec)

Second node (which shows identical results with third node):

MariaDB [(none)]> SELECT COUNT(*) FROM `test`.`t_personengruppen`;
+----------+
| COUNT(*) |
+----------+
|  8289996 |
+----------+
1 row in set (1.295 sec)
 
MariaDB [(none)]> SELECT COUNT(*) FROM `test`.`t_gruppen`;
+----------+
| COUNT(*) |
+----------+
|  3643386 |
+----------+
1 row in set (0.573 sec)
 
MariaDB [(none)]> EXPLAIN SELECT * FROM `test`.`t_personengruppen` JOIN `test`.`t_gruppen` ON `GruppenID` = `GruppenFID`;
+------+-------------+-------------------+------+---------------+------+---------+------+------+-------------------------------------------------+
| id   | select_type | table             | type | possible_keys | key  | key_len | ref  | rows | Extra                                           |
+------+-------------+-------------------+------+---------------+------+---------+------+------+-------------------------------------------------+
|    1 | SIMPLE      | t_personengruppen | ALL  | GruppenFID    | NULL | NULL    | NULL |    1 |                                                 |
|    1 | SIMPLE      | t_gruppen         | ALL  | GruppenID     | NULL | NULL    | NULL |    1 | Using where; Using join buffer (flat, BNL join) |
+------+-------------+-------------------+------+---------------+------+---------+------+------+-------------------------------------------------+
2 rows in set (0.000 sec)

Comment by Elena Stepanova [ 2019-02-04 ]

Thanks.
It resembles MDEV-18226 (which I think was closed prematurely).

Comment by Robert Kirscht [ 2019-03-05 ]

Hi, we are also seeing this behaviour on our 3-node Galera cluster running MariaDB v10.2.17 in a Read/Write-split configuration (no real master/slave, but only one machine is writing at one time to maintain data consistency). This was btw a fresh install of 10.2.17, not an upgrade from an earlier version. The index stats are correctly calculated and updated on the writer machine, but not on the two nodes receiving the changes via replication, where the cardinality values seem to correctly change to 0 when the tables are emptied, but then don't get recalculated once data is re-inserted. This is causing severe problems on our infrastructure because the Query Analyzer is optimizing queries based non-existent index data, causing severe CPU load on the DB servers (blocking them from efficient usage) and frustratingly long delays on our application servers. We have implemented a cronjob which regularly does an "ANALYZE TABLE" on certain candidates that regularly get rebuilt from scratch (search index tables etc), but would also much appreciate a proper fix for this. Sounds to me like the background process that is supposed to do the stats recalculations necessary from certain replication changes is failing to do its job properly. Agree that https://jira.mariadb.org/browse/MDEV-18226 was closed prematurely as it describes exactly this problem as well.

Comment by Jan Lindström (Inactive) [ 2019-03-13 ]

Fixed on https://jira.mariadb.org/browse/MDEV-18577

Generated at Thu Feb 08 08:44:08 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.