[MDEV-7118] Anemometer stop working after upgrade to from 5.6 to 10.0 Created: 2014-11-14  Updated: 2014-11-19  Resolved: 2014-11-19

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.0.14
Fix Version/s: 10.0.15

Type: Bug Priority: Critical
Reporter: VAROQUI Stephane Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-4120 UNIQUE indexes should not be consider... Closed

 Description   

analyze table hostname_max
 
explain SELECT DISTINCT `hostname_max` FROM `global_query_review_history`;
+------+-------------+-----------------------------+-------+---------------+--------------+---------+------+---------+-------------+
| id   | select_type | table                       | type  | possible_keys | key          | key_len | ref  | rows    | Extra       |
+------+-------------+-----------------------------+-------+---------------+--------------+---------+------+---------+-------------+
|    1 | SIMPLE      | global_query_review_history | index | NULL          | hostname_max | 218     | NULL | 3167247 | Using index |
+------+-------------+-----------------------------+-------+---------------+--------------+---------+------+---------+-------------+
1 row in set (0.01 sec)

work around is :

create index hostname_max1 on global_query_review_history (hostname_max);
[11/14/14 4:19:25 PM] Nicolas Payart: slow_query_log=# explain SELECT DISTINCT `hostname_max` FROM `global_query_review_history`;
+------+-------------+-----------------------------+-------+---------------+---------------+---------+------+------+--------------------------+
| id   | select_type | table                       | type  | possible_keys | key           | key_len | ref  | rows | Extra                    |
+------+-------------+-----------------------------+-------+---------------+---------------+---------+------+------+--------------------------+
|    1 | SIMPLE      | global_query_review_history | range | NULL          | hostname_max1 | 194     | NULL |   27 | Using index for group-by |
+------+-------------+-----------------------------+-------+---------------+---------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

was ok on MySQL 5.6 (without index hostname_max1)

slow_query_log=# show table status like 'global_query_review_history'\G
*************************** 1. row ***************************
           Name: global_query_review_history
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 3167374
 Avg_row_length: 1091
    Data_length: 3458203648
Max_data_length: 0
   Index_length: 335953920
      Data_free: 17825792
 Auto_increment: NULL
    Create_time: 2014-11-14 16:18:52
    Update_time: NULL
     Check_time: NULL
      Collation: utf8_general_ci
       Checksum: NULL
 Create_options:
        Comment:
1 row in set (0.00 sec)
CREATE TABLE `global_query_review_history` (
  `hostname_max` varchar(64) NOT NULL,
  `db_max` varchar(64) DEFAULT NULL,
  `checksum` bigint(20) unsigned NOT NULL,
  `sample` longtext NOT NULL,
  `ts_min` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `ts_max` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `ts_cnt` float DEFAULT NULL,
  `Query_time_sum` float DEFAULT NULL,
  `Query_time_min` float DEFAULT NULL,
  `Query_time_max` float DEFAULT NULL,
  `Query_time_pct_95` float DEFAULT NULL,
  `Query_time_stddev` float DEFAULT NULL,
  `Query_time_median` float DEFAULT NULL,
  `Lock_time_sum` float DEFAULT NULL,
  `Lock_time_min` float DEFAULT NULL,
  `Lock_time_max` float DEFAULT NULL,
  `Lock_time_pct_95` float DEFAULT NULL,
  `Lock_time_stddev` float DEFAULT NULL,
  `Lock_time_median` float DEFAULT NULL,
  `Rows_sent_sum` float DEFAULT NULL,
  `Rows_sent_min` float DEFAULT NULL,
  `Rows_sent_max` float DEFAULT NULL,
  `Rows_sent_pct_95` float DEFAULT NULL,
  `Rows_sent_stddev` float DEFAULT NULL,
  `Rows_sent_median` float DEFAULT NULL,
  `Rows_examined_sum` float DEFAULT NULL,
  `Rows_examined_min` float DEFAULT NULL,
  `Rows_examined_max` float DEFAULT NULL,
  `Rows_examined_pct_95` float DEFAULT NULL,
  `Rows_examined_stddev` float DEFAULT NULL,
  `Rows_examined_median` float DEFAULT NULL,
  `Rows_affected_sum` float DEFAULT NULL,
  `Rows_affected_min` float DEFAULT NULL,
  `Rows_affected_max` float DEFAULT NULL,
  `Rows_affected_pct_95` float DEFAULT NULL,
  `Rows_affected_stddev` float DEFAULT NULL,
  `Rows_affected_median` float DEFAULT NULL,
  `Rows_read_sum` float DEFAULT NULL,
  `Rows_read_min` float DEFAULT NULL,
  `Rows_read_max` float DEFAULT NULL,
  `Rows_read_pct_95` float DEFAULT NULL,
  `Rows_read_stddev` float DEFAULT NULL,
  `Rows_read_median` float DEFAULT NULL,
  `Merge_passes_sum` float DEFAULT NULL,
  `Merge_passes_min` float DEFAULT NULL,
  `Merge_passes_max` float DEFAULT NULL,
  `Merge_passes_pct_95` float DEFAULT NULL,
  `Merge_passes_stddev` float DEFAULT NULL,
  `Merge_passes_median` float DEFAULT NULL,
  `InnoDB_IO_r_ops_min` float DEFAULT NULL,
  `InnoDB_IO_r_ops_max` float DEFAULT NULL,
  `InnoDB_IO_r_ops_pct_95` float DEFAULT NULL,
  `InnoDB_IO_r_bytes_pct_95` float DEFAULT NULL,
  `InnoDB_IO_r_bytes_stddev` float DEFAULT NULL,
  `InnoDB_IO_r_bytes_median` float DEFAULT NULL,
  `InnoDB_IO_r_wait_min` float DEFAULT NULL,
  `InnoDB_IO_r_wait_max` float DEFAULT NULL,
  `InnoDB_IO_r_wait_pct_95` float DEFAULT NULL,
  `InnoDB_IO_r_ops_stddev` float DEFAULT NULL,
  `InnoDB_IO_r_ops_median` float DEFAULT NULL,
  `InnoDB_IO_r_bytes_min` float DEFAULT NULL,
  `InnoDB_IO_r_bytes_max` float DEFAULT NULL,
  `InnoDB_IO_r_wait_stddev` float DEFAULT NULL,
  `InnoDB_IO_r_wait_median` float DEFAULT NULL,
  `InnoDB_rec_lock_wait_min` float DEFAULT NULL,
  `InnoDB_rec_lock_wait_max` float DEFAULT NULL,
  `InnoDB_rec_lock_wait_pct_95` float DEFAULT NULL,
  `InnoDB_rec_lock_wait_stddev` float DEFAULT NULL,
  `InnoDB_rec_lock_wait_median` float DEFAULT NULL,
  `InnoDB_queue_wait_min` float DEFAULT NULL,
  `InnoDB_queue_wait_max` float DEFAULT NULL,
  `InnoDB_queue_wait_pct_95` float DEFAULT NULL,
  `InnoDB_queue_wait_stddev` float DEFAULT NULL,
  `InnoDB_queue_wait_median` float DEFAULT NULL,
  `InnoDB_pages_distinct_min` float DEFAULT NULL,
  `InnoDB_pages_distinct_max` float DEFAULT NULL,
  `InnoDB_pages_distinct_pct_95` float DEFAULT NULL,
  `InnoDB_pages_distinct_stddev` float DEFAULT NULL,
  `InnoDB_pages_distinct_median` float DEFAULT NULL,
  `QC_Hit_cnt` float DEFAULT NULL,
  `QC_Hit_sum` float DEFAULT NULL,
  `Full_scan_cnt` float DEFAULT NULL,
  `Full_scan_sum` float DEFAULT NULL,
  `Full_join_cnt` float DEFAULT NULL,
  `Full_join_sum` float DEFAULT NULL,
  `Tmp_table_cnt` float DEFAULT NULL,
  `Tmp_table_sum` float DEFAULT NULL,
  `Filesort_cnt` float DEFAULT NULL,
  `Filesort_sum` float DEFAULT NULL,
  `Tmp_table_on_disk_cnt` float DEFAULT NULL,
  `Tmp_table_on_disk_sum` float DEFAULT NULL,
  `Filesort_on_disk_cnt` float DEFAULT NULL,
  `Filesort_on_disk_sum` float DEFAULT NULL,
  `Bytes_sum` float DEFAULT NULL,
  `Bytes_min` float DEFAULT NULL,
  `Bytes_max` float DEFAULT NULL,
  `Bytes_pct_95` float DEFAULT NULL,
  `Bytes_stddev` float DEFAULT NULL,
  `Bytes_median` float DEFAULT NULL,
  UNIQUE KEY `hostname_max` (`hostname_max`,`checksum`,`ts_min`,`ts_max`),
  KEY `ts_min` (`ts_min`),
  KEY `checksum` (`checksum`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
 

Please backport MDEV-6657 as soon as possible in 10.0



 Comments   
Comment by VAROQUI Stephane [ 2014-11-14 ]

show profiles;
+----------+------------+-------------------------------------------------------------------------------------------+
| Query_ID | Duration   | Query                                                                                     |
+----------+------------+-------------------------------------------------------------------------------------------+
|        1 | 0.00027697 | SELECT DISTINCT `hostname_max` FROM `global_query_review_test`                            |
|        2 | 0.02410047 | SELECT DISTINCT `hostname_max` FROM `global_query_review_test` FORCE INDEX (hostname_max) |
+----------+------------+-------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

Comment by Elena Stepanova [ 2014-11-14 ]

I suppose if psergey pushed the change for MDEV-6657 into 10.1 only, it was too intrusive for a post-GA version; but I'll leave it to him to decide whether it's possible to backport it.

Comment by VAROQUI Stephane [ 2014-11-14 ]

After discussion on IRC with Sergei , it looks like it may be not related to 6657 but ignoring Using index for group-by on multi part keys , i sent sergei by email a dump to reproduce, do you need it for test case as well ?

Comment by Elena Stepanova [ 2014-11-14 ]

It's enough if Sergei already has it, thanks.

Comment by Sergei Petrunia [ 2014-11-15 ]

MariaDB 5.5.41:

MariaDB [j4]> explain SELECT DISTINCT `hostname_max` FROM `global_query_review_test` ;
+------+-------------+--------------------------+-------+---------------+---------------+---------+------+------+--------------------------+
| id   | select_type | table                    | type  | possible_keys | key           | key_len | ref  | rows | Extra                    |
+------+-------------+--------------------------+-------+---------------+---------------+---------+------+------+--------------------------+
|    1 | SIMPLE      | global_query_review_test | range | NULL          | hostname_max1 | 194     | NULL |    3 | Using index for group-by |
+------+-------------+--------------------------+-------+---------------+---------------+---------+------+------+--------------------------+

MariaDB [j4]> explain SELECT DISTINCT `hostname_max` FROM `global_query_review_test` ignore index(hostname_max1);
+------+-------------+--------------------------+-------+---------------+--------------+---------+------+------+--------------------------+
| id   | select_type | table                    | type  | possible_keys | key          | key_len | ref  | rows | Extra                    |
+------+-------------+--------------------------+-------+---------------+--------------+---------+------+------+--------------------------+
|    1 | SIMPLE      | global_query_review_test | range | NULL          | hostname_max | 194     | NULL |   19 | Using index for group-by |
+------+-------------+--------------------------+-------+---------------+--------------+---------+------+------+--------------------------+

Comment by Sergei Petrunia [ 2014-11-15 ]

MariaDB 10.0.15:

MariaDB [j4]> explain SELECT DISTINCT `hostname_max` FROM `global_query_review_test`;
+------+-------------+--------------------------+-------+---------------+---------------+---------+------+------+--------------------------+
| id   | select_type | table                    | type  | possible_keys | key           | key_len | ref  | rows | Extra                    |
+------+-------------+--------------------------+-------+---------------+---------------+---------+------+------+--------------------------+
|    1 | SIMPLE      | global_query_review_test | range | NULL          | hostname_max1 | 194     | NULL |    5 | Using index for group-by |
+------+-------------+--------------------------+-------+---------------+---------------+---------+------+------+--------------------------+

MariaDB [j4]> explain SELECT DISTINCT `hostname_max` FROM `global_query_review_test` ignore index(hostname_max1);
+------+-------------+--------------------------+-------+---------------+--------------+---------+------+-------+-------------+
| id   | select_type | table                    | type  | possible_keys | key          | key_len | ref  | rows  | Extra       |
+------+-------------+--------------------------+-------+---------------+--------------+---------+------+-------+-------------+
|    1 | SIMPLE      | global_query_review_test | index | NULL          | hostname_max | 218     | NULL | 23478 | Using index |
+------+-------------+--------------------------+-------+---------------+--------------+---------+------+-------+-------------+

Comment by Sergei Petrunia [ 2014-11-15 ]

I'll need to investigate why 5.5 and 10.0 plans are different.

Comment by Sergei Petrunia [ 2014-11-15 ]

The change is caused by: MDEV-4120: UNIQUE indexes should not be considered for loose index scan

Comment by Sergei Petrunia [ 2014-11-15 ]

Looking at MDEV-4120 and not fully understanding it.

It is apparent that one shouldn't attempt to use Loose Index Scan over a full unique index.

If index tuples are unique, jumping forward brings no benefits.

However, prefix of unique index is not unique. It may have a very low cardinality, and it will make perfrect sense to use Loose Index Scan.

Comment by Sergei Petrunia [ 2014-11-18 ]

.. on the other hand, if the index tuples are unique, why are we considering to use or not use loose index scan? We should be able to figure out that grouping is not needed at all.

I mean, Loose Scan is applied in two cases:

1. DISTINCT which was converted into GROUP BY

SELECT DISTINCT t.keypart1, t.keypart2, .... FROM t1 ...

2. GROUP BY with MIN/MAX

SELECT MIN(t.keypart2) FROM t1 ... GROUP BY t.keypart1

The testcase for MDEV-4120 is like #2:

create table t4 as select distinct a1, a2, b, c from t1;
alter table t4 add unique index idxt4 (a1, a2, b, c);
 
explain
select a1, a2, b, min(c) from t4 group by a1, a2, b;

Here, one could argue that Loose Scan can still be a good strategy when (a1,a2,b) have low cardinality, while `c` has high cardinality.

(we can't check whether the testcase for MDEV-4120 demonstrates a speedup, because table t1 has just 128 rows).

Comment by Sergei Petrunia [ 2014-11-18 ]

If I have a

  UNIQUE KEY `col1` (`col1`,`col2`,`col3`,`col4`)

and run a query:

explain select distinct col1,col2,col3,col4 from tpk1;

I see that Loose Scan is considered, despite that the index is unique. The cost
numbers prevent it from being chosen. The optimizer choses:

+------+-------------+-------+-------+---------------+------+---------+------+------+-------------+
| id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra       |
+------+-------------+-------+-------+---------------+------+---------+------+------+-------------+
|    1 | SIMPLE      | tpk1  | index | NULL          | col1 | 20      | NULL | 9990 | Using index |
+------+-------------+-------+-------+---------------+------+---------+------+------+-------------+

If I adjust the cost numbers in so that loose index scan is chosen, it is used:

+------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
| id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                    |
+------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
|    1 | SIMPLE      | tpk1  | range | NULL          | col1 | 20      | NULL | 9991 | Using index for group-by |
+------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+

The same goes for "SELECT DISTINCT pk_part1, ... pk_partN". Perhaps, this is what MDEV-4120 was trying to prevent?

Comment by VAROQUI Stephane [ 2014-11-18 ]

The cost of lose scan should be evaluated cost * (cardinality at the key_len /records) * cost_loose_scan_complexity /cost_index_scan_complexity

Comment by Sergei Petrunia [ 2014-11-18 ]

[stephane], your formula is not clear.

"cost of lose scan"= ... * cost_lose_scan_complexity ...

Is this formula recursive? if not, what is cost_lose_scan_complexity and how is it different from cost of loose scan?

Comment by Sergei Petrunia [ 2014-11-18 ]

Viable options so far

  • Undo the fix for MDEV-4120 completely
  • Change the fix for MDEV-4120 to disable Loose Index Scan only when doing "SELECT DISTINCT unique_key_col1 ,... unique_key_colN".

I am not sure if the second one is meaningful. I was not able to come up with a SELECT DISTINCT statement where cost of loose scan is smaller than the cost of index scan. Too bad we don't know the original testcase for MDEV-4120.

Comment by Sergei Petrunia [ 2014-11-19 ]

Implemented the second option. Fix pushed into 10.0 tree.

While debugging, also saw that Loose Index Scan cost formulas are poor - it can chose to use Loose Index Scan with group sizes as low as 4 rows, which seems to be wrong (at least, in some cases).

Generated at Thu Feb 08 07:17:08 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.