[MDEV-21354] Optimizer prefers rowid filter instead of primary key Created: 2019-12-18  Updated: 2020-03-20  Resolved: 2020-03-20

Status: Closed
Project: MariaDB Server
Component/s: Optimizer, Storage Engine - InnoDB
Affects Version/s: 10.4.10
Fix Version/s: N/A

Type: Bug Priority: Critical
Reporter: Michael Widenius Assignee: Igor Babaev
Resolution: Cannot Reproduce Votes: 1
Labels: None


 Description   

The file test.sql is mysqldump of tables subscription,label,track,license_track

Load in the tables query from the mysql command line

The problem query:
MariaDB [test]> explain SELECT SQL_NO_CACHE COUNT(label.id) FROM subscription,label,track,license_track WHERE track.enabled=1 AND label.enabled=0 AND subscription.account_id=7 and license_track.license_id=subscription.license_id and track.id=license_track.track_id and label.id=track.label_id;
-------------------------------------------------------------------------------------------------------------------------------------------------+

id select_type table type possible_keys key key_len ref rows Extra

-------------------------------------------------------------------------------------------------------------------------------------------------+

1 SIMPLE subscription ref account_id,license_id,account_id_2 account_id 4 const 1 Using index
1 SIMPLE track ref PRIMARY,label_id,enabled_2,enabled enabled_2 1 const 16754  
1 SIMPLE label eq_ref PRIMARY,enabled PRIMARY 4 test.track.label_id 1 Using where
1 SIMPLE license_track eq_ref license_id,track_id,license_id_2 license_id 8 test.subscription.license_id,test.track.id 1 Using index

-------------------------------------------------------------------------------------------------------------------------------------------------+

Execute:
optimize table track;
optimize table license_track;

After this the explain shows:
MariaDB [test]> explain SELECT SQL_NO_CACHE COUNT(label.id) FROM subscription,label,track,license_track WHERE track.enabled=1 AND label.enabled=0 AND subscription.account_id=7 and license_track.license_id=subscription.license_id and track.id=license_track.track_id and label.id=track.label_id;
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

id select_type table type possible_keys key key_len ref rows Extra

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

1 SIMPLE subscription ref account_id,license_id,account_id_2 account_id 4 const 1 Using index
1 SIMPLE license_track ref license_id,track_id,license_id_2 license_id 4 test.subscription.license_id 20055 Using index
1 SIMPLE track ALL filter PRIMARY,label_id,enabled_2,enabled   enabled_2   1 NULL 39744 (50%) Using where; Using join buffer (flat, BNL join); Using rowid filter
1 SIMPLE label eq_ref PRIMARY,enabled PRIMARY 4 test.track.label_id 1 Using where

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

The bug is that track is using rowid filter instead of lookup of primary key, like in the first explain

In 10.5 we get the following "correct" explain for the exact same 10.4 tables:
-----------------------------------------------------------------------------------------------------------------------------------+

id select_type table type possible_keys key key_len ref rows Extra

-----------------------------------------------------------------------------------------------------------------------------------+

1 SIMPLE subscription ref account_id,license_id,account_id_2 account_id 4 const 1 Using index
1 SIMPLE license_track ref license_id,track_id,license_id_2 license_id 4 test.subscription.license_id 20055 Using index
1 SIMPLE track eq_ref PRIMARY,label_id,enabled_2,enabled PRIMARY 4 test.license_track.track_id 1 Using where
1 SIMPLE label eq_ref PRIMARY,enabled PRIMARY 4 test.track.label_id 1 Using where

-----------------------------------------------------------------------------------------------------------------------------------+

I noticed that when one gets the 'wrong' explain in 10.4, one will get a crash when executing the following query:

SELECT SQL_NO_CACHE COUNT(label.id) FROM subscription LEFT JOIN license_track ON license_track.license_id=subscription.license_id LEFT JOIN track ON track.id=license_track.track_id LEFT JOIN label ON label.id=track.label_id WHERE track.enabled=1 AND label.enabled=0 AND subscription.account_id=7;

#4 0x00000000010ed6d5 in row_search_idx_cond_check (mysql_rec=0x7fff840bcd50 "ÿ\001", prebuilt=0x7fff840c0820, rec=0x7fffe13a007d "\200", offsets=0x7ffff4253e60) at /my/maria-10.4/storage/innobase/row/row0sel.cc:3942
#5 0x00000000010f13da in row_search_mvcc (buf=0x7fff840bcd50 "ÿ\001", mode=PAGE_CUR_G, prebuilt=0x7fff840c0820, match_mode=0, direction=0) at /my/maria-10.4/storage/innobase/row/row0sel.cc:5257
#6 0x0000000000f05342 in ha_innobase::index_read (this=0x7fff840bf410, buf=0x7fff840bcd50 "ÿ\001", key_ptr=0x0, key_len=0, find_flag=HA_READ_AFTER_KEY) at /my/maria-10.4/storage/innobase/handler/ha_innodb.cc:9305
#7 0x0000000000f062f4 in ha_innobase::index_first (this=0x7fff840bf410, buf=0x7fff840bcd50 "ÿ\001") at /my/maria-10.4/storage/innobase/handler/ha_innodb.cc:9679
#8 0x0000000000f064cc in ha_innobase::rnd_next (this=0x7fff840bf410, buf=0x7fff840bcd50 "ÿ\001") at /my/maria-10.4/storage/innobase/handler/ha_innodb.cc:9772
#9 0x0000000000bd5256 in handler::ha_rnd_next (this=0x7fff840bf410, buf=0x7fff840bcd50 "ÿ\001") at /my/maria-10.4/sql/handler.cc:2826
#10 0x0000000000d76cb1 in rr_sequential (info=0x7fff84147f98) at /my/maria-10.4/sql/records.cc:477
#11 0x00000000007cc56d in READ_RECORD::read_record (this=0x7fff84147f98) at /my/maria-10.4/sql/records.h:69
#12 0x00000000008fa3ed in join_init_read_record (tab=0x7fff84147ed0) at /my/maria-10.4/sql/sql_select.cc:21129
#13 0x0000000000a4701e in JOIN_TAB_SCAN::open (this=0x7fff84022e60) at /my/maria-10.4/sql/sql_join_cache.cc:3349
#14 0x0000000000a4543e in JOIN_CACHE::join_matching_records (this=0x7fff84167fd0, skip_last=false) at /my/maria-10.4/sql/sql_join_cache.cc:2252
#15 0x0000000000a44f38 in JOIN_CACHE::join_records (this=0x7fff84167fd0, skip_last=false) at /my/maria-10.4/sql/sql_join_cache.cc:2088
#16 0x00000000008f7b53 in sub_select_cache (join=0x7fff840226a0, join_tab=0x7fff84147ed0, end_of_records=false) at /my/maria-10.4/sql/sql_select.cc:19986
#17 0x00000000008f870c in evaluate_join_record (join=0x7fff840226a0, join_tab=0x7fff84147b28, error=0) at /my/maria-10.4/sql/sql_select.cc:20408
#18 0x00000000008f8134 in sub_select (join=0x7fff840226a0, join_tab=0x7fff84147b28, end_of_records=false) at /my/maria-10.4/sql/sql_select.cc:20227
#19 0x00000000008f870c in evaluate_join_record (join=0x7fff840226a0, join_tab=0x7fff84147780, error=0) at /my/maria-10.4/sql/sql_select.cc:20408
#20 0x00000000008f7f3f in sub_select (join=0x7fff840226a0, join_tab=0x7fff84147780, end_of_records=false) at /my/maria-10.4/sql/sql_select.cc:20188

This happens for table 'track'.



 Comments   
Comment by Igor Babaev [ 2019-12-19 ]

Here's my observations with 10.4.11 (Release version).

1. Running EEXPLAIN just after population of the tables does not make sense. Apparently InnoDB does not have proper statistical data at this moment. The output of the EXPLAIN shows this:

MariaDB [test]> explain SELECT SQL_NO_CACHE COUNT(label.id) FROM subscription,label,track,license_track WHERE track.enabled=1 AND label.enabled=0 AND subscription.account_id=7 and license_track.license_id=subscription.license_id and track.id=license_track.track_id and label.id=track.label_id;
+------+-------------+---------------+--------+------------------------------------+------------+---------+--------------------------------------------+------+-------------+
| id   | select_type | table         | type   | possible_keys                      | key        | key_len | ref                                        | rows | Extra       |
+------+-------------+---------------+--------+------------------------------------+------------+---------+--------------------------------------------+------+-------------+
|    1 | SIMPLE      | subscription  | ref    | account_id,license_id,account_id_2 | account_id | 4       | const                                      | 1    | Using index |
|    1 | SIMPLE      | track         | ref    | PRIMARY,label_id,enabled_2,enabled | enabled_2  | 1       | const                                      | 1    |             |
|    1 | SIMPLE      | label         | eq_ref | PRIMARY,enabled                    | PRIMARY    | 4       | test.track.label_id                        | 1    | Using where |
|    1 | SIMPLE      | license_track | eq_ref | license_id,track_id,license_id_2   | license_id | 8       | test.subscription.license_id,test.track.id | 1    | Using index |
+------+-------------+---------------+--------+------------------------------------+------------+---------+--------------------------------------------+------+-------------+

and the output of the following queries

MariaDB [test]> select count(*) from track;
+----------+
| count(*) |
+----------+
|    39349 |
+----------+
1 row in set (0.021 sec)
 
MariaDB [test]> select count(distinct enabled) from track;
+-------------------------+
| count(distinct enabled) |
+-------------------------+
|                       2 |
+-------------------------+

2. After we having run the select once we have a different execution plan:

MariaDB [test]> SELECT SQL_NO_CACHE COUNT(label.id) FROM subscription,label,track,license_track WHERE track.enabled=1 AND label.enabled=0 AND subscription.account_id=7 and license_track.license_id=subscription.license_id and track.id=license_track.track_id and label.id=track.label_id;
+-----------------+
| COUNT(label.id) |
+-----------------+
|               0 |
+-----------------+
1 row in set (0.001 sec)
 
MariaDB [test]> explain SELECT SQL_NO_CACHE COUNT(label.id) FROM subscription,label,track,license_track WHERE track.enabled=1 AND label.enabled=0 AND subscription.account_id=7 and license_track.license_id=subscription.license_id and track.id=license_track.track_id and label.id=track.label_id;
+------+-------------+---------------+--------+------------------------------------+------------+---------+--------------------------------------------+------+-------------+
| id   | select_type | table         | type   | possible_keys                      | key        | key_len | ref                                        | rows | Extra       |
+------+-------------+---------------+--------+------------------------------------+------------+---------+--------------------------------------------+------+-------------+
|    1 | SIMPLE      | subscription  | ref    | account_id,license_id,account_id_2 | account_id | 4       | const                                      | 1    | Using index |
|    1 | SIMPLE      | label         | ref    | PRIMARY,enabled                    | enabled    | 1       | const                                      | 66   | Using index |
|    1 | SIMPLE      | track         | ref    | PRIMARY,label_id,enabled_2,enabled | label_id   | 4       | test.label.id                              | 1    | Using where |
|    1 | SIMPLE      | license_track | eq_ref | license_id,track_id,license_id_2   | license_id | 8       | test.subscription.license_id,test.track.id | 1    | Using index |
+------+-------------+---------------+--------+------------------------------------+------------+---------+--------------------------------------------+------+-------------+
4 rows in set (0.001 sec)

ANALYZE FORMAT=JSON shows us that this is a pretty good execution plan:

MariaDB [test]> ANALYZE FORMAT=JSON SELECT SQL_NO_CACHE COUNT(label.id) FROM subscription,label,track,license_track WHERE track.enabled=1 AND label.enabled=0 AND subscription.account_id=7 and license_track.license_id=subscription.license_id and track.id=license_track.track_id and label.id=track.label_id \G
*************************** 1. row ***************************
ANALYZE: {
  "query_block": {
    "select_id": 1,
    "r_loops": 1,
    "r_total_time_ms": 0.4006,
    "table": {
      "table_name": "subscription",
      "access_type": "ref",
      "possible_keys": ["account_id", "license_id", "account_id_2"],
      "key": "account_id",
      "key_length": "4",
      "used_key_parts": ["account_id"],
      "ref": ["const"],
      "r_loops": 1,
      "rows": 1,
      "r_rows": 1,
      "r_total_time_ms": 0.0238,
      "filtered": 100,
      "r_filtered": 100,
      "using_index": true
    },
    "table": {
      "table_name": "label",
      "access_type": "ref",
      "possible_keys": ["PRIMARY", "enabled"],
      "key": "enabled",
      "key_length": "1",
      "used_key_parts": ["enabled"],
      "ref": ["const"],
      "r_loops": 1,
      "rows": 66,
      "r_rows": 66,
      "r_total_time_ms": 0.0509,
      "filtered": 100,
      "r_filtered": 100,
      "using_index": true
    },
    "table": {
      "table_name": "track",
      "access_type": "ref",
      "possible_keys": ["PRIMARY", "label_id", "enabled_2", "enabled"],
      "key": "label_id",
      "key_length": "4",
      "used_key_parts": ["label_id"],
      "ref": ["test.label.id"],
      "r_loops": 66,
      "rows": 1,
      "r_rows": 0,
      "r_total_time_ms": 0.2507,
      "filtered": 50,
      "r_filtered": 100,
      "attached_condition": "track.enabled = 1"
    },
    "table": {
      "table_name": "license_track",
      "access_type": "eq_ref",
      "possible_keys": ["license_id", "track_id", "license_id_2"],
      "key": "license_id",
      "key_length": "8",
      "used_key_parts": ["license_id", "track_id"],
      "ref": ["test.subscription.license_id", "test.track.id"],
      "r_loops": 0,
      "rows": 1,
      "r_rows": null,
      "filtered": 100,
      "r_filtered": null,
      "using_index": true
    }
  }
}
1 row in set (0.001 sec)

Yet again we see that this plan used wrong statistical data:

MariaDB [test]> select count(distinct label_id) from track;
+--------------------------+
| count(distinct label_id) |
+--------------------------+
|                        3 |
+--------------------------+
1 row in set (0.011 sec)

3. Let's run ANALYZE for all 4 tables:

MariaDB [test]> ANALYZE TABLE label, license_track, subscription, track;
+--------------------+---------+----------+----------+
| Table              | Op      | Msg_type | Msg_text |
+--------------------+---------+----------+----------+
| test.label         | analyze | status   | OK       |
| test.license_track | analyze | status   | OK       |
| test.subscription  | analyze | status   | OK       |
| test.track         | analyze | status   | OK       |
+--------------------+---------+----------+----------+

and run EXPLAIN again:

MariaDB [test]> explain SELECT SQL_NO_CACHE COUNT(label.id) FROM subscription,label,track,license_track WHERE track.enabled=1 AND label.enabled=0 AND subscription.account_id=7 and license_track.license_id=subscription.license_id and track.id=license_track.track_id and label.id=track.label_id;
+------+-------------+---------------+--------+------------------------------------+------------+---------+------------------------------+-------+-------------+
| id   | select_type | table         | type   | possible_keys                      | key        | key_len | ref                          | rows  | Extra       |
+------+-------------+---------------+--------+------------------------------------+------------+---------+------------------------------+-------+-------------+
|    1 | SIMPLE      | subscription  | ref    | account_id,license_id,account_id_2 | account_id | 4       | const                        | 1     | Using index |
|    1 | SIMPLE      | license_track | ref    | license_id,track_id,license_id_2   | license_id | 4       | test.subscription.license_id | 20055 | Using index |
|    1 | SIMPLE      | track         | eq_ref | PRIMARY,label_id,enabled_2,enabled | PRIMARY    | 4       | test.license_track.track_id  | 1     | Using where |
|    1 | SIMPLE      | label         | eq_ref | PRIMARY,enabled                    | PRIMARY    | 4       | test.track.label_id          | 1     | Using where |
+------+-------------+---------------+--------+------------------------------------+------------+---------+------------------------------+-------+-------------+
4 rows in set (0.001 sec)

This plan exactly the one called "correct" by Monty though it's by far not as fast as the previous plan:

MariaDB [test]> ANALYZE FORMAT=JSON SELECT SQL_NO_CACHE COUNT(label.id) FROM subscription,label,track,license_track WHERE track.enabled=1 AND label.enabled=0 AND subscription.account_id=7 and license_track.license_id=subscription.license_id and track.id=license_track.track_id and label.id=track.label_id \G
*************************** 1. row ***************************
ANALYZE: {
  "query_block": {
    "select_id": 1,
    "r_loops": 1,
    "r_total_time_ms": 57.609,
    "table": {
      "table_name": "subscription",
      "access_type": "ref",
      "possible_keys": ["account_id", "license_id", "account_id_2"],
      "key": "account_id",
      "key_length": "4",
      "used_key_parts": ["account_id"],
      "ref": ["const"],
      "r_loops": 1,
      "rows": 1,
      "r_rows": 1,
      "r_total_time_ms": 0.0281,
      "filtered": 100,
      "r_filtered": 100,
      "using_index": true
    },
    "table": {
      "table_name": "license_track",
      "access_type": "ref",
      "possible_keys": ["license_id", "track_id", "license_id_2"],
      "key": "license_id",
      "key_length": "4",
      "used_key_parts": ["license_id"],
      "ref": ["test.subscription.license_id"],
      "r_loops": 1,
      "rows": 20055,
      "r_rows": 39208,
      "r_total_time_ms": 12.641,
      "filtered": 100,
      "r_filtered": 100,
      "using_index": true
    },
    "table": {
      "table_name": "track",
      "access_type": "eq_ref",
      "possible_keys": ["PRIMARY", "label_id", "enabled_2", "enabled"],
      "key": "PRIMARY",
      "key_length": "4",
      "used_key_parts": ["id"],
      "ref": ["test.license_track.track_id"],
      "r_loops": 39208,
      "rows": 1,
      "r_rows": 1,
      "r_total_time_ms": 32.887,
      "filtered": 50,
      "r_filtered": 99.992,
      "attached_condition": "track.enabled = 1"
    },
    "table": {
      "table_name": "label",
      "access_type": "eq_ref",
      "possible_keys": ["PRIMARY", "enabled"],
      "key": "PRIMARY",
      "key_length": "4",
      "used_key_parts": ["id"],
      "ref": ["test.track.label_id"],
      "r_loops": 39205,
      "rows": 1,
      "r_rows": 1,
      "r_total_time_ms": 0.0584,
      "filtered": 95.652,
      "r_filtered": 0,
      "attached_condition": "label.enabled = 0"
    }
  }
}
1 row in set (0.059 sec)

Comment by Igor Babaev [ 2019-12-19 ]

Let's figure out why the optimizer hasn't t chosen the fast plan.
Optimizer trace show us when evaluating plans with the prefix ["subscription", "label", "track"]

                    "rows_for_plan": 66,
                    "cost_for_plan": 15.404,
                    "rest_of_plan": [
                      {
                        "plan_prefix": ["subscription", "label"],
                        "table": "track",
                        "best_access_path": {
                          "considered_access_paths": [
                            {
                              "access_type": "ref",
                              "index": "label_id",
                              "rows": 6624,
                              "cost": 55978,
                              "chosen": true
                            },
                            {
                              "access_type": "ref",
                              "index": "enabled_2",
                              "used_range_estimates": true,
                              "rows": 19872,
                              "cost": 19272,
                              "chosen": true
                            },
                            {
                              "access_type": "ref",
                              "index": "enabled",
                              "used_range_estimates": true,
                              "rows": 19872,
                              "cost": 19272,
                              "chosen": false,
                              "cause": "cost"
                            },
                            {
                              "access_type": "scan",
                              "resulting_rows": 19872,
                              "cost": 4071.4,
                              "chosen": true
                            }
                          ],
                          "chosen_access_method": {
                            "type": "scan",
                            "records": 19872,
                            "cost": 4071.4,
                            "uses_join_buffering": true,
                            "filter_used": false
                          }
                        },
                        "rows_for_plan": 1.31e6,
                        "cost_for_plan": 266397,
                        "rest_of_plan": [
                          {
                            "plan_prefix": ["subscription", "label", "track"],
                            "table": "license_track",
                        ...

It means the optimizer estimates the number of rows the partial join ["subscription", "label", "track"] as ~1300000 (=66*19872). In fact the expected number of rows in this partial join cannot be greater than the number of rows in the table track because this table is joined by the condition label.id=track.label_id and label.id is the primary key of the table label while we expect only 1 row from subscription. So we expect 66 rows from partial join ["subscription", "label"] and each of them has unique label_id.

Comment by Igor Babaev [ 2019-12-19 ]

Conclusions so far:
1. I can not reproduce the problem reported by Monty with 10.4.11 (as it was released).
2. I found a serious legacy defect of the optimizer when it estimates the expected number of rows after joining a table by a pair primary key - foreign key.

Generated at Thu Feb 08 09:06:30 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.