[MDEV-20364] Optimizer chooses wrong access access method with LooseScan semi-join strategy Created: 2019-08-16  Updated: 2020-08-25  Resolved: 2019-09-16

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.4
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Sergei Petrunia
Resolution: Not a Bug Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-20371 Invalid reads at plan refinement stag... Closed

 Description   

The optimizer may choose poor (expensive) access methods to be used together with LooseScan semi-join strategy.

The original example was observed at a customer, and we don't have access to the dataset.

I've also found this example in subselect3.result. I'm not sure if it is the same issue, but this gives an example to study:

create table t0(a int);
insert into t0 values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
create table t1 (kp1 int, kp2 int, c int, filler char(100), key(kp1, kp2));
insert into t1 select A.a+10*(B.a+10*C.a), 0, 0, 'filler' from t0 A, t0 B, t0 C;
insert into t1 select * from t1 where kp1 < 20;
create table t3 (a int);
insert into t3 select A.a + 10*B.a from t0 A, t0 B;

explain select * from t3 where a in (select kp1 from t1 where kp1<20);
+------+-------------+-------+-------+---------------+------+---------+------+------+-------------------------------------------------+
| id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                                           |
+------+-------------+-------+-------+---------------+------+---------+------+------+-------------------------------------------------+
|    1 | PRIMARY     | t1    | index | kp1           | kp1  | 10      | NULL | 1020 | Using where; Using index; LooseScan             |
|    1 | PRIMARY     | t3    | ALL   | NULL          | NULL | NULL    | NULL | 100  | Using where; Using join buffer (flat, BNL join) |
+------+-------------+-------+-------+---------------+------+---------+------+------+-------------------------------------------------+

Why does LooseScan use type=index? There is a possible range access on index kp1. It is certainly better than the full table scan.

Running the subquery standalone uses the select:

mysql> explain select kp1 from t1 where kp1<20;
+------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
| id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                    |
+------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
|    1 | SIMPLE      | t1    | range | kp1           | kp1  | 5       | NULL | 40   | Using where; Using index |
+------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+



 Comments   
Comment by Sergei Petrunia [ 2019-09-12 ]

For the testcase in the report, the optimizer no longer chooses LooseScan after fix for MDEV-20109.

Comment by Sergei Petrunia [ 2019-09-13 ]

... But when I debug the code, I can still see suspicious places. I need a bit more time.

Comment by Sergei Petrunia [ 2019-09-15 ]

Testcase for debugging:

https://gist.github.com/spetrunia/32b45da17c5da3a71fa89e44276e69c0

The suspicious place was this code in fix_semijoin_strategies_for_picked_join_order:

        if (join->best_positions[idx].use_join_buffer || (idx == first))
        {
           best_access_path(join, join->best_positions[idx].table,
                            rem_tables, join->best_positions, idx,
                            TRUE /* no jbuf */,
                            record_count, join->best_positions + idx,
                            &loose_scan_pos);
           if (idx==first)
           {
             join->best_positions[idx]= loose_scan_pos;

loose_scan_pos is filled by Loose_scan_opt::save_to_position() which sets the member variables describing the table access, as well as some of the members of loosescan_picker :

  void save_to_position(JOIN_TAB *tab, POSITION *pos)
  {
    pos->read_time=       best_loose_scan_cost;
    if (best_loose_scan_cost != DBL_MAX)
    {
      pos->records_read=    best_loose_scan_records;
      pos->key=             best_loose_scan_start_key;
      pos->cond_selectivity= 1.0;
      pos->loosescan_picker.loosescan_key=   best_loose_scan_key;
      pos->loosescan_picker.loosescan_parts= best_max_loose_keypart + 1;
      pos->use_join_buffer= FALSE;
      pos->table=           tab;
      pos->range_rowid_filter_info= tab->range_rowid_filter_info;
      pos->ref_depend_map= best_ref_depend_map;
    }

Comment by Sergei Petrunia [ 2019-09-15 ]

... and this way, join->best_positions[i].key will point to the index required by the Loose Scan strategy. No issue here.

loosescan_picker.loosescan_key is still used by the code in sub_select(), because full index scan and ref access do not have a single place where one could get index-nr from.

Comment by Sergei Petrunia [ 2019-09-15 ]

The member variables prefix_record_count and sj_strategy are not set in save_to_position().

The sj_strategy is set in fix_semijoin_strategies_for_picked_join_order(), below the best_access_path() loop:

      first_pos->sj_strategy= SJ_OPT_LOOSE_SCAN;

prefix_record_count is never set, so we end up with it being unset in join->best_positions[ first_loose_scan_table]. It looks like the member is never used after the fix_semijoin_strategies_for_picked_join_order call, though.

Generated at Thu Feb 08 08:58:54 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.