Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-26156

Empty table with EITS statistics causes multiply-by-zero wipe-out in the join optimizer

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open (View Workflow)
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 10.5
    • Fix Version/s: 10.5
    • Component/s: Optimizer
    • Labels:
      None

      Description

      create table t_empty (
        pk int primary key,
        col1 int
      );
       
      create table t1 (a int, b int);
      insert into t1 select seq, seq from seq_1_to_1000;
       
      create table t2 (
        a int,
        b int,
        key(a)
      );
      insert into t2 select a, a from t1;
      

      explain
      select * 
      from 
        t1 left join t_empty T on(T.col1 = t1.a)
        join t2 on t2.a=t1.b;
      

      +------+-------------+-------+------+---------------+------+---------+---------+------+-------------------------------------------------+               
      | id   | select_type | table | type | possible_keys | key  | key_len | ref     | rows | Extra                                           |               
      +------+-------------+-------+------+---------------+------+---------+---------+------+-------------------------------------------------+               
      |    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL    | 1000 | Using where                                     |               
      |    1 | SIMPLE      | T     | ALL  | NULL          | NULL | NULL    | NULL    | 1    | Using where; Using join buffer (flat, BNL join) |               
      |    1 | SIMPLE      | t2    | ref  | a             | a    | 5       | j4.t1.b | 1    |                                                 |               
      +------+-------------+-------+------+---------------+------+---------+---------+------+-------------------------------------------------+
      

      Good.

      analyze table t_empty persistent for all;
      

      select * from mysql.table_stats
      where db_name=database() and table_name='t_empty';
      +---------+------------+-------------+
      | db_name | table_name | cardinality |
      +---------+------------+-------------+
      | j4      | t_empty    |           0 |
      +---------+------------+-------------+
      

      re-run the same query:

      explain
      select * 
      from 
        t1 left join t_empty T on(T.col1 = t1.a)
        join t2 on t2.a=t1.b;
      +------+-------------+-------+------+---------------+------+---------+---------+------+-------------------------------------------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref     | rows | Extra                                           |
      +------+-------------+-------+------+---------------+------+---------+---------+------+-------------------------------------------------+
      |    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL    | 1000 | Using where                                     |
      |    1 | SIMPLE      | T     | ALL  | NULL          | NULL | NULL    | NULL    | 0    | Using where; Using join buffer (flat, BNL join) |
      |    1 | SIMPLE      | t2    | ref  | a             | a    | 5       | j4.t1.b | 1    |                                                 |
      +------+-------------+-------+------+---------------+------+---------+---------+------+-------------------------------------------------+
      

      note that T.rows=0 now.
      Let's check what effect this has on the join optimization:

              "plan_prefix": ["t1"],
              "table": "T",
              "best_access_path": {
                "considered_access_paths": [
                  {
                    "access_type": "scan",
                    "resulting_rows": 0,
                    "cost": 1000,
                    "chosen": true
                  }
                ],
                "chosen_access_method": {
                  "type": "scan",
                  "records": 0,
                  "cost": 1000,
                  "uses_join_buffering": false
                }
              },
              "rows_for_plan": 0,
              "cost_for_plan": 1204,
              "rest_of_plan": [
                {
                  "plan_prefix": ["t1", "T"],
                  "table": "t2",
                  "best_access_path": {
                    "considered_access_paths": [
                      {
                        "access_type": "ref",
                        "index": "a",
                        "used_range_estimates": false,
                        "cause": "not available",
                        "rows": 1,
                        "cost": 0,
                        "chosen": true
                      },
      

      look for best_access_path at table t2. it has cost:0. This is because incoming record_count=0. This multiply-by-zero doesn't let the optimizer make a meaningful query plan choice for tables after the empty table.

        Attachments

          Activity

            People

            Assignee:
            psergei Sergei Petrunia
            Reporter:
            psergei Sergei Petrunia
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

              Dates

              Created:
              Updated:

                Git Integration