Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-19474

Histogram statistics are used even with optimizer_use_condition_selectivity=3

    XMLWordPrintable

Details

    Description

      Dataset

      create table t1(a int);
      insert into t1 values (1),(2),(2),(3),(4);
      set use_stat_tables='preferably';
      set optimizer_use_condition_selectivity=4;
      set histogram_size= 255;
      analyze table t1;
      

      MariaDB [test]> analyze select * from t1 where a=2;
      +------+-------------+-------+------+---------------+------+---------+------+------+--------+----------+------------+-------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | r_rows | filtered | r_filtered | Extra       |
      +------+-------------+-------+------+---------------+------+---------+------+------+--------+----------+------------+-------------+
      |    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL | 5    | 5.00   |    39.84 |      40.00 | Using where |
      +------+-------------+-------+------+---------------+------+---------+------+------+--------+----------+------------+-------------+
      1 row in set (0.002 sec)
      

      So with optimizer_use_condition_selectivity=4 , we get filtered as 39.84

      MariaDB [test]> set optimizer_use_condition_selectivity=3;
      Query OK, 0 rows affected (0.000 sec)
       
      MariaDB [test]> analyze select * from t1 where a=2;
      +------+-------------+-------+------+---------------+------+---------+------+------+--------+----------+------------+-------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | r_rows | filtered | r_filtered | Extra       |
      +------+-------------+-------+------+---------------+------+---------+------+------+--------+----------+------------+-------------+
      |    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL | 5    | 5.00   |    39.84 |      40.00 | Using where |
      +------+-------------+-------+------+---------------+------+---------+------+------+--------+----------+------------+-------------+
      1 row in set (0.002 sec)
      

      So with optimizer_use_condition_selectivity=3 , we get * filtered as 39.84*

      MariaDB [test]> flush tables;  // Statistics are flushed, need to read again to use them
      Query OK, 0 rows affected (0.002 sec)
       
      MariaDB [test]> analyze select * from t1 where a=2;
      +------+-------------+-------+------+---------------+------+---------+------+------+--------+----------+------------+-------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | r_rows | filtered | r_filtered | Extra       |
      +------+-------------+-------+------+---------------+------+---------+------+------+--------+----------+------------+-------------+
      |    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL | 5    | 5.00   |    25.00 |      40.00 | Using where |
      +------+-------------+-------+------+---------------+------+---------+------+------+--------+----------+------------+-------------+
      1 row in set (0.008 sec)
      

      So the problem here is we don't stop using histogram statistics with optimizer_use_condition_selectivity=3.
      When the statistics are flushed, we don't re-read the histogram statistics and that is why we get the different value for filtered after flushing the EITS

      Attachments

        Activity

          People

            varun Varun Gupta (Inactive)
            varun Varun Gupta (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.