Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-9130

InnoDB uses non-covering index on bool column when estimate is 50% of the table

    XMLWordPrintable

Details

    Description

      When there is a tinyint column with 0/1 values and only handful of rows contain one of them, optimizer computes the rows extimate for using index on that column as 50% of the rows instead of 95%+ and moreover it actually uses the index instead of a table scan even when there are additional conditions on a non-indexed column (so PK ref access will be needed anyway).

      Selecting the "rare" value gives much better estimate (and using index is the right choice in that case).

      I tested it between 15k - 1M rows and did observe it in production on 3.5M table. It seemed to behave correctly (optimizer prefering tablescan) somewhere under 15k rows.

      I tested it on 10.0.22 an 10.1.6 (debug build I had at hand). Not using histogram-based stats. set global innodb_stats_traditional=off; did not make a difference.

      I reported this to MySQL too as I found 5.6 doing the same (did not test 5.7) - http://bugs.mysql.com/bug.php?id=79271

      Testcase attached, my results here:

      MariaDB [test]> EXPLAIN SELECT COUNT(1) FROM skewed_bool WHERE val > 1 AND flag = 1;
      +------+-------------+-------------+------+---------------+------+---------+-------+------+-------------+
      | id   | select_type | TABLE       | TYPE | possible_keys | KEY  | key_len | REF   | ROWS | Extra       |
      +------+-------------+-------------+------+---------------+------+---------+-------+------+-------------+
      |    1 | SIMPLE      | skewed_bool | REF  | flag          | flag | 1       | const |   12 | USING WHERE |
      +------+-------------+-------------+------+---------------+------+---------+-------+------+-------------+
      1 ROW IN SET (0.00 sec)
       
      MariaDB [test]> EXPLAIN SELECT COUNT(1) FROM skewed_bool WHERE val > 1 AND flag = 0;
      +------+-------------+-------------+------+---------------+------+---------+-------+-------+-------------+
      | id   | select_type | TABLE       | TYPE | possible_keys | KEY  | key_len | REF   | ROWS  | Extra       |
      +------+-------------+-------------+------+---------------+------+---------+-------+-------+-------------+
      |    1 | SIMPLE      | skewed_bool | REF  | flag          | flag | 1       | const | 18768 | USING WHERE |
      +------+-------------+-------------+------+---------------+------+---------+-------+-------+-------------+
      1 ROW IN SET (0.00 sec)
       
      MariaDB [test]> SELECT COUNT(1), flag FROM skewed_bool GROUP BY flag;
      +----------+------+
      | COUNT(1) | flag |
      +----------+------+
      |    36850 |    0 |
      |       13 |    1 |
      +----------+------+
      2 ROWS IN SET (0.01 sec)

      Attachments

        Activity

          People

            psergei Sergei Petrunia
            jkavalik Jiri Kavalik
            Votes:
            2 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.