Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-19424

InnoDB's records_in_range estimates are capped at about 50%

Details

    Description

      InnoDB's records_in_range estimates seem to be capped at ~50% of the table. (We used to observe this on various occasions before but I haven't been able to find an MDEV for this).

      If I pass a range that contains a bigger fraction of the table, the estimated number of rows is still around 50% of the total rows in the table.

      Test dataset:

      create table ten(a int);
      insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
       
      create table one_k(a int);
      insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C;
       
      create table t1 (
        a int,
        b varchar(32),
        c int,
        key(a),
        key(b)
      ) engine=innodb;
      

      # 100K NULLs
      insert into t1 select
        null,
        null,
        A.a + 1000*B.a 
      from
        one_k A,
        one_k B
      where 
        B.a<100;
       
      # 900 K non-NULLs
      insert into t1 select
        A.a + 1000*B.a,
        A.a + 1000*B.a,
        A.a + 1000*B.a 
      from
        one_k A,
        one_k B
      where 
        B.a>=100;
      

      Now, both a IS NOT NULL or b IS NOT NULL match 900K rows (90% of the table).
      But EXPLAIN will show the estimates of about 500K rows, which is 50% of the table:

      explain select * from t1 force index (a) where a is not null ;
      +------+-------------+-------+-------+---------------+------+---------+------+--------+-----------------------+
      | id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows   | Extra                 |
      +------+-------------+-------+-------+---------------+------+---------+------+--------+-----------------------+
      |    1 | SIMPLE      | t1    | range | a             | a    | 5       | NULL | 494308 | Using index condition |
      +------+-------------+-------+-------+---------------+------+---------+------+--------+-----------------------+
      

      explain select * from t1 force index (b) where b is not null ;
      +------+-------------+-------+-------+---------------+------+---------+------+--------+-----------------------+
      | id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows   | Extra                 |
      +------+-------------+-------+-------+---------------+------+---------+------+--------+-----------------------+
      |    1 | SIMPLE      | t1    | range | b             | b    | 35      | NULL | 494308 | Using index condition |
      +------+-------------+-------+-------+---------------+------+---------+------+--------+-----------------------+
      

      When the optimizer was simple, this property was not a problem.

      A simple optimizer would only use range estimates to construct range access. Range access is cheaper than full table if it covers about 30% of the table. Returning 50% of the table instead of 90% was not an issue.

      A more advanced optimizer also attempts to use range estimates for condition selectivity, etc. Here, returning 50% selectivity instead of 90% is a problem. (One must take into account that selectivity is computed for multiple indexes. For example, for 5 indexes 0.5^2= 1/32 . 32x under-estimation of selectivity)

      Attachments

        Issue Links

          Activity

            According to marko, this part of InnoDB code has not been touched by anyone for a very long time.

            psergei Sergei Petrunia added a comment - According to marko , this part of InnoDB code has not been touched by anyone for a very long time.

            In addition to the systematic error of 50%, ha_innobase::records_in_range() can also return completely made-up numbers due to race condition scenarios caused by too minimal locking. Refer to inexact: in btr_estimate_n_rows_in_range_on_level() and diverged in btr_estimate_n_rows_in_range_low().

            marko Marko Mäkelä added a comment - In addition to the systematic error of 50%, ha_innobase::records_in_range() can also return completely made-up numbers due to race condition scenarios caused by too minimal locking. Refer to inexact: in btr_estimate_n_rows_in_range_on_level() and diverged in btr_estimate_n_rows_in_range_low() .
            Elkin Andrei Elkin added a comment -

            MDEV-7140 represents another type of failures in the test some of which are still
            valid for 5.5. The current ticket failure case is decided to report separately.

            Elkin Andrei Elkin added a comment - MDEV-7140 represents another type of failures in the test some of which are still valid for 5.5. The current ticket failure case is decided to report separately.
            valerii Valerii Kravchuk added a comment - See also upstream https://bugs.mysql.com/bug.php?id=73386 .

            Not reproducible as of MariaDB 11.0.1 ? Checking which revision has fixed it..

            psergei Sergei Petrunia added a comment - Not reproducible as of MariaDB 11.0.1 ? Checking which revision has fixed it..

            This one:

            commit b66cdbd1eaeed7e96317a03a190c496fd062ec71
            Author: Monty <monty@mariadb.org>
            Date:   Thu Aug 11 13:05:23 2022 +0300
             
                Changing all cost calculation to be given in milliseconds
                
            

            https://github.com/mariadb/server/commit/b66cdbd1eaeed7e96317a03a190c496fd062ec71
            Specifically, the part in btr0cur.cc where the piece of logic with

              /* Do not estimate the number of rows in the range to over 1 / 2 of the
              estimated rows in the whole table */
            

            is ifdef-ed away.

            psergei Sergei Petrunia added a comment - This one: commit b66cdbd1eaeed7e96317a03a190c496fd062ec71 Author: Monty <monty@mariadb.org> Date: Thu Aug 11 13:05:23 2022 +0300   Changing all cost calculation to be given in milliseconds https://github.com/mariadb/server/commit/b66cdbd1eaeed7e96317a03a190c496fd062ec71 Specifically, the part in btr0cur.cc where the piece of logic with /* Do not estimate the number of rows in the range to over 1 / 2 of the estimated rows in the whole table */ is ifdef-ed away.

            People

              psergei Sergei Petrunia
              psergei Sergei Petrunia
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.