Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-34666

Join that hits same rows many times is slower due to disabled AHI

    XMLWordPrintable

Details

    Description

      (Filing this based on exchange with Monty. The original testcase is private data, we'll have to come up with one of our own)

      The symptom: take a join query that repeatedly reads the same set of rows from certain tables (call them t10x).

      ANALYZE FORMAT=JSON will show that the query spends most of its time accessing tables t10x, with one table taking the biggest share.

      MySQL 5.7 is faster for the query (about 2x).
      MariaDB catches up with MySQL if one enables AHI.

      My testcase:

      -- The size of this is from original testcase
      create table t1 (
        a int,
        b int
      );
      insert into t1 select seq, floor(seq/100) from seq_1_to_500;
       
      -- Here, the fanout is from the original testcase, the size of the table is arbitrary
      create table t10 (
        pk int primary key auto_increment,
        filler1 varchar(100),
        key1 int,
        filler2 varchar(100),
        col1 int,
        key(key1)
      );
      insert into t10 (key1, filler1, filler2, col1)
      select 
        A.seq, uuid(), uuid(),123
      from
        seq_1_to_1000 A,
        seq_1_to_20 B;
       
      -- Same as above
      create table t11 (
        pk int primary key auto_increment,
        filler1 varchar(100),
        key1 int,
        filler2 varchar(100),
        col1 int,
        key(key1)
      );
       
      insert into t11 (key1, filler1, filler2, col1)
      select 
        A.seq, uuid(), uuid(),345
      from
        seq_1_to_1000 A,
        seq_1_to_100 B;
      

      The query:

      explain 
      select straight_join max(t10.col1), max(t11.col1)
      from 
       t1
       join t10 on t10.key1=t1.a
       join t11 on t11.key1=t1.a;
      

      +------+-------------+-------+------+---------------+------+---------+-----------+------+-------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref       | rows | Extra       |
      +------+-------------+-------+------+---------------+------+---------+-----------+------+-------------+
      |    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL      | 500  | Using where |
      |    1 | SIMPLE      | t10   | ref  | key1          | key1 | 5       | test.t1.a | 10   |             |
      |    1 | SIMPLE      | t11   | ref  | key1          | key1 | 5       | test.t1.a | 50   |             |
      +------+-------------+-------+------+---------------+------+---------+-----------+------+-------------+
      

      ANALYZE FORMAT=JSON shows most of the time is spent reading t11, it has r_engine_stats.pages_accessed: 2 021 080

      Release build of 10.6.19 on my machine:

      innodb_adaptive_hash_index=OFF (default) 0.79 sec
      innodb_adaptive_hash_index=ON 0.42 sec

      Possible solutions

      • Something on InnoDB layer. Enable use of AHI for some tables or indexes or ... Note that AHI is shared across all threads while the need to use it may be local - we need it for some instances of tableX.indexY in some query.
      • Something on SQL layer, like join buffering?

      Attachments

        Activity

          People

            psergei Sergei Petrunia
            psergei Sergei Petrunia
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.