Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-36244

Left Join Performance v10.1.48 vs. v11.7.2

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 11.7.2
    • None
    • Optimizer

    Description

      Using 11.7.2-MariaDB-ubu2404-log (docker image mariadb:11.7),

      I have a left join with Group-By, no matching index on the second table and limit 1:

      explain select * from table1 left join table2 on table1.id = table2.table1Id group by table1.id limit 1;
      +------+-------------+-----------------+------+---------------+------+---------+------+-------+-------------------------------------------------+
      | id   | select_type | table           | type | possible_keys | key  | key_len | ref  | rows  | Extra                                           |
      +------+-------------+-----------------+------+---------------+------+---------+------+-------+-------------------------------------------------+
      |    1 | SIMPLE      | table1          | ALL  | NULL          | NULL | NULL    | NULL | 13121 | Using temporary; Using filesort                 |
      |    1 | SIMPLE      | table2          | ALL  | NULL          | NULL | NULL    | NULL | 31740 | Using where; Using join buffer (flat, BNL join) |
      +------+-------------+-----------------+------+---------------+------+---------+------+-------+-------------------------------------------------+
       
      set join_buffer_size = 16*1024*1024;
      explain select * from table1 left join table2 on table1.id = table2.table1Id group by table1.id limit 1;
      +------+-------------+-----------------+------+---------------+------+---------+------+-------+-------------------------------------------------+
      | id   | select_type | table           | type | possible_keys | key  | key_len | ref  | rows  | Extra                                           |
      +------+-------------+-----------------+------+---------------+------+---------+------+-------+-------------------------------------------------+
      |    1 | SIMPLE      | table1          | ALL  | NULL          | NULL | NULL    | NULL | 13121 | Using temporary; Using filesort                 |
      |    1 | SIMPLE      | table2          | ALL  | NULL          | NULL | NULL    | NULL | 31740 | Using where; Using join buffer (flat, BNL join) |
      +------+-------------+-----------------+------+---------------+------+---------+------+-------+-------------------------------------------------+
      

      Execution takes 1 min 51.763 sec. (v10.11.11 gave me 1 min 53.750 sec)

      set join_cache_level = 3;
      explain select * from table1 left join table2 on table1.id = table2.table1Id group by table1.id limit 1;
      +------+-------------+-----------------+----------+---------------+-----------+---------+-----------------------+-------+--------------------------------------------------+
      | id   | select_type | table           | type     | possible_keys | key       | key_len | ref                   | rows  | Extra                                            |
      +------+-------------+-----------------+----------+---------------+-----------+---------+-----------------------+-------+--------------------------------------------------+
      |    1 | SIMPLE      | table1          | ALL      | NULL          | NULL      | NULL    | NULL                  | 13121 | Using temporary; Using filesort                  |
      |    1 | SIMPLE      | table2          | hash_ALL | NULL          | #hash#$hj | 4       | db.table1.id          | 31740 | Using where; Using join buffer (flat, BNLH join) |
      +------+-------------+-----------------+----------+---------------+-----------+---------+-----------------------+-------+--------------------------------------------------+
      

      Execution takes 0.101 sec.

      set join_cache_level = 0;
      explain select * from table1 left join table2 on table1.id = table2.table1Id group by table1.id limit 1;
      +------+-------------+-----------------+-------+---------------+---------+---------+------+-------+-------------+
      | id   | select_type | table           | type  | possible_keys | key     | key_len | ref  | rows  | Extra       |
      +------+-------------+-----------------+-------+---------------+---------+---------+------+-------+-------------+
      |    1 | SIMPLE      | table1          | index | NULL          | PRIMARY | 4       | NULL | 1     |             |
      |    1 | SIMPLE      | table2          | ALL   | NULL          | NULL    | NULL    | NULL | 31740 | Using where |
      +------+-------------+-----------------+-------+---------------+---------+---------+------+-------+-------------+
      

      Execution takes 0.007 sec.

      Using 10.1.48-MariaDB-1~bionic (docker image library/mariadb:10.1.48) I get:

      explain select * from table1 left join table2 on table1.id = table2.table1Id group by table1.id limit 1;
      +------+-------------+-----------------+------+---------------+------+---------+------+-------+-------------------------------------------------+
      | id   | select_type | table           | type | possible_keys | key  | key_len | ref  | rows  | Extra                                           |
      +------+-------------+-----------------+------+---------------+------+---------+------+-------+-------------------------------------------------+
      |    1 | SIMPLE      | table1          | ALL  | NULL          | NULL | NULL    | NULL | 13549 | Using temporary; Using filesort                 |
      |    1 | SIMPLE      | table2          | ALL  | NULL          | NULL | NULL    | NULL | 31210 | Using where; Using join buffer (flat, BNL join) |
      +------+-------------+-----------------+------+---------------+------+---------+------+-------+-------------------------------------------------+
       
      set join_buffer_size = 16*1024*1024;
      explain select * from table1 left join table2 on table1.id = table2.table1Id group by table1.id limit 1;
      +------+-------------+-----------------+-------+---------------+---------+---------+------+-------+-------------+
      | id   | select_type | table           | type  | possible_keys | key     | key_len | ref  | rows  | Extra       |
      +------+-------------+-----------------+-------+---------------+---------+---------+------+-------+-------------+
      |    1 | SIMPLE      | table1          | index | NULL          | PRIMARY | 4       | NULL |     1 |             |
      |    1 | SIMPLE      | table2          | ALL   | NULL          | NULL    | NULL    | NULL | 31210 | Using where |
      +------+-------------+-----------------+-------+---------------+---------+---------+------+-------+-------------+
      

      Execution takes 0.03 sec.

      My questions are:

      • why join_cache_level defaults to 2 and not 0, 3 or 4?
      • why is increasing join_buffer_size causing to avoid BNL in 10.1.48, but not in 11.7 ?

      Thanks for your help!

      Attachments

        Activity

          answers to some questions:

          • Join cache level=0 means "do not use join cache". Typically it is advantageous to use it.
          • join cache level 3 and higher cannot be set by default because they may cause a very inefficient plan to be picked in some cases: MDEV-16337. There is MDEV-35855 to fix it, but it is not done yet.
          psergei Sergei Petrunia added a comment - answers to some questions: Join cache level=0 means "do not use join cache". Typically it is advantageous to use it. join cache level 3 and higher cannot be set by default because they may cause a very inefficient plan to be picked in some cases: MDEV-16337 . There is MDEV-35855 to fix it, but it is not done yet.

          The report has:
          set join_cache_level = 0;

          explain select * from table1 left join table2 on table1.id = table2.table1Id group by table1.id limit 1;
          +------+-------------+-----------------+-------+---------------+---------+---------+------+-------+-------------+
          | id   | select_type | table           | type  | possible_keys | key     | key_len | ref  | rows  | Extra       |
          +------+-------------+-----------------+-------+---------------+---------+---------+------+-------+-------------+
          |    1 | SIMPLE      | table1          | index | NULL          | PRIMARY | 4       | NULL | 1     |             |
          |    1 | SIMPLE      | table2          | ALL   | NULL          | NULL    | NULL    | NULL | 31740 | Using where |
          +------+-------------+-----------------+-------+---------------+---------+---------+------+-------+-------------+
          

          Note that table1 has type=index, rows=1 (as opposed to rows=13121 normally).

          I suppose this is because the query has group by table1.id limit 1. Does table1 have PRIMARY KEY (id) (and just that column?). Looks so, and the optimizer apparently has estimated that it would only need to read one row from table1 to get one group? (TODO: does GROUP BY ... LIMIT actually short-cut execution like that?)

          psergei Sergei Petrunia added a comment - The report has: set join_cache_level = 0; explain select * from table1 left join table2 on table1.id = table2.table1Id group by table1.id limit 1; +------+-------------+-----------------+-------+---------------+---------+---------+------+-------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-----------------+-------+---------------+---------+---------+------+-------+-------------+ | 1 | SIMPLE | table1 | index | NULL | PRIMARY | 4 | NULL | 1 | | | 1 | SIMPLE | table2 | ALL | NULL | NULL | NULL | NULL | 31740 | Using where | +------+-------------+-----------------+-------+---------------+---------+---------+------+-------+-------------+ Note that table1 has type=index, rows=1 (as opposed to rows=13121 normally). I suppose this is because the query has group by table1.id limit 1 . Does table1 have PRIMARY KEY (id) (and just that column?). Looks so, and the optimizer apparently has estimated that it would only need to read one row from table1 to get one group? (TODO: does GROUP BY ... LIMIT actually short-cut execution like that?)

          People

            psergei Sergei Petrunia
            thbley Thomas Bley
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.