Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-34232

Check performance of BNL-H vs MySQL's hash join

    XMLWordPrintable

Details

    Description

      MariaDB (and historically, MySQL) had a BNL-H join (in MariaDB, join_cache_level>=3).

      Then MySQL has added "proper" hash join.

      This MDEV is about checking the performance of the two.

      Query pattern

      SELECT 
        MAX(column1), ... 
      FROM
        t1, t2 
      WHERE
        t1.col1=t2.col2
      

      Let the join columns (col1 and col2 above) be integers for now.

      Tables

      Use 500bytes / row in both tables.

      No indexes.

      20M rows in one table, 20M (or 200M?) in another.

      Run 1:

      • table 1: 20M rows, unique values.
      • table 2: 20M rows, 10 rows with same value on average.

      Run 2:

      • table 1: 20M rows, unique values.
      • table 2: 200M rows, 10 rows with same value on average.

      (Please don't "co-locate" rows with duplicates next to one another).

      Table sizes

      The first table: 20M * 0.5 KB/record = 10G

      The second table: also 10G in try1, 100G in try2.

      InnoDB Buffer pool

      Start from "hot" buffer pool for both tables.
      (Set innodb_buffer_pool_size to fit both tables, do multiple query runs).

      Memory used for joining

      MariaDB

      Default is join_buffer_size=256K

      MySQL

      Default is join_buffer_size=256K

      MySQL: "Memory usage by hash joins can be controlled using the join_buffer_size
      system variable"

      To use in benchmark

      Set the common default: let'start with join_buffer_size=1G.

      Links

      https://dev.mysql.com/doc/refman/8.0/en/hash-joins.html
      https://dev.mysql.com/blog-archive/hash-join-in-mysql-8/

      https://mariadb.com/kb/en/server-system-variables/#join_cache_level

      Attachments

        1. t2_small.txt
          1.04 MB
        2. t1_small.txt
          1.23 MB
        3. temp_files_mysql_n=200000.txt
          7 kB

        Issue Links

          Activity

            People

              lstartseva Lena Startseva
              psergei Sergei Petrunia
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.