Details
-
Task
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
None
Description
MariaDB (and historically, MySQL) had a BNL-H join (in MariaDB, join_cache_level>=3).
Then MySQL has added "proper" hash join.
This MDEV is about checking the performance of the two.
Query pattern
SELECT |
MAX(column1), ... |
FROM
|
t1, t2
|
WHERE
|
t1.col1=t2.col2
|
Let the join columns (col1 and col2 above) be integers for now.
Tables
Use 500bytes / row in both tables.
No indexes.
20M rows in one table, 20M (or 200M?) in another.
Run 1:
- table 1: 20M rows, unique values.
- table 2: 20M rows, 10 rows with same value on average.
Run 2:
- table 1: 20M rows, unique values.
- table 2: 200M rows, 10 rows with same value on average.
(Please don't "co-locate" rows with duplicates next to one another).
Table sizes
The first table: 20M * 0.5 KB/record = 10G
The second table: also 10G in try1, 100G in try2.
InnoDB Buffer pool
Start from "hot" buffer pool for both tables.
(Set innodb_buffer_pool_size to fit both tables, do multiple query runs).
Memory used for joining
MariaDB
Default is join_buffer_size=256K
MySQL
Default is join_buffer_size=256K
MySQL: "Memory usage by hash joins can be controlled using the join_buffer_size
system variable"
To use in benchmark
Set the common default: let'start with join_buffer_size=1G.
Links
https://dev.mysql.com/doc/refman/8.0/en/hash-joins.html
https://dev.mysql.com/blog-archive/hash-join-in-mysql-8/
https://mariadb.com/kb/en/server-system-variables/#join_cache_level
Attachments
Issue Links
- relates to
-
MDEV-34352 Execution time for Block Nested Loop Hash (BNLH) join with charset utf8mb4 is 7-20 times slower than with latin1
- Closed
-
MDEV-34427 BNL-H has not optimal implementation for varchar type
- Open
-
MDEV-34417 Wrong result set with utf8mb4_danish_ci and BNLH join
- Closed