[MDEV-34352] Execution time for Block Nested Loop Hash (BNLH) join with charset utf8mb4 is 7-20 times slower than with latin1 - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Duplicate
Affects Version/s: 10.5, 10.6, 10.11, 11.1(EOL), 11.2(EOL), 11.4, 11.5(EOL), 11.6(EOL)
Fix Version/s: N/A
Component/s: Character Sets
Labels:
- bnlh

Description

2024-06-21 Update

This issue appeared to be a combination of two problems:

~~MDEV-34417~~ Wrong result set with utf8mb4_danish_ci and BNLH join
MDEV-34427 BNL-H has not optimal implementation for varchar type

All further work will be done under terms of the above two tickets.

The original report

Testcase:
1. Enable Block Nested Loop Hash (BNLH):

set join_cache_level =3;

2. Variables join_buffer_size and join_buffer_space_limit were used with default value (They can be changed, but the difference in execution time will still remain)

+-------------------------+----------------------+

| Variable_name           | Value                |

+-------------------------+----------------------+

| join_buffer_size        | 262144               |

| join_buffer_space_limit | 2097152              |

+-------------------------+----------------------+

3. Create table with charset=latin1 or charset=utf8mb4:

CREATE TABLE t1(col1 VARCHAR(500)) engine=INNODB character set latin1 collate latin1_swedish_ci;

CREATE TABLE t2(col2 VARCHAR(500)) engine=INNODB character set latin1 collate latin1_swedish_ci;

CREATE TABLE t1(col1 VARCHAR(500)) engine=INNODB character set utf8mb4 collate utf8mb4_general_ci;

CREATE TABLE t2(col2 VARCHAR(500)) engine=INNODB character set utf8mb4 collate utf8mb4_general_ci;

4. Fill out the tables

insert into t1 select  FLOOR(1 + (RAND() * 200000)) from seq_1_to_200000;

insert into t2 select  FLOOR(1 + (RAND() * 20000)) from seq_1_to_200000;

5. Execute the query:

select max(A.col1) from t1 A, t2 B where A.col1=B.col2;

Results for v. 11.6:
Debug build:

latin1	utf8mb4
7 min 6,426 sec	54 min 2,303 sec

Release build:

latin1	utf8mb4
17,266 sec	8 min 21,989 sec

The results of statistics collected using perf are also attached to the task.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

mariadb_11.5_latin1_flamegraph_2.svg
2024-06-10 16:07
585 kB
Lena Startseva
mariadb_11.5_utf8_flamegraph.svg
2024-06-10 16:07
257 kB
Lena Startseva

Issue Links

duplicates

MDEV-34417 Wrong result set with utf8mb4_danish_ci and BNLH join

Closed

MDEV-34427 BNL-H has not optimal implementation for varchar type

Open

relates to

MDEV-19123 Change default charset from latin1 to utf8mb4

Closed

MDEV-34232 Check performance of BNL-H vs MySQL's hash join

Open

MDEV-34417 Wrong result set with utf8mb4_danish_ci and BNLH join

Closed

MDEV-34427 BNL-H has not optimal implementation for varchar type

Open

(1 relates to)

Activity

People

Assignee:: Alexander Barkov

Reporter:: Lena Startseva

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 2024-06-10 16:16

Updated:: 2024-06-21 05:46

Resolved:: 2024-06-21 05:42

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

2024-06-21 Update

The original report

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration