|
Reproduced. Selecting distinct rows from short (10 rows) ranges is about the same speed in 10.2 and 10.3 while MySQL got much faster in 8.0 vs. 5.7. The effect is more visible in utf8mb4 and in the *_unicode_ci collations.
MariaDB spends a lot of cpu time in my_uca_scanner_next_any
|
|
Creating regression tests for this kind of workload now.
|
|
related files are on github: mariadb-benchmarks/Sysbench-0.5/MDEV-16413
Requires a Lua-enabled sysbench, i.e. sysbench-mariadb
In order to profile a certain workload, fire up a MariaDB server, edit config.sh to point to the socket for it, edit THREADS and RUNTIME_RO and then execute ./run.sysbench 07. The number doesn't really matter, it just defines which outdir is used for results.
|
|
There are two points about collating performance:
1. scalability with thread number. As this machine can do 32 threads, performance would peak for 32 benchmark threads. Performance should stay stable for higher thread counts, but often doesn't. Example: utf8_general_ci
# data set 01 -> mariadb-10.1.36
|
# data set 02 -> mariadb-10.2.18
|
# data set 03 -> mariadb-10.3.9
|
# data set 04 -> mysql-5.6.36
|
# data set 05 -> mysql-5.7.21
|
# data set 06 -> mysql-8.0.11
|
|
# read-only (short ranges)
|
#thd 01 02 03 04 05 06
|
1 7308.8 8184.7 8020.1 6151.6 7234.8 7262.0
|
2 14090 15229 15308 12181 13546 14339
|
4 25085 28512 29831 21528 26559 27801
|
8 43274 51316 51550 39246 43164 51494
|
16 76782 97134 92415 71065 75132 98776
|
32 104524 143690 137077 100247 117379 144925
|
64 105551 121672 123965 101364 119981 144997
|
128 101807 123353 121965 101324 114877 144768
|
256 109214 122711 121956 100715 113369 144280
|
A clear peak at 32 threads for 10.3, but then it goes down.
2. the performance related to MySQL 8.0. In some cases also related to earlier MariaDB releases. Example above: 10.2 is faster than 10.3. 8.0 is a lot faster.
Both issues seem to be heavier for the unicode collations (compared to general) and for utf8mb compared to utf8.
|
|
BENCHMARK() results (using RelWithDebInfo builds) for comparison of ~40 byte strings, for utf8_unicode_ci, ascii repertoir:
SELECT BENCHMARK(5000000,_utf8 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa' COLLATE utf8_unicode_ci < _utf8'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbb') AS a;
|
- MySQL-5.7 (3.44 sec)
- MySQL-8.0 (2.33 sec)
- MariaDB-10.4 (2.90 sec)
|
|
BENCHMARK() results (using RelWithDebInfo builds) for comparison of ~40 byte strings, for utf8mb4_unicode_ci, ascii repertoir.
Note, MySQL-8.0 significantly improved comparing to MySQL-5.7.
SELECT BENCHMARK(5000000,_utf8mb4 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa' COLLATE utf8mb4_unicode_ci < _utf8'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbb') AS a;
|
- MySQL-5.7 (3.43 sec)
- MySQL-8.0 (1.67 sec)
- MariaDB-10.4 (2.91 sec)
|
|
BENCHMARK() results (using RelWithDebInfo builds) for comparison of ~40 byte strings, for utf8mb4_spanish2_ci, ascii repertoir.
Note, MySQL-8.0 significantly improved comparing to MySQL-5.7.
SELECT BENCHMARK(5000000,_utf8mb4 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa' COLLATE utf8mb4_spanish2_ci < _utf8'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbb') AS a;
|
- MySQL-5.7 (8.82 sec)
- MySQL-8.0 (1.78 sec)
- MariaDB-10.4 (3.55 sec)
|
|
axel, MDEV-17474 is now pushed. Please run benchmarks again. Thanks.
|
utf8_unicode_ci
# data set 03 -> mariadb-10.3.9
|
# data set 06 -> mysql-8.0.11
|
# data set 07 -> mariadb-10.4 baseline
|
# data set 08 -> mariadb-10.4 with inline collate
|
|
# read-only (short ranges)
|
#thd 03 06 07 08
|
1 6212.1 6744.6 5911.8 7197.8
|
2 11072 13233 10696 13491
|
4 21491 25630 21763 25607
|
8 38310 45858 39494 45109
|
16 72045 91982 69576 84109
|
32 94114 129141 91978 109944
|
64 89896 129005 89959 104324
|
128 90026 129264 89056 108373
|
256 92578 128756 91488 107215
|
most busy function in mysqld:
12.56% [.] my_uca_strnxfrm_onelevel_no_contractions_utf8mb3
|
call graph:
- 12.03% 11.86% [.] my_uca_strnxfrm_onelevel_no_contractions_utf8mb3
|
- my_uca_strnxfrm_onelevel_no_contractions_utf8mb3
|
- 99.81% my_uca_strnxfrm_no_contractions_utf8mb3
|
_ZN12Field_string11sort_stringEPhj
|
_ZL12make_sortkeyP10Sort_paramPhS1_
|
_Z8filesortP3THDP5TABLEP8FilesortP16Filesort_trackerP4JOINy
|
_Z17create_sort_indexP3THDP4JOINP13st_join_tableP8Filesort
|
_ZN13st_join_table10sort_tableEv
|
_Z21join_init_read_recordP13st_join_table
|
_ZN7AGGR_OP8end_sendEv
|
_Z24sub_select_postjoin_aggrP4JOINP13st_join_tableb
|
_ZN4JOIN10exec_innerEv
|
_ZN4JOIN4execEv
|
|
utf8mb4_unicode_ci
# data set 03 -> mariadb-10.3.9
|
# data set 06 -> mysql-8.0.11
|
# data set 07 -> mariadb-10.4 baseline
|
# data set 08 -> mariadb-10.4 with inline collate
|
|
# read-only (short ranges)
|
#thd 03 06 07 08
|
1 5622.1 7309.9 5608.0 6822.8
|
2 10188 14859 10637 12787
|
4 20166 27410 19464 24159
|
8 34266 50028 34302 41361
|
16 64133 97444 62909 75192
|
32 89883 136156 85741 103783
|
64 83062 137810 83564 100442
|
128 82443 138099 82641 101763
|
256 85764 137945 85181 104029
|
most busy function in mysqld:
16.91% [.] my_uca_strnxfrm_onelevel_no_contractions_utf8mb4
|
call graph:
- 17.62% 17.11% [.] my_uca_strnxfrm_onelevel_no_contractions_utf8mb4
|
- my_uca_strnxfrm_onelevel_no_contractions_utf8mb4
|
- 99.60% my_uca_strnxfrm_no_contractions_utf8mb4
|
_ZN12Field_string11sort_stringEPhj
|
_ZL12make_sortkeyP10Sort_paramPhS1_
|
_Z8filesortP3THDP5TABLEP8FilesortP16Filesort_trackerP4JOINy
|
_Z17create_sort_indexP3THDP4JOINP13st_join_tableP8Filesort
|
_ZN13st_join_table10sort_tableEv
|
_Z21join_init_read_recordP13st_join_table
|
_ZN7AGGR_OP8end_sendEv
|
_Z24sub_select_postjoin_aggrP4JOINP13st_join_tableb
|
_ZN4JOIN10exec_innerEv
|
_ZN4JOIN4execEv
|
...
|
|
|
utf8mb4_unicode_ci, removed the ORDER BY clause from the query to eliminate the filesort.
# data set 03 -> mariadb-10.3.9
|
# data set 06 -> mysql-8.0.11
|
# data set 07 -> mariadb-10.4 baseline
|
# data set 08 -> mariadb-10.4 with inline collate
|
# data set 09 -> mariadb-10.4 bar
|
|
# read-only (short ranges)
|
#thd 03 06 07 08 09
|
1 8250.3 8152.8 8003.8 8403.1 8469.8
|
2 16582 16159 15965 15926 16721
|
4 30780 31393 28467 30442 31813
|
8 52279 56888 51344 52947 54583
|
16 96950 109789 95596 97491 95948
|
32 138664 160343 143985 149344 149047
|
64 128624 160171 127250 133061 132593
|
128 127300 159873 122495 129590 129175
|
256 127458 159356 126651 132277 131171
|
|
|
Repeatable with this script:
DROP TABLE IF EXISTS t1;
|
CREATE TABLE t1 (pk SERIAL, field CHAR(120) CHARACTER SET utf8 COLLATE utf8_unicode_ci);
|
INSERT INTO t1 (field) VALUES ('a'),('b'),('c'),('d');
|
INSERT t1 (field)
|
WITH RECURSIVE int_seq AS (
|
SELECT 1 AS val
|
UNION ALL
|
SELECT val + 1
|
FROM int_seq
|
WHERE val < 1000
|
) SELECT 'a' FROM int_seq;
|
|
DROP PROCEDURE IF EXISTS p1;
|
DELIMITER $$
|
CREATE PROCEDURE p1()
|
BEGIN
|
DECLARE a INT DEFAULT 100000;
|
WHILE (a > 0)
|
DO
|
SELECT DISTINCT field INTO @a FROM t1 WHERE pk BETWEEN 1 AND 11 ORDER BY field LIMIT 1;
|
SET a=a-1;
|
END WHILE;
|
END;
|
$$
|
DELIMITER ;
|
CALL p1;
|
- (11.62 sec) – MariaDB-10.4 before
MDEV-17474
- (9.25 sec) – MariaDB-10.4 after
MDEV-17474
- (8.30 sec) – MySQL-8.0
Note, if I remove the ORDER BY, the statistics looks as follows:
- (5.62 sec) – MariaDB-10.4 before
MDEV-17474
- (5.54 sec) – MariaDB-10.4 after
MDEV-17474
- (6.40 sec) – MySQL-8.0
|
|
With VARCHAR(120), instead of CHAR(120), MariaDB is faster on ORDER-BY queries (even before MDEV-17474 changes)
DROP TABLE IF EXISTS t1;
|
CREATE TABLE t1 (pk SERIAL, field VARCHAR(120) CHARACTER SET utf8 COLLATE utf8_unicode_ci);
|
INSERT INTO t1 (field) VALUES ('a'),('b'),('c'),('d');
|
INSERT t1 (field)
|
WITH RECURSIVE int_seq AS (
|
SELECT 1 AS val
|
UNION ALL
|
SELECT val + 1
|
FROM int_seq
|
WHERE val < 1000
|
) SELECT 'a' FROM int_seq;
|
|
DROP PROCEDURE IF EXISTS p1;
|
DELIMITER $$
|
CREATE PROCEDURE p1()
|
BEGIN
|
DECLARE a INT DEFAULT 100000;
|
WHILE (a > 0)
|
DO
|
SELECT DISTINCT field INTO @a FROM t1 WHERE pk BETWEEN 1 AND 11 ORDER BY field LIMIT 1;
|
SET a=a-1;
|
END WHILE;
|
END;
|
$$
|
DELIMITER ;
|
CALL p1;
|
- (6.82 sec) – MariaDB-10.4 before
MDEV-17474
- (6.65 sec) – MariaDB-10.4 after
MDEV-17474
- (8.22 sec) – MySQL-8.0
|