[MDEV-16413] test performance of distinct range queries Created: 2018-06-06  Updated: 2021-07-13  Resolved: 2019-06-04

Status: Closed
Project: MariaDB Server
Component/s: Character Sets
Fix Version/s: 10.4.3, 10.2.22, 10.3.13

Type: Task Priority: Major
Reporter: Axel Schwenke Assignee: Axel Schwenke
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Blocks
is blocked by MDEV-17474 Change Unicode collation implementati... Closed
is blocked by MDEV-17502 Change Unicode xxx_general_ci and xxx... Closed
is blocked by MDEV-17511 Improve performance for ORDER BY with... Closed
is blocked by MDEV-17534 Implement fast path for ASCII range i... Closed
Relates
relates to MDEV-8872 Performance regressions with utf8mb4 ... Closed

 Description   

Check the claims in this blog



 Comments   
Comment by Axel Schwenke [ 2018-10-02 ]

Reproduced. Selecting distinct rows from short (10 rows) ranges is about the same speed in 10.2 and 10.3 while MySQL got much faster in 8.0 vs. 5.7. The effect is more visible in utf8mb4 and in the *_unicode_ci collations.

MariaDB spends a lot of cpu time in my_uca_scanner_next_any

Comment by Axel Schwenke [ 2018-10-02 ]

Creating regression tests for this kind of workload now.

Comment by Axel Schwenke [ 2018-10-09 ]

related files are on github: mariadb-benchmarks/Sysbench-0.5/MDEV-16413

Requires a Lua-enabled sysbench, i.e. sysbench-mariadb

In order to profile a certain workload, fire up a MariaDB server, edit config.sh to point to the socket for it, edit THREADS and RUNTIME_RO and then execute ./run.sysbench 07. The number doesn't really matter, it just defines which outdir is used for results.

Comment by Axel Schwenke [ 2018-10-09 ]

There are two points about collating performance:

1. scalability with thread number. As this machine can do 32 threads, performance would peak for 32 benchmark threads. Performance should stay stable for higher thread counts, but often doesn't. Example: utf8_general_ci

# data set 01 -> mariadb-10.1.36
# data set 02 -> mariadb-10.2.18
# data set 03 -> mariadb-10.3.9
# data set 04 -> mysql-5.6.36
# data set 05 -> mysql-5.7.21
# data set 06 -> mysql-8.0.11
 
# read-only (short ranges)
#thd    01      02      03      04      05      06
1       7308.8  8184.7  8020.1  6151.6  7234.8  7262.0
2       14090   15229   15308   12181   13546   14339
4       25085   28512   29831   21528   26559   27801
8       43274   51316   51550   39246   43164   51494
16      76782   97134   92415   71065   75132   98776
32      104524  143690  137077  100247  117379  144925
64      105551  121672  123965  101364  119981  144997
128     101807  123353  121965  101324  114877  144768
256     109214  122711  121956  100715  113369  144280

A clear peak at 32 threads for 10.3, but then it goes down.

2. the performance related to MySQL 8.0. In some cases also related to earlier MariaDB releases. Example above: 10.2 is faster than 10.3. 8.0 is a lot faster.

Both issues seem to be heavier for the unicode collations (compared to general) and for utf8mb compared to utf8.

Comment by Alexander Barkov [ 2018-10-15 ]

BENCHMARK() results (using RelWithDebInfo builds) for comparison of ~40 byte strings, for utf8_unicode_ci, ascii repertoir:

SELECT BENCHMARK(5000000,_utf8 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa' COLLATE utf8_unicode_ci < _utf8'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbb') AS a;

  • MySQL-5.7 (3.44 sec)
  • MySQL-8.0 (2.33 sec)
  • MariaDB-10.4 (2.90 sec)
Comment by Alexander Barkov [ 2018-10-16 ]

BENCHMARK() results (using RelWithDebInfo builds) for comparison of ~40 byte strings, for utf8mb4_unicode_ci, ascii repertoir.
Note, MySQL-8.0 significantly improved comparing to MySQL-5.7.

SELECT BENCHMARK(5000000,_utf8mb4 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa' COLLATE utf8mb4_unicode_ci < _utf8'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbb') AS a;

  • MySQL-5.7 (3.43 sec)
  • MySQL-8.0 (1.67 sec)
  • MariaDB-10.4 (2.91 sec)
Comment by Alexander Barkov [ 2018-10-16 ]

BENCHMARK() results (using RelWithDebInfo builds) for comparison of ~40 byte strings, for utf8mb4_spanish2_ci, ascii repertoir.
Note, MySQL-8.0 significantly improved comparing to MySQL-5.7.

SELECT BENCHMARK(5000000,_utf8mb4 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa' COLLATE utf8mb4_spanish2_ci < _utf8'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbb') AS a;

  • MySQL-5.7 (8.82 sec)
  • MySQL-8.0 (1.78 sec)
  • MariaDB-10.4 (3.55 sec)
Comment by Alexander Barkov [ 2018-10-18 ]

axel, MDEV-17474 is now pushed. Please run benchmarks again. Thanks.

Comment by Axel Schwenke [ 2018-10-18 ]

utf8_unicode_ci

# data set 03 -> mariadb-10.3.9
# data set 06 -> mysql-8.0.11
# data set 07 -> mariadb-10.4 baseline
# data set 08 -> mariadb-10.4 with inline collate
 
# read-only (short ranges)
#thd    03      06      07      08
1       6212.1  6744.6  5911.8  7197.8
2       11072   13233   10696   13491
4       21491   25630   21763   25607
8       38310   45858   39494   45109
16      72045   91982   69576   84109
32      94114   129141  91978   109944
64      89896   129005  89959   104324
128     90026   129264  89056   108373
256     92578   128756  91488   107215

most busy function in mysqld:

12.56%  [.] my_uca_strnxfrm_onelevel_no_contractions_utf8mb3

call graph:

-   12.03%    11.86%  [.] my_uca_strnxfrm_onelevel_no_contractions_utf8mb3
   - my_uca_strnxfrm_onelevel_no_contractions_utf8mb3
      - 99.81% my_uca_strnxfrm_no_contractions_utf8mb3
           _ZN12Field_string11sort_stringEPhj
           _ZL12make_sortkeyP10Sort_paramPhS1_
           _Z8filesortP3THDP5TABLEP8FilesortP16Filesort_trackerP4JOINy
           _Z17create_sort_indexP3THDP4JOINP13st_join_tableP8Filesort
           _ZN13st_join_table10sort_tableEv
           _Z21join_init_read_recordP13st_join_table
           _ZN7AGGR_OP8end_sendEv
           _Z24sub_select_postjoin_aggrP4JOINP13st_join_tableb
           _ZN4JOIN10exec_innerEv
           _ZN4JOIN4execEv

Comment by Axel Schwenke [ 2018-10-18 ]

utf8mb4_unicode_ci

# data set 03 -> mariadb-10.3.9
# data set 06 -> mysql-8.0.11
# data set 07 -> mariadb-10.4 baseline
# data set 08 -> mariadb-10.4 with inline collate
 
# read-only (short ranges)
#thd    03      06      07      08
1       5622.1  7309.9  5608.0  6822.8
2       10188   14859   10637   12787
4       20166   27410   19464   24159
8       34266   50028   34302   41361
16      64133   97444   62909   75192
32      89883   136156  85741   103783
64      83062   137810  83564   100442
128     82443   138099  82641   101763
256     85764   137945  85181   104029

most busy function in mysqld:

16.91%  [.] my_uca_strnxfrm_onelevel_no_contractions_utf8mb4

call graph:

-   17.62%    17.11%  [.] my_uca_strnxfrm_onelevel_no_contractions_utf8mb4
   - my_uca_strnxfrm_onelevel_no_contractions_utf8mb4
      - 99.60% my_uca_strnxfrm_no_contractions_utf8mb4
           _ZN12Field_string11sort_stringEPhj
           _ZL12make_sortkeyP10Sort_paramPhS1_
           _Z8filesortP3THDP5TABLEP8FilesortP16Filesort_trackerP4JOINy
           _Z17create_sort_indexP3THDP4JOINP13st_join_tableP8Filesort
           _ZN13st_join_table10sort_tableEv
           _Z21join_init_read_recordP13st_join_table
           _ZN7AGGR_OP8end_sendEv
           _Z24sub_select_postjoin_aggrP4JOINP13st_join_tableb
           _ZN4JOIN10exec_innerEv
           _ZN4JOIN4execEv
...

Comment by Axel Schwenke [ 2018-10-18 ]

utf8mb4_unicode_ci, removed the ORDER BY clause from the query to eliminate the filesort.

# data set 03 -> mariadb-10.3.9
# data set 06 -> mysql-8.0.11
# data set 07 -> mariadb-10.4 baseline
# data set 08 -> mariadb-10.4 with inline collate
# data set 09 -> mariadb-10.4 bar
 
# read-only (short ranges)
#thd    03      06      07      08      09
1       8250.3  8152.8  8003.8  8403.1  8469.8
2       16582   16159   15965   15926   16721
4       30780   31393   28467   30442   31813
8       52279   56888   51344   52947   54583
16      96950   109789  95596   97491   95948
32      138664  160343  143985  149344  149047
64      128624  160171  127250  133061  132593
128     127300  159873  122495  129590  129175
256     127458  159356  126651  132277  131171

Comment by Alexander Barkov [ 2018-10-18 ]

Repeatable with this script:

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (pk SERIAL, field CHAR(120) CHARACTER SET utf8 COLLATE utf8_unicode_ci);
INSERT INTO t1 (field) VALUES ('a'),('b'),('c'),('d');
INSERT t1 (field)
WITH  RECURSIVE int_seq AS (
  SELECT 1 AS val
  UNION ALL
  SELECT val + 1
  FROM int_seq
  WHERE val < 1000
) SELECT 'a' FROM int_seq;
 
DROP PROCEDURE IF EXISTS p1;
DELIMITER $$
CREATE PROCEDURE p1()
BEGIN
  DECLARE a INT DEFAULT 100000;
  WHILE (a > 0)
  DO
    SELECT DISTINCT field INTO @a FROM t1 WHERE pk BETWEEN 1 AND 11 ORDER BY field LIMIT 1;
    SET a=a-1;
  END WHILE;
END;
$$
DELIMITER ;
CALL p1;

  • (11.62 sec) – MariaDB-10.4 before MDEV-17474
  • (9.25 sec) – MariaDB-10.4 after MDEV-17474
  • (8.30 sec) – MySQL-8.0

Note, if I remove the ORDER BY, the statistics looks as follows:

  • (5.62 sec) – MariaDB-10.4 before MDEV-17474
  • (5.54 sec) – MariaDB-10.4 after MDEV-17474
  • (6.40 sec) – MySQL-8.0
Comment by Alexander Barkov [ 2018-10-19 ]

With VARCHAR(120), instead of CHAR(120), MariaDB is faster on ORDER-BY queries (even before MDEV-17474 changes)

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (pk SERIAL, field VARCHAR(120) CHARACTER SET utf8 COLLATE utf8_unicode_ci);
INSERT INTO t1 (field) VALUES ('a'),('b'),('c'),('d');
INSERT t1 (field)
WITH  RECURSIVE int_seq AS (
  SELECT 1 AS val
  UNION ALL
  SELECT val + 1
  FROM int_seq
  WHERE val < 1000
) SELECT 'a' FROM int_seq;
 
DROP PROCEDURE IF EXISTS p1;
DELIMITER $$
CREATE PROCEDURE p1()
BEGIN
  DECLARE a INT DEFAULT 100000;
  WHILE (a > 0)
  DO
    SELECT DISTINCT field INTO @a FROM t1 WHERE pk BETWEEN 1 AND 11 ORDER BY field LIMIT 1;
    SET a=a-1;
  END WHILE;
END;
$$
DELIMITER ;
CALL p1;

  • (6.82 sec) – MariaDB-10.4 before MDEV-17474
  • (6.65 sec) – MariaDB-10.4 after MDEV-17474
  • (8.22 sec) – MySQL-8.0
Generated at Thu Feb 08 08:28:44 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.