[MDEV-17698] MEMORY engine performance regression Created: 2018-11-13  Updated: 2019-03-25  Resolved: 2019-03-25

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - Memory
Affects Version/s: 5.5.50, 10.0.25, 10.1.14, 10.2, 10.3, 10.4
Fix Version/s: 10.3.11, 10.4.1, 10.1.38, 5.5.63, 10.0.38, 10.2.20

Type: Bug Priority: Major
Reporter: Axel Schwenke Assignee: Alexander Barkov
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Duplicate
is duplicated by MDEV-17724 Wrong result for BETWEEN 0 AND 184467... Closed
Relates
relates to MDEV-9372 select 100 between 1 and 922337203685... Closed
relates to MDEV-18319 BIGINT UNSIGNED Performance issue Stalled

 Description   

The regression benchmarks found a regression for mixed read/write workload and MEMORY tables. The following shows the last two good and the first two bad releases for each major version of the MariaDB server. From 10.2 onward there are no good releases any more.

Test 't_oltp-memory-multi' - sysbench OLTP read/write
32 Memory tables with 100K rows each
numbers are queries per second
 
#thread count           1       8       16      32      64      128     256
mariadb-5.5.48          985.74  4912.4  7701.0  8719.2  8484.9  8441.0  8425.6
mariadb-5.5.49          962.26  4532.4  7638.5  8332.6  8167.6  8143.0  8135.8
mariadb-5.5.50          559.35  3036.3  5002.7  4960.6  5011.1  5019.3  5012.1
mariadb-5.5.51          571.61  3576.5  5357.0  4830.9  4913.1  4912.8  4914.3
...
mariadb-10.0.23         948.63  5292.5  7040.5  7106.2  7047.1  7037.4  7044.7
mariadb-10.0.24         938.52  5259.6  6961.9  7021.9  7032.9  7034.6  7024.4
mariadb-10.0.25         563.74  3536.8  5893.9  5786.6  5816.9  5797.2  5799.1
mariadb-10.0.26         548.38  3516.5  6002.7  5807.9  5788.9  5780.5  5786.1
...
mariadb-10.1.12         958.87  5353.0  6957.1  8403.9  7739.5  7749.0  7766.4
mariadb-10.1.13         949.01  5510.0  7114.3  8458.5  8295.2  8216.3  8291.6
mariadb-10.1.14         567.69  3743.1  6147.7  5653.2  5400.9  5439.1  5415.8
mariadb-10.1.15         549.83  3580.7  5666.3  5267.9  5286.0  5340.9  5293.5
...
mariadb-10.2.1          487.03  3116.2  4802.8  4393.1  4452.0  4446.7  4457.6
mariadb-10.2.2          495.54  3065.3  4600.5  4423.6  4487.9  4488.6  4496.6
...
mariadb-10.3.0          486.31  3070.2  5629.3  5160.6  5325.6  5308.0  5305.6
mariadb-10.3.1          500.91  3278.9  5541.2  5335.6  5433.4  5430.5  5439.4
...
mariadb-10.4.0          462.51  2641.8  5303.6  4985.9  5048.4  5045.9  5045.3



 Comments   
Comment by Axel Schwenke [ 2018-11-13 ]

Narrowed it down to being range queries:

# data set 01 -> mariadb-5.5.49
# data set 02 -> mariadb-5.5.50
 
# workload #1 (point selects)
#thd    01      02
1       29724   28616
8       201414  200596
 
# workload #2 (simple ranges)
#thd    01      02
1       206.82  123.23
8       1264.4  743.30
 
# workload #3 (distinct ranges)
#thd    01      02
1       212.71  125.87
8       1166.0  763.33
 
# workload #4 (sum ranges)
#thd    01      02
1       214.86  125.14
8       1246.8  873.41
 
# workload #5 (order ranges)
#thd    01      02
1       231.91  132.73
8       1268.0  904.04
 
# workload #6 (non-index updates)
#thd    01      02
1       30755   29736
8       177581  181791
 
# workload #7 (index updates)
#thd    01      02
1       31660   31231
8       181892  186362

Now while it is clear that a HASH index is no much help for a range query, it still shouldn't be that much slower for no apparent reason.

Comment by Axel Schwenke [ 2018-11-13 ]

perf shows decimal_cmp() as the most busy function in 5.5.50. In 5.5.49 it is completely off the radar (< 0.01% hits vs. 19.22% in 5.5.50)

call graph from perf report:

# Children      Self  Command        Shared Object                                                                                
# ........  ........  .......  ...................  ..............................................................................
#
    19.22%    19.20%   mysqld  mysqld               [.] decimal_cmp                                                               
             |
             --- decimal_cmp
                |          
                |--93.14%-- _ZN17Item_func_between7val_intEv
                |          _ZL20evaluate_join_recordP4JOINP13st_join_tablei
                |          _Z10sub_selectP4JOINP13st_join_tableb
                |          _ZL9do_selectP4JOINP4ListI4ItemEP5TABLEP9Procedure
                |          _ZN4JOIN4execEv
                |          _Z13handle_selectP3THDP3LEXP13select_resultm
                |          _ZL21execute_sqlcom_selectP3THDP10TABLE_LIST
                |          _Z21mysql_execute_commandP3THD
                |          _Z11mysql_parseP3THDPcjP12Parser_state
                |          _Z16dispatch_command19enum_server_commandP3THDPcj
                |          _Z24do_handle_one_connectionP3THD
                |          handle_one_connection
                |          start_thread
                |          
                 --6.86%-- _ZL20evaluate_join_recordP4JOINP13st_join_tablei
                           _Z10sub_selectP4JOINP13st_join_tableb
                           _ZL9do_selectP4JOINP4ListI4ItemEP5TABLEP9Procedure
                           _ZN4JOIN4execEv
                           _Z13handle_selectP3THDP3LEXP13select_resultm
                           _ZL21execute_sqlcom_selectP3THDP10TABLE_LIST
                           _Z21mysql_execute_commandP3THD
                           _Z11mysql_parseP3THDPcjP12Parser_state
                           _Z16dispatch_command19enum_server_commandP3THDPcj
                           _Z24do_handle_one_connectionP3THD
                           handle_one_connection
                           start_thread

Comment by Axel Schwenke [ 2018-11-13 ]

bisecting commits in the 5.5 branch finds this:

3f0d07e55b0a2ab8c7bdc1209d55ec2f66542830 is the first bad commit
commit 3f0d07e55b0a2ab8c7bdc1209d55ec2f66542830
Author: Alexander Barkov <bar@mariadb.org>
Date:   Fri Apr 22 16:04:20 2016 +0400
 
    MDEV-9372 select 100 between 1 and 9223372036854775808 returns false
    
    Integer comparison of INT expressions with different signess in BETWEEN
    is not safe. Switching to DECIMAL comparison in case if INT arguments
    have different signess.

Comment by Axel Schwenke [ 2018-11-13 ]

test case in pure SQL:

drop procedure if exists 17698_create;
drop procedure if exists 17698_run;
 
delimiter !!
 
create procedure 17698_create()
begin
  drop table if exists t1;
  create table t1 (c1 int unsigned primary key, c2 int unsigned) engine memory;
  set @i=0;
  start transaction;
  while (@i<100000) do
    insert into t1 values (@i, 1000*rand());
    set @i=@i+1;
  end while;
  commit;
end !!
 
create procedure 17698_run()
begin
  declare j,k int;
  set @i=1000;
  while (@i>0) do
    set j=99900*rand();
    set k=j+99;
    select sum(c2) into @dummy from t1 where c1 between j and k;
    set @i=@i-1;
  end while;
end !!
 
delimiter ;
 
call 17698_create();
call 17698_run();
call 17698_run();
call 17698_run();

On a good release the 17698_run() procedure needs ~4 seconds, on a bad one ~7 seconds. If the engine is changed to InnoDB and number of selects increased to 100000, it's ~6 seconds each. There seems to be a very small regression of ~0.1 seconds, but in any case it's much smaller.

Note: firstly I used variables @j and @k in the procedure but I had to switch to declared INT variables j and k to make the effect visible. In the original sysbench range query the bounds for between are both literals and strictly non-negative. I wonder how the signedness of such constants is established?

Comment by Alexander Barkov [ 2018-11-14 ]

axel do you know why _oltp-memory-multi uses a mixture of signed int and unsigned int in BETWEEN?

This change in procedure 17698_run:

-  declare j,k int;
+  declare j,k int unsigned;

fixes the problem.

The performance degradation caused by MDEV-9372 can be fixed quite easy in 10.3, so a mixture of signed and unsigned integers will compare both:

Doing this optimization in versions 10.2 or earlier is not desirable.

Comment by Alexander Barkov [ 2018-11-14 ]

If I change the table t1 to use BTREE index (instead of the default HASH index):

drop procedure if exists 17698_create;
delimiter !!
create procedure 17698_create()
begin
  drop table if exists t1;
  create table t1 (
    c1 int,
    c2 int unsigned,
    unique key (c1) type BTREE
  ) engine memory;
  set @i=0;
  start transaction;
  while (@i<100000) do
    insert into t1 values (@i, 1000*rand());
    set @i=@i+1;
  end while;
  commit;
end !!
delimiter ;
call 17698_create();

"call 17698_run()" becomes around 100 times faster.

Comment by Axel Schwenke [ 2018-11-15 ]

commit 4576942 fixes the regression in MariaDB 10.3:

# data set 05 -> 10.3.10
# data set 06 -> 10.3 + MDEV-17698
 
# workload #2 (simple ranges)
#thd    05      06
1       107.91  201.24
8       753.60  1227.2
 
# workload #3 (distinct ranges)
#thd    05      06
1       106.71  195.30
8       750.05  1138.2
 
# workload #4 (sum ranges)
#thd    05      06
1       108.39  201.84
8       691.30  1153.5
 
# workload #5 (order ranges)
#thd    05      06
1       106.18  197.18
8       749.11  1121.6

Comment by Axel Schwenke [ 2018-11-15 ]

commit 7f17559 fixes the regression in MariaDB 5.5:

# data set 01 -> mariadb-5.5.49
# data set 02 -> mariadb-5.5.50
# data set 07 -> 5.5 + MDEV-17698
 
# workload #2 (simple ranges)
#thd    01      02      07
1       206.82  123.23  197.89
8       1264.4  743.30  1126.9
 
# workload #3 (distinct ranges)
#thd    01      02      07
1       212.71  125.87  197.55
8       1166.0  763.33  1186.8
 
# workload #4 (sum ranges)
#thd    01      02      07
1       214.86  125.14  204.37
8       1246.8  873.41  1134.1
 
# workload #5 (order ranges)
#thd    01      02      07
1       231.91  132.73  214.34
8       1268.0  904.04  1229.9

Generated at Thu Feb 08 08:38:25 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.