Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-17698

MEMORY engine performance regression

Details

    Description

      The regression benchmarks found a regression for mixed read/write workload and MEMORY tables. The following shows the last two good and the first two bad releases for each major version of the MariaDB server. From 10.2 onward there are no good releases any more.

      Test 't_oltp-memory-multi' - sysbench OLTP read/write
      32 Memory tables with 100K rows each
      numbers are queries per second
       
      #thread count           1       8       16      32      64      128     256
      mariadb-5.5.48          985.74  4912.4  7701.0  8719.2  8484.9  8441.0  8425.6
      mariadb-5.5.49          962.26  4532.4  7638.5  8332.6  8167.6  8143.0  8135.8
      mariadb-5.5.50          559.35  3036.3  5002.7  4960.6  5011.1  5019.3  5012.1
      mariadb-5.5.51          571.61  3576.5  5357.0  4830.9  4913.1  4912.8  4914.3
      ...
      mariadb-10.0.23         948.63  5292.5  7040.5  7106.2  7047.1  7037.4  7044.7
      mariadb-10.0.24         938.52  5259.6  6961.9  7021.9  7032.9  7034.6  7024.4
      mariadb-10.0.25         563.74  3536.8  5893.9  5786.6  5816.9  5797.2  5799.1
      mariadb-10.0.26         548.38  3516.5  6002.7  5807.9  5788.9  5780.5  5786.1
      ...
      mariadb-10.1.12         958.87  5353.0  6957.1  8403.9  7739.5  7749.0  7766.4
      mariadb-10.1.13         949.01  5510.0  7114.3  8458.5  8295.2  8216.3  8291.6
      mariadb-10.1.14         567.69  3743.1  6147.7  5653.2  5400.9  5439.1  5415.8
      mariadb-10.1.15         549.83  3580.7  5666.3  5267.9  5286.0  5340.9  5293.5
      ...
      mariadb-10.2.1          487.03  3116.2  4802.8  4393.1  4452.0  4446.7  4457.6
      mariadb-10.2.2          495.54  3065.3  4600.5  4423.6  4487.9  4488.6  4496.6
      ...
      mariadb-10.3.0          486.31  3070.2  5629.3  5160.6  5325.6  5308.0  5305.6
      mariadb-10.3.1          500.91  3278.9  5541.2  5335.6  5433.4  5430.5  5439.4
      ...
      mariadb-10.4.0          462.51  2641.8  5303.6  4985.9  5048.4  5045.9  5045.3
      

      Attachments

        Issue Links

          Activity

            axel Axel Schwenke added a comment -

            Narrowed it down to being range queries:

            # data set 01 -> mariadb-5.5.49
            # data set 02 -> mariadb-5.5.50
             
            # workload #1 (point selects)
            #thd    01      02
            1       29724   28616
            8       201414  200596
             
            # workload #2 (simple ranges)
            #thd    01      02
            1       206.82  123.23
            8       1264.4  743.30
             
            # workload #3 (distinct ranges)
            #thd    01      02
            1       212.71  125.87
            8       1166.0  763.33
             
            # workload #4 (sum ranges)
            #thd    01      02
            1       214.86  125.14
            8       1246.8  873.41
             
            # workload #5 (order ranges)
            #thd    01      02
            1       231.91  132.73
            8       1268.0  904.04
             
            # workload #6 (non-index updates)
            #thd    01      02
            1       30755   29736
            8       177581  181791
             
            # workload #7 (index updates)
            #thd    01      02
            1       31660   31231
            8       181892  186362
            

            Now while it is clear that a HASH index is no much help for a range query, it still shouldn't be that much slower for no apparent reason.

            axel Axel Schwenke added a comment - Narrowed it down to being range queries: # data set 01 -> mariadb-5.5.49 # data set 02 -> mariadb-5.5.50   # workload #1 (point selects) #thd 01 02 1 29724 28616 8 201414 200596   # workload #2 (simple ranges) #thd 01 02 1 206.82 123.23 8 1264.4 743.30   # workload #3 (distinct ranges) #thd 01 02 1 212.71 125.87 8 1166.0 763.33   # workload #4 (sum ranges) #thd 01 02 1 214.86 125.14 8 1246.8 873.41   # workload #5 (order ranges) #thd 01 02 1 231.91 132.73 8 1268.0 904.04   # workload #6 (non-index updates) #thd 01 02 1 30755 29736 8 177581 181791   # workload #7 (index updates) #thd 01 02 1 31660 31231 8 181892 186362 Now while it is clear that a HASH index is no much help for a range query, it still shouldn't be that much slower for no apparent reason.
            axel Axel Schwenke added a comment -

            perf shows decimal_cmp() as the most busy function in 5.5.50. In 5.5.49 it is completely off the radar (< 0.01% hits vs. 19.22% in 5.5.50)

            call graph from perf report:

            # Children      Self  Command        Shared Object                                                                                
            # ........  ........  .......  ...................  ..............................................................................
            #
                19.22%    19.20%   mysqld  mysqld               [.] decimal_cmp                                                               
                         |
                         --- decimal_cmp
                            |          
                            |--93.14%-- _ZN17Item_func_between7val_intEv
                            |          _ZL20evaluate_join_recordP4JOINP13st_join_tablei
                            |          _Z10sub_selectP4JOINP13st_join_tableb
                            |          _ZL9do_selectP4JOINP4ListI4ItemEP5TABLEP9Procedure
                            |          _ZN4JOIN4execEv
                            |          _Z13handle_selectP3THDP3LEXP13select_resultm
                            |          _ZL21execute_sqlcom_selectP3THDP10TABLE_LIST
                            |          _Z21mysql_execute_commandP3THD
                            |          _Z11mysql_parseP3THDPcjP12Parser_state
                            |          _Z16dispatch_command19enum_server_commandP3THDPcj
                            |          _Z24do_handle_one_connectionP3THD
                            |          handle_one_connection
                            |          start_thread
                            |          
                             --6.86%-- _ZL20evaluate_join_recordP4JOINP13st_join_tablei
                                       _Z10sub_selectP4JOINP13st_join_tableb
                                       _ZL9do_selectP4JOINP4ListI4ItemEP5TABLEP9Procedure
                                       _ZN4JOIN4execEv
                                       _Z13handle_selectP3THDP3LEXP13select_resultm
                                       _ZL21execute_sqlcom_selectP3THDP10TABLE_LIST
                                       _Z21mysql_execute_commandP3THD
                                       _Z11mysql_parseP3THDPcjP12Parser_state
                                       _Z16dispatch_command19enum_server_commandP3THDPcj
                                       _Z24do_handle_one_connectionP3THD
                                       handle_one_connection
                                       start_thread
            

            axel Axel Schwenke added a comment - perf shows decimal_cmp() as the most busy function in 5.5.50. In 5.5.49 it is completely off the radar (< 0.01% hits vs. 19.22% in 5.5.50) call graph from perf report: # Children Self Command Shared Object # ........ ........ ....... ................... .............................................................................. # 19.22% 19.20% mysqld mysqld [.] decimal_cmp | --- decimal_cmp | |--93.14%-- _ZN17Item_func_between7val_intEv | _ZL20evaluate_join_recordP4JOINP13st_join_tablei | _Z10sub_selectP4JOINP13st_join_tableb | _ZL9do_selectP4JOINP4ListI4ItemEP5TABLEP9Procedure | _ZN4JOIN4execEv | _Z13handle_selectP3THDP3LEXP13select_resultm | _ZL21execute_sqlcom_selectP3THDP10TABLE_LIST | _Z21mysql_execute_commandP3THD | _Z11mysql_parseP3THDPcjP12Parser_state | _Z16dispatch_command19enum_server_commandP3THDPcj | _Z24do_handle_one_connectionP3THD | handle_one_connection | start_thread | --6.86%-- _ZL20evaluate_join_recordP4JOINP13st_join_tablei _Z10sub_selectP4JOINP13st_join_tableb _ZL9do_selectP4JOINP4ListI4ItemEP5TABLEP9Procedure _ZN4JOIN4execEv _Z13handle_selectP3THDP3LEXP13select_resultm _ZL21execute_sqlcom_selectP3THDP10TABLE_LIST _Z21mysql_execute_commandP3THD _Z11mysql_parseP3THDPcjP12Parser_state _Z16dispatch_command19enum_server_commandP3THDPcj _Z24do_handle_one_connectionP3THD handle_one_connection start_thread
            axel Axel Schwenke added a comment -

            bisecting commits in the 5.5 branch finds this:

            3f0d07e55b0a2ab8c7bdc1209d55ec2f66542830 is the first bad commit
            commit 3f0d07e55b0a2ab8c7bdc1209d55ec2f66542830
            Author: Alexander Barkov <bar@mariadb.org>
            Date:   Fri Apr 22 16:04:20 2016 +0400
             
                MDEV-9372 select 100 between 1 and 9223372036854775808 returns false
                
                Integer comparison of INT expressions with different signess in BETWEEN
                is not safe. Switching to DECIMAL comparison in case if INT arguments
                have different signess.
            

            axel Axel Schwenke added a comment - bisecting commits in the 5.5 branch finds this: 3f0d07e55b0a2ab8c7bdc1209d55ec2f66542830 is the first bad commit commit 3f0d07e55b0a2ab8c7bdc1209d55ec2f66542830 Author: Alexander Barkov <bar@mariadb.org> Date: Fri Apr 22 16:04:20 2016 +0400   MDEV-9372 select 100 between 1 and 9223372036854775808 returns false Integer comparison of INT expressions with different signess in BETWEEN is not safe. Switching to DECIMAL comparison in case if INT arguments have different signess.
            axel Axel Schwenke added a comment -

            test case in pure SQL:

            drop procedure if exists 17698_create;
            drop procedure if exists 17698_run;
             
            delimiter !!
             
            create procedure 17698_create()
            begin
              drop table if exists t1;
              create table t1 (c1 int unsigned primary key, c2 int unsigned) engine memory;
              set @i=0;
              start transaction;
              while (@i<100000) do
                insert into t1 values (@i, 1000*rand());
                set @i=@i+1;
              end while;
              commit;
            end !!
             
            create procedure 17698_run()
            begin
              declare j,k int;
              set @i=1000;
              while (@i>0) do
                set j=99900*rand();
                set k=j+99;
                select sum(c2) into @dummy from t1 where c1 between j and k;
                set @i=@i-1;
              end while;
            end !!
             
            delimiter ;
             
            call 17698_create();
            call 17698_run();
            call 17698_run();
            call 17698_run();
            

            On a good release the 17698_run() procedure needs ~4 seconds, on a bad one ~7 seconds. If the engine is changed to InnoDB and number of selects increased to 100000, it's ~6 seconds each. There seems to be a very small regression of ~0.1 seconds, but in any case it's much smaller.

            Note: firstly I used variables @j and @k in the procedure but I had to switch to declared INT variables j and k to make the effect visible. In the original sysbench range query the bounds for between are both literals and strictly non-negative. I wonder how the signedness of such constants is established?

            axel Axel Schwenke added a comment - test case in pure SQL: drop procedure if exists 17698_create; drop procedure if exists 17698_run;   delimiter !!   create procedure 17698_create() begin drop table if exists t1; create table t1 (c1 int unsigned primary key, c2 int unsigned) engine memory; set @i=0; start transaction; while (@i<100000) do insert into t1 values (@i, 1000*rand()); set @i=@i+1; end while; commit; end !!   create procedure 17698_run() begin declare j,k int; set @i=1000; while (@i>0) do set j=99900*rand(); set k=j+99; select sum(c2) into @dummy from t1 where c1 between j and k; set @i=@i-1; end while; end !!   delimiter ;   call 17698_create(); call 17698_run(); call 17698_run(); call 17698_run(); On a good release the 17698_run() procedure needs ~4 seconds, on a bad one ~7 seconds. If the engine is changed to InnoDB and number of selects increased to 100000, it's ~6 seconds each. There seems to be a very small regression of ~0.1 seconds, but in any case it's much smaller. Note: firstly I used variables @j and @k in the procedure but I had to switch to declared INT variables j and k to make the effect visible. In the original sysbench range query the bounds for between are both literals and strictly non-negative. I wonder how the signedness of such constants is established?
            bar Alexander Barkov added a comment - - edited

            axel do you know why _oltp-memory-multi uses a mixture of signed int and unsigned int in BETWEEN?

            This change in procedure 17698_run:

            -  declare j,k int;
            +  declare j,k int unsigned;
            

            fixes the problem.

            The performance degradation caused by MDEV-9372 can be fixed quite easy in 10.3, so a mixture of signed and unsigned integers will compare both:

            Doing this optimization in versions 10.2 or earlier is not desirable.

            bar Alexander Barkov added a comment - - edited axel do you know why _oltp-memory-multi uses a mixture of signed int and unsigned int in BETWEEN? This change in procedure 17698_run: - declare j,k int; + declare j,k int unsigned; fixes the problem. The performance degradation caused by MDEV-9372 can be fixed quite easy in 10.3, so a mixture of signed and unsigned integers will compare both: correctly (like in after- MDEV-9372 releases) fast (like in pre- MDEV-9372 releases) Doing this optimization in versions 10.2 or earlier is not desirable.
            bar Alexander Barkov added a comment - - edited

            If I change the table t1 to use BTREE index (instead of the default HASH index):

            drop procedure if exists 17698_create;
            delimiter !!
            create procedure 17698_create()
            begin
              drop table if exists t1;
              create table t1 (
                c1 int,
                c2 int unsigned,
                unique key (c1) type BTREE
              ) engine memory;
              set @i=0;
              start transaction;
              while (@i<100000) do
                insert into t1 values (@i, 1000*rand());
                set @i=@i+1;
              end while;
              commit;
            end !!
            delimiter ;
            call 17698_create();
            

            "call 17698_run()" becomes around 100 times faster.

            bar Alexander Barkov added a comment - - edited If I change the table t1 to use BTREE index (instead of the default HASH index): drop procedure if exists 17698_create; delimiter !! create procedure 17698_create() begin drop table if exists t1; create table t1 ( c1 int , c2 int unsigned, unique key (c1) type BTREE ) engine memory; set @i=0; start transaction ; while (@i<100000) do insert into t1 values (@i, 1000*rand()); set @i=@i+1; end while; commit ; end !! delimiter ; call 17698_create(); "call 17698_run()" becomes around 100 times faster.
            axel Axel Schwenke added a comment -

            commit 4576942 fixes the regression in MariaDB 10.3:

            # data set 05 -> 10.3.10
            # data set 06 -> 10.3 + MDEV-17698
             
            # workload #2 (simple ranges)
            #thd    05      06
            1       107.91  201.24
            8       753.60  1227.2
             
            # workload #3 (distinct ranges)
            #thd    05      06
            1       106.71  195.30
            8       750.05  1138.2
             
            # workload #4 (sum ranges)
            #thd    05      06
            1       108.39  201.84
            8       691.30  1153.5
             
            # workload #5 (order ranges)
            #thd    05      06
            1       106.18  197.18
            8       749.11  1121.6
            

            axel Axel Schwenke added a comment - commit 4576942 fixes the regression in MariaDB 10.3: # data set 05 -> 10.3.10 # data set 06 -> 10.3 + MDEV-17698   # workload #2 (simple ranges) #thd 05 06 1 107.91 201.24 8 753.60 1227.2   # workload #3 (distinct ranges) #thd 05 06 1 106.71 195.30 8 750.05 1138.2   # workload #4 (sum ranges) #thd 05 06 1 108.39 201.84 8 691.30 1153.5   # workload #5 (order ranges) #thd 05 06 1 106.18 197.18 8 749.11 1121.6
            axel Axel Schwenke added a comment -

            commit 7f17559 fixes the regression in MariaDB 5.5:

            # data set 01 -> mariadb-5.5.49
            # data set 02 -> mariadb-5.5.50
            # data set 07 -> 5.5 + MDEV-17698
             
            # workload #2 (simple ranges)
            #thd    01      02      07
            1       206.82  123.23  197.89
            8       1264.4  743.30  1126.9
             
            # workload #3 (distinct ranges)
            #thd    01      02      07
            1       212.71  125.87  197.55
            8       1166.0  763.33  1186.8
             
            # workload #4 (sum ranges)
            #thd    01      02      07
            1       214.86  125.14  204.37
            8       1246.8  873.41  1134.1
             
            # workload #5 (order ranges)
            #thd    01      02      07
            1       231.91  132.73  214.34
            8       1268.0  904.04  1229.9
            

            axel Axel Schwenke added a comment - commit 7f17559 fixes the regression in MariaDB 5.5: # data set 01 -> mariadb-5.5.49 # data set 02 -> mariadb-5.5.50 # data set 07 -> 5.5 + MDEV-17698   # workload #2 (simple ranges) #thd 01 02 07 1 206.82 123.23 197.89 8 1264.4 743.30 1126.9   # workload #3 (distinct ranges) #thd 01 02 07 1 212.71 125.87 197.55 8 1166.0 763.33 1186.8   # workload #4 (sum ranges) #thd 01 02 07 1 214.86 125.14 204.37 8 1246.8 873.41 1134.1   # workload #5 (order ranges) #thd 01 02 07 1 231.91 132.73 214.34 8 1268.0 904.04 1229.9

            People

              bar Alexander Barkov
              axel Axel Schwenke
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.