[MDEV-24140] connect performance regression Created: 2020-11-05  Updated: 2020-11-05  Resolved: 2020-11-05

Status: Closed
Project: MariaDB Server
Component/s: Server
Affects Version/s: 10.1.47, 10.1.48, 10.2.34, 10.2.35, 10.3.25, 10.3.26, 10.4.15, 10.4.16, 10.5.6, 10.5.7
Fix Version/s: N/A

Type: Bug Priority: Critical
Reporter: Axel Schwenke Assignee: Axel Schwenke
Resolution: Not a Bug Votes: 0
Labels: regression


 Description   

The performance regression test suite shows a significant reduction of the number of connect; SELECT 1; disconnect; cycles that the server allows.

It affects the last two releases in the 10.1 .. 10.5 branches. This table shows the last 4 releases (2 good, 2 bad).

Test 't_connect-TC=default' - connect test, default thread cache size
workload: connect; SELECT 1 FROM dual; disconnect
numbers are connects per second
 
#thread count           1       8       16      32      64      128     256
mariadb-10.1.45         2352.8  17846   19485   19080   19185   19530   18095
mariadb-10.1.46         2338.2  17689   19723   19509   19097   18876   17951
mariadb-10.1.47         2222.9  16003   18200   17919   17533   18101   16975
mariadb-10.1.48         2222.1  16045   17830   17458   17415   17616   16809
 
mariadb-10.2.32         6324.0  36137   84842   106303  103993  104089  81115
mariadb-10.2.33         5632.8  35125   84811   106135  105619  105257  81129
mariadb-10.2.34         5904.1  36200   75119   77941   77911   78606   69920
mariadb-10.2.35-pre     6423.6  35748   74842   77825   76804   77706   69122
 
mariadb-10.3.23         6076.1  35412   85235   107689  104335  108150  82766
mariadb-10.3.24         6033.5  35755   86277   108305  107284  105296  81852
mariadb-10.3.25         6087.5  35786   75976   78890   78415   78507   70032
mariadb-10.3.26-pre     6069.6  35561   76046   79188   78527   78440   70107
 
mariadb-10.4.13         5913.2  35183   83286   106969  106577  107517  82435
mariadb-10.4.14         5856.8  34662   82742   105578  105473  106174  81587
mariadb-10.4.15         5995.8  34949   74532   77821   77096   77449   69387
mariadb-10.4.16-pre     5595.0  34511   74461   77918   78059   77830   70050
 
mariadb-10.5.4          6673.0  36643   103875  129540  132678  96475   98152
mariadb-10.5.5          5694.4  36919   103683  130704  133107  101547  100085
mariadb-10.5.6          6162.1  36409   97850   116103  121092  106306  99883
mariadb-10.5.7-pre      5609.8  36988   97164   118628  121102  109606  93927



 Comments   
Comment by Axel Schwenke [ 2020-11-05 ]

OK. This regression is bogus. It is not a change in the Server code, but a change in the test system. I have rerun the test for latest good release in each branch. Below the old result is flagged ".bak" and the one without ".bak" is from the second run.

#thread count           1       8       16      32      64      128     256
mariadb-10.1.46.bak     2338.2  17689   19723   19509   19097   18876   17951
mariadb-10.1.46         2218.2  15967   18044   17875   17941   17768   17122
mariadb-10.1.47         2222.9  16003   18200   17919   17533   18101   16975
mariadb-10.1.48         2222.1  16045   17830   17458   17415   17616   16809
 
mariadb-10.2.33.bak     5632.8  35125   84811   106135  105619  105257  81129
mariadb-10.2.33         6170.3  34484   75260   77673   77698   77599   69527
mariadb-10.2.34         5904.1  36200   75119   77941   77911   78606   69920
mariadb-10.2.35         6360.3  36381   74971   77379   77026   77170   69137
 
mariadb-10.3.24.bak     6033.5  35755   86277   108305  107284  105296  81852
mariadb-10.3.24         5929.9  35806   76284   78907   78724   79283   71101
mariadb-10.3.25         5982.4  35527   75839   78659   78432   78576   70475
mariadb-10.3.26         5960.2  36573   75883   78852   78372   78385   70697
 
mariadb-10.4.14.bak     5856.8  34662   82742   105578  105473  106174  81587
mariadb-10.4.14         5541.4  35331   74602   77749   77969   77954   70299
mariadb-10.4.15         5995.8  34949   74532   77821   77096   77449   69387
mariadb-10.4.16         6201.4  34665   74700   77877   77458   77428   69629
 
mariadb-10.5.5.bak      5694.4  36919   103683  130704  133107  101547  100085
mariadb-10.5.5          5350.9  36761   96610   117424  119512  104814  95839
mariadb-10.5.6          6162.1  36409   97850   116103  121092  106306  99883
mariadb-10.5.7-pre      5609.8  36988   97164   118628  121102  109606  93927

The test machine was rebooted between the "good" and "bad" results. It also had it's RAM replaced (hardware defect). Very possibly it booted with a newer kernel. It is now running 4.15.0-112-generic. The previous kernel was at most 4.15.0-51-generic, possibly the much older 4.4.0-145-generic.

The observed regression is more prominent for test cases that do more context switches (point select on MEMORY tables, connect). The new kernel probably contains more mitigations for not-so-recent vulnerabilities in Intel CPUs (Spectre, Meltdown etc.) that make context switches more expensive.

Comment by Axel Schwenke [ 2020-11-05 ]

I now also reran the point-select testcase on MEMORY tables. I see again slower results for the last "good" release in the current environment. Or in other works: this regression is again caused by changes in the test system, not the tested code.

Test 't_1K-reads-memory-multi' - sysbench OLTP readonly
1000 point selects per iteration, no range queries
20 tables, 1 mio rows total, engine MEMORY
numbers are queries per second
 
#thread count           1       8       16      32      64      128     256
mariadb-10.1.46.bak     22674   161391  248894  419076  412670  408980  411369
mariadb-10.1.46         22809   158438  240798  404404  400018  397837  399455
mariadb-10.1.47         24110   166113  243503  402993  400759  399484  398667
mariadb-10.1.48         23432   157145  240082  396930  398292  394489  398342
 
mariadb-10.2.33.bak     20348   149681  237102  402713  400152  399270  397628
mariadb-10.2.33         23130   152394  233788  391468  390511  386418  388688
mariadb-10.2.34         22879   150536  232772  390538  387388  387140  384545
mariadb-10.2.35         22417   150716  233545  391137  389270  386059  386935
 
mariadb-10.3.24.bak     21645   146903  233067  394226  389591  390952  389096
mariadb-10.3.24         22242   149264  226557  375837  375044  374895  374155
mariadb-10.3.25         21085   143859  224819  375037  374603  373816  372040
mariadb-10.3.26         21528   145186  224912  374538  373189  372417  371196
 
mariadb-10.4.14.bak     18949   137929  222598  370316  370741  372015  368066
mariadb-10.4.14         20244   135180  213022  356376  355445  356116  354143
mariadb-10.4.15         20189   135525  213835  358731  353810  355944  353213
mariadb-10.4.16         20025   136277  212890  359120  354785  356021  353390
 
mariadb-10.5.5.bak      20207   139597  222369  369957  369435  369814  367771
mariadb-10.5.5          20771   139362  216183  359823  357807  356979  354691
mariadb-10.5.6          19721   133263  212382  355269  352901  353354  350997
mariadb-10.5.7          20481   137228  214844  356719  355523  357171  355382

Comment by Axel Schwenke [ 2020-11-05 ]

regression was caused by changes in the test environment (new Linux kernel)

Generated at Thu Feb 08 09:27:44 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.