The regression benchmarks found a regression for mixed read/write workload and MEMORY tables. The following shows the last two good and the first two bad releases for each major version of the MariaDB server. From 10.2 onward there are no good releases any more.
Test 't_oltp-memory-multi' - sysbench OLTP read/write
Now while it is clear that a HASH index is no much help for a range query, it still shouldn't be that much slower for no apparent reason.
Axel Schwenke
added a comment - Narrowed it down to being range queries:
# data set 01 -> mariadb-5.5.49
# data set 02 -> mariadb-5.5.50
# workload #1 (point selects)
#thd 01 02
1 29724 28616
8 201414 200596
# workload #2 (simple ranges)
#thd 01 02
1 206.82 123.23
8 1264.4 743.30
# workload #3 (distinct ranges)
#thd 01 02
1 212.71 125.87
8 1166.0 763.33
# workload #4 (sum ranges)
#thd 01 02
1 214.86 125.14
8 1246.8 873.41
# workload #5 (order ranges)
#thd 01 02
1 231.91 132.73
8 1268.0 904.04
# workload #6 (non-index updates)
#thd 01 02
1 30755 29736
8 177581 181791
# workload #7 (index updates)
#thd 01 02
1 31660 31231
8 181892 186362
Now while it is clear that a HASH index is no much help for a range query, it still shouldn't be that much slower for no apparent reason.
3f0d07e55b0a2ab8c7bdc1209d55ec2f66542830 is the first bad commit
commit 3f0d07e55b0a2ab8c7bdc1209d55ec2f66542830
Author: Alexander Barkov <bar@mariadb.org>
Date: Fri Apr 22 16:04:20 2016 +0400
MDEV-9372 select 100 between 1 and 9223372036854775808 returns false
Integer comparison of INT expressions with different signess in BETWEEN
is not safe. Switching to DECIMAL comparison in case if INT arguments
have different signess.
Axel Schwenke
added a comment - bisecting commits in the 5.5 branch finds this:
3f0d07e55b0a2ab8c7bdc1209d55ec2f66542830 is the first bad commit
commit 3f0d07e55b0a2ab8c7bdc1209d55ec2f66542830
Author: Alexander Barkov <bar@mariadb.org>
Date: Fri Apr 22 16:04:20 2016 +0400
MDEV-9372 select 100 between 1 and 9223372036854775808 returns false
Integer comparison of INT expressions with different signess in BETWEEN
is not safe. Switching to DECIMAL comparison in case if INT arguments
have different signess.
create table t1 (c1 int unsigned primary key, c2 int unsigned) engine memory;
set @i=0;
start transaction;
while (@i<100000) do
insert into t1 values (@i, 1000*rand());
set @i=@i+1;
end while;
commit;
end !!
create procedure 17698_run()
begin
declare j,k int;
set @i=1000;
while (@i>0) do
set j=99900*rand();
set k=j+99;
select sum(c2) into @dummy from t1 where c1 between j and k;
set @i=@i-1;
end while;
end !!
delimiter ;
call 17698_create();
call 17698_run();
call 17698_run();
call 17698_run();
On a good release the 17698_run() procedure needs ~4 seconds, on a bad one ~7 seconds. If the engine is changed to InnoDB and number of selects increased to 100000, it's ~6 seconds each. There seems to be a very small regression of ~0.1 seconds, but in any case it's much smaller.
Note: firstly I used variables @j and @k in the procedure but I had to switch to declared INT variables j and k to make the effect visible. In the original sysbench range query the bounds for between are both literals and strictly non-negative. I wonder how the signedness of such constants is established?
Axel Schwenke
added a comment - test case in pure SQL:
drop procedure if exists 17698_create;
drop procedure if exists 17698_run;
delimiter !!
create procedure 17698_create()
begin
drop table if exists t1;
create table t1 (c1 int unsigned primary key, c2 int unsigned) engine memory;
set @i=0;
start transaction;
while (@i<100000) do
insert into t1 values (@i, 1000*rand());
set @i=@i+1;
end while;
commit;
end !!
create procedure 17698_run()
begin
declare j,k int;
set @i=1000;
while (@i>0) do
set j=99900*rand();
set k=j+99;
select sum(c2) into @dummy from t1 where c1 between j and k;
set @i=@i-1;
end while;
end !!
delimiter ;
call 17698_create();
call 17698_run();
call 17698_run();
call 17698_run();
On a good release the 17698_run() procedure needs ~4 seconds, on a bad one ~7 seconds. If the engine is changed to InnoDB and number of selects increased to 100000, it's ~6 seconds each. There seems to be a very small regression of ~0.1 seconds, but in any case it's much smaller.
Note: firstly I used variables @j and @k in the procedure but I had to switch to declared INT variables j and k to make the effect visible. In the original sysbench range query the bounds for between are both literals and strictly non-negative. I wonder how the signedness of such constants is established?
Doing this optimization in versions 10.2 or earlier is not desirable.
Alexander Barkov
added a comment - - edited axel do you know why _oltp-memory-multi uses a mixture of signed int and unsigned int in BETWEEN?
This change in procedure 17698_run:
- declare j,k int;
+ declare j,k int unsigned;
fixes the problem.
The performance degradation caused by MDEV-9372 can be fixed quite easy in 10.3, so a mixture of signed and unsigned integers will compare both:
correctly (like in after- MDEV-9372 releases)
fast (like in pre- MDEV-9372 releases)
Doing this optimization in versions 10.2 or earlier is not desirable.
If I change the table t1 to use BTREE index (instead of the default HASH index):
dropprocedureif exists 17698_create;
delimiter !!
createprocedure 17698_create()
begin
droptableif exists t1;
createtable t1 (
c1 int,
c2 int unsigned,
uniquekey (c1) type BTREE
) engine memory;
set @i=0;
start transaction;
while (@i<100000) do
insertinto t1 values (@i, 1000*rand());
set @i=@i+1;
end while;
commit;
end !!
delimiter ;
call 17698_create();
"call 17698_run()" becomes around 100 times faster.
Alexander Barkov
added a comment - - edited If I change the table t1 to use BTREE index (instead of the default HASH index):
drop procedure if exists 17698_create;
delimiter !!
create procedure 17698_create()
begin
drop table if exists t1;
create table t1 (
c1 int ,
c2 int unsigned,
unique key (c1) type BTREE
) engine memory;
set @i=0;
start transaction ;
while (@i<100000) do
insert into t1 values (@i, 1000*rand());
set @i=@i+1;
end while;
commit ;
end !!
delimiter ;
call 17698_create();
"call 17698_run()" becomes around 100 times faster.
Narrowed it down to being range queries:
# data set 01 -> mariadb-5.5.49
# data set 02 -> mariadb-5.5.50
# workload #1 (point selects)
#thd 01 02
1 29724 28616
8 201414 200596
# workload #2 (simple ranges)
#thd 01 02
1 206.82 123.23
8 1264.4 743.30
# workload #3 (distinct ranges)
#thd 01 02
1 212.71 125.87
8 1166.0 763.33
# workload #4 (sum ranges)
#thd 01 02
1 214.86 125.14
8 1246.8 873.41
# workload #5 (order ranges)
#thd 01 02
1 231.91 132.73
8 1268.0 904.04
# workload #6 (non-index updates)
#thd 01 02
1 30755 29736
8 177581 181791
# workload #7 (index updates)
#thd 01 02
1 31660 31231
8 181892 186362
Now while it is clear that a HASH index is no much help for a range query, it still shouldn't be that much slower for no apparent reason.