[MDEV-4410] update does not want to use a covering index, but select uses it. Created: 2013-04-22  Updated: 2014-03-17  Resolved: 2014-03-17

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 10.0.1, 5.5.30, 5.3.12
Fix Version/s: 5.5.37, 10.0.10, 5.3.13

Type: Bug Priority: Major
Reporter: Oleg (Inactive) Assignee: Sergei Petrunia
Resolution: Fixed Votes: 1
Labels: optimizer, upstream
Environment:

innodb_version 5.5.30-MariaDB-30.1
version 5.5.30-MariaDB-log
version_comment MariaDB Server
version_compile_machine x86_64
version_compile_os Linux



 Description   

Good day!

I found problem and strange behavior of update

there is table

CREATE TABLE `schedule_test` (
   `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
   `datetime_processed` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
   `datetime_next` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
   `sign` bigint(20) unsigned NOT NULL DEFAULT '0',
   `lock` tinyint(4) NOT NULL DEFAULT '0',
   PRIMARY KEY (`id`),
   KEY `idx_lock_dt_nxt` (`lock`,`datetime_next`)
) ENGINE=InnoDB;

it's a simple task scheduler and table has near 45k rows

Task selecting doing in this way:

update schedule_test as sa 
set 
     sa.`lock` = 1, 
     sa.sign = 2123123123,
     sa.`datetime_processed` = now()
where 
       sa.`lock` = 0 
and sa.`datetime_next` <= now()
order by sa.`datetime_next`  asc
limit 10;

problem is that update doesnt use index `idx_lock_dt_nxt` for selecting and mysql does table fullscan (read all 45k).
I look in innotop count readed rows.

But if rewrite update to select, it works as should - use undex and innodb read from table only 10 rows.

select sa.`lock`, sign, datetime_next
from  schedule_test as sa 
where 
       sa.`lock` = 0 
and sa.`datetime_next` <= now()
order by sa.`datetime_next`  asc
limit 10;

explain of this:

id    select_type    table    type    possible_keys    key    key_len    ref    rows    Extra
1    SIMPLE    sa    ref    lock    lock    1    const    22716    Using index condition; Using where

for me is not understandable nature this behaviour. as i know select and update should use the same execution plan.

As an experiment, i modify index and make it so KEY `idx_lock_dt_nxt` (`datetime_next`)

in this case update begins using it for select rows by date, but i keep in my mind, it works like full scan but little softer, anyway.
because after some time all rows with minimal datetime will be locked (lock=1) and mysql have to scan more and more rows for checking where condition.

i think may be there is an error in choosing good query plan execution for update?



 Comments   
Comment by Elena Stepanova [ 2013-04-22 ]

Test case:

CREATE TABLE `schedule_test` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`datetime_processed` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`datetime_next` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`sign` bigint(20) unsigned NOT NULL DEFAULT '0',
`lock` tinyint(4) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY `idx_lock_dt_nxt` (`lock`,`datetime_next`)
) ENGINE=InnoDB;

INSERT INTO schedule_test (datetime_processed,sign,`lock`) VALUES (NOW(),ROUND(RAND()*10000),ROUND(RAND()));
INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test;
INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test;
INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test;
INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test;
INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test;
INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test;
INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test;
INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test;
INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test;
INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test;
INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test;
INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test;

FLUSH STATUS;

select sa.`lock`, sign, datetime_next
from schedule_test as sa
where
sa.`lock` = 0
and sa.`datetime_next` <= now()
order by sa.`datetime_next` asc
limit 10;

SHOW STATUS LIKE 'Handler_read%';

FLUSH STATUS;

update schedule_test as sa
set
sa.`lock` = 1,
sa.sign = 2123123123,
sa.`datetime_processed` = now()
where
sa.`lock` = 0
and sa.`datetime_next` <= now()
order by sa.`datetime_next` asc
limit 10;

SHOW STATUS LIKE 'Handler_read%';

DROP TABLE schedule_test;

Comment by Elena Stepanova [ 2013-04-22 ]

Output:

select sa.`lock`, sign, datetime_next
from schedule_test as sa
where
sa.`lock` = 0
and sa.`datetime_next` <= now()
order by sa.`datetime_next` asc
limit 10;
lock sign datetime_next
0 5802 0000-00-00 00:00:00
0 320 0000-00-00 00:00:00
0 1416 0000-00-00 00:00:00
0 1490 0000-00-00 00:00:00
0 2185 0000-00-00 00:00:00
0 8730 0000-00-00 00:00:00
0 2423 0000-00-00 00:00:00
0 7466 0000-00-00 00:00:00
0 7506 0000-00-00 00:00:00
0 7339 0000-00-00 00:00:00

SHOW STATUS LIKE 'Handler_read%';
Variable_name Value
Handler_read_first 0
Handler_read_key 1
Handler_read_last 0
Handler_read_next 9
Handler_read_prev 0
Handler_read_rnd 0
Handler_read_rnd_deleted 0
Handler_read_rnd_next 0
FLUSH STATUS;
update schedule_test as sa
set
sa.`lock` = 1,
sa.sign = 2123123123,
sa.`datetime_processed` = now()
where
sa.`lock` = 0
and sa.`datetime_next` <= now()
order by sa.`datetime_next` asc
limit 10;
SHOW STATUS LIKE 'Handler_read%';
Variable_name Value
Handler_read_first 0
Handler_read_key 1
Handler_read_last 0
Handler_read_next 2084
Handler_read_prev 0
Handler_read_rnd 10
Handler_read_rnd_deleted 0
Handler_read_rnd_next 0

Comment by Elena Stepanova [ 2013-04-22 ]

Also reproducible on mysql-5.6

Comment by Patryk Pomykalski [ 2013-12-18 ]

I don't know if it's documented anywhere but particular index will not be used if columns of that index are updated. I think a work around is to use update on primary column and a subquery:
update table set ... where id in (select * from (select id from table where ...))

Comment by M Kellogg [ 2014-03-06 ]

This issue seems rather serious and could cause extreme performance degradation.

This really needs to be upgraded to "critical" IMO.

Comment by Sergei Petrunia [ 2014-03-15 ]

The problem is still repeatable on 10.0 and mysql-5.6

Comment by Sergei Petrunia [ 2014-03-15 ]

Investigated how the UPDATE query is optimized/executed.

The table has an index:

KEY `idx_lock_dt_nxt` (`lock`,`datetime_next`)

The WHERE clause is:

sa.`lock` = 0 and sa.`datetime_next` <= now()

it produces a range access for the index.

the ORDER BY is:

order by sa.`datetime_next` asc
limit 10;

The optimizer is able to figure out that range access over idx_lock_dt_nxt produces rows in the order that matches the ORDER BY.

Comment by Sergei Petrunia [ 2014-03-17 ]

The optimizer also will see that used_key_is_modified=true (which means one can't do updates on the fly)

And then, a wrong decision will be made: instead of using "Using buffer" the optimizer will pick "Using filesort". This will cause it to scan more rows than necessary

Comment by Sergei Petrunia [ 2014-03-17 ]

Another problem: EXPLAIN shows that rows=

{number of rows scanned by range scan}

, not

{number of rows scanned before we find LIMIT}

(let's call this "rows problem").

Comment by Sergei Petrunia [ 2014-03-17 ]

For 10.0, I'll fix the query plan problem but not "rows problem". fixing query plan problem is safe, fixing "rows problem" could potentially have some implications.

Comment by Sergei Petrunia [ 2014-03-17 ]

Pushed a fix into 10.0. #rows in EXPLAIN UPDATE will be addressed separately.

Comment by Sergei Petrunia [ 2014-03-17 ]

The #rows will be addressed in MDEV-5884.

Comment by Sergei Petrunia [ 2014-03-17 ]

Pushed the fix into MariaDB 10.0 tree.

boa, rebelde - thanks for your input, it is very appreciated.

Generated at Thu Feb 08 06:56:12 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.