Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-4410

update does not want to use a covering index, but select uses it.

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.0.1, 5.5.30, 5.3.12
    • 5.5.37, 10.0.10, 5.3.13
    • None
    • innodb_version 5.5.30-MariaDB-30.1
      version 5.5.30-MariaDB-log
      version_comment MariaDB Server
      version_compile_machine x86_64
      version_compile_os Linux

    Description

      Good day!

      I found problem and strange behavior of update

      there is table

      CREATE TABLE `schedule_test` (
         `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
         `datetime_processed` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
         `datetime_next` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
         `sign` bigint(20) unsigned NOT NULL DEFAULT '0',
         `lock` tinyint(4) NOT NULL DEFAULT '0',
         PRIMARY KEY (`id`),
         KEY `idx_lock_dt_nxt` (`lock`,`datetime_next`)
      ) ENGINE=InnoDB;

      it's a simple task scheduler and table has near 45k rows

      Task selecting doing in this way:

      update schedule_test as sa 
      set 
           sa.`lock` = 1, 
           sa.sign = 2123123123,
           sa.`datetime_processed` = now()
      where 
             sa.`lock` = 0 
      and sa.`datetime_next` <= now()
      order by sa.`datetime_next`  asc
      limit 10;

      problem is that update doesnt use index `idx_lock_dt_nxt` for selecting and mysql does table fullscan (read all 45k).
      I look in innotop count readed rows.

      But if rewrite update to select, it works as should - use undex and innodb read from table only 10 rows.

      select sa.`lock`, sign, datetime_next
      from  schedule_test as sa 
      where 
             sa.`lock` = 0 
      and sa.`datetime_next` <= now()
      order by sa.`datetime_next`  asc
      limit 10;

      explain of this:

      id    select_type    table    type    possible_keys    key    key_len    ref    rows    Extra
      1    SIMPLE    sa    ref    lock    lock    1    const    22716    Using index condition; Using where

      for me is not understandable nature this behaviour. as i know select and update should use the same execution plan.

      As an experiment, i modify index and make it so KEY `idx_lock_dt_nxt` (`datetime_next`)

      in this case update begins using it for select rows by date, but i keep in my mind, it works like full scan but little softer, anyway.
      because after some time all rows with minimal datetime will be locked (lock=1) and mysql have to scan more and more rows for checking where condition.

      i think may be there is an error in choosing good query plan execution for update?

      Attachments

        Activity

          boa Oleg (Inactive) created issue -
          elenst Elena Stepanova made changes -
          Field Original Value New Value
          Description Good day!

          I found problem and strange behavior of update

          there is table

          CREATE TABLE `schedule_test` (
             `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
             `datetime_processed` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
             `datetime_next` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
             `sign` bigint(20) unsigned NOT NULL DEFAULT '0',
             `lock` tinyint(4) NOT NULL DEFAULT '0',
             PRIMARY KEY (`id`),
             KEY `idx_lock_dt_nxt` (`lock`,`datetime_next`)
          ) ENGINE=InnoDB;

          it's a simple task scheduler and table has near 45k rows

          Task selecting doing in this way:

          update schedule_test as sa
          set
               sa.`lock` = 1,
               sa.sign = 2123123123,
               sa.`datetime_processed` = now()
          where
                 sa.`lock` = 0
          and sa.`datetime_next` <= now()
          order by sa.`datetime_next` asc
          limit 10;

          problem is that update doesnt use index `idx_lock_dt_nxt` for selecting and mysql does table fullscan (read all 45k).
          I look in innotop count readed rows.

          But if rewrite update to select, it works as should - use undex and innodb read from table only 10 rows.

          select sa.`lock`, sign, datetime_next
          from schedule_test as sa
          where
                 sa.`lock` = 0
          and sa.`datetime_next` <= now()
          order by sa.`datetime_next` asc
          limit 10;


          explain of this:

          id select_type table type possible_keys key key_len ref rows Extra
          1 SIMPLE sa ref lock lock 1 const 22716 Using index condition; Using where

          for me is not understandable nature this behaviour. as i know select and update should use the same execution plan.

          As an experiment, i modify index and make it so KEY `idx_lock_dt_nxt` (`datetime_next`)

          in this case update begins using it for select rows by date, but i keep in my mind, it works like full scan but little softer, anyway.
          because after some time all rows with minimal datetime will be locked (lock=1) and mysql have to scan more and more rows for checking where condition.

          i think may be there is an error in choosing good query plan execution for update?
          Good day!

          I found problem and strange behavior of update

          there is table

          {code:sql}
          CREATE TABLE `schedule_test` (
             `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
             `datetime_processed` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
             `datetime_next` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
             `sign` bigint(20) unsigned NOT NULL DEFAULT '0',
             `lock` tinyint(4) NOT NULL DEFAULT '0',
             PRIMARY KEY (`id`),
             KEY `idx_lock_dt_nxt` (`lock`,`datetime_next`)
          ) ENGINE=InnoDB;
          {code}

          it's a simple task scheduler and table has near 45k rows

          Task selecting doing in this way:

          {code:sql}
          update schedule_test as sa
          set
               sa.`lock` = 1,
               sa.sign = 2123123123,
               sa.`datetime_processed` = now()
          where
                 sa.`lock` = 0
          and sa.`datetime_next` <= now()
          order by sa.`datetime_next` asc
          limit 10;
          {code}

          problem is that update doesnt use index `idx_lock_dt_nxt` for selecting and mysql does table fullscan (read all 45k).
          I look in innotop count readed rows.

          But if rewrite update to select, it works as should - use undex and innodb read from table only 10 rows.

          {code:sql}
          select sa.`lock`, sign, datetime_next
          from schedule_test as sa
          where
                 sa.`lock` = 0
          and sa.`datetime_next` <= now()
          order by sa.`datetime_next` asc
          limit 10;
          {code}

          explain of this:
          {noformat}
          id select_type table type possible_keys key key_len ref rows Extra
          1 SIMPLE sa ref lock lock 1 const 22716 Using index condition; Using where
          {noformat}

          for me is not understandable nature this behaviour. as i know select and update should use the same execution plan.

          As an experiment, i modify index and make it so {{KEY `idx_lock_dt_nxt` (`datetime_next`)}}

          in this case update begins using it for select rows by date, but i keep in my mind, it works like full scan but little softer, anyway.
          because after some time all rows with minimal datetime will be locked (lock=1) and mysql have to scan more and more rows for checking where condition.

          i think may be there is an error in choosing good query plan execution for update?
          Labels optimizer
          elenst Elena Stepanova made changes -
          Assignee Elena Stepanova [ elenst ]

          Test case:

          CREATE TABLE `schedule_test` (
          `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
          `datetime_processed` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
          `datetime_next` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
          `sign` bigint(20) unsigned NOT NULL DEFAULT '0',
          `lock` tinyint(4) NOT NULL DEFAULT '0',
          PRIMARY KEY (`id`),
          KEY `idx_lock_dt_nxt` (`lock`,`datetime_next`)
          ) ENGINE=InnoDB;

          INSERT INTO schedule_test (datetime_processed,sign,`lock`) VALUES (NOW(),ROUND(RAND()*10000),ROUND(RAND()));
          INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test;
          INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test;
          INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test;
          INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test;
          INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test;
          INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test;
          INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test;
          INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test;
          INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test;
          INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test;
          INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test;
          INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test;

          FLUSH STATUS;

          select sa.`lock`, sign, datetime_next
          from schedule_test as sa
          where
          sa.`lock` = 0
          and sa.`datetime_next` <= now()
          order by sa.`datetime_next` asc
          limit 10;

          SHOW STATUS LIKE 'Handler_read%';

          FLUSH STATUS;

          update schedule_test as sa
          set
          sa.`lock` = 1,
          sa.sign = 2123123123,
          sa.`datetime_processed` = now()
          where
          sa.`lock` = 0
          and sa.`datetime_next` <= now()
          order by sa.`datetime_next` asc
          limit 10;

          SHOW STATUS LIKE 'Handler_read%';

          DROP TABLE schedule_test;

          elenst Elena Stepanova added a comment - Test case: CREATE TABLE `schedule_test` ( `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `datetime_processed` datetime NOT NULL DEFAULT '0000-00-00 00:00:00', `datetime_next` datetime NOT NULL DEFAULT '0000-00-00 00:00:00', `sign` bigint(20) unsigned NOT NULL DEFAULT '0', `lock` tinyint(4) NOT NULL DEFAULT '0', PRIMARY KEY (`id`), KEY `idx_lock_dt_nxt` (`lock`,`datetime_next`) ) ENGINE=InnoDB; INSERT INTO schedule_test (datetime_processed,sign,`lock`) VALUES (NOW(),ROUND(RAND()*10000),ROUND(RAND())); INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test; INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test; INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test; INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test; INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test; INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test; INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test; INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test; INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test; INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test; INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test; INSERT INTO schedule_test (datetime_processed,sign,`lock`) SELECT NOW(),ROUND(RAND()*10000),ROUND(RAND()) FROM schedule_test; FLUSH STATUS; select sa.`lock`, sign, datetime_next from schedule_test as sa where sa.`lock` = 0 and sa.`datetime_next` <= now() order by sa.`datetime_next` asc limit 10; SHOW STATUS LIKE 'Handler_read%'; FLUSH STATUS; update schedule_test as sa set sa.`lock` = 1, sa.sign = 2123123123, sa.`datetime_processed` = now() where sa.`lock` = 0 and sa.`datetime_next` <= now() order by sa.`datetime_next` asc limit 10; SHOW STATUS LIKE 'Handler_read%'; DROP TABLE schedule_test;

          Output:

          select sa.`lock`, sign, datetime_next
          from schedule_test as sa
          where
          sa.`lock` = 0
          and sa.`datetime_next` <= now()
          order by sa.`datetime_next` asc
          limit 10;
          lock sign datetime_next
          0 5802 0000-00-00 00:00:00
          0 320 0000-00-00 00:00:00
          0 1416 0000-00-00 00:00:00
          0 1490 0000-00-00 00:00:00
          0 2185 0000-00-00 00:00:00
          0 8730 0000-00-00 00:00:00
          0 2423 0000-00-00 00:00:00
          0 7466 0000-00-00 00:00:00
          0 7506 0000-00-00 00:00:00
          0 7339 0000-00-00 00:00:00

          SHOW STATUS LIKE 'Handler_read%';
          Variable_name Value
          Handler_read_first 0
          Handler_read_key 1
          Handler_read_last 0
          Handler_read_next 9
          Handler_read_prev 0
          Handler_read_rnd 0
          Handler_read_rnd_deleted 0
          Handler_read_rnd_next 0
          FLUSH STATUS;
          update schedule_test as sa
          set
          sa.`lock` = 1,
          sa.sign = 2123123123,
          sa.`datetime_processed` = now()
          where
          sa.`lock` = 0
          and sa.`datetime_next` <= now()
          order by sa.`datetime_next` asc
          limit 10;
          SHOW STATUS LIKE 'Handler_read%';
          Variable_name Value
          Handler_read_first 0
          Handler_read_key 1
          Handler_read_last 0
          Handler_read_next 2084
          Handler_read_prev 0
          Handler_read_rnd 10
          Handler_read_rnd_deleted 0
          Handler_read_rnd_next 0

          elenst Elena Stepanova added a comment - Output: select sa.`lock`, sign, datetime_next from schedule_test as sa where sa.`lock` = 0 and sa.`datetime_next` <= now() order by sa.`datetime_next` asc limit 10; lock sign datetime_next 0 5802 0000-00-00 00:00:00 0 320 0000-00-00 00:00:00 0 1416 0000-00-00 00:00:00 0 1490 0000-00-00 00:00:00 0 2185 0000-00-00 00:00:00 0 8730 0000-00-00 00:00:00 0 2423 0000-00-00 00:00:00 0 7466 0000-00-00 00:00:00 0 7506 0000-00-00 00:00:00 0 7339 0000-00-00 00:00:00 SHOW STATUS LIKE 'Handler_read%'; Variable_name Value Handler_read_first 0 Handler_read_key 1 Handler_read_last 0 Handler_read_next 9 Handler_read_prev 0 Handler_read_rnd 0 Handler_read_rnd_deleted 0 Handler_read_rnd_next 0 FLUSH STATUS; update schedule_test as sa set sa.`lock` = 1, sa.sign = 2123123123, sa.`datetime_processed` = now() where sa.`lock` = 0 and sa.`datetime_next` <= now() order by sa.`datetime_next` asc limit 10; SHOW STATUS LIKE 'Handler_read%'; Variable_name Value Handler_read_first 0 Handler_read_key 1 Handler_read_last 0 Handler_read_next 2084 Handler_read_prev 0 Handler_read_rnd 10 Handler_read_rnd_deleted 0 Handler_read_rnd_next 0
          elenst Elena Stepanova made changes -
          Labels optimizer optimizer upstream
          elenst Elena Stepanova made changes -
          Affects Version/s 5.3.12 [ 12000 ]
          Affects Version/s 10.0.1 [ 11400 ]
          elenst Elena Stepanova made changes -
          Fix Version/s 10.0.3 [ 12900 ]
          Fix Version/s 5.5.32 [ 13000 ]

          Also reproducible on mysql-5.6

          elenst Elena Stepanova added a comment - Also reproducible on mysql-5.6
          elenst Elena Stepanova made changes -
          Assignee Elena Stepanova [ elenst ] Sergei Petrunia [ psergey ]
          serg Sergei Golubchik made changes -
          Fix Version/s 10.0.4 [ 13101 ]
          Fix Version/s 10.0.3 [ 12900 ]
          psergei Sergei Petrunia made changes -
          Fix Version/s 5.5.33 [ 13300 ]
          Fix Version/s 5.5.32 [ 13000 ]
          serg Sergei Golubchik made changes -
          Fix Version/s 10.0.5 [ 13201 ]
          Fix Version/s 10.0.4 [ 13101 ]
          serg Sergei Golubchik made changes -
          Rank Ranked higher
          psergei Sergei Petrunia made changes -
          Fix Version/s 5.5.34 [ 13500 ]
          Fix Version/s 5.5.33 [ 13300 ]
          serg Sergei Golubchik made changes -
          Fix Version/s 5.5.34 [ 13700 ]
          Fix Version/s 5.5.33a [ 13500 ]
          serg Sergei Golubchik made changes -
          Fix Version/s 10.0.6 [ 13202 ]
          Fix Version/s 10.0.5 [ 13201 ]
          psergei Sergei Petrunia made changes -
          Fix Version/s 5.5.35 [ 14000 ]
          Fix Version/s 5.5.34 [ 13700 ]
          serg Sergei Golubchik made changes -
          Fix Version/s 10.0.7 [ 14100 ]
          Fix Version/s 10.0.6 [ 13202 ]
          psergei Sergei Petrunia made changes -
          Fix Version/s 10.0.8 [ 14200 ]
          Fix Version/s 10.0.7 [ 14100 ]

          I don't know if it's documented anywhere but particular index will not be used if columns of that index are updated. I think a work around is to use update on primary column and a subquery:
          update table set ... where id in (select * from (select id from table where ...))

          pomyk Patryk Pomykalski added a comment - I don't know if it's documented anywhere but particular index will not be used if columns of that index are updated. I think a work around is to use update on primary column and a subquery: update table set ... where id in (select * from (select id from table where ...))
          psergei Sergei Petrunia made changes -
          Fix Version/s 5.5.36 [ 14600 ]
          Fix Version/s 5.5.35 [ 14000 ]
          psergei Sergei Petrunia made changes -
          Fix Version/s 10.0.9 [ 14400 ]
          Fix Version/s 10.0.8 [ 14200 ]
          psergei Sergei Petrunia made changes -
          Fix Version/s 5.5.37 [ 15000 ]
          Fix Version/s 5.5.36 [ 14600 ]
          serg Sergei Golubchik made changes -
          Fix Version/s 10.0.10 [ 14500 ]
          serg Sergei Golubchik made changes -
          Fix Version/s 5.3.13 [ 12602 ]
          Fix Version/s 10.0.9 [ 14400 ]
          rebelde M Kellogg added a comment -

          This issue seems rather serious and could cause extreme performance degradation.

          This really needs to be upgraded to "critical" IMO.

          rebelde M Kellogg added a comment - This issue seems rather serious and could cause extreme performance degradation. This really needs to be upgraded to "critical" IMO.
          psergei Sergei Petrunia made changes -
          Status Open [ 1 ] In Progress [ 3 ]

          The problem is still repeatable on 10.0 and mysql-5.6

          psergei Sergei Petrunia added a comment - The problem is still repeatable on 10.0 and mysql-5.6

          Investigated how the UPDATE query is optimized/executed.

          The table has an index:

          KEY `idx_lock_dt_nxt` (`lock`,`datetime_next`)

          The WHERE clause is:

          sa.`lock` = 0 and sa.`datetime_next` <= now()

          it produces a range access for the index.

          the ORDER BY is:

          order by sa.`datetime_next` asc
          limit 10;

          The optimizer is able to figure out that range access over idx_lock_dt_nxt produces rows in the order that matches the ORDER BY.

          psergei Sergei Petrunia added a comment - Investigated how the UPDATE query is optimized/executed. The table has an index: KEY `idx_lock_dt_nxt` (`lock`,`datetime_next`) The WHERE clause is: sa.`lock` = 0 and sa.`datetime_next` <= now() it produces a range access for the index. the ORDER BY is: order by sa.`datetime_next` asc limit 10; The optimizer is able to figure out that range access over idx_lock_dt_nxt produces rows in the order that matches the ORDER BY.

          The optimizer also will see that used_key_is_modified=true (which means one can't do updates on the fly)

          And then, a wrong decision will be made: instead of using "Using buffer" the optimizer will pick "Using filesort". This will cause it to scan more rows than necessary

          psergei Sergei Petrunia added a comment - The optimizer also will see that used_key_is_modified=true (which means one can't do updates on the fly) And then, a wrong decision will be made: instead of using "Using buffer" the optimizer will pick "Using filesort". This will cause it to scan more rows than necessary

          Another problem: EXPLAIN shows that rows=

          {number of rows scanned by range scan}

          , not

          {number of rows scanned before we find LIMIT}

          (let's call this "rows problem").

          psergei Sergei Petrunia added a comment - Another problem: EXPLAIN shows that rows= {number of rows scanned by range scan} , not {number of rows scanned before we find LIMIT} (let's call this "rows problem").

          For 10.0, I'll fix the query plan problem but not "rows problem". fixing query plan problem is safe, fixing "rows problem" could potentially have some implications.

          psergei Sergei Petrunia added a comment - For 10.0, I'll fix the query plan problem but not "rows problem". fixing query plan problem is safe, fixing "rows problem" could potentially have some implications.

          Pushed a fix into 10.0. #rows in EXPLAIN UPDATE will be addressed separately.

          psergei Sergei Petrunia added a comment - Pushed a fix into 10.0. #rows in EXPLAIN UPDATE will be addressed separately.

          The #rows will be addressed in MDEV-5884.

          psergei Sergei Petrunia added a comment - The #rows will be addressed in MDEV-5884 .

          Pushed the fix into MariaDB 10.0 tree.

          boa, rebelde - thanks for your input, it is very appreciated.

          psergei Sergei Petrunia added a comment - Pushed the fix into MariaDB 10.0 tree. boa , rebelde - thanks for your input, it is very appreciated.
          psergei Sergei Petrunia made changes -
          Resolution Fixed [ 1 ]
          Status In Progress [ 3 ] Closed [ 6 ]
          serg Sergei Golubchik made changes -
          Workflow defaullt [ 27201 ] MariaDB v2 [ 42783 ]
          ratzpo Rasmus Johansson (Inactive) made changes -
          Workflow MariaDB v2 [ 42783 ] MariaDB v3 [ 62500 ]
          serg Sergei Golubchik made changes -
          Workflow MariaDB v3 [ 62500 ] MariaDB v4 [ 146616 ]

          People

            psergei Sergei Petrunia
            boa Oleg (Inactive)
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.