Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-27999

select~for update skip locked locks unnecessary record

    XMLWordPrintable

Details

    Description

      Here is the test scenario user provide:

      Following situation:

      • 3 connections (1x master, 2x worker)
      • autocommit = 0
      • tx_isolation = read-committed on worker

      DB layout:

      CREATE TABLE `jobs` (
      id int(11) NOT NULL AUTO_INCREMENT,
      state varchar(25) DEFAULT NULL,
      created timestamp NOT NULL DEFAULT current_timestamp(),
      PRIMARY KEY ("id")
      );

      The master writes jobs in the table and then worker nodes fetch jobs, do some processing in the application and in the end delete the job from the table. The jobs table acts as a task queue and doesn't get any updates in this simplified test. Also this test is done in a controlled environment without any concurrent connections.

      master is doing 5 insert statements

      mysql (test)> insert into jobs (state) value('pending');
      mysql (test)> insert into jobs (state) value('pending');
      mysql (test)> insert into jobs (state) value('pending');
      mysql (test)> insert into jobs (state) value('pending');
      mysql (test)> insert into jobs (state) value('pending');
      mysql (test)> commit;
      

      worker 1 takes all unlocked rows from table and locks them for update

      mysql (test)> begin;
      mysql (test)> select * from jobs for update skip locked;
      +----+---------+---------------------+
      | id | state | created |
      +----+---------+---------------------+
      | 1 | pending | 2022-03-03 17:11:33 |
      ...
      | 5 | pending | 2022-03-03 17:11:37 |
      +----+---------+---------------------+
      5 rows in set (0.000 sec)
      

      master does more inserts

      mysql (test)> insert into jobs (state) value('pending');
      mysql (test)> insert into jobs (state) value('pending');
      mysql (test)> insert into jobs (state) value('pending');
      mysql (test)> insert into jobs (state) value('pending');
      mysql (test)> insert into jobs (state) value('pending');
      

      worker 2 takes unlocked rows

      mysql (test)> select * from jobs for update skip locked;
      +----+---------+---------------------+
      | id | state | created |
      +----+---------+---------------------+
      | 6 | pending | 2022-03-03 17:18:22 |
      ...
      | 10 | pending | 2022-03-03 17:18:24 |
      +----+---------+---------------------+
      5 rows in set (0.001 sec)
      

      worker 1 finished working on his jobs and deletes his locked rows. this is where it starts to get weird. If deleting all rows in a single where in () clause it will wait for worker 2, even though both shouldn't have any dependencies.

      mysql (test)> delete from jobs where id in (1,2,3,4,5);
      ... waiting
      ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
      

      If deleting only one row at once (no matter which one) it will work.

      mysql (test)> delete from jobs where id = 1;
      Query OK, 1 row affected (0.001 sec)
      

      It will even work when some rows are deleted at once:

      mysql (test)> delete from jobs where id in (1,2,3);
      Query OK, 2 rows affected (0.001 sec)
      

      However, in this example row 4 and 5 cannot be deleted together with other rows:

      mysql (test)> delete from jobs where id in (1,2,3,5);
      ...
      ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
      

      Except for only row 4 and 5 are deleted alone

      mysql (test)> delete from jobs where id in (4,5);
      Query OK, 2 rows affected (0.001 sec)
      

      Why can it happen that some rows can be deleted together? When deleting all rows one by one it works without any problems.

      Another thing:
      Worker 1 tries to delete rows

      mysql (test)> delete from jobs where id in (1,2,3,4,5);
      ... waiting
      

      Worker 2 tries to delete rows concurrently to Worker 1

      mysql (test)> delete from jobs where id in (6,7,8,9,10);
      ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
      

      Worker 1 succeeds

      ...
      Query OK, 5 rows affected (48.561 sec)
      

      Why is Worker 2 rolled back with a deadlock? There should be no overlapping locks as both locked only some rows. Out of curiosity I ran the same test against MySQL 8.0.28 where it succeded.
      Testcase included, test.py. It requires python3, pymysql, threading

      Attachments

        1. test.py
          5 kB
        2. test-1.py
          6 kB

        Issue Links

          Activity

            People

              Johnston Rex Johnston
              allen.lee@mariadb.com Allen Lee (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.