Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-30469

Support ORDER BY and LIMIT for multi-table DELETE, index hints for single-table DELETE.

Details

    Description

      Delete doesn't work with LIMIT and Hints and I don't find any specific reason behind this. Even not in documentation.

      MariaDB [test]> DELETE t1.* FROM test1 t1 WHERE ID=1;
      Query OK, 0 rows affected (0.000 sec)
       
      MariaDB [test]> DELETE t1.* FROM test1 t1 WHERE ID=1 LIMIT 10;
      ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'LIMIT 10' at line 1
      MariaDB [test]> 
       
      MariaDB [test]> DELETE t1.* FROM test1 t1 use index(ix_id) WHERE ID = 1 limit 10;
      ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'limit 10' at line 1
      MariaDB [test]> 
      
      

      Doc says, "For the multiple-table syntax, DELETE deletes from each tbl_name the rows that satisfy the conditions. In this case, ORDER BY and LIMIT> cannot be used." But here there are no multiple tables.
      https://mariadb.com/kb/en/delete/

      Patch details

      (based on a talk with Sanja):

      The patch adds two pieces of functionality:

      1. Index hints for UPDATE/DELETE
      2. ORDER BY LIMIT support for multi-table DELETE (UPDATEs are already supported).

      1. Index Hints.

      Single-table mysql_update/mysql_delete do not have the code to handle
      hints.

      The patch switches to multi-table update/delete code path whenever
      the updated table uses hints (Yes. )

      2. ORDER BY LIMIT support for multi-table DELETE

      2.1 Multi-table UPDATE already supports ORDER BY LIMIT

      Multi-table UPDATE supports "buffered" operation:

      • first, collect rowids of rows to be updated and new column values.
      • then, sort the resultset and apply LIMIT if necessary.
      • then, read the temporary table and apply the updates.

      Initially, this was done to avoid the "Halloween problem" for UPDATE without ORDER BY.
      It also made it easy to support ORDER BY ... LIMIT. It was done in MDEV-13911, https://github.com/MariaDB/server/commit/26ff92f7ac2dc373769b8053e936e4593a2ee302.
      Note that there are no restrictions on which tables are updated, or which tables can be referred from ORDER BY.

      2.2 Adding ORDER BY ...LIMIT support to multi-table DELETE

      Multi-table DELETE also supports a smaller variant of buffered mode (as it doesn't need to store the updated column values) It doesn't take into account that ORDER BY ... LIMIT code can put the row combinations into a temporary table and then sort it.

      Sanja's patch tries to fix this but is not finished.

      Attachments

        Issue Links

          Activity

            serg Sergei Golubchik added a comment - - edited

            Use bb-11.8-mdev-30469

            serg Sergei Golubchik added a comment - - edited Use bb-11.8-mdev-30469

            It looks like the wrong query plan is being used for cases with order by NULL:

            Testcase:

            CREATE TABLE t1 (a INT, b INT,
                             PRIMARY KEY (a),
                             KEY i2(a,b));
            INSERT INTO t1 VALUES (1,1),(2,2),(3,3),(4,4),(5,5),(6,6),(7,7),(8,8);
            ANALYZE TABLE t1;
             
            EXPLAIN format=json SELECT a FROM t1 USE INDEX (i2) ORDER BY NULL LIMIT 2;
            EXPLAIN format=json DELETE FROM t1 USE INDEX (i2) ORDER BY NULL LIMIT 2;
            

            For these queries we have the following plans (same for select and delete):

            EXPLAIN format=json SELECT a FROM t1 USE INDEX (i2) ORDER BY (a) LIMIT 2;
            EXPLAIN
            {
              "query_block": {
                "select_id": 1,
                "cost": 0.006896702,
                "nested_loop": [
                  {
                    "table": {
                      "table_name": "t1",
                      "access_type": "index",
                      "key": "i2",
                      "key_length": "9",
                      "used_key_parts": ["a", "b"],
                      "loops": 1,
                      "rows": 2,
                      "cost": 0.006896702,
                      "filtered": 100,
                      "using_index": true
                    }
                  }
                ]
              }
            }
            EXPLAIN format=json DELETE FROM t1 USE INDEX (i2) ORDER BY (a) LIMIT 2;
            EXPLAIN
            {
              "query_block": {
                "select_id": 1,
                "cost": 0.006896702,
                "nested_loop": [
                  {
                    "table": {
                      "table_name": "t1",
                      "access_type": "index",
                      "key": "i2",
                      "key_length": "9",
                      "used_key_parts": ["a", "b"],
                      "loops": 1,
                      "rows": 2,
                      "cost": 0.006896702,
                      "filtered": 100
                    }
                  }
                ]
              }
            }
            

            But for next queries (with order by NULL) plans for select and delete are differ. In spite of "USE INDEX (i2)" is set, index is ignored for delete:

            EXPLAIN format=json SELECT a FROM t1 USE INDEX (i2) ORDER BY NULL LIMIT 2;
            EXPLAIN format=json DELETE FROM t1 USE INDEX (i2) ORDER BY NULL LIMIT 2;
            

            Plans:

            EXPLAIN format=json SELECT a FROM t1 USE INDEX (i2) ORDER BY NULL LIMIT 2;
            EXPLAIN
            {
              "query_block": {
                "select_id": 1,
                "cost": 0.006896702,
                "nested_loop": [
                  {
                    "table": {
                      "table_name": "t1",
                      "access_type": "index",
                      "key": "i2",
                      "key_length": "9",
                      "used_key_parts": ["a", "b"],
                      "loops": 1,
                      "rows": 8,
                      "cost": 0.006896702,
                      "filtered": 100,
                      "using_index": true
                    }
                  }
                ]
              }
            }
            EXPLAIN format=json DELETE FROM t1 USE INDEX (i2) ORDER BY NULL LIMIT 2;
            EXPLAIN
            {
              "query_block": {
                "select_id": 1,
                "cost": 0.006896702,
                "nested_loop": [
                  {
                    "table": {
                      "table_name": "t1",
                      "access_type": "ALL",
                      "loops": 1,
                      "rows": 8,
                      "cost": 0.006896702,
                      "filtered": 100
                    }
                  }
                ]
              }
            }
            

            lstartseva Lena Startseva added a comment - It looks like the wrong query plan is being used for cases with order by NULL : Testcase: CREATE TABLE t1 (a INT , b INT , PRIMARY KEY (a), KEY i2(a,b)); INSERT INTO t1 VALUES (1,1),(2,2),(3,3),(4,4),(5,5),(6,6),(7,7),(8,8); ANALYZE TABLE t1;   EXPLAIN format=json SELECT a FROM t1 USE INDEX (i2) ORDER BY NULL LIMIT 2; EXPLAIN format=json DELETE FROM t1 USE INDEX (i2) ORDER BY NULL LIMIT 2; For these queries we have the following plans (same for select and delete ): EXPLAIN format=json SELECT a FROM t1 USE INDEX (i2) ORDER BY (a) LIMIT 2; EXPLAIN { "query_block" : { "select_id" : 1, "cost" : 0.006896702, "nested_loop" : [ { "table" : { "table_name" : "t1" , "access_type" : "index" , "key" : "i2" , "key_length" : "9" , "used_key_parts" : [ "a" , "b" ], "loops" : 1, "rows" : 2, "cost" : 0.006896702, "filtered" : 100, "using_index" : true } } ] } } EXPLAIN format=json DELETE FROM t1 USE INDEX (i2) ORDER BY (a) LIMIT 2; EXPLAIN { "query_block" : { "select_id" : 1, "cost" : 0.006896702, "nested_loop" : [ { "table" : { "table_name" : "t1" , "access_type" : "index" , "key" : "i2" , "key_length" : "9" , "used_key_parts" : [ "a" , "b" ], "loops" : 1, "rows" : 2, "cost" : 0.006896702, "filtered" : 100 } } ] } } But for next queries (with order by NULL ) plans for select and delete are differ. In spite of "USE INDEX (i2)" is set, index is ignored for delete : EXPLAIN format=json SELECT a FROM t1 USE INDEX (i2) ORDER BY NULL LIMIT 2; EXPLAIN format=json DELETE FROM t1 USE INDEX (i2) ORDER BY NULL LIMIT 2; Plans: EXPLAIN format=json SELECT a FROM t1 USE INDEX (i2) ORDER BY NULL LIMIT 2; EXPLAIN { "query_block" : { "select_id" : 1, "cost" : 0.006896702, "nested_loop" : [ { "table" : { "table_name" : "t1" , "access_type" : "index" , "key" : "i2" , "key_length" : "9" , "used_key_parts" : [ "a" , "b" ], "loops" : 1, "rows" : 8, "cost" : 0.006896702, "filtered" : 100, "using_index" : true } } ] } } EXPLAIN format=json DELETE FROM t1 USE INDEX (i2) ORDER BY NULL LIMIT 2; EXPLAIN { "query_block" : { "select_id" : 1, "cost" : 0.006896702, "nested_loop" : [ { "table" : { "table_name" : "t1" , "access_type" : "ALL" , "loops" : 1, "rows" : 8, "cost" : 0.006896702, "filtered" : 100 } } ] } }

            ORDER BY NULL means basically "do not order"...

            I don't think SELECT actually follows the hint here. I modify the query: add and use a new column so that the SELECT cannot use an index-only full index scan. And it stops using it:

            alter table t1 add z int;
            EXPLAIN format=json SELECT a,z FROM t1 USE INDEX (i2) ORDER BY NULL LIMIT 2;
            {
              "query_block": {
                "select_id": 1,
                "cost": 0.0121588,
                "nested_loop": [
                  {
                    "table": {
                      "table_name": "t1",
                      "access_type": "ALL",
                      "loops": 1,
                      "rows": 8,
                      "cost": 0.0121588,
                      "filtered": 100
                    }
                  }
                ]
              }
            }
            

            The same plan as with DELETE. DELETE cannot use "index-only" plans (not sure about the exact rationale behind this, but the code explicitly disables them).

            psergei Sergei Petrunia added a comment - ORDER BY NULL means basically "do not order"... I don't think SELECT actually follows the hint here. I modify the query: add and use a new column so that the SELECT cannot use an index-only full index scan. And it stops using it: alter table t1 add z int; EXPLAIN format=json SELECT a,z FROM t1 USE INDEX (i2) ORDER BY NULL LIMIT 2; { "query_block": { "select_id": 1, "cost": 0.0121588, "nested_loop": [ { "table": { "table_name": "t1", "access_type": "ALL", "loops": 1, "rows": 8, "cost": 0.0121588, "filtered": 100 } } ] } } The same plan as with DELETE. DELETE cannot use "index-only" plans (not sure about the exact rationale behind this, but the code explicitly disables them).
            Gosselin Dave Gosselin added a comment -

            Yes, I saw this in JOIN::optimize_stage2 and assumed it was intentional:

                /*
                  If we are using ORDER BY NULL or ORDER BY const_expression,
                  return result in any order (even if we are using a GROUP BY)
                */
                if (!order && org_order)
                  skip_sort_order= 1;
            

            FWIW MySQL shows NULL first in a resultset (by default).

            Gosselin Dave Gosselin added a comment - Yes, I saw this in JOIN::optimize_stage2 and assumed it was intentional: /* If we are using ORDER BY NULL or ORDER BY const_expression, return result in any order (even if we are using a GROUP BY) */ if (!order && org_order) skip_sort_order= 1; FWIW MySQL shows NULL first in a resultset (by default).

            Testing done, OK to push

            lstartseva Lena Startseva added a comment - Testing done, OK to push

            People

              Gosselin Dave Gosselin
              niljoshi Nilnandan Joshi
              Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.