Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33971

Using NAME_CONST() (or executing query from the stored procedure and referring to a local variable) changes the plan and may make execution slower

Details

    Description

      Consider the following simple table:

      Yuliyas-Air:maria10.6 Valerii$ bin/mysql test
      Reading table information for completion of table and column names
      You can turn off this feature to get a quicker startup with -A
       
      Welcome to the MariaDB monitor.  Commands end with ; or \g.
      Your MariaDB connection id is 3
      Server version: 10.6.18-MariaDB MariaDB Server
       
      Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
       
      Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
       
      MariaDB [test]> select * from tt1;
      +------+------+------+
      | id   | c1   | c2   |
      +------+------+------+
      |    1 |    3 | 1    |
      |    2 |    2 | 3    |
      +------+------+------+
      2 rows in set (0.014 sec)
       
      MariaDB [test]> show create table tt1\G
      *************************** 1. row ***************************
             Table: tt1
      Create Table: CREATE TABLE `tt1` (
        `id` int(11) DEFAULT NULL,
        `c1` int(11) DEFAULT NULL,
        `c2` varchar(100) DEFAULT NULL,
        KEY `c1` (`c1`),
        KEY `c2` (`c2`)
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1 COLLATE=latin1_swedish_ci
      1 row in set (0.004 sec)
       
      MariaDB [test]> insert into tt1 select * from tt1;
      Query OK, 2 rows affected (0.016 sec)
      Records: 2  Duplicates: 0  Warnings: 0
       
      MariaDB [test]> insert into tt1 select * from tt1;
      Query OK, 4 rows affected (0.003 sec)
      Records: 4  Duplicates: 0  Warnings: 0
       
      ...
       
      MariaDB [test]> insert into tt1 select * from tt1;
      Query OK, 32768 rows affected (0.297 sec)
      Records: 32768  Duplicates: 0  Warnings: 0
       
      MariaDB [test]> update tt1 set c1 = 4 limit 1;
      Query OK, 1 row affected (0.012 sec)
      Rows matched: 1  Changed: 1  Warnings: 0
       
      MariaDB [test]> select c1, count(*) from tt1 group by c1;
      +------+----------+
      | c1   | count(*) |
      +------+----------+
      |    2 |    32768 |
      |    3 |    32767 |
      |    4 |        1 |
      +------+----------+
      3 rows in set (0.042 sec)
      

      The following queries get expected execution plans:

      MariaDB [test]> explain select * from tt1 where c1 = 3;
      +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows  | Extra       |
      +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+
      |    1 | SIMPLE      | tt1   | ALL  | c1            | NULL | NULL    | NULL | 65758 | Using where |
      +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+
      1 row in set (0.000 sec)
       
      MariaDB [test]> explain select * from tt1 where c1 = 4;
      +------+-------------+-------+------+---------------+------+---------+-------+------+-------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref   | rows | Extra |
      +------+-------------+-------+------+---------------+------+---------+-------+------+-------+
      |    1 | SIMPLE      | tt1   | ref  | c1            | c1   | 5       | const | 1    |       |
      +------+-------------+-------+------+---------------+------+---------+-------+------+-------+
      1 row in set (0.003 sec)
       
      MariaDB [test]> explain select * from tt1 where c1 = '4';
      +------+-------------+-------+------+---------------+------+---------+-------+------+-------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref   | rows | Extra |
      +------+-------------+-------+------+---------------+------+---------+-------+------+-------+
      |    1 | SIMPLE      | tt1   | ref  | c1            | c1   | 5       | const | 1    |       |
      +------+-------------+-------+------+---------------+------+---------+-------+------+-------+
      1 row in set (0.000 sec)
      

      but if we add name_const() as it happens when the query is executed from the stored routine the plan changes to use index condition pushdown:

      MariaDB [test]> explain select * from tt1 where c1 = name_const('a',4);
      +------+-------------+-------+------+---------------+------+---------+-------+------+-----------------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref   | rows | Extra                 |
      +------+-------------+-------+------+---------------+------+---------+-------+------+-----------------------+
      |    1 | SIMPLE      | tt1   | ref  | c1            | c1   | 5       | const | 1    | Using index condition |
      +------+-------------+-------+------+---------------+------+---------+-------+------+-----------------------+
      1 row in set (0.004 sec)
       
      MariaDB [test]> explain select * from tt1 where c1 = name_const('a','4');
      +------+-------------+-------+------+---------------+------+---------+-------+------+-----------------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref   | rows | Extra                 |
      +------+-------------+-------+------+---------------+------+---------+-------+------+-----------------------+
      |    1 | SIMPLE      | tt1   | ref  | c1            | c1   | 5       | const | 1    | Using index condition |
      +------+-------------+-------+------+---------------+------+---------+-------+------+-----------------------+
      1 row in set (0.000 sec)
      

      which is strange and not expected. Moreover, we may note somewhat slower execution as a result:

      MariaDB [test]> analyze format=json select * from tt1 where c1 = 4\G
      *************************** 1. row ***************************
      ANALYZE: {
        "query_block": {
          "select_id": 1,
          "r_loops": 1,
          "r_total_time_ms": 2.356,
          "table": {
            "table_name": "tt1",
            "access_type": "ref",
            "possible_keys": ["c1"],
            "key": "c1",
            "key_length": "5",
            "used_key_parts": ["c1"],
            "ref": ["const"],
            "r_loops": 1,
            "rows": 1,
            "r_rows": 1,
            "r_table_time_ms": 0.203,
            "r_other_time_ms": 0.01,
            "r_engine_stats": {
              "pages_accessed": 4
            },
            "filtered": 100,
            "r_filtered": 100
          }
        }
      }
      1 row in set (0.004 sec)
       
      MariaDB [test]> analyze format=json select * from tt1 where c1 = name_const('a',4)\G
      *************************** 1. row ***************************
      ANALYZE: {
        "query_block": {
          "select_id": 1,
          "r_loops": 1,
          "r_total_time_ms": 2.423,
          "table": {
            "table_name": "tt1",
            "access_type": "ref",
            "possible_keys": ["c1"],
            "key": "c1",
            "key_length": "5",
            "used_key_parts": ["c1"],
            "ref": ["const"],
            "r_loops": 1,
            "rows": 1,
            "r_rows": 1,
            "r_table_time_ms": 2.396,
            "r_other_time_ms": 0.02,
            "r_engine_stats": {
              "pages_accessed": 4
            },
            "filtered": 100,
            "r_filtered": 100,
            "index_condition": "tt1.c1 = 4"
          }
        }
      }
      1 row in set (0.003 sec)
      

      Why all these happens and can we force the plans to be the same for such cases?

      Attachments

        Activity

          valerii Valerii Kravchuk created issue -
          Gosselin Dave Gosselin made changes -
          Field Original Value New Value
          Assignee Dave Gosselin [ JIRAUSER52216 ]
          Gosselin Dave Gosselin made changes -
          Fix Version/s 10.5.25 [ 29626 ]
          Gosselin Dave Gosselin made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          Gosselin Dave Gosselin made changes -
          Attachment mdev-33971.numbers [ 73438 ]
          Gosselin Dave Gosselin added a comment - - edited

          I attached a spreadsheet wherein I ran the following queries, named "index" and "NAME_CONST", respectively:

          analyze format=json select * from tt1 where c1 = 3\G
          analyze format=json select * from tt1 where c1 = name_const('a', 3)\G
          

          I ran this pair of queries five times under a Debug build, capturing the result of r_total_time_ms for each query, and placing it in the corresponding column of the spreadsheet. I then repeated the five queries, but reversing their order, so that "NAME_CONST" was run first, then "index" second, capturing the same information in the spreadsheet. I repeated the entire experiment again for a Release build. I'm running the latest 10.5, git sha e73181112f684debb7246a298fabc4cbce04eb9b which is a pre-10.5.25 version. The results are then used to compute median and average run times.

          What we see is that overwhelmingly the order of the queries matters far more than whether NAME_CONST is present. Unless there is some configuration or server variables that have also been modified, there is no bug here as far as I can measure.

          I should add, I used a table with 500K rows generated according to the algorithm in the description.

          Gosselin Dave Gosselin added a comment - - edited I attached a spreadsheet wherein I ran the following queries, named "index" and "NAME_CONST", respectively: analyze format=json select * from tt1 where c1 = 3\G analyze format=json select * from tt1 where c1 = name_const('a', 3)\G I ran this pair of queries five times under a Debug build, capturing the result of r_total_time_ms for each query, and placing it in the corresponding column of the spreadsheet. I then repeated the five queries, but reversing their order, so that "NAME_CONST" was run first, then "index" second, capturing the same information in the spreadsheet. I repeated the entire experiment again for a Release build. I'm running the latest 10.5, git sha e73181112f684debb7246a298fabc4cbce04eb9b which is a pre-10.5.25 version. The results are then used to compute median and average run times. What we see is that overwhelmingly the order of the queries matters far more than whether NAME_CONST is present. Unless there is some configuration or server variables that have also been modified, there is no bug here as far as I can measure. I should add, I used a table with 500K rows generated according to the algorithm in the description.

          Do you see different plans and if you do, why are they different? Where is this documented?

          valerii Valerii Kravchuk added a comment - Do you see different plans and if you do, why are they different? Where is this documented?
          Gosselin Dave Gosselin added a comment -

          First 'analyze' output, then 'explain' output. Apart from timings, the analyze outputs appear to be identical. I'll repeat this work on 10.5.19 and see if that makes a difference.

          MariaDB [test]> analyze format=json select * from tt1 where c1 = 3\G
          *************************** 1. row ***************************
          ANALYZE: {
            "query_block": {
              "select_id": 1,
              "r_loops": 1,
              "r_total_time_ms": 161.4513492,
              "table": {
                "table_name": "tt1",
                "access_type": "ALL",
                "possible_keys": ["c1"],
                "r_loops": 1,
                "rows": 1046577,
                "r_rows": 1048584,
                "r_table_time_ms": 125.2281306,
                "r_other_time_ms": 36.21630051,
                "filtered": 49.99995041,
                "r_filtered": 50,
                "attached_condition": "tt1.c1 = 3"
              }
            }
          }
          1 row in set (0.162 sec)
           
          MariaDB [test]> analyze format=json select * from tt1 where c1 = name_const('a', 3)\G
          *************************** 1. row ***************************
          ANALYZE: {
            "query_block": {
              "select_id": 1,
              "r_loops": 1,
              "r_total_time_ms": 137.0009049,
              "table": {
                "table_name": "tt1",
                "access_type": "ALL",
                "possible_keys": ["c1"],
                "r_loops": 1,
                "rows": 1046577,
                "r_rows": 1048584,
                "r_table_time_ms": 105.7870582,
                "r_other_time_ms": 31.21084605,
                "filtered": 49.99995041,
                "r_filtered": 50,
                "attached_condition": "tt1.c1 = 3"
              }
            }
          }
          1 row in set (0.137 sec)
          

          The explain outputs look identical as well:

          MariaDB [test]> explain select * from tt1 where c1 = 3;
          +------+-------------+-------+------+---------------+------+---------+------+---------+-------------+
          | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows    | Extra       |
          +------+-------------+-------+------+---------------+------+---------+------+---------+-------------+
          |    1 | SIMPLE      | tt1   | ALL  | c1            | NULL | NULL    | NULL | 1046577 | Using where |
          +------+-------------+-------+------+---------------+------+---------+------+---------+-------------+
          1 row in set (0.000 sec)
           
          MariaDB [test]>
          MariaDB [test]> explain select * from tt1 where c1 = '3';
          +------+-------------+-------+------+---------------+------+---------+------+---------+-------------+
          | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows    | Extra       |
          +------+-------------+-------+------+---------------+------+---------+------+---------+-------------+
          |    1 | SIMPLE      | tt1   | ALL  | c1            | NULL | NULL    | NULL | 1046577 | Using where |
          +------+-------------+-------+------+---------------+------+---------+------+---------+-------------+
          1 row in set (0.001 sec)
           
          MariaDB [test]>
          MariaDB [test]> explain select * from tt1 where c1 = name_const('a', '3');
          +------+-------------+-------+------+---------------+------+---------+------+---------+-------------+
          | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows    | Extra       |
          +------+-------------+-------+------+---------------+------+---------+------+---------+-------------+
          |    1 | SIMPLE      | tt1   | ALL  | c1            | NULL | NULL    | NULL | 1046577 | Using where |
          +------+-------------+-------+------+---------------+------+---------+------+---------+-------------+
          1 row in set (0.001 sec)
           
          MariaDB [test]>
          MariaDB [test]> explain select * from tt1 where c1 = name_const('a', 3);
          +------+-------------+-------+------+---------------+------+---------+------+---------+-------------+
          | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows    | Extra       |
          +------+-------------+-------+------+---------------+------+---------+------+---------+-------------+
          |    1 | SIMPLE      | tt1   | ALL  | c1            | NULL | NULL    | NULL | 1046577 | Using where |
          +------+-------------+-------+------+---------------+------+---------+------+---------+-------------+
          1 row in set (0.000 sec)
          

          Gosselin Dave Gosselin added a comment - First 'analyze' output, then 'explain' output. Apart from timings, the analyze outputs appear to be identical. I'll repeat this work on 10.5.19 and see if that makes a difference. MariaDB [test]> analyze format=json select * from tt1 where c1 = 3\G *************************** 1. row *************************** ANALYZE: { "query_block": { "select_id": 1, "r_loops": 1, "r_total_time_ms": 161.4513492, "table": { "table_name": "tt1", "access_type": "ALL", "possible_keys": ["c1"], "r_loops": 1, "rows": 1046577, "r_rows": 1048584, "r_table_time_ms": 125.2281306, "r_other_time_ms": 36.21630051, "filtered": 49.99995041, "r_filtered": 50, "attached_condition": "tt1.c1 = 3" } } } 1 row in set (0.162 sec)   MariaDB [test]> analyze format=json select * from tt1 where c1 = name_const('a', 3)\G *************************** 1. row *************************** ANALYZE: { "query_block": { "select_id": 1, "r_loops": 1, "r_total_time_ms": 137.0009049, "table": { "table_name": "tt1", "access_type": "ALL", "possible_keys": ["c1"], "r_loops": 1, "rows": 1046577, "r_rows": 1048584, "r_table_time_ms": 105.7870582, "r_other_time_ms": 31.21084605, "filtered": 49.99995041, "r_filtered": 50, "attached_condition": "tt1.c1 = 3" } } } 1 row in set (0.137 sec) The explain outputs look identical as well: MariaDB [test]> explain select * from tt1 where c1 = 3; +------+-------------+-------+------+---------------+------+---------+------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------+---------+-------------+ | 1 | SIMPLE | tt1 | ALL | c1 | NULL | NULL | NULL | 1046577 | Using where | +------+-------------+-------+------+---------------+------+---------+------+---------+-------------+ 1 row in set (0.000 sec)   MariaDB [test]> MariaDB [test]> explain select * from tt1 where c1 = '3'; +------+-------------+-------+------+---------------+------+---------+------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------+---------+-------------+ | 1 | SIMPLE | tt1 | ALL | c1 | NULL | NULL | NULL | 1046577 | Using where | +------+-------------+-------+------+---------------+------+---------+------+---------+-------------+ 1 row in set (0.001 sec)   MariaDB [test]> MariaDB [test]> explain select * from tt1 where c1 = name_const('a', '3'); +------+-------------+-------+------+---------------+------+---------+------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------+---------+-------------+ | 1 | SIMPLE | tt1 | ALL | c1 | NULL | NULL | NULL | 1046577 | Using where | +------+-------------+-------+------+---------------+------+---------+------+---------+-------------+ 1 row in set (0.001 sec)   MariaDB [test]> MariaDB [test]> explain select * from tt1 where c1 = name_const('a', 3); +------+-------------+-------+------+---------------+------+---------+------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------+---------+-------------+ | 1 | SIMPLE | tt1 | ALL | c1 | NULL | NULL | NULL | 1046577 | Using where | +------+-------------+-------+------+---------------+------+---------+------+---------+-------------+ 1 row in set (0.000 sec)

          Please, check also on 10.6.17.

          valerii Valerii Kravchuk added a comment - Please, check also on 10.6.17.
          Gosselin Dave Gosselin added a comment -

          I repeated my prior experiment on both 10.6.17 and 10.5.19. I considered 'Release' type builds only this time. We continue to see that the first query of the two takes longer to execute than the second, regardless of which one is first.

          • On 10.5.19, the "NAME_CONST" query first runs in a median time of 163ms, then the "index" query runs in a median time of 128ms.
          • On 10.5.19, the "index" query first runs in a median time of 153ms, then the "NAME_CONST" query runs in a median time of 139ms.
          • On 10.6.17, the "NAME_CONST" query first runs in a median time of 192ms, then the "index" query runs in a median time of 171ms.
          • On 10.6.17, the "index" query first runs in a median time of 193ms, then the "NAME_CONST" query runs in a median time of 170ms.

          Interestingly and in this latest set of comparisons, we see that when running the "NAME_CONST" query second, it never runs faster than when we run the "index" query second; 139ms < 128ms, and 170ms < 171ms. This wasn't clear at all in the previous experiment, partly because we had less data and looked at one version only.

          I have another spreadsheet locally with these data if you're interested in seeing it.

          Gosselin Dave Gosselin added a comment - I repeated my prior experiment on both 10.6.17 and 10.5.19. I considered 'Release' type builds only this time. We continue to see that the first query of the two takes longer to execute than the second, regardless of which one is first. On 10.5.19, the "NAME_CONST" query first runs in a median time of 163ms, then the "index" query runs in a median time of 128ms. On 10.5.19, the "index" query first runs in a median time of 153ms, then the "NAME_CONST" query runs in a median time of 139ms. On 10.6.17, the "NAME_CONST" query first runs in a median time of 192ms, then the "index" query runs in a median time of 171ms. On 10.6.17, the "index" query first runs in a median time of 193ms, then the "NAME_CONST" query runs in a median time of 170ms. Interestingly and in this latest set of comparisons, we see that when running the "NAME_CONST" query second, it never runs faster than when we run the "index" query second; 139ms < 128ms, and 170ms < 171ms. This wasn't clear at all in the previous experiment, partly because we had less data and looked at one version only. I have another spreadsheet locally with these data if you're interested in seeing it.

          Let's execute each query many times to remove that "first" vs "second" execution impact. With my original test data mysqlslap of 10000 executions give the following over 10 iterations:

          Yuliyas-Air:maria10.6 Valerii$ bin/mysqlslap --concurrency=1 --create-schema=test --no-drop --number-of-queries=10000 --iterations=10 --query='select * from tt1 where c1 = 4;'
          Benchmark
          	Average number of seconds to run all queries: 1.053 seconds
          	Minimum number of seconds to run all queries: 1.037 seconds
          	Maximum number of seconds to run all queries: 1.085 seconds
          	Number of clients running queries: 1
          	Average number of queries per client: 10000
           
          Yuliyas-Air:maria10.6 Valerii$ bin/mysqlslap --concurrency=1 --create-schema=test --no-drop --number-of-queries=10000 --iterations=10 --query="select * from tt1 where c1 = name_const('a',4)";
          Benchmark
          	Average number of seconds to run all queries: 1.127 seconds
          	Minimum number of seconds to run all queries: 1.110 seconds
          	Maximum number of seconds to run all queries: 1.163 seconds
          	Number of clients running queries: 1
          	Average number of queries per client: 10000
          

          The difference is clearly visible at this scale/number of runs. The server is started with --no-defaults (if you think any settings matter).

          Also the questions remain for 10.6.17+:

          "Do you see different plans and if you do, why are they different? Where is this documented?"

          valerii Valerii Kravchuk added a comment - Let's execute each query many times to remove that "first" vs "second" execution impact. With my original test data mysqlslap of 10000 executions give the following over 10 iterations: Yuliyas-Air:maria10.6 Valerii$ bin/mysqlslap --concurrency=1 --create-schema=test --no-drop --number-of-queries=10000 --iterations=10 --query='select * from tt1 where c1 = 4;' Benchmark Average number of seconds to run all queries: 1.053 seconds Minimum number of seconds to run all queries: 1.037 seconds Maximum number of seconds to run all queries: 1.085 seconds Number of clients running queries: 1 Average number of queries per client: 10000   Yuliyas-Air:maria10.6 Valerii$ bin/mysqlslap --concurrency=1 --create-schema=test --no-drop --number-of-queries=10000 --iterations=10 --query="select * from tt1 where c1 = name_const('a',4)"; Benchmark Average number of seconds to run all queries: 1.127 seconds Minimum number of seconds to run all queries: 1.110 seconds Maximum number of seconds to run all queries: 1.163 seconds Number of clients running queries: 1 Average number of queries per client: 10000 The difference is clearly visible at this scale/number of runs. The server is started with --no-defaults (if you think any settings matter). Also the questions remain for 10.6.17+: "Do you see different plans and if you do, why are they different? Where is this documented?"
          Gosselin Dave Gosselin added a comment -

          Interesting, I haven't heard of mysqlslap before. I've been running the queries over five iterations and gathering the data manually. I agree at this scale it's pretty clear there's a difference. I'll return to your question regarding different plans/documentation today. Please tell me where I can find mysqlslap because I don't see it in the CE server codebase. Thanks.

          Gosselin Dave Gosselin added a comment - Interesting, I haven't heard of mysqlslap before. I've been running the queries over five iterations and gathering the data manually. I agree at this scale it's pretty clear there's a difference. I'll return to your question regarding different plans/documentation today. Please tell me where I can find mysqlslap because I don't see it in the CE server codebase. Thanks.

          This is a standard client a.k.a. mariadb-slap:

          https://mariadb.com/kb/en/mariadb-slap/

          that should be in the basedir/bin. Here is the source code:

          https://github.com/MariaDB/server/blob/10.6/client/mysqlslap.c

          valerii Valerii Kravchuk added a comment - This is a standard client a.k.a. mariadb-slap: https://mariadb.com/kb/en/mariadb-slap/ that should be in the basedir/bin. Here is the source code: https://github.com/MariaDB/server/blob/10.6/client/mysqlslap.c
          psergei Sergei Petrunia added a comment - - edited

          Discussion take-aways:

          How NAME_CONST adds overhead

          NAME_CONST can indeed add some overhead:

          In case of full table scan, without NAME_CONST we have the WHERE condition c1=3 represented as:

          Item_func_eq( Item_field(c1),  Item_int(3))
          

          with NAME_CONST it becomes:

          Item_func_eq( Item_name_const( Item_field(c1)),  Item_int(3) )
          

          Computing that means adding one extra virtual function call. It is surprising that this is visible in performance numbers.

          In case of ref access: c1=3 is guaranteed to be true for all rows returned by the index lookup. The optimizer can figure this out and remove the check (see sql_select.cc:test_if_ref()).
          When c1= NAME_CONST(...) is used, the optimizer is apparently unable to figure out that the condition will be always true and the check remains.
          This means there's some extra CPU overhead.

          Is NAME_CONST needed?

          NAME_CONST used instead of SP vars. It is necessary in the SELECT list. Is it necessary to use it in the WHERE clause? Names of items in the WHERE clause do not matter. Some attributes might matter (e.g. coercibility? take coercibility of an SP var reference, a string literal and of NAME_CONST(). ) At execution phase, NAME_CONST is definitely not necessary.

          Suggested thing to try

          What if Item_name_const::fix_fields() substituted itself with the constant it is wrapping? It can assign ((Item*)constant)->name.

          psergei Sergei Petrunia added a comment - - edited Discussion take-aways: How NAME_CONST adds overhead NAME_CONST can indeed add some overhead: In case of full table scan, without NAME_CONST we have the WHERE condition c1=3 represented as: Item_func_eq( Item_field(c1), Item_int(3)) with NAME_CONST it becomes: Item_func_eq( Item_name_const( Item_field(c1)), Item_int(3) ) Computing that means adding one extra virtual function call. It is surprising that this is visible in performance numbers. In case of ref access: c1=3 is guaranteed to be true for all rows returned by the index lookup. The optimizer can figure this out and remove the check (see sql_select.cc:test_if_ref()). When c1= NAME_CONST(...) is used, the optimizer is apparently unable to figure out that the condition will be always true and the check remains. This means there's some extra CPU overhead. Is NAME_CONST needed? NAME_CONST used instead of SP vars. It is necessary in the SELECT list. Is it necessary to use it in the WHERE clause? Names of items in the WHERE clause do not matter. Some attributes might matter (e.g. coercibility? take coercibility of an SP var reference, a string literal and of NAME_CONST(). ) At execution phase, NAME_CONST is definitely not necessary. Suggested thing to try What if Item_name_const::fix_fields() substituted itself with the constant it is wrapping? It can assign ((Item*)constant)->name.
          Gosselin Dave Gosselin added a comment -

          Hi valerii, using mariadb-slap I can clearly see the performance differences between using ref access with and without NAME_CONST. In response to your earlier question ("Do you see different plans and if you do, why are they different? Where is this documented?"), there are no material plan differences, but in 10.6.17 we now include a new metric, r_engine_stats, which was not present in 10.5.19. With respect to the benchmark measurements, I noted that NAME_CONST is slower than direct ref access in both 10.5.19 and 10.6.17, but in the latter version it's yet even slower, by an additional 100ms. With mariadb-slap, I ran 50,000 queries over three iterations in order to clearly demonstrate the difference.

          Gosselin Dave Gosselin added a comment - Hi valerii , using mariadb-slap I can clearly see the performance differences between using ref access with and without NAME_CONST. In response to your earlier question ("Do you see different plans and if you do, why are they different? Where is this documented?"), there are no material plan differences, but in 10.6.17 we now include a new metric, r_engine_stats, which was not present in 10.5.19. With respect to the benchmark measurements, I noted that NAME_CONST is slower than direct ref access in both 10.5.19 and 10.6.17, but in the latter version it's yet even slower, by an additional 100ms. With mariadb-slap, I ran 50,000 queries over three iterations in order to clearly demonstrate the difference.
          serg Sergei Golubchik made changes -
          Fix Version/s 10.5 [ 23123 ]
          Fix Version/s 10.5.25 [ 29626 ]
          Gosselin Dave Gosselin made changes -
          Assignee Dave Gosselin [ JIRAUSER52216 ] Sergei Petrunia [ psergey ]
          Status In Progress [ 3 ] In Review [ 10002 ]
          psergei Sergei Petrunia added a comment - - edited

          A bit more detail about where/how NAME_CONST is generated:

          Example:

          create table t1 (a int, b int);
          insert into t1 values (1,1),(2,2);
           
          delimiter //;
          create procedure p1() 
          begin
            declare var1 int;
            update t1 set b=var1 where a=1;
          end
          //;
           
          delimiter ;//
          

          Shows NAME_CONST is produced here:

            #0  Item_splocal::append_for_log (this=0x7fffa096e838, thd=0x7fffa0000d78, str=0x7ffff01c4ac0) at /home/psergey/dev-git2/10.6-review-mdev33971-name_const/sql/sp_head.cc:140
            #1  0x0000555555dd765a in Copy_query_with_rewrite::append (this=0x7ffff01c4a90, p=0x7fffa096e8c8) at /home/psergey/dev-git2/10.6-review-mdev33971-name_const/sql/item.h:614
            #2  0x0000555555dc9170 in subst_spvars (thd=0x7fffa0000d78, instr=0x7fffa09740a8, query_str=0x7fffa09740e0) at /home/psergey/dev-git2/10.6-review-mdev33971-name_const/sql/sp_head.cc:1129
            #3  0x0000555555dd06cf in sp_instr_stmt::execute (this=0x7fffa09740a8, thd=0x7fffa0000d78, nextp=0x7ffff01c4edc) at /home/psergey/dev-git2/10.6-review-mdev33971-name_const/sql/sp_head.cc:3737
            #4  0x0000555555dc9e44 in sp_head::execute (this=0x7fffa096cd00, thd=0x7fffa0000d78, merge_da_on_success=true) at /home/psergey/dev-git2/10.6-review-mdev33971-name_const/sql/sp_head.cc:1442
            #5  0x0000555555dccc4c in sp_head::execute_procedure (this=0x7fffa096cd00, thd=0x7fffa0000d78, args=0x7fffa0006088) at /home/psergey/dev-git2/10.6-review-mdev33971-name_const/sql/sp_head.cc:2485
            #6  0x0000555555ecdd31 in do_execute_sp (thd=0x7fffa0000d78, sp=0x7fffa096cd00) at /home/psergey/dev-git2/10.6-review-mdev33971-name_const/sql/sql_parse.cc:3064
            #7  0x0000555555ece964 in Sql_cmd_call::execute (this=0x7fffa0016ce8, thd=0x7fffa0000d78) at /home/psergey/dev-git2/10.6-review-mdev33971-name_const/sql/sql_parse.cc:3310
            #8  0x0000555555ed8a3a in mysql_execute_command (thd=0x7fffa0000d78, is_called_from_prepared_stmt=false) at /home/psergey/dev-git2/10.6-review-mdev33971-name_const/sql/sql_parse.cc:6124
            #9  0x0000555555edeabe in mysql_parse (thd=0x7fffa0000d78, rawbuf=0x7fffa0016c40 "call p1()", length=9, parser_state=0x7ffff01c6360) at /home/psergey/dev-git2/10.6-review-mdev33971-name_const/sql/sql_parse.cc:8146
          

          subst_spvars does this:

              /* Find rewritable Items used in this statement */
              for (Item *item= instr->free_list; item; item= item->next)
              {
                Rewritable_query_parameter *rqp= item->get_rewritable_query_parameter();
          

          get_rewritable_query_parameter() is non-NULL for Item_splocal and Item_param.
          Rewritable_query_parameter has

              my_ptrdiff_t pos_in_query;
              uint len_in_query;
              virtual bool append_for_log(THD *thd, String *str) = 0;
          

          That is, Items use it to replace themselves in the original query text (which is just String).

          At the time the substitution is made, Items that have no idea where in the query they are located (and whether it's the same part of the query that they were in the original query text).

          Because of this, fixing the MDEV by "do not produce NAME_CONST(...) calls where it is not necessary" is hard.

          psergei Sergei Petrunia added a comment - - edited A bit more detail about where/how NAME_CONST is generated: Example: create table t1 (a int , b int ); insert into t1 values (1,1),(2,2);   delimiter //; create procedure p1() begin declare var1 int ; update t1 set b=var1 where a=1; end //;   delimiter ;// Shows NAME_CONST is produced here: #0 Item_splocal::append_for_log (this=0x7fffa096e838, thd=0x7fffa0000d78, str=0x7ffff01c4ac0) at /home/psergey/dev-git2/10.6-review-mdev33971-name_const/sql/sp_head.cc:140 #1 0x0000555555dd765a in Copy_query_with_rewrite::append (this=0x7ffff01c4a90, p=0x7fffa096e8c8) at /home/psergey/dev-git2/10.6-review-mdev33971-name_const/sql/item.h:614 #2 0x0000555555dc9170 in subst_spvars (thd=0x7fffa0000d78, instr=0x7fffa09740a8, query_str=0x7fffa09740e0) at /home/psergey/dev-git2/10.6-review-mdev33971-name_const/sql/sp_head.cc:1129 #3 0x0000555555dd06cf in sp_instr_stmt::execute (this=0x7fffa09740a8, thd=0x7fffa0000d78, nextp=0x7ffff01c4edc) at /home/psergey/dev-git2/10.6-review-mdev33971-name_const/sql/sp_head.cc:3737 #4 0x0000555555dc9e44 in sp_head::execute (this=0x7fffa096cd00, thd=0x7fffa0000d78, merge_da_on_success=true) at /home/psergey/dev-git2/10.6-review-mdev33971-name_const/sql/sp_head.cc:1442 #5 0x0000555555dccc4c in sp_head::execute_procedure (this=0x7fffa096cd00, thd=0x7fffa0000d78, args=0x7fffa0006088) at /home/psergey/dev-git2/10.6-review-mdev33971-name_const/sql/sp_head.cc:2485 #6 0x0000555555ecdd31 in do_execute_sp (thd=0x7fffa0000d78, sp=0x7fffa096cd00) at /home/psergey/dev-git2/10.6-review-mdev33971-name_const/sql/sql_parse.cc:3064 #7 0x0000555555ece964 in Sql_cmd_call::execute (this=0x7fffa0016ce8, thd=0x7fffa0000d78) at /home/psergey/dev-git2/10.6-review-mdev33971-name_const/sql/sql_parse.cc:3310 #8 0x0000555555ed8a3a in mysql_execute_command (thd=0x7fffa0000d78, is_called_from_prepared_stmt=false) at /home/psergey/dev-git2/10.6-review-mdev33971-name_const/sql/sql_parse.cc:6124 #9 0x0000555555edeabe in mysql_parse (thd=0x7fffa0000d78, rawbuf=0x7fffa0016c40 "call p1()", length=9, parser_state=0x7ffff01c6360) at /home/psergey/dev-git2/10.6-review-mdev33971-name_const/sql/sql_parse.cc:8146 subst_spvars does this: /* Find rewritable Items used in this statement */ for (Item *item= instr->free_list; item; item= item->next) { Rewritable_query_parameter *rqp= item->get_rewritable_query_parameter(); get_rewritable_query_parameter() is non-NULL for Item_splocal and Item_param. Rewritable_query_parameter has my_ptrdiff_t pos_in_query; uint len_in_query; virtual bool append_for_log(THD *thd, String *str) = 0; That is, Items use it to replace themselves in the original query text (which is just String). At the time the substitution is made, Items that have no idea where in the query they are located (and whether it's the same part of the query that they were in the original query text). Because of this, fixing the MDEV by "do not produce NAME_CONST(...) calls where it is not necessary" is hard.

          Not producing NAME_CONST vs removing NAME_CONST on execution.

          Not producing NAME_CONST when it is not necessary

          +: is cleaner.
          -: Is a more intrusive fix. If we skip NAME_CONST where we should not have, this will produce broken binary logs.

          Removing NAME_CONST when we've got a query with it.

          +: Is a more conservative fix, although not as clean.

          psergei Sergei Petrunia added a comment - Not producing NAME_CONST vs removing NAME_CONST on execution. Not producing NAME_CONST when it is not necessary +: is cleaner. -: Is a more intrusive fix. If we skip NAME_CONST where we should not have, this will produce broken binary logs. Removing NAME_CONST when we've got a query with it. +: Is a more conservative fix, although not as clean.
          psergei Sergei Petrunia made changes -
          Assignee Sergei Petrunia [ psergey ] Dave Gosselin [ JIRAUSER52216 ]
          Status In Review [ 10002 ] Stalled [ 10000 ]
          psergei Sergei Petrunia added a comment - Review input provided in https://github.com/MariaDB/server/pull/3236
          Gosselin Dave Gosselin made changes -
          Status Stalled [ 10000 ] In Progress [ 3 ]
          Gosselin Dave Gosselin made changes -
          Assignee Dave Gosselin [ JIRAUSER52216 ] Sergei Petrunia [ psergey ]
          Status In Progress [ 3 ] In Review [ 10002 ]

          When trying to see if we could always remove NAME_CONST, I've stumbled on a case where removing it creates issues even in the WHERE clause:

          create procedure p1(var1 varchar(10)) 
          select coercibility(var1);
          call p1('abc');
          coercibility(var1)
          2
          

          SP variable has coercibility =2.

          COERCIBILITY(NAME_CONST('name','test'))
          2
          

          NAME_CONST has the same, good.
          String literal has:

          SELECT COERCIBILITY('test');
          COERCIBILITY('test')
          4
          

          Moreover, one can get NAME_CONST with string literals with explicit collation (which have COERCIBILITY=0):

          --source include/have_log_bin.inc
           
          create table t1 (
            a varchar(100) collate utf8_unicode_ci,
            b int
          );
          insert into t1 values ('foo', 1),('bar', 1);
           
          create procedure p1(var1 varchar(10)) 
            update t1 set b=b+1 where a=var1;
           
          call p1('foo');
          drop procedure p1;
           
          drop table t1;
          show binlog events;
          

          gives

          master-bin.000001       1003    Query   1       1158    use `test`; 
          update t1 set b=b+1 where a= NAME_CONST('var1',_latin1'foo' COLLATE 'latin1_swedish_ci')
          

          Trying to see how replication slave would work:

          --source include/have_log_bin.inc
          create table t1 (
            a varchar(100) collate utf8_unicode_ci,
            b int
          );
          insert into t1 values ('foo', 1),('bar', 1);
           
          explain format=json
          update t1 set b=b+1 where a= NAME_CONST('var1',_latin1'foo' COLLATE 'latin1_swedish_ci');
           
          drop table t1;
          

          before this MDEV, I get:

                "attached_condition": "t1.a = convert(NAME_CONST('var1',_latin1'foo' collate latin1_swedish_ci) using utf8mb3)"
          

          after this MDEV, I get:

          mysqltest: At line 8: query 'explain format=json 
          update t1 set b=b+1 where a= NAME_CONST('var1',_latin1'foo' COLLATE 'latin1_swedish_ci')' failed: ER_CANT_AGGREGATE_2COLLATIONS (1267): Illegal mix of collations (utf8mb3_unicode_ci,IMPLICIT) and (latin1_swedish_ci,EXPLICIT) for operation '='
          

          Gosselin, any thoughts?

          psergei Sergei Petrunia added a comment - When trying to see if we could always remove NAME_CONST, I've stumbled on a case where removing it creates issues even in the WHERE clause: create procedure p1(var1 varchar (10)) select coercibility(var1); call p1( 'abc' ); coercibility(var1) 2 SP variable has coercibility =2. COERCIBILITY(NAME_CONST('name','test')) 2 NAME_CONST has the same, good. String literal has: SELECT COERCIBILITY('test'); COERCIBILITY('test') 4 Moreover, one can get NAME_CONST with string literals with explicit collation (which have COERCIBILITY=0): --source include/have_log_bin.inc   create table t1 ( a varchar (100) collate utf8_unicode_ci, b int ); insert into t1 values ( 'foo' , 1),( 'bar' , 1);   create procedure p1(var1 varchar (10)) update t1 set b=b+1 where a=var1;   call p1( 'foo' ); drop procedure p1;   drop table t1; show binlog events; gives master-bin.000001 1003 Query 1 1158 use `test`; update t1 set b=b+1 where a= NAME_CONST('var1',_latin1'foo' COLLATE 'latin1_swedish_ci') Trying to see how replication slave would work: --source include/have_log_bin.inc create table t1 ( a varchar (100) collate utf8_unicode_ci, b int ); insert into t1 values ( 'foo' , 1),( 'bar' , 1);   explain format=json update t1 set b=b+1 where a= NAME_CONST( 'var1' ,_latin1 'foo' COLLATE 'latin1_swedish_ci' );   drop table t1; before this MDEV, I get: "attached_condition": "t1.a = convert(NAME_CONST('var1',_latin1'foo' collate latin1_swedish_ci) using utf8mb3)" after this MDEV, I get: mysqltest: At line 8: query 'explain format=json update t1 set b=b+1 where a= NAME_CONST('var1',_latin1'foo' COLLATE 'latin1_swedish_ci')' failed: ER_CANT_AGGREGATE_2COLLATIONS (1267): Illegal mix of collations (utf8mb3_unicode_ci,IMPLICIT) and (latin1_swedish_ci,EXPLICIT) for operation '=' Gosselin , any thoughts?
          psergei Sergei Petrunia made changes -
          Assignee Sergei Petrunia [ psergey ] Dave Gosselin [ JIRAUSER52216 ]
          Status In Review [ 10002 ] Stalled [ 10000 ]
          Gosselin Dave Gosselin made changes -
          Status Stalled [ 10000 ] In Progress [ 3 ]
          Gosselin Dave Gosselin made changes -
          Assignee Dave Gosselin [ JIRAUSER52216 ] Sergei Petrunia [ psergey ]
          Status In Progress [ 3 ] In Review [ 10002 ]
          julien.fritsch Julien Fritsch made changes -
          Labels triage
          julien.fritsch Julien Fritsch made changes -
          Priority Major [ 3 ] Critical [ 2 ]

          Got questions on the updated pull request.

          psergei Sergei Petrunia added a comment - Got questions on the updated pull request.
          psergei Sergei Petrunia made changes -
          Assignee Sergei Petrunia [ psergey ] Dave Gosselin [ JIRAUSER52216 ]
          Status In Review [ 10002 ] Stalled [ 10000 ]
          julien.fritsch Julien Fritsch made changes -
          Fix Version/s 10.6 [ 24028 ]
          Gosselin Dave Gosselin made changes -
          Status Stalled [ 10000 ] In Progress [ 3 ]
          psergei Sergei Petrunia added a comment - - edited

          The patch uses

             thd->change_item_tree(ref, value_item);
          

          call.
          Should it do this or just do ?

            *ref=value_item;
          

          instead?

          Discussed this with sanja_byelkin. Take-aways:
          change_item_tree() is to be used when we want/need the change to be rolled back at the end of execution.
          In our case, this seems to be unnecessary. Although it's not harmful, either.

          or wait.. what if our argument is Item_func_set_collation...

          psergei Sergei Petrunia added a comment - - edited The patch uses thd->change_item_tree( ref , value_item); call. Should it do this or just do ? * ref =value_item; instead? Discussed this with sanja_byelkin . Take-aways: change_item_tree() is to be used when we want/need the change to be rolled back at the end of execution. In our case, this seems to be unnecessary. Although it's not harmful, either. or wait.. what if our argument is Item_func_set_collation...
          Gosselin Dave Gosselin added a comment -

          If there's no harm in leaving it as is, then let's leave it, especially if that's the more general case anyhow.

          Gosselin Dave Gosselin added a comment - If there's no harm in leaving it as is, then let's leave it, especially if that's the more general case anyhow.
          Gosselin Dave Gosselin made changes -
          Assignee Dave Gosselin [ JIRAUSER52216 ] Sergei Petrunia [ psergey ]
          Status In Progress [ 3 ] In Review [ 10002 ]
          mariadb-jira-automation Jira Automation (IT) made changes -
          Zendesk Related Tickets 201680
          Zendesk active tickets 201680

          Review input provided. Almost there.

          psergei Sergei Petrunia added a comment - Review input provided. Almost there.
          psergei Sergei Petrunia made changes -
          Assignee Sergei Petrunia [ psergey ] Dave Gosselin [ JIRAUSER52216 ]
          Status In Review [ 10002 ] Stalled [ 10000 ]
          Gosselin Dave Gosselin made changes -
          Fix Version/s 10.6.19 [ 29833 ]
          Fix Version/s 10.5 [ 23123 ]
          Fix Version/s 10.6 [ 24028 ]
          Resolution Fixed [ 1 ]
          Status Stalled [ 10000 ] Closed [ 6 ]
          JIraAutomate JiraAutomate made changes -
          Fix Version/s 10.11.9 [ 29834 ]
          Fix Version/s 11.1.6 [ 29835 ]
          Fix Version/s 11.2.5 [ 29836 ]
          Fix Version/s 11.4.3 [ 29837 ]
          ralf.gebhardt Ralf Gebhardt made changes -
          mariadb-jira-automation Jira Automation (IT) made changes -
          Zendesk Related Tickets 201680
          Zendesk active tickets 201680
          julien.fritsch Julien Fritsch made changes -
          Labels triage

          People

            Gosselin Dave Gosselin
            valerii Valerii Kravchuk
            Votes:
            1 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.