Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-24319

Create semi-join materialization tables with the collations used for performing the IN-comparison

    XMLWordPrintable

    Details

      Description

      Here is a test case demonstrating the issue

      h3 Query 1:

      CREATE TABLE t1(a VARCHAR(50) collate utf8_general_ci, b INT);
      INSERT INTO t1 VALUES ('abc',1), ('def', 2), ('ghi', 3), ('jkl', 4), ('mno', 5);
       
      CREATE TABLE t2(a VARCHAR(50) collate utf8mb4_general_ci, b INT);
      INSERT INTO t2 VALUES ('abc',1), ('def', 2), ('ghi', 3), ('jkl', 4), ('mno', 5);
      

      MariaDB [test]> EXPLAIN SELECT * FROM t1 WHERE t1.a IN (SELECT t2.a FROM t2);
      +------+-------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                                                           |
      +------+-------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------------------+
      |    1 | PRIMARY     | t1    | ALL  | NULL          | NULL | NULL    | NULL | 5    |                                                                 |
      |    1 | PRIMARY     | t2    | ALL  | NULL          | NULL | NULL    | NULL | 5    | Using where; FirstMatch(t1); Using join buffer (flat, BNL join) |
      +------+-------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------------------+
      2 rows in set (0.004 sec)
      
      

      So here we don't use materialization strategy, lets look in the optimizer trace why this happens:

                        "transformation": {
                          "select_id": 2,
                          "from": "IN (SELECT)",
                          "to": "materialization",
                          "possible": false,
                          "cause": "types mismatch"
                        }
      

      So the reason why materialization does not happen is because there is a type mistmatch for the IN comparison. The column
      t2.a is utf8mb4_general_ci and t1.a us utf8_general_ci. So we could allow materilazation when their is a mistmatch of datatypes.

      IN predicate to IN subquery optimization would benefit from this too.

      IN-predicate to IN subquery conversion

      MariaDB [test]> create table t1(a int);
      Query OK, 0 rows affected (0.047 sec)
       
      MariaDB [test]> insert into t1 select seq from seq_1_to_100;
      Query OK, 100 rows affected (0.025 sec)
       
      MariaDB [test]> set in_predicate_conversion_threshold=9;
      Query OK, 0 rows affected (0.001 sec)
      
      

      Query 2:

      select * from t1 where a in ('0','1','2','3','4','5','6','7','8','9')
      the column a has different datatype as the elements in the IN list.

      MariaDB [test]> explain select * from t1 where a in ('0','1','2','3','4','5','6','7','8','9');
      +------+-------------+-------+------+---------------+------+---------+------+------+-------------+
      | id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra       |
      +------+-------------+-------+------+---------------+------+---------+------+------+-------------+
      |    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL |  100 | Using where |
      +------+-------------+-------+------+---------------+------+---------+------+------+-------------+
      1 row in set (0.002 sec)
      

      For query 2 it would be good if we allow materialization strategy and create the materialized table fields with the collations
      used to the comparison of the IN predicate or IN subquery.This way, current collation-based limitations will be lifted.

      With this task we could allow materialization strategy for further cases.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              psergei Sergei Petrunia
              Reporter:
              varun Varun Gupta (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Dates

                Created:
                Updated:

                  Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.