Details

    Description

      I run the following statements, in which the query returns an incorrect result.

      CREATE TABLE t1 (c1 TINYBLOB, UNIQUE (c1(2)));
      INSERT INTO t1 (c1) VALUES (1);
      SELECT c1 FROM t1 WHERE 'a' BETWEEN 0 AND (c1); -- actual: {}, expected: {1}
      

      I got the following query plan.

      mysql> EXPLAIN SELECT c1 FROM t1 WHERE 'a' BETWEEN 0 AND (c1);
      +------+-------------+-------+-------+---------------+------+---------+------+------+-------------+
      | id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra       |
      +------+-------------+-------+-------+---------------+------+---------+------+------+-------------+
      |    1 | SIMPLE      | t1    | range | c1            | c1   | 5       | NULL | 1    | Using where |
      +------+-------------+-------+-------+---------------+------+---------+------+------+-------------+
      

      Attachments

        Issue Links

          Activity

            John Jove John Jove added a comment -

            Hi, shall we change this bug status as confirmed?
            I also try the BLOB data type with NOT NULL, in which a correct result is returned.

            CREATE TABLE t1 (c1 TINYBLOB NOT NULL);
            INSERT INTO t1 VALUES (1);
            SELECT c1 FROM t1 WHERE 'a' BETWEEN 0 AND (c1); -- {0x31}
            

            It seems that the NOT NULL constraint causes a different calculation from the UNIQUE constraint against the BLOB data type.

            John Jove John Jove added a comment - Hi, shall we change this bug status as confirmed? I also try the BLOB data type with NOT NULL, in which a correct result is returned. CREATE TABLE t1 (c1 TINYBLOB NOT NULL ); INSERT INTO t1 VALUES (1); SELECT c1 FROM t1 WHERE 'a' BETWEEN 0 AND (c1); -- {0x31} It seems that the NOT NULL constraint causes a different calculation from the UNIQUE constraint against the BLOB data type.

            CREATE TABLE t1 (c1 TINYBLOB, UNIQUE (c1(2))) engine=innodb;
            INSERT INTO t1 (c1) VALUES (1);
            SELECT c1 FROM t1 WHERE 'a' BETWEEN 0 AND (c1); # no rows
            alter table t1 engine=myisam;
            SELECT c1 FROM t1 WHERE 'a' BETWEEN 0 AND (c1); # 1 row
            drop table t1;
            

            serg Sergei Golubchik added a comment - CREATE TABLE t1 (c1 TINYBLOB, UNIQUE (c1(2))) engine=innodb; INSERT INTO t1 (c1) VALUES (1); SELECT c1 FROM t1 WHERE 'a' BETWEEN 0 AND (c1); # no rows alter table t1 engine=myisam; SELECT c1 FROM t1 WHERE 'a' BETWEEN 0 AND (c1); # 1 row drop table t1;
            John Jove John Jove added a comment -

            I observe that if we specify the index without the length of column c1 like follows, a correct result is returned.

            CREATE TABLE t1 (c1 TINYBLOB, UNIQUE (c1)) engine=innodb;
            INSERT INTO t1 (c1) VALUES (1);
            SELECT c1 FROM t1 WHERE 'a' BETWEEN 0 AND (c1); -- {1}
            

            John Jove John Jove added a comment - I observe that if we specify the index without the length of column c1 like follows, a correct result is returned. CREATE TABLE t1 (c1 TINYBLOB, UNIQUE (c1)) engine=innodb; INSERT INTO t1 (c1) VALUES (1); SELECT c1 FROM t1 WHERE 'a' BETWEEN 0 AND (c1); -- {1}
            Gosselin Dave Gosselin added a comment -

            I tried the following on MySQL 8.3.0:

            mysql> select @@version;
            +-------------+
            | @@version   |
            +-------------+
            | 8.3.0-debug |
            +-------------+
            1 row in set (0.00 sec)
             
            mysql> CREATE TABLE t1 (c1 TINYBLOB, UNIQUE (c1(2))) engine=innodb;
            Query OK, 0 rows affected (0.04 sec)
             
            mysql> INSERT INTO t1 (c1) VALUES (1);
            Query OK, 1 row affected (0.01 sec)
             
            mysql> SELECT c1 FROM t1 WHERE 'a' BETWEEN 0 AND (c1); # no rows
            Empty set (0.00 sec)
             
            mysql> alter table t1 engine=myisam;
            Query OK, 1 row affected (0.04 sec)
            Records: 1  Duplicates: 0  Warnings: 0
             
            mysql> SELECT c1 FROM t1 WHERE 'a' BETWEEN 0 AND (c1); # 1 row
            +------------+
            | c1         |
            +------------+
            | 0x31       |
            +------------+
            1 row in set, 1 warning (0.00 sec)
             
            mysql> drop table t1;
            Query OK, 0 rows affected (0.01 sec)
            

            and

            mysql> CREATE TABLE t1 (c1 TINYBLOB NOT NULL);
            Query OK, 0 rows affected (0.04 sec)
             
            mysql> INSERT INTO t1 VALUES (1);
            Query OK, 1 row affected (0.00 sec)
             
            mysql> SELECT c1 FROM t1 WHERE 'a' BETWEEN 0 AND (c1);
            +--------+
            | c1     |
            +--------+
            | 0x31   |
            +--------+
            1 row in set, 1 warning (0.00 sec)
            

            The following is not supported on MySQL 8.3.0:

            mysql> CREATE TABLE t1 (c1 TINYBLOB, UNIQUE (c1)) engine=innodb;
            ERROR 1170 (42000): BLOB/TEXT column 'c1' used in key specification without a key length
            

            Gosselin Dave Gosselin added a comment - I tried the following on MySQL 8.3.0: mysql> select @@version; +-------------+ | @@version | +-------------+ | 8.3.0-debug | +-------------+ 1 row in set (0.00 sec)   mysql> CREATE TABLE t1 (c1 TINYBLOB, UNIQUE (c1(2))) engine=innodb; Query OK, 0 rows affected (0.04 sec)   mysql> INSERT INTO t1 (c1) VALUES (1); Query OK, 1 row affected (0.01 sec)   mysql> SELECT c1 FROM t1 WHERE 'a' BETWEEN 0 AND (c1); # no rows Empty set (0.00 sec)   mysql> alter table t1 engine=myisam; Query OK, 1 row affected (0.04 sec) Records: 1 Duplicates: 0 Warnings: 0   mysql> SELECT c1 FROM t1 WHERE 'a' BETWEEN 0 AND (c1); # 1 row +------------+ | c1 | +------------+ | 0x31 | +------------+ 1 row in set, 1 warning (0.00 sec)   mysql> drop table t1; Query OK, 0 rows affected (0.01 sec) and mysql> CREATE TABLE t1 (c1 TINYBLOB NOT NULL); Query OK, 0 rows affected (0.04 sec)   mysql> INSERT INTO t1 VALUES (1); Query OK, 1 row affected (0.00 sec)   mysql> SELECT c1 FROM t1 WHERE 'a' BETWEEN 0 AND (c1); +--------+ | c1 | +--------+ | 0x31 | +--------+ 1 row in set, 1 warning (0.00 sec) The following is not supported on MySQL 8.3.0: mysql> CREATE TABLE t1 (c1 TINYBLOB, UNIQUE (c1)) engine=innodb; ERROR 1170 (42000): BLOB/TEXT column 'c1' used in key specification without a key length
            Gosselin Dave Gosselin added a comment -

            This is a problem with how range optimization is performed during make_join_statistics. The following is NOT a solution for this bug; rather it illustrates the point. If we disable range analysis for all queries, then we will return a correct result. This can be done by changing the following conditional in make_join_statistics such that it always evaluates to false (in fact range analysis is disallowed for materialized subqueries, derived tables, and derived views as noted).

               6208       /*
               6209         Perform range analysis if there are keys it could use (1).
               6210         Don't do range analysis for materialized subqueries (2).
               6211         Don't do range analysis for materialized derived tables/views (3)
               6212       */
               6213       const bool DEBUG_RANGE_ENABLED= false;
               6214       if (DEBUG_RANGE_ENABLED && (!s->const_keys.is_clear_all() ||
               6215            !bitmap_is_clear_all(map: &s->table->cond_set)) &&              // (1)
               6216           !s->table->is_filled_at_execution() &&                      // (2)
               6217           !(s->table->pos_in_table_list->derived &&                   // (3)
               6218             s->table->pos_in_table_list->is_materialized_derived()))  // (3)
               6219       {
               6220         bool impossible_range= FALSE;
               6221         ha_rows records= HA_ROWS_MAX;
               6222         SQL_SELECT *select= 0;
               6223         Item **sargable_cond= NULL;
               6224         if (!s->const_keys.is_clear_all())
               

            I will continue to debug the issue to a root cause from this point.

            Gosselin Dave Gosselin added a comment - This is a problem with how range optimization is performed during make_join_statistics . The following is NOT a solution for this bug; rather it illustrates the point. If we disable range analysis for all queries, then we will return a correct result. This can be done by changing the following conditional in make_join_statistics such that it always evaluates to false (in fact range analysis is disallowed for materialized subqueries, derived tables, and derived views as noted). 6208 /* 6209 Perform range analysis if there are keys it could use (1). 6210 Don't do range analysis for materialized subqueries (2). 6211 Don't do range analysis for materialized derived tables/views (3) 6212 */ 6213 const bool DEBUG_RANGE_ENABLED= false; 6214 if (DEBUG_RANGE_ENABLED && (!s->const_keys.is_clear_all() || 6215 !bitmap_is_clear_all(map: &s->table->cond_set)) && // (1) 6216 !s->table->is_filled_at_execution() && // (2) 6217 !(s->table->pos_in_table_list->derived && // (3) 6218 s->table->pos_in_table_list->is_materialized_derived())) // (3) 6219 { 6220 bool impossible_range= FALSE; 6221 ha_rows records= HA_ROWS_MAX; 6222 SQL_SELECT *select= 0; 6223 Item **sargable_cond= NULL; 6224 if (!s->const_keys.is_clear_all()) I will continue to debug the issue to a root cause from this point.
            Gosselin Dave Gosselin added a comment - - edited

            BETWEEN AND does not correctly cast its arguments according to https://mariadb.com/kb/en/type-conversion/ when two of the arguments are constant and the third is a column reference.

            In the case of SELECT 'a' BETWEEN 0 AND 1, 'a' is correctly cast such that BETWEEN AND makes comparisons as decimal values, as documented in the Type Conversion KB article linked above. Since 'a' casts to 0 and BETWEEN AND is left-bound inclusive, the SELECT results in 1:

            MariaDB [test]> SELECT 'a' BETWEEN 0 AND 1;
            +---------------------+
            | 'a' BETWEEN 0 AND 1 |
            +---------------------+
            |                   1 |
            +---------------------+
            1 row in set, 1 warning (0.001 sec)
             
            MariaDB [test]> show warnings;
            +---------+------+----------------------------------------+
            | Level   | Code | Message                                |
            +---------+------+----------------------------------------+
            | Warning | 1292 | Truncated incorrect DECIMAL value: 'a' |
            +---------+------+----------------------------------------+
            1 row in set (0.000 sec)
            

            However, for the query from the 'Description' field of this ticket, SELECT c1 FROM t1 WHERE 'a' BETWEEN 0 AND (c1), 'a' is not cast correctly. Instead, the ASCII value for 'a', 0x61, is used during range optimization when constructing the keys that are passed to the storage engine. This happens because Item_func_between::get_func_mm_tree passes 'a' without casting it to a decimal into get_mm_parts which writes the ASCII value to the corresponding item field. This happens because it just writes the binary bits as they appear in memory directly into the destination field. Later, during TRP_RANGE::make_quick, the min/max tree returned by Item_func_between::get_func_mm_tree is used to create a new QUICK_RANGE_SELECT object with min and max keys corresponding to 0x61, which are passed to the storage engine during handler::index_read_map. We can see this by placing a debugger in that method and inspecting the memory corresponding to the keys (the key length in this example is 5, so the first five bytes here correspond to the key):

            0x12915b950: 00 01 00 61 00 a5 8f 8f a5 8f 8f 8f 8f 8f 8f 8f  ...a............
            

            The fourth byte indicates the key 0x61 (and you can see the 'a' in the ASCII readout). It should be zero here if the typecast rules were correctly followed during range optimization (as they are for the case of all constants).

            One solution is to implement get_mm_leaf on Item_func_between, overriding the implementation given by its parent Item_bool_func, to inspect the arguments of BETWEEN AND and apply the casting rules such that the correct key value is used in range construction.

            Here are some similar queries on the same table that show how casting affects BETWEEN AND:

            select c1 from t1 where '0' between 0 and (c1);  -- MRR and correct result because 0x30 ('0') is < all values in table
            select c1 from t1 where c1 in (select c1 from t1 where c1 mod 2 = 0);  -- Semi-join shows what should happen when key_len > 0 during ha_innodb::index_read, helpful to debug from join_read_key2 and index_read_map what the key looks like in memory along with key_len and flag, comparing against wrong result for MRR.
            select c1 from t1 where 1 between 0 and (c1);  -- No MRR, rr_sequential used because of collation mismatch during range construction
            select c1 from t1 where 'a' between 0 and (c1);  -- MRR, wrong result because 0x61 ('a') > any value in the table (flag is HA_READ_KEY_OR_NEXT)
            select c1 from t1 where 0x01 between 0 and (c1);  -- MRR, produces correct result because 0x01 < 0x31 and flag is HA_READ_KEY_OR_NEXT
            select c1 from t1 where 0 between 0 and (c1);  -- No MRR,  rr_sequential used because of collation mismatch during range construction
            

            Gosselin Dave Gosselin added a comment - - edited BETWEEN AND does not correctly cast its arguments according to https://mariadb.com/kb/en/type-conversion/ when two of the arguments are constant and the third is a column reference. In the case of SELECT 'a' BETWEEN 0 AND 1 , 'a' is correctly cast such that BETWEEN AND makes comparisons as decimal values, as documented in the Type Conversion KB article linked above. Since 'a' casts to 0 and BETWEEN AND is left-bound inclusive, the SELECT results in 1: MariaDB [test]> SELECT 'a' BETWEEN 0 AND 1; +---------------------+ | 'a' BETWEEN 0 AND 1 | +---------------------+ | 1 | +---------------------+ 1 row in set, 1 warning (0.001 sec)   MariaDB [test]> show warnings; +---------+------+----------------------------------------+ | Level | Code | Message | +---------+------+----------------------------------------+ | Warning | 1292 | Truncated incorrect DECIMAL value: 'a' | +---------+------+----------------------------------------+ 1 row in set (0.000 sec) However, for the query from the 'Description' field of this ticket, SELECT c1 FROM t1 WHERE 'a' BETWEEN 0 AND (c1) , 'a' is not cast correctly. Instead, the ASCII value for 'a', 0x61, is used during range optimization when constructing the keys that are passed to the storage engine. This happens because Item_func_between::get_func_mm_tree passes 'a' without casting it to a decimal into get_mm_parts which writes the ASCII value to the corresponding item field. This happens because it just writes the binary bits as they appear in memory directly into the destination field. Later, during TRP_RANGE::make_quick, the min/max tree returned by Item_func_between::get_func_mm_tree is used to create a new QUICK_RANGE_SELECT object with min and max keys corresponding to 0x61, which are passed to the storage engine during handler::index_read_map. We can see this by placing a debugger in that method and inspecting the memory corresponding to the keys (the key length in this example is 5, so the first five bytes here correspond to the key): 0x12915b950: 00 01 00 61 00 a5 8f 8f a5 8f 8f 8f 8f 8f 8f 8f ...a............ The fourth byte indicates the key 0x61 (and you can see the 'a' in the ASCII readout). It should be zero here if the typecast rules were correctly followed during range optimization (as they are for the case of all constants). One solution is to implement get_mm_leaf on Item_func_between, overriding the implementation given by its parent Item_bool_func, to inspect the arguments of BETWEEN AND and apply the casting rules such that the correct key value is used in range construction. Here are some similar queries on the same table that show how casting affects BETWEEN AND : select c1 from t1 where '0' between 0 and (c1); -- MRR and correct result because 0x30 ('0') is < all values in table select c1 from t1 where c1 in (select c1 from t1 where c1 mod 2 = 0); -- Semi-join shows what should happen when key_len > 0 during ha_innodb::index_read, helpful to debug from join_read_key2 and index_read_map what the key looks like in memory along with key_len and flag, comparing against wrong result for MRR. select c1 from t1 where 1 between 0 and (c1); -- No MRR, rr_sequential used because of collation mismatch during range construction select c1 from t1 where 'a' between 0 and (c1); -- MRR, wrong result because 0x61 ('a') > any value in the table (flag is HA_READ_KEY_OR_NEXT) select c1 from t1 where 0x01 between 0 and (c1); -- MRR, produces correct result because 0x01 < 0x31 and flag is HA_READ_KEY_OR_NEXT select c1 from t1 where 0 between 0 and (c1); -- No MRR, rr_sequential used because of collation mismatch during range construction
            bar Alexander Barkov added a comment -

            INNODB and MyISAM work differently because because MyISAM treats tables with a single row in a special way.

            If we insert two records, there is no a difference:

            CREATE OR REPLACE TABLE t1 (c1 TINYBLOB, UNIQUE (c1(2))) engine=innodb;
            INSERT INTO t1 (c1) VALUES (1),(2);
            SELECT c1 FROM t1 WHERE 'a' BETWEEN 0 AND (c1); # no rows
            alter table t1 engine=myisam;
            SELECT c1 FROM t1 WHERE 'a' BETWEEN 0 AND (c1); # no rows
            drop table t1;
            

            bar Alexander Barkov added a comment - INNODB and MyISAM work differently because because MyISAM treats tables with a single row in a special way. If we insert two records, there is no a difference: CREATE OR REPLACE TABLE t1 (c1 TINYBLOB, UNIQUE (c1(2))) engine=innodb; INSERT INTO t1 (c1) VALUES (1),(2); SELECT c1 FROM t1 WHERE 'a' BETWEEN 0 AND (c1); # no rows alter table t1 engine=myisam; SELECT c1 FROM t1 WHERE 'a' BETWEEN 0 AND (c1); # no rows drop table t1;
            Gosselin Dave Gosselin added a comment -

            Yes, that is true because of this special case code in sql_select. I've been testing with more than one row to avoid this special case:

                if (!table->is_filled_at_execution() &&
                    (table->s->system ||
                     (table->file->stats.records <= 1 &&
                      (table->file->ha_table_flags() & HA_STATS_RECORDS_IS_EXACT)) ||
                     all_partitions_pruned_away) &&
            	!s->dependent &&
                    !table->fulltext_searched && !join->no_const_tables)
                {
                  set_position(join,const_count++,s,(KEYUSE*) 0);
                  no_rows_const_tables |= table->map;
                }
            

            Gosselin Dave Gosselin added a comment - Yes, that is true because of this special case code in sql_select. I've been testing with more than one row to avoid this special case: if (!table->is_filled_at_execution() && (table->s->system || (table->file->stats.records <= 1 && (table->file->ha_table_flags() & HA_STATS_RECORDS_IS_EXACT)) || all_partitions_pruned_away) && !s->dependent && !table->fulltext_searched && !join->no_const_tables) { set_position(join,const_count++,s,(KEYUSE*) 0); no_rows_const_tables |= table->map; }

            People

              Gosselin Dave Gosselin
              John Jove John Jove
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.