[MDEV-6797] intcol=NUM is not used to remove intcol from ORDER BY if NUM is quoted Created: 2014-09-26  Updated: 2023-12-05

Status: Stalled
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.0.13
Fix Version/s: 10.1

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Sergei Petrunia
Resolution: Unresolved Votes: 0
Labels: datatype, optimizer, order-by-optimization

Sprint: 10.1.6-2

 Description   

ORDER BY optimizer removes constant items from ORDER BY list. That is, if there is a query

SELECT ... WHERE col1=const  ... ORDER BY col1, ...

then col1 can be removed from ORDER BY list (except for some charsets).

The problem is, this feature doesn't work for some cases where it should.
Test dataset:

CREATE TABLE tb_bug2 (
  pk1 int(11) NOT NULL,
  pk2 varchar(64) NOT NULL,
  col1 varchar(16) DEFAULT NULL,
  PRIMARY KEY (pk1,pk2),
  KEY key1 (pk1,col1,pk2)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
 
INSERT INTO `tb_bug2` VALUES 
(12321321,'a8f5f167f44f4964e6c998dee827110c','video'),
(12321321,'d77a17a3659ffa60c54e0ea17b6c6d16','video'),
(12321321,'wwafdsafdsafads','video'),
(12321321,'696aa249f0738e8181957dd57c2d7d0b','video-2014-09-23'),
(12321321,'802f9f29584b486f356693e3aa4ef0af','video=sdsd'),
(12321321,'2f94543ff74aab82e9a058b4e8316d75','video=sdsdsds'),
(12321321,'c1316b9df0d203fd1b9035308de52a0a','video=sdsdsdsdsd');

Now, lets try two queries, one with pk1=const and the other with pk1='const':

explain 
SELECT pk2 FROM tb_bug2 USE INDEX(key1) 
WHERE pk1 = 123 AND col1 = 'video' 
ORDER BY pk1 DESC, col1 DESC, pk2 DESC LIMIT 21;
id      select_type     table   type    possible_keys   key     key_len ref     rows    Extra
1       SIMPLE  tb_bug2 ref     key1    key1    55      const,const     1       Using where; Using index

explain 
SELECT pk2 FROM tb_bug2 USE INDEX(key1) 
WHERE pk1 = '123' AND col1 = 'video' 
ORDER BY pk1 DESC, col1 DESC, pk2 DESC LIMIT 21;
id      select_type     table   type    possible_keys   key     key_len ref     rows    Extra
1       SIMPLE  tb_bug2 ref     key1    key1    55      const,const     1       Using where; Using index; Using filesort



 Comments   
Comment by Sergei Petrunia [ 2014-09-26 ]
  • The problem is not related to not-yet-pushed order by optimization fixes, MDEV-6657 or MDEV-6402 or MDEV-6796
  • The problem is repeatable on mysql-5.6
Comment by Sergei Petrunia [ 2014-09-26 ]

Note that the ref optimizer is able to make use of "pk1='123'". It's only ORDER BY optimizer that can't.

Comment by Sergei Petrunia [ 2014-09-26 ]

Narrowed it down to this code:

test_if_equality_guarantees_uniqueness(Item *l, Item *r)
{
  return (r->const_item() || !(r->used_tables() & ~OUTER_REF_TABLE_BIT)) &&
    item_cmp_type(l->cmp_type(), r->cmp_type()) == l->cmp_type() &&
    (l->cmp_type() != STRING_RESULT ||
     l->collation.collation == r->collation.collation);
}

ORDER BY optimizer uses it to check whether rows that satisfy "item=const" are all equal from the point of view of "ORDER BY item".

Here,

  • l is "pk1" (integer column).
  • r is either integer constant or a string constant.
  • item_cmp_type(l, r) = DOUBLE_RESULT.

Apparently, the restriction in this function is too strict.

Comment by Sergei Petrunia [ 2014-09-28 ]

Discussed the problem with bar. Came to conclusion that whenever the column's result_type is in

{ REAL_RESULT, INT_RESULT, DECIMAL_RESULT , TIME_RESULT}

then the comparison guarantees uniqueness.

Comment by Sergei Petrunia [ 2014-09-28 ]

Consider a counter-example:

(gdb) p 0xFFFF FFFF FFFF FFFF
$1 = 18446744073709551615

create table t1 (a bigint unsigned);
insert into t1 values 
(18446744073709551000),
(18446744073709551100),
(18446744073709551200);

select a, 
       a='18446744073709551200',
       a=18446744073709551200
from t1;
+----------------------+--------------------------+------------------------+
| a                    | a='18446744073709551200' | a=18446744073709551200 |
+----------------------+--------------------------+------------------------+
| 18446744073709551000 |                        0 |                      0 |
| 18446744073709551100 |                        0 |                      0 |
| 18446744073709551200 |                        1 |                      1 |
+----------------------+--------------------------+------------------------+

Hmm this was supposed to be a counter example, but it's not. If the comparison a='...' has been done by comparing double values, then all values in the table would have been equal to the constant, because of insufficient precision. This didn't happen, because of this code:

 void Item_bool_func2::fix_length_and_dec()
...
    Make a special case of compare with fields to get nicer comparisons
    of bigint numbers with constant string.
    This directly contradicts the manual (number and a string should
    be compared as doubles), but seems to provide more
    "intuitive" behavior in some cases (but less intuitive in others).
...

and the code here sets args[i]->cmp_context to show which type will be used for comparison.

Comment by Alexander Barkov [ 2014-09-28 ]

Hi Sergei,

Discussed the problem with Alexander Barkov. Came to conclusion that whenever the column's result_type is in

Unknown macro: { REAL_RESULT, INT_RESULT, DECIMAL_RESULT , TIME_RESULT}

then the comparison guarantees uniqueness.

It's cmp_type(), not result_type()!!!
Note: temporal types (DATE, TIME, DATETIME) have resul_type()==STRING_RESULT,
but cmp_type()==TIME_RESULT.

Comment by Sergei Golubchik [ 2014-10-10 ]

Apparently, the restriction in this function is too strict.

the restriction is formally correct. But it doesn't take into account that hack in Item_bool_func2::fix_length_and_dec().

Perhaps, instead of having that hack and — on top of that — introducing another hack in test_if_equality_guarantees_uniqueness() it'd be cleaner to replace the string literal with a numeric literal (under the same conditions that Item_bool_func2::fix_length_and_dec() checks now).

Comment by Julien Fritsch [ 2023-12-05 ]

Automated message:
----------------------------
Since this issue has not been updated since 6 weeks, it's time to move it back to Stalled.

Generated at Thu Feb 08 07:14:39 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.