[MCOL-4417] Non-equality comparison operators do not work well with NOPAD collations Created: 2020-11-27  Updated: 2021-02-15  Resolved: 2020-12-02

Status: Closed
Project: MariaDB ColumnStore
Component/s: PrimProc
Affects Version/s: 5.4.3
Fix Version/s: 5.5.1

Type: Bug Priority: Major
Reporter: Alexander Barkov Assignee: Daniel Lee (Inactive)
Resolution: Fixed Votes: 0
Labels: datatype

Issue Links:
Blocks
is blocked by MCOL-4422 Remove mariadb.h and my_sys.h depende... Closed
Relates
relates to MCOL-495 Make string comparison not case sensi... Closed
relates to MCOL-4064 Make JOIN collation aware Closed
relates to MCOL-4498 LIKE is not collation aware Closed
relates to MCOL-4065 DISTINCT is case sensitive Closed
relates to MCOL-4388 Equality does not respect the NOPAD c... Closed
relates to MCOL-4428 WHERE char6col <='STR' OR char6col IS... Closed
relates to MCOL-4539 WHERE short_char_column='literal' ign... Closed

 Description   

Long VARCHAR:

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (a VARCHAR(32) CHARACTER SET utf8 COLLATE utf8_general_nopad_ci) ENGINE=ColumnStore;
INSERT INTO t1 VALUES ('a');
INSERT INTO t1 VALUES ('ã');
INSERT INTO t1 VALUES ('a ');
INSERT INTO t1 VALUES ('ã ');
INSERT INTO t1 VALUES ('a  ');
INSERT INTO t1 VALUES ('ã  ');
SELECT quote(a) FROM t1 WHERE a<='a  ';

+----------+
| quote(a) |
+----------+
| 'a'      |
+----------+

Looks wrong. The condition should match all records.

Short VARCHAR:

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (a VARCHAR(4) CHARACTER SET utf8 COLLATE utf8_general_nopad_ci) ENGINE=ColumnStore;
INSERT INTO t1 VALUES ('a');
INSERT INTO t1 VALUES ('ã');
INSERT INTO t1 VALUES ('a ');
INSERT INTO t1 VALUES ('ã ');
INSERT INTO t1 VALUES ('a  ');
INSERT INTO t1 VALUES ('ã  ');
SELECT quote(a) FROM t1 WHERE a<='a  ';

+----------+
| quote(a) |
+----------+
| 'a'      |
| 'a '     |
| 'a  '    |
+----------+

It returns a different result set, but it is still wrong.

With other engines it works as expected:

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (a VARCHAR(32) CHARACTER SET utf8 COLLATE utf8_general_nopad_ci) ENGINE=MyISAM;
INSERT INTO t1 VALUES ('a');
INSERT INTO t1 VALUES ('ã');
INSERT INTO t1 VALUES ('a ');
INSERT INTO t1 VALUES ('ã ');
INSERT INTO t1 VALUES ('a  ');
INSERT INTO t1 VALUES ('ã  ');
SELECT quote(a) FROM t1 WHERE a<='a  ';

+----------+
| quote(a) |
+----------+
| 'a'      |
| 'ã'      |
| 'a '     |
| 'ã '     |
| 'a  '    |
| 'ã  '    |
+----------+

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (a VARCHAR(4) CHARACTER SET utf8 COLLATE utf8_general_nopad_ci) ENGINE=MyISAM;
INSERT INTO t1 VALUES ('a');
INSERT INTO t1 VALUES ('ã');
INSERT INTO t1 VALUES ('a ');
INSERT INTO t1 VALUES ('ã ');
INSERT INTO t1 VALUES ('a  ');
INSERT INTO t1 VALUES ('ã  ');
SELECT quote(a) FROM t1 WHERE a<='a  ';

+----------+
| quote(a) |
+----------+
| 'a'      |
| 'ã'      |
| 'a '     |
| 'ã '     |
| 'a  '    |
| 'ã  '    |
+----------+



 Comments   
Comment by Daniel Lee (Inactive) [ 2020-12-02 ]

Build verified: 5.5.1 (Drone 1233)

Verified the test case in the ticket description.

Generated at Thu Feb 08 02:50:11 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.