[MCOL-4064] Make JOIN collation aware Created: 2020-06-15  Updated: 2021-05-17  Resolved: 2020-12-10

Status: Closed
Project: MariaDB ColumnStore
Component/s: PrimProc
Affects Version/s: 1.5.1
Fix Version/s: 5.5.1

Type: Bug Priority: Critical
Reporter: David Hall (Inactive) Assignee: Daniel Lee (Inactive)
Resolution: Fixed Votes: 0
Labels: datatype

Issue Links:
Issue split
split from MCOL-3536 Order by with UTF Closed
PartOf
includes MCOL-4724 Make disk-based JOIN collation and wi... Open
is part of MCOL-4343 umbrella for tech debt issues Open
Relates
relates to MCOL-4065 DISTINCT is case sensitive Closed
relates to MCOL-4498 LIKE is not collation aware Closed
relates to MCOL-495 Make string comparison not case sensi... Closed
relates to MCOL-4388 Equality does not respect the NOPAD c... Closed
relates to MCOL-4417 Non-equality comparison operators do ... Closed
relates to MCOL-4428 WHERE char6col <='STR' OR char6col IS... Closed
relates to MCOL-4534 MariaDB collation library: improve co... Stalled
relates to MCOL-4539 WHERE short_char_column='literal' ign... Closed
Epic Link: ColumnStore Compatibility Improvements
Sprint: 2020-8

 Description   

Columnstore performs joins on char types using a binary compare. Server uses case-insensitive compare. We need to be compatible.



 Comments   
Comment by David Hall (Inactive) [ 2020-06-19 ]

I've linked this to MCOL-495. While predicates have been modified to use collation, many "where a=b" type statements are optimized to use a JOIN rather than a predicate, so we can't say MCOL-495 is complete until JOINs are complete.

Comment by Daniel Lee (Inactive) [ 2020-12-03 ]

Build tested: 5.5.1 (Drone 1251)

Issues found. If we want to fix these issues for 5.5.1, I will test again when the fix is available. If we defer additional fix for future release, then please create a MCOL to track them. Thx

Scenarios test for both case sensitive and insensitive tables
Case sensitive tables worked fine

For case insensitive tables
1. single table, column-to-string comparison, PASSED
2. single table, column-to-column comparison, FAILED
3. single table, self-join, FAILED
4. two tables, join, FAILED

The interesting thing is right after creating the tables, test #3 passed. But after #2 failed, #3 also would fail.

Case insensitive table

MariaDB [mytest]> create table t1 (c1 char(5), c2 char(5)) engine=columnstore collate=latin1_general_ci;
Query OK, 0 rows affected (0.147 sec)

MariaDB [mytest]> insert into t1 values ('hello','Hello');
Query OK, 1 row affected (0.154 sec)

MariaDB [mytest]> create table t4 (c1 char(5), c2 char(5)) engine=columnstore collate=latin1_general_ci;
Query OK, 0 rows affected (0.182 sec)

MariaDB [mytest]> insert into t4 values ('hello','Hello');
Query OK, 1 row affected (0.158 sec)

2.
MariaDB [mytest]> select * from t1 where c1=c2;
Empty set (0.012 sec)

3.
MariaDB [mytest]> select * from t1, t1 t2 where t1.c1=t2.c2;
Empty set (0.144 sec)

4.
MariaDB [mytest]> select * from t1, t4 where t1.c1=t4.c2;
Empty set (0.030 sec)

Comment by Daniel Lee (Inactive) [ 2020-12-10 ]

Build tested: 5.5.1 (Drone 1313)

The issue has been fixed

Generated at Thu Feb 08 02:47:31 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.