Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-4539

WHERE short_char_column='literal' ignores the collation on a huge table

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 5.6.1, 5.5.1
    • Fix Version/s: 5.5.2
    • Component/s: PrimProc
    • Labels:
      None
    • Sprint:
      2021-3

      Description

      Preparing a dump file with data:

      DROP TABLE IF EXISTS t1;
      CREATE TABLE t1 (c CHAR(4)) ENGINE=MyISAM;
      DELIMITER $$
      FOR i IN 0..90000 DO
        INSERT INTO t1 VALUES ('aaaa');
        INSERT INTO t1 VALUES ('bbbb');
        INSERT INTO t1 VALUES ('AAAA');
        INSERT INTO t1 VALUES ('BBBB');
      END FOR;
      $$
      DELIMITER ;
      SELECT c INTO OUTFILE 't1.txt' FROM t1;
      

      Creating and populating a ColumnStore table:

      DROP TABLE IF EXISTS t2;
      CREATE TABLE t2 (c CHAR(4) CHARACTER SET latin1 COLLATE latin1_bin) ENGINE=ColumnStore;
      LOAD DATA INFILE 't1.txt' INTO TABLE t2;
      

      Querying the ColumnStore table:

      SELECT * FROM t2 WHERE c='aaaa';
      

      +------+
      | c    |
      +------+
      | aaaa |
      | aaaa |
      | aaaa |
      | aaaa |
      | aaaa |
      | aaaa |
      | aaaa |
      | aaaa |
      | aaaa |
      | aaaa |
      ........
      | aaaa |
      | AAAA |
      | aaaa |
      | AAAA |
      +------+
      147746 rows in set (0.054 sec)
      

      Notice, the leading rows in the result set consist only of the expected values, but the trailing rows contains values with capital letter 'AAAA', which is wrong.

      A DISTINCT query also returns a wrong result:

      SELECT DISTINCT c FROM t2 WHERE c='aaaa';
      

      +------+
      | c    |
      +------+
      | aaaa |
      | AAAA |
      +------+
      

      This script with INSERT..SELECT also returns a wrong result:

      DROP TABLE IF EXISTS t3;
      CREATE TABLE t3 LIKE t2;
      INSERT INTO t3 SELECT * FROM t2 WHERE c='aaaa';
      SELECT c, COUNT(*) FROM t3 GROUP BY c;
      

      +------+----------+
      | c    | COUNT(*) |
      +------+----------+
      | AAAA |    57745 |
      | aaaa |    90001 |
      +------+----------+
      

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              dleeyh Daniel Lee
              Reporter:
              bar Alexander Barkov
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Git Integration