Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-4539

WHERE short_char_column='literal' ignores the collation on a huge table

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 5.5.1, 5.6.1
    • 5.5.2
    • PrimProc
    • None
    • 2021-3

    Description

      Preparing a dump file with data:

      DROP TABLE IF EXISTS t1;
      CREATE TABLE t1 (c CHAR(4)) ENGINE=MyISAM;
      DELIMITER $$
      FOR i IN 0..90000 DO
        INSERT INTO t1 VALUES ('aaaa');
        INSERT INTO t1 VALUES ('bbbb');
        INSERT INTO t1 VALUES ('AAAA');
        INSERT INTO t1 VALUES ('BBBB');
      END FOR;
      $$
      DELIMITER ;
      SELECT c INTO OUTFILE 't1.txt' FROM t1;
      

      Creating and populating a ColumnStore table:

      DROP TABLE IF EXISTS t2;
      CREATE TABLE t2 (c CHAR(4) CHARACTER SET latin1 COLLATE latin1_bin) ENGINE=ColumnStore;
      LOAD DATA INFILE 't1.txt' INTO TABLE t2;
      

      Querying the ColumnStore table:

      SELECT * FROM t2 WHERE c='aaaa';
      

      +------+
      | c    |
      +------+
      | aaaa |
      | aaaa |
      | aaaa |
      | aaaa |
      | aaaa |
      | aaaa |
      | aaaa |
      | aaaa |
      | aaaa |
      | aaaa |
      ........
      | aaaa |
      | AAAA |
      | aaaa |
      | AAAA |
      +------+
      147746 rows in set (0.054 sec)
      

      Notice, the leading rows in the result set consist only of the expected values, but the trailing rows contains values with capital letter 'AAAA', which is wrong.

      A DISTINCT query also returns a wrong result:

      SELECT DISTINCT c FROM t2 WHERE c='aaaa';
      

      +------+
      | c    |
      +------+
      | aaaa |
      | AAAA |
      +------+
      

      This script with INSERT..SELECT also returns a wrong result:

      DROP TABLE IF EXISTS t3;
      CREATE TABLE t3 LIKE t2;
      INSERT INTO t3 SELECT * FROM t2 WHERE c='aaaa';
      SELECT c, COUNT(*) FROM t3 GROUP BY c;
      

      +------+----------+
      | c    | COUNT(*) |
      +------+----------+
      | AAAA |    57745 |
      | aaaa |    90001 |
      +------+----------+
      

      Attachments

        Issue Links

          Activity

            People

              dleeyh Daniel Lee (Inactive)
              bar Alexander Barkov
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.