Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-30072

Wrong ORDER BY for a partitioned prefix key + NOPAD

Details

    Description

      I create a table:

      CREATE OR REPLACE TABLE t1
      (
        id INT,
        data VARCHAR(20),
        KEY data_id (data,id)
      ) COLLATE utf8mb3_unicode_nopad_ci ENGINE=MyISAM
      PARTITION BY RANGE COLUMNS (id)
      (
        PARTITION p10 VALUES LESS THAN (20),
        PARTITION p20 VALUES LESS THAN MAXVALUE
      );
      INSERT INTO t1 VALUES (30, 'ss '), (10, 'ß ');
      

      Now I run ORDER BY queries:

      SELECT id FROM t1 WHERE data='ss ' ORDER BY id;
      

      +------+
      | id   |
      +------+
      |   10 |
      |   30 |
      +------+
      

      SELECT id FROM t1 WHERE data='ss ' ORDER BY id DESC;
      

      +------+
      | id   |
      +------+
      |   30 |
      |   10 |
      +------+
      

      Looks good so far.

      Now I alter the table changing the full key to a prefix key:

      ALTER TABLE t1 DROP KEY data_id, ADD KEY data_id2(data(10),id);
      

      And return ORDER BY queries again:

      SELECT id FROM t1 WHERE data='ss ' ORDER BY id;
      

      +------+
      | id   |
      +------+
      |   30 |
      |   10 |
      +------+
      

      SELECT id FROM t1 WHERE data='ss ' ORDER BY id DESC;
      

      +------+
      | id   |
      +------+
      |   10 |
      |   30 |
      +------+
      

      Notice the order changed to opposite.

      Attachments

        Issue Links

          Activity

            bar Alexander Barkov added a comment - - edited

            The problem was caused by MDEV-25904, by this chunk for Field_varstring (and a similar chunk for Field_blob)

            @@ -7880,8 +7854,12 @@ int Field_varstring::cmp_prefix(const uchar *a_ptr, const
             uchar *b_ptr,
                 a_length= uint2korr(a_ptr);
                 b_length= uint2korr(b_ptr);
               }
            -  return cmp_str_prefix(a_ptr+length_bytes, a_length, b_ptr+length_bytes,
            -                        b_length, prefix_len, field_charset);
            +  return field_charset->coll->strnncollsp_nchars(field_charset,
            +                                                 a_ptr + length_bytes,
            +                                                 a_length,
            +                                                 b_ptr + length_bytes,
            +                                                 b_length,
            +                                                 prefix_len / field_charset->mbmaxlen);
             }
             

            This change was OK for PAD SPACE collations and improved performance for VARCHAR/TEXT columns.

            But it appeared not to be correct for NOPAD collations - it changed the behaviour to wrong.

            It seems, to combine both performance and correct behavior, instead of one function strnncollsp_nchars() we need two separate functions:

            • strnncollsp_nchars_char()
            • strnncollsp_nchars_varchar()

            Both should truncate too long strings.
            However, only the CHAR version should (virtually) pad too short strings with trailing spaces.

            bar Alexander Barkov added a comment - - edited The problem was caused by MDEV-25904 , by this chunk for Field_varstring (and a similar chunk for Field_blob) @@ -7880,8 +7854,12 @@ int Field_varstring::cmp_prefix(const uchar *a_ptr, const uchar *b_ptr, a_length= uint2korr(a_ptr); b_length= uint2korr(b_ptr); } - return cmp_str_prefix(a_ptr+length_bytes, a_length, b_ptr+length_bytes, - b_length, prefix_len, field_charset); + return field_charset->coll->strnncollsp_nchars(field_charset, + a_ptr + length_bytes, + a_length, + b_ptr + length_bytes, + b_length, + prefix_len / field_charset->mbmaxlen); } This change was OK for PAD SPACE collations and improved performance for VARCHAR/TEXT columns. But it appeared not to be correct for NOPAD collations - it changed the behaviour to wrong. It seems, to combine both performance and correct behavior, instead of one function strnncollsp_nchars() we need two separate functions: strnncollsp_nchars_char() strnncollsp_nchars_varchar() Both should truncate too long strings. However, only the CHAR version should (virtually) pad too short strings with trailing spaces.

            Also repeatable with TEXT:

            CREATE OR REPLACE TABLE t1
            (
              id INT,
              data TEXT,
              KEY data_id (data(10),id)
            ) COLLATE utf8mb3_unicode_nopad_ci ENGINE=MyISAM
            PARTITION BY RANGE COLUMNS (id)
            (
              PARTITION p10 VALUES LESS THAN (20),
              PARTITION p20 VALUES LESS THAN MAXVALUE
            );
            INSERT INTO t1 VALUES (30, 'ss '), (10, 'ß ');
            SELECT id FROM t1 WHERE data='ss ' ORDER BY id;
            SELECT id FROM t1 WHERE data='ss ' ORDER BY id DESC;
            

            MariaDB [test]> SELECT id FROM t1 WHERE data='ss ' ORDER BY id;
            +------+
            | id   |
            +------+
            |   30 |
            |   10 |
            +------+
            2 rows in set (0.002 sec)
             
            MariaDB [test]> SELECT id FROM t1 WHERE data='ss ' ORDER BY id DESC;
            +------+
            | id   |
            +------+
            |   10 |
            |   30 |
            +------+
            2 rows in set (0.002 sec)
            

            bar Alexander Barkov added a comment - Also repeatable with TEXT: CREATE OR REPLACE TABLE t1 ( id INT , data TEXT, KEY data_id (data(10),id) ) COLLATE utf8mb3_unicode_nopad_ci ENGINE=MyISAM PARTITION BY RANGE COLUMNS (id) ( PARTITION p10 VALUES LESS THAN (20), PARTITION p20 VALUES LESS THAN MAXVALUE ); INSERT INTO t1 VALUES (30, 'ss ' ), (10, 'ß ' ); SELECT id FROM t1 WHERE data= 'ss ' ORDER BY id; SELECT id FROM t1 WHERE data= 'ss ' ORDER BY id DESC ; MariaDB [test]> SELECT id FROM t1 WHERE data='ss ' ORDER BY id; +------+ | id | +------+ | 30 | | 10 | +------+ 2 rows in set (0.002 sec)   MariaDB [test]> SELECT id FROM t1 WHERE data='ss ' ORDER BY id DESC; +------+ | id | +------+ | 10 | | 30 | +------+ 2 rows in set (0.002 sec)

            People

              bar Alexander Barkov
              bar Alexander Barkov
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.