Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-21044

Wrong result when using a smaller size for sort buffer

Details

    Description

      DATASET

      create table t1(a varchar(255), b varchar(255), c varchar(255),d int);
      insert into t1 values ("jkl","mno", "pqr",1),("abc","def","ghi",2);
      insert into t1 values ("jkl","mno", "pqr",1),("abc","def","ghi",2);
      insert into t1 values ("jkl","mno", "pqr",1),("abc","def","ghi",2);
      insert into t1 values ("jkl","mno", "pqr",1),("abc","def","ghi",2);
      insert into t1 values ("jkl","mno", "pqr",1),("abc","def","ghi",2);
      insert into t1 values ("jkl","mno", "pqr",1),("abc","def","ghi",2);
      insert into t1 values ("jkl","mno", "pqr",1),("abc","def","ghi",2);
      insert into t1 values ("jkl","mno", "pqr",1),("abc","def","ghi",2);
      

      MariaDB [test]> select * from t1 order  by d;
      +------+------+------+------+
      | a    | b    | c    | d    |
      +------+------+------+------+
      | jkl  | mno  | pqr  |    1 |
      | jkl  | mno  | pqr  |    1 |
      | jkl  | mno  | pqr  |    1 |
      | jkl  | mno  | pqr  |    1 |
      | jkl  | mno  | pqr  |    1 |
      | jkl  | mno  | pqr  |    1 |
      | jkl  | mno  | pqr  |    1 |
      | jkl  | mno  | pqr  |    1 |
      | abc  | def  | ghi  |    2 |
      | abc  | def  | ghi  |    2 |
      | abc  | def  | ghi  |    2 |
      | abc  | def  | ghi  |    2 |
      | abc  | def  | ghi  |    2 |
      | abc  | def  | ghi  |    2 |
      | abc  | def  | ghi  |    2 |
      | abc  | def  | ghi  |    2 |
      +------+------+------+------+
      16 rows in set (0.01 sec)
       
      MariaDB [test]> set sort_buffer_size=1024;
      Query OK, 0 rows affected (0.00 sec)
       
      MariaDB [test]> select * from t1 order  by d;
      +------+------+------+------+
      | a    | b    | c    | d    |
      +------+------+------+------+
      | jkl  | mno  | pqr  |    1 |
      | jkl  | mno  | pqr  |    1 |
      | jkl  | mno  | pqr  |    1 |
      | jkl  | mno  | pqr  |    1 |
      | jkl  | mno  | pqr  |    1 |
      | jkl  | mno  | pqr  |    1 |
      | jkl  | mno  | pqr  |    1 |
      | jkl  | mno  | pqr  |    1 |
      | abc  | def  | ghi  |    2 |
      | abc  | def  | ghi  |    2 |
      | abc  | def  | ghi  |    2 |
      | abc  | def  | ghi  |    2 |
      | abc  | def  | ghi  |    2 |
      | abc  | def  | ghi  |    2 |
      +------+------+------+------+
      14 rows in set (0.00 sec)
      
      

      With smaller sort_buffer_size we get incorrect results, actual expected rows is 16

      Attachments

        Activity

          Investigation
          Inside the function filesort:

          (gdb) p param.max_keys_per_buffer
          $7 = 1

              /*
                Use also the space previously used by string pointers in sort_buffer
                for temporary key storage.
              */
              param.max_keys_per_buffer=((param.max_keys_per_buffer *
                                          (param.rec_length + sizeof(char*))) /
                                         param.rec_length - 1);
          

          In the example this sets the param.max_keys_per_buffer to 0.

          (gdb) p param.max_keys_per_buffer
          $8 = 0
          

          The problem with this is later this causes the merge buffers to not read anything as there is no key and one merge buffer is skipped and not written to the output file. Hence we get wrong results.

          varun Varun Gupta (Inactive) added a comment - Investigation Inside the function filesort: (gdb) p param.max_keys_per_buffer $7 = 1 /* Use also the space previously used by string pointers in sort_buffer for temporary key storage. */ param.max_keys_per_buffer=((param.max_keys_per_buffer * (param.rec_length + sizeof ( char *))) / param.rec_length - 1); In the example this sets the param.max_keys_per_buffer to 0. (gdb) p param.max_keys_per_buffer $8 = 0 The problem with this is later this causes the merge buffers to not read anything as there is no key and one merge buffer is skipped and not written to the output file. Hence we get wrong results.
          varun Varun Gupta (Inactive) added a comment - Patch http://lists.askmonty.org/pipermail/commits/2019-November/014050.html

          Ok to push.

          psergei Sergei Petrunia added a comment - Ok to push.

          People

            varun Varun Gupta (Inactive)
            varun Varun Gupta (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.