Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-22556

Incorrect result for window function when using encrypt-tmp-files=ON

Details

    Description

      Here is the mtr test case:

      source include/have_file_key_management_plugin.inc;
      source include/have_sequence.inc;
      source include/have_innodb.inc;
       
      select @@encrypt_tmp_files;
      set sort_buffer_size= 2000;
      create table t1( a DECIMAL(12,0) DEFAULT NULL, b VARCHAR(20) DEFAULT NULL, c DECIMAL(12,0) DEFAULT NULL)engine=INNODB;
      insert into t1 select seq, seq, seq from seq_1_to_5000;
      select count(*) from (select a, b, c, ROW_NUMBER() OVER (PARTITION BY a) FROM t1)q;
       
      drop table t1;
      

      and the opt file used is

      --encrypt-tmp-files=ON
      --loose-file-key-management-encryption-algorithm=aes_cbc
      --file-key-management=FORCE
      --loose-file-key-management-filename=$MTR_SUITE_DIR/t/filekeys-data.enc
      --loose-file-key-management-filekey=FILE:$MTR_SUITE_DIR/t/filekeys-data.key
      

      The result of the query is

      MariaDB [test]> select count(*) from (select a, b, c, ROW_NUMBER() OVER (PARTITION BY a) FROM t1)q;
      +----------+
      | count(*) |
      +----------+
      |     4096 |
      +----------+
      1 row in set (0.373 sec)
      
      

      Now lets run the query with encrypt-tmp-files=OFF

      MariaDB [test]> select @@encrypt_tmp_files;
      +---------------------+
      | @@encrypt_tmp_files |
      +---------------------+
      |                   0 |
      +---------------------+
      1 row in set (0.001 sec)
      

      and re run the query

      MariaDB [test]> select count(*) from (select a, b, c, ROW_NUMBER() OVER (PARTITION BY a) FROM t1)q;
      +----------+
      | count(*) |
      +----------+
      |     5000 |
      +----------+
      1 row in set (0.378 sec)
      

      The result returned with encrypt-tmp-files=OFF is correct. The table t1 has 5k rows.

      Attachments

        Issue Links

          Activity

            varun Varun Gupta (Inactive) added a comment - - edited

            Had a discussion with serg

            He said:
            lseek is definitely wrong when encryption is used in the function reinit_io_cache().
            Encryption adds storage overhead and hides it from the caller by recalculating offsets and lengths
            and anyway two different IO_CACHE cannot possibly modify the same file
            because the encryption key is randomly generated and stored in the IO_CACHE.
            So when the tempfiles are encrypted DO NOT use lseek to change end_of_file.

            Further observation, not directly related to this issue but why is lseek used to change end_of_file (related to MDEV-19117)

            1. the end_of_file update (using lseek) is ONLY used for binlog index files
            2. the whole point is to update file length when the file was modified via a different file descriptor. The temporary IO_CACHE files can never be modified via a different file descriptor
            3. Only temporary IO_CACHE can be encrypted and updating end_of_file for encrypted temporary IO_CACHE should not be done using lseek
            varun Varun Gupta (Inactive) added a comment - - edited Had a discussion with serg He said: lseek is definitely wrong when encryption is used in the function reinit_io_cache(). Encryption adds storage overhead and hides it from the caller by recalculating offsets and lengths and anyway two different IO_CACHE cannot possibly modify the same file because the encryption key is randomly generated and stored in the IO_CACHE. So when the tempfiles are encrypted DO NOT use lseek to change end_of_file. Further observation, not directly related to this issue but why is lseek used to change end_of_file (related to MDEV-19117 ) the end_of_file update (using lseek) is ONLY used for binlog index files the whole point is to update file length when the file was modified via a different file descriptor. The temporary IO_CACHE files can never be modified via a different file descriptor Only temporary IO_CACHE can be encrypted and updating end_of_file for encrypted temporary IO_CACHE should not be done using lseek
            varun Varun Gupta (Inactive) added a comment - Patch https://github.com/MariaDB/server/commit/82b6a5cb85fefc15afcbc35f3782139a76432f4e

            People

              varun Varun Gupta (Inactive)
              varun Varun Gupta (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.