Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-1396

VARCHAR returning NULL when StringStore memory limit exceeded

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Fixed
    • 1.1.4
    • 1.1.5
    • ExeMgr
    • None
    • 2018-10, 2018-11

    Description

      Report of a table with 100M rows containing a VARCHAR(48) and VARCHAR(32) will start returning NULL and truncated versions of:

      _CpNuLl_
      

      Attachments

        1. create.sql
          0.1 kB
        2. generate.php
          0.5 kB

        Activity

          Did more testing with new build and newly generated data and could not reproduce the issue.

          dleeyh Daniel Lee (Inactive) added a comment - Did more testing with new build and newly generated data and could not reproduce the issue.

          tried many ways to reproduce this, re-assigned to Daniel to see if he can again on a new build.

          LinuxJedi Andrew Hutchings (Inactive) added a comment - tried many ways to reproduce this, re-assigned to Daniel to see if he can again on a new build.

          struggling to reproduce this. I'm wondering if it is a RAM issue. I don't have anything that small but I'll create one and try.

          LinuxJedi Andrew Hutchings (Inactive) added a comment - struggling to reproduce this. I'm wondering if it is a RAM issue. I don't have anything that small but I'll create one and try.
          dleeyh Daniel Lee (Inactive) added a comment - - edited

          Build tested: 1.1.5-1 source

          /root/columnstore/mariadb-columnstore-server
          commit 0c983bff02172849a174dde46b62d76aa66485f8
          Merge: 6b8a674 d5e6d89
          Author: benthompson15 <ben.thompson@mariadb.com>
          Date: Thu Apr 26 16:16:51 2018 -0500

          Merge pull request #112 from mariadb-corporation/davidhilldallas-patch-3

          update to 1.1.5

          /root/columnstore/mariadb-columnstore-server/mariadb-columnstore-engine
          commit 1ea5198e0e9ecc2a8d13e6b44bf6c632f8561199
          Merge: 4533116 59858aa
          Author: Andrew Hutchings <andrew@linuxjedi.co.uk>
          Date: Fri May 18 12:37:47 2018 +0100

          Merge pull request #475 from drrtuy/MCOL-1415

          MCOL-1415

          /root/mariadb-columnstore-tools
          commit e83f2713be574c0b98b37faf4fa61c8ce4997e90
          Author: david hill <david.hill@mariadb.com>
          Date: Wed Apr 25 14:07:58 2018 -0500

          update to 1.1.5

          I reproduced the issue in 1.1.4-1. The query ran for a while and ColumnStore eventually restarted due to swap space usage, an expected behavior (My VM used for testing has limited amount of memory). When I checked the output.data file, there were 24117249 lines, with 10698751 lines of "NULL NULL" toward the end of the file.

          In 1.1.5-1, the same test produced 18692097 lines without "NULL NULL"s. That means the query did not return NULLs when memory was running out.

          Therefore, the reported issue seemed to have been fixed, an additional test uncovered an compression issue in 1.1.5-1. The same issue did not occurred in 1.1.4-1.

          After creating the table and loading the data file (Both 1.1.5-1 and 1.1.4-1 tests used the same data file, copied across the network), I executed the following query to make sure there is no NULLs in the table.

          MariaDB [mytest]> select count(*), sum(isnull(a)), sum(isnull(b)), sum(isnull(c)) from mcol1396;
          ERROR 1815 (HY000): Internal error: An unexpected condition within the query caused an internal processing error within InfiniDB. Please check the log files for more details. Additional Information: error in BatchPrimitiveProces
          

          further investigation show that column c had a decompression issue. Here is what's in the err.log file.

          May 23 18:40:01 localhost PrimProc[666]: 01.296203 |0|0|0| C 28 CAL0061: PrimProc error reading file for OID 3791; Error decompressing block 63 code=-1 part=0 seg=2
          May 23 18:40:01 localhost PrimProc[666]: 01.305513 |0|0|0| C 28 CAL0000: Error decompressing block 63 code=-1 part=0 seg=2
          May 23 18:40:04 localhost PrimProc[666]: 04.470432 |0|0|0| C 28 CAL0061: PrimProc error reading file for OID 3791; Error decompressing block 63 code=-1 part=0 seg=2
          May 23 18:40:04 localhost PrimProc[666]: 04.471273 |0|0|0| C 28 CAL0000: Error decompressing block 63 code=-1 part=0 seg=2

          I recreated the table and imported the data file again, then the decompression error occurred on column b instead with similar error messages.

          For testing, I am using Centos 7 vagrant vm box, with 6 gb of memory configured. Columnstore stack is single server with one local dbroot.

          dleeyh Daniel Lee (Inactive) added a comment - - edited Build tested: 1.1.5-1 source /root/columnstore/mariadb-columnstore-server commit 0c983bff02172849a174dde46b62d76aa66485f8 Merge: 6b8a674 d5e6d89 Author: benthompson15 <ben.thompson@mariadb.com> Date: Thu Apr 26 16:16:51 2018 -0500 Merge pull request #112 from mariadb-corporation/davidhilldallas-patch-3 update to 1.1.5 /root/columnstore/mariadb-columnstore-server/mariadb-columnstore-engine commit 1ea5198e0e9ecc2a8d13e6b44bf6c632f8561199 Merge: 4533116 59858aa Author: Andrew Hutchings <andrew@linuxjedi.co.uk> Date: Fri May 18 12:37:47 2018 +0100 Merge pull request #475 from drrtuy/ MCOL-1415 MCOL-1415 /root/mariadb-columnstore-tools commit e83f2713be574c0b98b37faf4fa61c8ce4997e90 Author: david hill <david.hill@mariadb.com> Date: Wed Apr 25 14:07:58 2018 -0500 update to 1.1.5 I reproduced the issue in 1.1.4-1. The query ran for a while and ColumnStore eventually restarted due to swap space usage, an expected behavior (My VM used for testing has limited amount of memory). When I checked the output.data file, there were 24117249 lines, with 10698751 lines of "NULL NULL" toward the end of the file. In 1.1.5-1, the same test produced 18692097 lines without "NULL NULL"s. That means the query did not return NULLs when memory was running out. Therefore, the reported issue seemed to have been fixed, an additional test uncovered an compression issue in 1.1.5-1. The same issue did not occurred in 1.1.4-1. After creating the table and loading the data file (Both 1.1.5-1 and 1.1.4-1 tests used the same data file, copied across the network), I executed the following query to make sure there is no NULLs in the table. MariaDB [mytest]> select count (*), sum ( isnull (a)), sum ( isnull (b)), sum ( isnull (c)) from mcol1396; ERROR 1815 (HY000): Internal error: An unexpected condition within the query caused an internal processing error within InfiniDB. Please check the log files for more details. Additional Information: error in BatchPrimitiveProces further investigation show that column c had a decompression issue. Here is what's in the err.log file. May 23 18:40:01 localhost PrimProc [666] : 01.296203 |0|0|0| C 28 CAL0061: PrimProc error reading file for OID 3791; Error decompressing block 63 code=-1 part=0 seg=2 May 23 18:40:01 localhost PrimProc [666] : 01.305513 |0|0|0| C 28 CAL0000: Error decompressing block 63 code=-1 part=0 seg=2 May 23 18:40:04 localhost PrimProc [666] : 04.470432 |0|0|0| C 28 CAL0061: PrimProc error reading file for OID 3791; Error decompressing block 63 code=-1 part=0 seg=2 May 23 18:40:04 localhost PrimProc [666] : 04.471273 |0|0|0| C 28 CAL0000: Error decompressing block 63 code=-1 part=0 seg=2 I recreated the table and imported the data file again, then the decompression error occurred on column b instead with similar error messages. For testing, I am using Centos 7 vagrant vm box, with 6 gb of memory configured. Columnstore stack is single server with one local dbroot.

          For QA: see comment #2

          LinuxJedi Andrew Hutchings (Inactive) added a comment - For QA: see comment #2

          Confirmed the problem is as described in comment #1

          LinuxJedi Andrew Hutchings (Inactive) added a comment - Confirmed the problem is as described in comment #1

          How to reproduce (using attachments):

          1. Import the table create.sql into test database
          2. php generate.php > data.tbl (go make coffee, this will take a long time)
          3. cpimport test mcol1396 data.tbl
          4. Execute the following:

          mcsmysql -uroot test -r -q -e "select if(a > 0, b, c), if(a > 0, c, b) from (select * from mcol1396) as se;" > output.data
          

          Some of the rows will have "NULL" instead of data

          LinuxJedi Andrew Hutchings (Inactive) added a comment - How to reproduce (using attachments): 1. Import the table create.sql into test database 2. php generate.php > data.tbl (go make coffee, this will take a long time) 3. cpimport test mcol1396 data.tbl 4. Execute the following: mcsmysql -uroot test -r -q -e "select if(a > 0, b, c), if(a > 0, c, b) from (select * from mcol1396) as se;" > output.data Some of the rows will have "NULL" instead of data

          Current working theory:

          StringStore in 1.0 could hold up to 4GB of strings before it was full. In 1.1 we use the high bit of StringStore to mark long strings (TEXT/BLOB) so that we could store these separately. This means StringStore can store 2GB, after this it will be storing using the high bit (there is no check to see if we are doing something bad) and the retrieval will try and get from the long string storage (which is empty).

          Regardless of outcome we need to modify StringStore to handle > 4GB of data (64bit ints).

          LinuxJedi Andrew Hutchings (Inactive) added a comment - Current working theory: StringStore in 1.0 could hold up to 4GB of strings before it was full. In 1.1 we use the high bit of StringStore to mark long strings (TEXT/BLOB) so that we could store these separately. This means StringStore can store 2GB, after this it will be storing using the high bit (there is no check to see if we are doing something bad) and the retrieval will try and get from the long string storage (which is empty). Regardless of outcome we need to modify StringStore to handle > 4GB of data (64bit ints).

          People

            dleeyh Daniel Lee (Inactive)
            LinuxJedi Andrew Hutchings (Inactive)
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.