[MCOL-1396] VARCHAR returning NULL when StringStore memory limit exceeded Created: 2018-05-08 Updated: 2018-05-25 Resolved: 2018-05-25 |
|
| Status: | Closed |
| Project: | MariaDB ColumnStore |
| Component/s: | ExeMgr |
| Affects Version/s: | 1.1.4 |
| Fix Version/s: | 1.1.5 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Andrew Hutchings (Inactive) | Assignee: | Daniel Lee (Inactive) |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | None | ||
| Attachments: |
|
| Sprint: | 2018-10, 2018-11 |
| Description |
|
Report of a table with 100M rows containing a VARCHAR(48) and VARCHAR(32) will start returning NULL and truncated versions of:
|
| Comments |
| Comment by Andrew Hutchings (Inactive) [ 2018-05-08 ] | ||
|
Current working theory: StringStore in 1.0 could hold up to 4GB of strings before it was full. In 1.1 we use the high bit of StringStore to mark long strings (TEXT/BLOB) so that we could store these separately. This means StringStore can store 2GB, after this it will be storing using the high bit (there is no check to see if we are doing something bad) and the retrieval will try and get from the long string storage (which is empty). Regardless of outcome we need to modify StringStore to handle > 4GB of data (64bit ints). | ||
| Comment by Andrew Hutchings (Inactive) [ 2018-05-08 ] | ||
|
How to reproduce (using attachments): 1. Import the table create.sql into test database
Some of the rows will have "NULL" instead of data | ||
| Comment by Andrew Hutchings (Inactive) [ 2018-05-08 ] | ||
|
Confirmed the problem is as described in comment #1 | ||
| Comment by Andrew Hutchings (Inactive) [ 2018-05-08 ] | ||
|
For QA: see comment #2 | ||
| Comment by Daniel Lee (Inactive) [ 2018-05-23 ] | ||
|
Build tested: 1.1.5-1 source /root/columnstore/mariadb-columnstore-server Merge pull request #112 from mariadb-corporation/davidhilldallas-patch-3 update to 1.1.5 /root/columnstore/mariadb-columnstore-server/mariadb-columnstore-engine Merge pull request #475 from drrtuy/ /root/mariadb-columnstore-tools update to 1.1.5 I reproduced the issue in 1.1.4-1. The query ran for a while and ColumnStore eventually restarted due to swap space usage, an expected behavior (My VM used for testing has limited amount of memory). When I checked the output.data file, there were 24117249 lines, with 10698751 lines of "NULL NULL" toward the end of the file. In 1.1.5-1, the same test produced 18692097 lines without "NULL NULL"s. That means the query did not return NULLs when memory was running out. Therefore, the reported issue seemed to have been fixed, an additional test uncovered an compression issue in 1.1.5-1. The same issue did not occurred in 1.1.4-1. After creating the table and loading the data file (Both 1.1.5-1 and 1.1.4-1 tests used the same data file, copied across the network), I executed the following query to make sure there is no NULLs in the table.
further investigation show that column c had a decompression issue. Here is what's in the err.log file. May 23 18:40:01 localhost PrimProc[666]: 01.296203 |0|0|0| C 28 CAL0061: PrimProc error reading file for OID 3791; Error decompressing block 63 code=-1 part=0 seg=2 I recreated the table and imported the data file again, then the decompression error occurred on column b instead with similar error messages. For testing, I am using Centos 7 vagrant vm box, with 6 gb of memory configured. Columnstore stack is single server with one local dbroot. | ||
| Comment by Andrew Hutchings (Inactive) [ 2018-05-24 ] | ||
|
struggling to reproduce this. I'm wondering if it is a RAM issue. I don't have anything that small but I'll create one and try. | ||
| Comment by Andrew Hutchings (Inactive) [ 2018-05-24 ] | ||
|
tried many ways to reproduce this, re-assigned to Daniel to see if he can again on a new build. | ||
| Comment by Daniel Lee (Inactive) [ 2018-05-25 ] | ||
|
Did more testing with new build and newly generated data and could not reproduce the issue. |