[MCOL-3499] S3 with localStorage cpimport returned a 'bad length field' error message Created: 2019-09-10 Updated: 2019-10-29 Resolved: 2019-10-29 |
|
| Status: | Closed |
| Project: | MariaDB ColumnStore |
| Component/s: | cpimport |
| Affects Version/s: | 1.4.0 |
| Fix Version/s: | 1.4.1 |
| Type: | Bug | Priority: | Major |
| Reporter: | Daniel Lee (Inactive) | Assignee: | Daniel Lee (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Description |
|
Build tested: 1.4.0-1 server commit: With S3 localStorage on a single server configuration, I created a dbt3 database and tried to cpimport a 1gb dataset. When it was loading the lineitem table, the following error was shown and cpimport never finished after almost one hour. I had to kill the cpimport process. The same test with S3 cloud (AWS) completed successfully. Using table OID 3092 as the default JOB ID |
| Comments |
| Comment by Daniel Lee (Inactive) [ 2019-09-10 ] |
|
Although this test is for single server configuration. If the -d parameter is used for postConfigure, the test would pass. |
| Comment by Ben Thompson (Inactive) [ 2019-10-03 ] |
|
Have not been able to reproduce with script running overnight doing hundreds of cpimports. |
| Comment by Ben Thompson (Inactive) [ 2019-10-04 ] |
|
Reviewed changes since commit that is shown in ticket descriptions. No changes should have fixed this but I am unable to reproduce with 1.4.0 current develop builds. Will continue to monitor issue. |
| Comment by Daniel Lee (Inactive) [ 2019-10-10 ] |
|
While testing for other tickets on the 1.2.5-1 with S3 build, I also ran into the same issue.. Single node installation |
| Comment by Patrick LeBlanc (Inactive) [ 2019-10-10 ] |
|
We'll need more details. We haven't seen this prob in testing after milestone 1 when we wrote that code (March +/-). How big was the DB you were loading? Also, which machine were you using, we may need to get on that same machine to confirm the problem. |
| Comment by Ben Thompson (Inactive) [ 2019-10-14 ] |
|
Issue has been reproduced and fix is in progress. |
| Comment by Ben Thompson (Inactive) [ 2019-10-17 ] |
|
This appears to be the same issue as |
| Comment by Ben Thompson (Inactive) [ 2019-10-17 ] |
|
Issue is subtle race at startup between PrefixCache populating cache with files that are present on restart can find the first file written to cache directory by a write call at startup. this causes the Cache LRU list to contain a duplicate entry of the file. Later when _makespace tries to flush from lru the duplicate listing attempts to flush and cannot be found. Likely because it was removed or renamed within the metadata. Fix is to not run prefixCache call to populate in background for now. The more optimal solution that requires more detailed thought would be to make prefixCache::populate synchronous with write calls. Opening a new issue regarding this future improvement and linking. |
| Comment by Daniel Lee (Inactive) [ 2019-10-29 ] |
|
Build verified: 1.4.1-1 engine commit: |