[MCOL-3251] FILE001.cdf:No such file or directory - occurred when query, truncate and cpimport collided Created: 2019-04-12 Updated: 2020-11-12 Resolved: 2020-03-24 |
|
| Status: | Closed |
| Project: | MariaDB ColumnStore |
| Component/s: | None |
| Affects Version/s: | 1.2.2 |
| Fix Version/s: | 1.2.6, 1.4.4, 1.5.1 |
| Type: | Bug | Priority: | Minor |
| Reporter: | David Hill (Inactive) | Assignee: | Daniel Lee (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
2um 2pm with local query |
||
| Sprint: | 2020-3, 2020-4, 2020-5 |
| Description |
|
Customer report errors on both PMs Apr 11 19:11:05 usfit-scdb6 PrimProc[144321]: 05.309773 |0|0|0| C 28 CAL0053: PrimProc could not open file for OID 73844; /000.dir/001.dir/032.dir/116.dir/000.dir/FILE001.cdf:No such file or directory No ExeMgr restarted, which has caused this issue in the past. Customer reported this is might have caused the problem via the Issue There was a long running query.. and at the end of the query this uses table X. This query was running.. when at the same time a process ran that TRUNCATES and cpimports table X. It appears the data files were updated, leaving the long query to look in the previous place for the data that was not there. This is when our error returned and still presents a problem for this type of thing to be able to happen. It is understandable that it can happen.. however the errors given seem to indicate a l detrimental loss of data |
| Comments |
| Comment by Patrick LeBlanc (Inactive) [ 2020-03-13 ] |
|
The existing method to generate filenames is excessive for what PrimProc needs. Unclear whether other callers need that extra functionality or error handling. For now I've simply added a minimal method to PrimProc, because it just needs the filename. This introduces no new error paths, risks nothing outside of PrimProc breaking, and will be more informative to the user. When a dbroot goes offline during a query, the user will get a file-not-found error as before, with a filename that indicates which dbroot it looked in. |
| Comment by Patrick LeBlanc (Inactive) [ 2020-03-13 ] |
|
Oops. |
| Comment by Gagan Goel (Inactive) [ 2020-03-17 ] |
|
For QA: The fix might not be verifiable if the original bug cannot be reproduced. pleblanc ran some regression tests to ensure the filenames generated by the fix are correct. |
| Comment by Daniel Lee (Inactive) [ 2020-03-24 ] |
|
From the information in the ticket, I am not sure exactly what the new behavior is and if this ticket is testable. I performed the following test on these different releases: 1) cpimport 10gb lineitem table query: select sum(l_quantity), sum(l_extendedprice) from lineitem where l_comment > "A"; Depending on how soon I truncate the lineitem table after the query has been started, the behavior would be different. 1.2.2-1 MariaDB [mytest]> select sum(l_quantity), sum(l_extendedprice) from lineitem where l_comment > "A"; crit.log [root@localhost columnstore]# cat crit.log debug.log Mar 23 14:32:31 localhost IDBFile[4952]: 31.983083 |0|0|0| D 35 CAL0002: Failed to open file: /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/207.dir/000.dir/FILE001.cdf, exception: unable to open Unbuffered file 1.2.6-1 MariaDB [mytest]> select sum(l_quantity), sum(l_extendedprice) from lineitem where l_comment > "A"; crit.log Mar 24 18:24:50 localhost PrimProc[14561]: 50.303039 |0|0|0| C 28 CAL0053: PrimProc could not open file for OID 3033; /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/217.dir/000.dir/FILE002.cdf:No such file or directory debug.log Mar 24 18:24:49 localhost IDBFile[14561]: 49.296938 |0|0|0| D 35 CAL0002: Failed to open file: /usr/local/mariadb/columnstore/data1/000.dir/000.dir/011.dir/217.dir/000.dir/FILE002.cdf, exception: unable to open Unbuffered file 1.4.4-1 and 1.5.0-1 Both have similar behavior. Some times the truncate table statement seemed to be waiting for the query to finish. Both statements processed successfully. MariaDB [mytest]> select sum(l_quantity), sum(l_extendedprice) from lineitem where l_comment > "A"; crit.log [root@localhost columnstore]# cat crit.log debug.log Mar 24 18:47:21 localhost ExeMgr[4173]: 21.795796 |14|0|0| D 16 CAL0041: Start SQL statement: select sum(l_quantity), sum(l_extendedprice) from lineitem where l_comment > "A"; |mytest| |
| Comment by Daniel Lee (Inactive) [ 2020-03-24 ] |
|
Truncating a table while a long query on the table is inflight will cause a file-not-found error and the query would failed. This is the expected behavior. The change for this ticket is to make the messages more clear regarding the file names and paths. |