[MCOL-764] redistributedata causes inflight queries to failed with "data file does not exist" error Created: 2017-06-09  Updated: 2022-11-05  Resolved: 2022-11-05

Status: Closed
Project: MariaDB ColumnStore
Component/s: N/A
Affects Version/s: 1.0.9
Fix Version/s: Icebox

Type: Bug Priority: Major
Reporter: Daniel Lee (Inactive) Assignee: Unassigned
Resolution: Won't Do Votes: 0
Labels: None

Epic Link: ColumnStore Performance Improvements

 Description   

Build tested: 1.0.9-1

Test environment
1 UM and 4 PMs. Load line item with 20gb of data (loaded 10g source twice) to PM1 only, using the -P option in cpimport.

As soon as redistributedata started, I executed the following query:

MariaDB [tpch10c]> select min(l_orderkey), min(l_partkey), min(l_suppkey), min(l_linenumber), sum(l_quantity), sum(l_extendedprice), max(l_discount), max(l_tax), count(l_returnflag), count(l_linestatus), max(l_shipdate), max(l_receiptdate), max(l_shipinstruct), count(l_shipmode), count(l_comment) from lineitem;
ERROR 1815 (HY000): Internal error: IDB-2039: Data file does not exist, please contact your system administrator for more information.

The error does not happen every time. If redistributedata completes without triggering the error, you would need to truncate the table, load data to PM1 again before repeating the test.

The query access all rows in all columns of the lineitem table.

At this point, I do not know if the error occurred due to
1) a file being read has been moved (Can a file opened for read be moved)
2) a file about to be read has been moved
3) something else



 Comments   
Comment by Daniel Lee (Inactive) [ 2017-06-09 ]

Additional data from log files in PM1, which had all the data to begin with.

Soon after the file was opened for redistribution, primproc failed to read the file in it's original location. This error occurred 4 times for the same file. Each time, PrimProc tried to open the file 5 times, with 1 second pause between each read, before declaring the file was missing.

crit.log

Jun 9 19:33:08 localhost PrimProc[14810]: 08.393723 |0|0|0| C 28 CAL0053: PrimProc could not open file for OID 3191; /000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf:No such file or directory
Jun 9 19:33:13 localhost PrimProc[14810]: 13.464454 |0|0|0| C 28 CAL0053: PrimProc could not open file for OID 3191; /000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf:No such file or directory
Jun 9 19:33:18 localhost PrimProc[14810]: 18.495811 |0|0|0| C 28 CAL0053: PrimProc could not open file for OID 3191; /000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf:No such file or directory
Jun 9 19:33:23 localhost PrimProc[14810]: 23.508516 |0|0|0| C 28 CAL0053: PrimProc could not open file for OID 3191; /000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf:No such file or directory

Filtered rows from debug.log

[root@localhost columnstore]# cat debug.log |grep "/000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf"
Jun 9 19:32:43 localhost writeengineserver[14876]: 43.066410 |0|0|0| I 32 CAL0002: RED: <=redistributing: /usr/local/mariadb/columnstore/data1/000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf, oid=3191, db=1, part=3, seg=0 to db=3 @workerThread:515
Jun 9 19:32:43 localhost writeengineserver[14876]: 43.066628 |0|0|0| I 32 CAL0002: RED: open /usr/local/mariadb/columnstore/data1/000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf, oid=3191, dbroot=1, partition=3, segment=0. 0x7f198c0ecf80 @workerThread:539

Jun 9 19:33:03 localhost IDBFile[14810]: 03.354681 |0|0|0| D 35 CAL0002: Failed to open file: /000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf, exception: unable to open Unbuffered file
Jun 9 19:33:04 localhost IDBFile[14810]: 04.361182 |0|0|0| D 35 CAL0002: Failed to open file: /000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf, exception: unable to open Unbuffered file
Jun 9 19:33:05 localhost IDBFile[14810]: 05.368675 |0|0|0| D 35 CAL0002: Failed to open file: /000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf, exception: unable to open Unbuffered file
Jun 9 19:33:06 localhost IDBFile[14810]: 06.372826 |0|0|0| D 35 CAL0002: Failed to open file: /000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf, exception: unable to open Unbuffered file
Jun 9 19:33:07 localhost IDBFile[14810]: 07.376200 |0|0|0| D 35 CAL0002: Failed to open file: /000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf, exception: unable to open Unbuffered file
Jun 9 19:33:08 localhost PrimProc[14810]: 08.393723 |0|0|0| C 28 CAL0053: PrimProc could not open file for OID 3191; /000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf:No such file or directory

Jun 9 19:33:08 localhost IDBFile[14810]: 08.441235 |0|0|0| D 35 CAL0002: Failed to open file: /000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf, exception: unable to open Unbuffered file
Jun 9 19:33:09 localhost IDBFile[14810]: 09.444775 |0|0|0| D 35 CAL0002: Failed to open file: /000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf, exception: unable to open Unbuffered file
Jun 9 19:33:10 localhost IDBFile[14810]: 10.449832 |0|0|0| D 35 CAL0002: Failed to open file: /000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf, exception: unable to open Unbuffered file
Jun 9 19:33:11 localhost IDBFile[14810]: 11.450436 |0|0|0| D 35 CAL0002: Failed to open file: /000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf, exception: unable to open Unbuffered file
Jun 9 19:33:12 localhost IDBFile[14810]: 12.454114 |0|0|0| D 35 CAL0002: Failed to open file: /000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf, exception: unable to open Unbuffered file
Jun 9 19:33:13 localhost PrimProc[14810]: 13.464454 |0|0|0| C 28 CAL0053: PrimProc could not open file for OID 3191; /000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf:No such file or directory

Jun 9 19:33:13 localhost IDBFile[14810]: 13.465257 |0|0|0| D 35 CAL0002: Failed to open file: /000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf, exception: unable to open Unbuffered file
Jun 9 19:33:14 localhost IDBFile[14810]: 14.468139 |0|0|0| D 35 CAL0002: Failed to open file: /000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf, exception: unable to open Unbuffered file
Jun 9 19:33:15 localhost IDBFile[14810]: 15.473495 |0|0|0| D 35 CAL0002: Failed to open file: /000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf, exception: unable to open Unbuffered file
Jun 9 19:33:16 localhost IDBFile[14810]: 16.486629 |0|0|0| D 35 CAL0002: Failed to open file: /000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf, exception: unable to open Unbuffered file
Jun 9 19:33:17 localhost IDBFile[14810]: 17.489101 |0|0|0| D 35 CAL0002: Failed to open file: /000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf, exception: unable to open Unbuffered file
Jun 9 19:33:18 localhost PrimProc[14810]: 18.495811 |0|0|0| C 28 CAL0053: PrimProc could not open file for OID 3191; /000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf:No such file or directory

Jun 9 19:33:18 localhost IDBFile[14810]: 18.496818 |0|0|0| D 35 CAL0002: Failed to open file: /000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf, exception: unable to open Unbuffered file
Jun 9 19:33:19 localhost IDBFile[14810]: 19.498383 |0|0|0| D 35 CAL0002: Failed to open file: /000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf, exception: unable to open Unbuffered file
Jun 9 19:33:20 localhost IDBFile[14810]: 20.499360 |0|0|0| D 35 CAL0002: Failed to open file: /000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf, exception: unable to open Unbuffered file
Jun 9 19:33:21 localhost IDBFile[14810]: 21.500392 |0|0|0| D 35 CAL0002: Failed to open file: /000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf, exception: unable to open Unbuffered file
Jun 9 19:33:22 localhost IDBFile[14810]: 22.506067 |0|0|0| D 35 CAL0002: Failed to open file: /000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf, exception: unable to open Unbuffered file
Jun 9 19:33:23 localhost PrimProc[14810]: 23.508516 |0|0|0| C 28 CAL0053: PrimProc could not open file for OID 3191; /000.dir/000.dir/012.dir/119.dir/003.dir/FILE000.cdf:No such file or directory

Comment by Todd Stoffel (Inactive) [ 2022-11-05 ]

Item is out of date. Closing due to inactivity. If you feel this was done in error please open a new ticket.

Generated at Thu Feb 08 02:23:39 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.