Details
-
Bug
-
Status: Closed (View Workflow)
-
Minor
-
Resolution: Fixed
-
None
-
2-node combo configuration on AWS, using 2d.8xlarge instance type
-
2017-3, 2017-4, 2017-5
Description
Build tested: 1.0.7-1 AMI.
The DBT3 test failed with many missing data file errors. I checked the log files and noticed that there is a "too many open files" error when connection to ExeMgr is lost. Soon after, PrimProc got restarted.
Does ColumnStore close/reuse the file handle it hit a missing data file error?
Feb 7 15:40:49 ip-172-30-0-236 PrimProc[66666]: 49.330422 |0|0|0| C 28 CAL0053: PrimProc could not open file for OID 3053; /000.dir/000.dir/011.dir/237.dir/000.dir/FILE001.cdf:No such file or directory
Feb 7 17:41:45 ip-172-30-0-236 joblist[124597]: 45.262103 |0|0|0| C 05 CAL0000: /home/builder/mariadb-columnstore-server/mariadb-columnstore-engine/dbcon/execplan/clientrotator.cpp @ 318 Could not get a ExeMgr connection.
Feb 7 17:41:45 ip-172-30-0-236 joblist[124597]: 45.262157 |0|0|0| C 05 CAL0000: /home/builder/mariadb-columnstore-server/mariadb-columnstore-engine/dbcon/execplan/clientrotator.cpp @ 146 /home/builder/mariadb-columnstore-server/mariadb-columnstore-engine/dbcon/execplan/clientrotator.cpp: Could not get a connection to a ExeMgr
Feb 7 21:39:13 ip-172-30-0-236 PrimProc[8608]: 13.252067 |0|0|0| C 28 CAL0053: PrimProc could not open file for OID 3352; /home/mariadb-user/mariadb/columnstore/data1/000.dir/000.dir/013.dir/024.dir/058.dir/FILE002.cdf:Too many open files
Feb 7 21:39:15 ip-172-30-0-236 joblist[9083]: 15.690711 |0|0|0| C 05 CAL0000: /home/builder/mariadb-columnstore-server/mariadb-columnstore-engine/dbcon/joblist/distributedenginecomm.cpp @ 382 DEC: lost connection to 172.30.0.232