Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
1.0.9
-
None
Description
Had customer report errors with PrimProc could not open file for OID for unknown reasons, MCOL-801 and MCOL-804.
Was able to reproduce this error by doing the following. Wasnt sure if same problem with 801/804, so opened a new BUG.
1. setup a 1um / 2 pm system with 50gb tpch1 database
2. run a script that continually did the following query:
[root@ip-172-30-0-161 ~]# cat query.sh
#!/bin/bash
while [ true ]; do
echo "select count from lineitem" | /usr/local//mariadb/columnstore/mysql/bin/mysql --defaults-extra-file=/usr/local//mariadb/columnstore/mysql/my.cnf -u root tpch100
sleep 1
done
exit 0
3. Did a pkill on pm2 PrimProc
pm1 errors logs soon started after the recovery was performed:
Jul 14 16:24:46 ip-172-30-0-176 PrimProc[93531]: 46.550644 |0|0|0| W 28 CAL0000: IDB-2039: Data file does not exist, please contact your system administrator for more information.
Jul 14 16:24:47 ip-172-30-0-176 IDBFile[93531]: 47.550530 |0|0|0| D 35 CAL0002: Failed to open file: /000.dir/000.dir/012.dir/012.dir/000.dir/FILE002.cdf, exception: unable to open Unbuffered file
Jul 14 16:24:48 ip-172-30-0-176 IDBFile[93531]: 48.550839 |0|0|0| D 35 CAL0002: Failed to open file: /000.dir/000.dir/012.dir/012.dir/000.dir/FILE002.cdf, exception: unable to open Unbuffered file
Jul 14 16:24:49 ip-172-30-0-176 IDBFile[93531]: 49.551158 |0|0|0| D 35 CAL0002: Failed to open file: /000.dir/000.dir/012.dir/012.dir/000.dir/FILE002.cdf, exception: unable to open Unbuffered file
This file exist on pm2, so ExeMgr is sending the request to the wrong pm1/PrimProc
data2]# ll 000.dir/000.dir/012.dir/012.dir/000.dir/FILE002.cdf
rw-rr- 1 root root 11345920 Jul 14 16:00 000.dir/000.dir/012.dir/012.dir/000.dir/FILE002.cdf