[MCOL-5242] DiskJoinStep::joinFcn() Disk join could not open file (read access) Created: 2022-09-30  Updated: 2023-03-23  Resolved: 2023-02-22

Status: Closed
Project: MariaDB ColumnStore
Component/s: None
Affects Version/s: 5.6.2
Fix Version/s: 23.02.1

Type: Bug Priority: Critical
Reporter: Allen Lee (Inactive) Assignee: Leonid Fedorov
Resolution: Fixed Votes: 0
Labels: triage

Attachments: PNG File screenshot-1.png    
Issue Links:
Duplicate
is duplicated by MCOL-4866 /tmp/columnstore_tmp_files can be own... Closed
Assigned for Testing: Daniel Lee Daniel Lee (Inactive)

 Description   

User has been hitting after changing config directory as below:
============================
#mcsSetConfig SystemConfig SystemTempFileDir /tmp/columnstore_tmp_files

#systemctl stop mariadb-columnstore
#systemctl stop mariadb

#systemctl start mariadb
#systemctl start mariadb-columnstore
============================

  • Error log:

    Sep 26 19:48:03 cpandbsv01 joblist[148328]: 03.486812 |245|0|0| C 05 CAL0000: DiskJoinStep::joinFcn() Disk join could not open file (read access) /tmp/columnstore_tmp_files/joins//Columnstore-join-data-454-large: Operation not permitted
    debug.log
    Sep 26 19:48:03 cpandbsv01 joblist[148328]: 03.486812 |245|0|0| C 05 CAL0000: DiskJoinStep::joinFcn() Disk join could not open file (read access) /tmp/columnstore_tmp_files/joins//Columnstore-join-data-454-large: Operation not permitted
    Sep 26 19:48:04 cpandbsv01 ExeMgr[148328]: 04.273731 |245|0|0| D 16 CAL0042: End SQL statement
    Sep 26 19:48:04 cpandbsv01 writeenginesplit[159918]: 04.273983 |0|0|0| I 33 CAL0000: Send EOD message to All PMs
    Sep 26 19:48:04 cpandbsv01 writeengineserver[148313]: 04.319524 |0|0|0| D 32 CAL0000: 1805 : onReceiveEOD : child ID = 159947
    Sep 26 19:48:04 cpandbsv01 writeengineserver[148313]: 04.319600 |0|0|0| D 32 CAL0000: 1805 : Message Queue is empty; Stopping CF Thread
    Sep 26 19:48:07 cpandbsv01 cpimport.bin[159947]: 07.447017 |0|0|0| I 34 CAL0083: BulkLoad: JobId-87675; finished loading table <database>.<table>; 678080 rows inserted
    Sep 26 19:48:07 cpandbsv01 writeengine[159947]: 07.447066 |0|0|0| I 19 CAL0008: Bulkload |Job: /var/log/mariadb/columnstore/data/bulk/tmpjob/<filename>.xml |For table <database>.<table>: 678080 rows processed and 678080 rows inserted.
    Sep 26 19:48:07 cpandbsv01 cpimport.bin[159947]: 07.512299 |0|0|0| I 34 CAL0082: End BulkLoad: JobId-87675; status-SUCCESS
    Sep 26 19:48:07 cpandbsv01 writeengineserver[148313]: 07.835802 |0|0|0| I 32 CAL0000: 1805 : cpimport exit on success
    Sep 26 19:48:07 cpandbsv01 writeengineserver[148313]: 07.836307 |0|0|0| D 32 CAL0000: 1805 : onCpimportSuccess BrmReport Send
    Sep 26 19:48:07 cpandbsv01 writeengineserver[148313]: 07.836352 |0|0|0| D 32 CAL0000: 1805 : onReceiveEOD : child ID = 0
    Sep 26 19:48:07 cpandbsv01 writeengineserver[148313]: 07.836386 |0|0|0| D 32 CAL0000: 1805 : onReceiveEOD : child ID = 0
    Sep 26 19:48:08 cpandbsv01 writeenginesplit[159918]: 08.174147 |0|0|0| I 33 CAL0098: Received a Cpimport Pass from PM1.
    Sep 26 19:48:08 cpandbsv01 writeengineserver[148313]: 08.174146 |0|0|0| D 32 CAL0000: 1805 : OnReceiveCleanup arrived
    Sep 26 19:48:08 cpandbsv01 writeenginesplit[159918]: 08.243808 |0|0|0| I 33 CAL0000: Released Table Lock
    Sep 26 19:48:08 cpandbsv01 dbcon[147885]: 08.744482 |245|0|0| D 24 CAL0001: End SQL statement
    

  • syslog

 
Sep 26 19:48:03 cpandbsv01 systemd-logind: Removed session 2162031.
Sep 26 19:48:03 cpandbsv01 env: DiskJoinStep::joinFcn() caught a internal exception.
Sep 26 19:48:03 cpandbsv01 joblist[148328]: 03.486812 |245|0|0| C 05 CAL0000: DiskJoinStep::joinFcn() Disk join could not open file (read access) /tmp/columnstore_tmp_files/joins//Columnstore-join-data-454-large: Operation not permitted
Sep 26 19:48:03 cpandbsv01 env: joblist[148328]: 03.486812 |245|0|0| C 05 CAL0000: DiskJoinStep::joinFcn() Disk join could not open file (read access) /tmp/columnstore_tmp_files/joins//Columnstore-join-data-454-large: Operation not permitted
Sep 26 19:48:04 cpandbsv01 env: ExeMgr[148328]: 04.273731 |245|0|0| D 16 CAL0042: End SQL statement
Sep 26 19:48:04 cpandbsv01 writeenginesplit[159918]: 04.273983 |0|0|0| I 33 CAL0000: Send EOD message to All PMs
Sep 26 19:48:04 cpandbsv01 env: writeengineserver[148313]: 04.319524 |0|0|0| D 32 CAL0000: 1805 : onReceiveEOD : child ID = 159947
Sep 26 19:48:04 cpandbsv01 env: Received EOD
Sep 26 19:48:04 cpandbsv01 env: writeengineserver[148313]: 04.319600 |0|0|0| D 32 CAL0000: 1805 : Message Queue is empty; Stopping CF Thread
Sep 26 19:48:04 cpandbsv01 env: Notified all
Sep 26 19:48:04 cpandbsv01 env: DEBUG : still not stopped
Sep 26 19:48:04 cpandbsv01 env: CpiFeedThread Stopped!!
Sep 26 19:48:04 cpandbsv01 env: Finished running Feeder Thread!!
Sep 26 19:48:04 cpandbsv01 env: Going to close call = 16
Sep 26 19:48:04 cpandbsv01 env: ----- closed both pipes -------- aRet = 0
Sep 26 19:48:05 cpandbsv01 systemd-logind: New session 2162032 of user <username1>.
Sep 26 19:48:05 cpandbsv01 systemd: Started Session 2162032 of user <username1>.
Sep 26 19:48:05 cpandbsv01 systemd: Starting Session 2162032 of user <username1>.
Sep 26 19:48:06 cpandbsv01 systemd-logind: Removed session 2162032.
Sep 26 19:48:06 cpandbsv01 env: Inside tearDown waitpid rc[0] = 0
Sep 26 19:48:07 cpandbsv01 env: 2022-09-26 19:48:07 (159947) INFO : For table <database>.<table>: 678080 rows processed and 678080 rows inserted.
Sep 26 19:48:07 cpandbsv01 cpimport.bin[159947]: 07.447017 |0|0|0| I 34 CAL0083: BulkLoad: JobId-87675; finished loading table <database>.<table>; 678080 rows inserted
Sep 26 19:48:07 cpandbsv01 env: cpimport.bin[159947]: 07.447017 |0|0|0| I 34 CAL0083: BulkLoad: JobId-87675; finished loading table <database>.<table>; 678080 rows inserted
Sep 26 19:48:07 cpandbsv01 env: writeengine[159947]: 07.447066 |0|0|0| I 19 CAL0008: Bulkload |Job: /var/log/mariadb/columnstore/data/bulk/tmpjob/<filename>.xml |For table <database>.<table>: 678080 rows processed and 678080 rows inserted.
Sep 26 19:48:07 cpandbsv01 writeengine[159947]: 07.447066 |0|0|0| I 19 CAL0008: Bulkload |Job: /var/log/mariadb/columnstore/data/bulk/tmpjob/<filename>.xml |For table <database>.<table>: 678080 rows processed and 678080 rows inserted.
Sep 26 19:48:07 cpandbsv01 env: 2022-09-26 19:48:07 (159947) INFO : Bulk load completed, total run time : 1270.12 seconds
Sep 26 19:48:07 cpandbsv01 env: cpimport.bin[159947]: 07.512299 |0|0|0| I 34 CAL0082: End BulkLoad: JobId-87675; status-SUCCESS
Sep 26 19:48:07 cpandbsv01 cpimport.bin[159947]: 07.512299 |0|0|0| I 34 CAL0082: End BulkLoad: JobId-87675; status-SUCCESS
Sep 26 19:48:07 cpandbsv01 env: Received SIGCHLD of terminated process..
Sep 26 19:48:07 cpandbsv01 env: Inside tearDown waitpid rc[1] = 159947
Sep 26 19:48:07 cpandbsv01 env: Cpimport exit on success
Sep 26 19:48:07 cpandbsv01 writeengineserver[148313]: 07.835802 |0|0|0| I 32 CAL0000: 1805 : cpimport exit on success
Sep 26 19:48:07 cpandbsv01 env: writeengineserver[148313]: 07.835802 |0|0|0| I 32 CAL0000: 1805 : cpimport exit on success
Sep 26 19:48:07 cpandbsv01 env: Sending BRMRPT
Sep 26 19:48:07 cpandbsv01 env: Closed File : /tmp/columnstore_tmp_files/BrmRpt0826192657159918.rpt 0
Sep 26 19:48:07 cpandbsv01 env: Finished Sending BRMRPT
Sep 26 19:48:07 cpandbsv01 env: Sent CPIPASS info
Sep 26 19:48:07 cpandbsv01 env: writeengineserver[148313]: 07.836307 |0|0|0| D 32 CAL0000: 1805 : onCpimportSuccess BrmReport Send
Sep 26 19:48:07 cpandbsv01 env: writeengineserver[148313]: 07.836352 |0|0|0| D 32 CAL0000: 1805 : onReceiveEOD : child ID = 0
Sep 26 19:48:07 cpandbsv01 env: Received EOD
Sep 26 19:48:07 cpandbsv01 env: writeengineserver[148313]: 07.836386 |0|0|0| D 32 CAL0000: 1805 : onReceiveEOD : child ID = 0
Sep 26 19:48:07 cpandbsv01 env: Received EOD
Sep 26 19:48:08 cpandbsv01 env: writeengineserver[148313]: 08.174146 |0|0|0| D 32 CAL0000: 1805 : OnReceiveCleanup arrived
Sep 26 19:48:08 cpandbsv01 writeenginesplit[159918]: 08.174147 |0|0|0| I 33 CAL0098: Received a Cpimport Pass from PM1.
Sep 26 19:48:08 cpandbsv01 env: ClearTableLockCmd::processCleanup for DataLoader: tableOID-87675
Sep 26 19:48:08 cpandbsv01 writeenginesplit[159918]: 08.243808 |0|0|0| I 33 CAL0000: Released Table Lock
Sep 26 19:48:08 cpandbsv01 systemd-logind: New session 2162033 of user <username1>.
Sep 26 19:48:08 cpandbsv01 systemd: Started Session 2162033 of user <username1>.
Sep 26 19:48:08 cpandbsv01 systemd: Starting Session 2162033 of user <username1>.
Sep 26 19:48:08 cpandbsv01 env: Rx Bytes 1168761463
Sep 26 19:48:08 cpandbsv01 env: TX Bytes 23652
Sep 26 19:48:08 cpandbsv01 env: Child PID 0
Sep 26 19:48:08 cpandbsv01 mariadbd: dbcon[147885]: 08.744482 |245|0|0| D 24 CAL0001: End SQL statement



 Comments   
Comment by Leonid Fedorov [ 2023-02-16 ]

allen.herrera That was awesome reproduce instruction. Video file with consoles made my day. Fixing PR is ready

Comment by Daniel Lee (Inactive) [ 2023-02-22 ]

Build verified: develop branch

engine: e6127ab9bc3b8bfb5916bb57b2f5a1f3641fae61
server: 1c8bc55d8fadfeefaf7fbaabd04a0929481a5381
buildNo: 6871

Reproduced the issue in 22.08.2 and verified the in the build mentioned above.

Generated at Thu Feb 08 02:56:25 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.