Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-833

could not open file for OID after a outage recover from pm2 PrimProc

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 1.0.9
    • 1.1.0
    • ?, ExeMgr
    • None

    Description

      Had customer report errors with PrimProc could not open file for OID for unknown reasons, MCOL-801 and MCOL-804.

      Was able to reproduce this error by doing the following. Wasnt sure if same problem with 801/804, so opened a new BUG.

      1. setup a 1um / 2 pm system with 50gb tpch1 database
      2. run a script that continually did the following query:
      [root@ip-172-30-0-161 ~]# cat query.sh
      #!/bin/bash
      while [ true ]; do
      echo "select count from lineitem" | /usr/local//mariadb/columnstore/mysql/bin/mysql --defaults-extra-file=/usr/local//mariadb/columnstore/mysql/my.cnf -u root tpch100
      sleep 1
      done
      exit 0

      3. Did a pkill on pm2 PrimProc

      pm1 errors logs soon started after the recovery was performed:

      Jul 14 16:24:46 ip-172-30-0-176 PrimProc[93531]: 46.550644 |0|0|0| W 28 CAL0000: IDB-2039: Data file does not exist, please contact your system administrator for more information.
      Jul 14 16:24:47 ip-172-30-0-176 IDBFile[93531]: 47.550530 |0|0|0| D 35 CAL0002: Failed to open file: /000.dir/000.dir/012.dir/012.dir/000.dir/FILE002.cdf, exception: unable to open Unbuffered file
      Jul 14 16:24:48 ip-172-30-0-176 IDBFile[93531]: 48.550839 |0|0|0| D 35 CAL0002: Failed to open file: /000.dir/000.dir/012.dir/012.dir/000.dir/FILE002.cdf, exception: unable to open Unbuffered file
      Jul 14 16:24:49 ip-172-30-0-176 IDBFile[93531]: 49.551158 |0|0|0| D 35 CAL0002: Failed to open file: /000.dir/000.dir/012.dir/012.dir/000.dir/FILE002.cdf, exception: unable to open Unbuffered file

      This file exist on pm2, so ExeMgr is sending the request to the wrong pm1/PrimProc

      data2]# ll 000.dir/000.dir/012.dir/012.dir/000.dir/FILE002.cdf
      rw-rr- 1 root root 11345920 Jul 14 16:00 000.dir/000.dir/012.dir/012.dir/000.dir/FILE002.cdf

      Attachments

        Activity

          People

            dleeyh Daniel Lee (Inactive)
            hill David Hill (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.