Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-2140

timeout for replication of 1 minute is to small - timing out on system with 4 nodes

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Won't Do
    • None
    • Icebox
    • N/A
    • None
    • 2um 2pm with local query

    Description

      Customer reported that the replication wasnt working and the slaves wasnt being setup on there 2pm 2um with local query. It turns out that the distributed request failed dur to a timeout on PM1 procmgr waiting on UM1 procmon. The distrbute command took longer than 1 minute on a 4 node system where it has to distribute to 3 slave nodes.

      PM1

      Feb 4 15:29:45 usfit-scdb6 ProcessManager[171189]: 45.342198 |0|0|0| D 17 CAL0000: sendMsgProcMon: Process module um1
      Feb 4 15:30:45 usfit-scdb6 ProcessManager[171189]: 45.393626 |0|0|0| E 17 CAL0000: line: 6901 sendMsgProcMon: ProcMon Msg timeout on module um1

      UM1 15:29:45 to 15:31:06

      Feb 4 15:29:45 usfit-scdb1 ProcessMonitor[101017]: 45.338509 |0|0|0| I 18 CAL0000: MSG RECEIVED: Run Master DB Distribute command
      Feb 4 15:29:45 usfit-scdb1 ProcessMonitor[101017]: 45.338704 |0|0|0| D 18 CAL0000: runMasterDist function called

      Feb 4 15:29:45 usfit-scdb1 ProcessMonitor[101017]: 45.350897 |0|0|0| D 18 CAL0000: cmd = /usr/local/mariadb/columnstore/bin/rsync.sh 192.168.212.39 ssh /usr/local/mariadb/columnstore 1 > /scdbprd_tmp//master-dist_um2.log
      Feb 4 15:30:08 usfit-scdb1 ProcessMonitor[101017]: 08.522846 |0|0|0| D 18 CAL0000: runMasterDist: Success rsync to module: um2

      Feb 4 15:30:08 usfit-scdb1 ProcessMonitor[101017]: 08.522949 |0|0|0| D 18 CAL0000: cmd = /usr/local/mariadb/columnstore/bin/rsync.sh 192.168.212.47 ssh /usr/local/mariadb/columnstore 1 > /scdbprd_tmp//master-dist_pm1.log
      Feb 4 15:30:38 usfit-scdb1 ProcessMonitor[101017]: 38.745679 |0|0|0| D 18 CAL0000: runMasterDist: Success rsync to module: pm1

      Feb 4 15:30:38 usfit-scdb1 ProcessMonitor[101017]: 38.745774 |0|0|0| D 18 CAL0000: cmd = /usr/local/mariadb/columnstore/bin/rsync.sh 192.168.212.48 ssh /usr/local/mariadb/columnstore 1 > /scdbprd_tmp//master-dist_pm2.log
      Feb 4 15:31:06 usfit-scdb1 ProcessMonitor[101017]: 06.437719 |0|0|0| D 18 CAL0000: runMasterDist: Success rsync to module: pm2

      Feb 4 15:31:06 usfit-scdb1 ProcessMonitor[101017]: 06.437841 |0|0|0| I 18 CAL0000: MASTERDIST: runMasterRep - ACK back to ProcMgr return status = 0

      Attachments

        Activity

          People

            Unassigned Unassigned
            hill David Hill (Inactive)
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.