Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-3278

ProcMon and ProcMgr crashed - Signal: 6 - libmessageqcpp.so.1

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: 1.1.6
    • Fix Version/s: N/A
    • Component/s: oam
    • Labels:
      None
    • Environment:
      3 um 5 pm Aamazon EBS system

      Description

      Customer reported:

      Problem was a colxml error prevented cpimport from happening. Appears to me to be a temporary communication error.

      Found this from the support report:

      ProMon restarted on pm1 – causing the errors to occur on UM1, the reason colxml and cpimport failed. They couldnt get correct status of the DBROOTs.

      ProMgr did restart 18 minutes later.

      Only thing to note is that ProcMon and ProcMgr both crashed with similar errors.

      Based on the logs, not sure why ProcMon restarted and ProcMgr followed a bit later. I will open a new BUG.

      Um1 logs

      Apr 23 04:59:53 mcs1-um1 joblist[125849]: 53.777357 |2147483648|0|0| C 05 CAL0000: IDB-2034: At least one DBRoot required for that query is offline.

      Apr 23 04:59:53 mcs1-um1 oamcpp[125849]: 53.714950 |0|0|0| E 08 CAL0000: OamCache::checkReload exception while getModuleStatus pm1 Invalid Parameter passed in getModuleStatus API

      Apr 23 04:59:53 mcs1-um1 writeengine[125849]: 53.833765 |0|0|0| E 19 CAL0087: BulkLoad Error: colxml runtime exception: Error reading columns for table canary.future_bigsum_tmp: IDB-2043: An internal error occurred. Check the error log file & contact support.

      Pm1

      Apr 23 04:59:49 mcs1-pm1 messagequeue[113607]: 49.076723 |0|0|0| W 31 CAL0000: Client read close socket for InetStreamSocket::readToMagic(): I/O error1: rc-1; poll signal interrupt ( POLLHUP POLLERR )
      Apr 23 04:59:49 mcs1-pm1 ProcessManager[113607]: 49.095118 |0|0|0| D 17 CAL0000: Set System State = ACTIVE
      Apr 23 04:59:49 mcs1-pm1 ProcessManager[113607]: 49.095953 |0|0|0| D 17 CAL0000: setQuerySystemState = 1
      Apr 23 04:59:49 mcs1-pm1 ProcessManager[113607]: 49.096429 |0|0|0| D 17 CAL0000: setQuerySystemState successful
      Apr 23 04:59:54 mcs1-pm1 ProcessMonitor[84163]: 54.101273 |0|0|0| I 18 CAL0000:
      Apr 23 04:59:54 mcs1-pm1 ProcessMonitor[84163]: 54.101385 |0|0|0| I 18 CAL0000: *********Process Monitor Started*********
      Apr 23 04:59:54 mcs1-pm1 ProcessMonitor[84163]: 54.101405 |0|0|0| D 18 CAL0000:
      Apr 23 04:59:54 mcs1-pm1 ProcessMonitor[84163]: 54.101421 |0|0|0| D 18 CAL0000: *********Process Monitor Started*********

      Date/time: 2019-04-23 04:59:47
      Signal: 6

      /usr/local/mariadb/columnstore/bin/ProcMon(_Z12fatalHandleri+0x150)[0x5557d5baf8c0]
      /lib64/libpthread.so.0(+0xf6d0)[0x7efc34aed6d0]
      /lib64/libc.so.6(gsignal+0x37)[0x7efc33b1a277]
      /lib64/libc.so.6(abort+0x148)[0x7efc33b1b968]
      /lib64/libstdc++.so.6(ZN9gnu_cxx27_verbose_terminate_handlerEv+0x165)[0x7efc344297d5]
      /lib64/libstdc++.so.6(+0x5e746)[0x7efc34427746]
      /lib64/libstdc++.so.6(+0x5e773)[0x7efc34427773]
      /lib64/libstdc++.so.6(+0x5e993)[0x7efc34427993]
      /usr/local/mariadb/columnstore/lib/libmessageqcpp.so.1(_ZNK11messageqcpp10ByteStream4peekERSs+0x104)[0x7efc36015d64]
      /usr/local/mariadb/columnstore/lib/libmessageqcpp.so.1(_ZN11messageqcpp10ByteStreamrsERSs+0x12)[0x7efc36015db2]
      /usr/local/mariadb/columnstore/bin/ProcMon(_Z16processStatusMSGPN11messageqcpp8IOSocketE+0x91b)[0x5557d5b7712b]
      /lib64/libpthread.so.0(+0x7e25)[0x7efc34ae5e25]
      /lib64/libc.so.6(clone+0x6d)[0x7efc33be2bad]

      And ProcMgr restarted 18 minutes after ProcMon.

      Apr 23 05:17:14 mcs1-pm1 ProcessMonitor[84163]: 14.581682 |0|0|0| C 18 CAL0000: *****MariaDB ColumnStore Process Restarting: ProcessManager, old PID = 113607

      Is trace was from an earlier ProcMgr crash. Didnt see one for this one.

      Date/time: 2019-04-22 00:02:48
      Signal: 6

      [0x55da18a03c70]
      /lib64/libpthread.so.0(+0xf6d0)[0x7fa6549ce6d0]
      /lib64/libc.so.6(gsignal+0x37)[0x7fa6539fb277]
      /lib64/libc.so.6(abort+0x148)[0x7fa6539fc968]
      /lib64/libstdc++.so.6(ZN9gnu_cxx27_verbose_terminate_handlerEv+0x165)[0x7fa65430a7d5]
      /lib64/libstdc++.so.6(+0x5e746)[0x7fa654308746]
      /lib64/libstdc++.so.6(+0x5e773)[0x7fa654308773]
      /lib64/libstdc++.so.6(+0x5e993)[0x7fa654308993]
      /usr/local/mariadb/columnstore/lib/libmessageqcpp.so.1(_ZNK11messageqcpp10ByteStream4peekERSs+0x104)[0x7fa655ef6d64]
      /usr/local/mariadb/columnstore/lib/libmessageqcpp.so.1(_ZN11messageqcpp10ByteStreamrsERSs+0x12)[0x7fa655ef6db2]
      [0x55da189f1c62]
      /lib64/libpthread.so.0(+0x7e25)[0x7fa6549c6e25]
      /lib64/libc.so.6(clone+0x6d)[0x7fa653ac3bad]

        Attachments

          Activity

            People

            Assignee:
            pleblanc Patrick LeBlanc (Inactive)
            Reporter:
            hill David Hill (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.