Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-5760

Dump failing with "Internal error: Lost connection to ExeMgr. Please contact your administrator (1815)" after upgrading to version 23.10.1

    XMLWordPrintable

Details

    • Bug
    • Status: Needs Feedback (View Workflow)
    • Critical
    • Resolution: Unresolved
    • 23.10.1
    • 23.10
    • ExeMgr
    • 2024-2

    Description

      Customer was hitting an error on their environment, where JSON_VALUE causing PrimProc to crash with a signal 11. Here and Here

      Following the Fix version mentioned in the above JIRAs, we upgraded customer environment to version 23.10.1, however after the upgrade we started hitting the below issue:

      SELECT mo_id, JSON_VALUE(mo_crs, '$.997') FROM dmc_dg_ita_920.modello LIMIT 0, 1000
       
      Error Code: 1815
      Internal error: Lost connection to ExeMgr. Please contact your administrator
      

      As advised by CS Engineering, we attempted to take a dump of the table (after getting customer's approval, and then we did hit this error:

      mysqldump: Couldn't execute 'SELECT /*!40001 SQL_NO_CACHE */ `mo_id`, `mo_desc`, `mo_ma_desc`, `mo_ma_id`, `mo_fa_id`, `mo_ti_id`, `mo_id_padre`, `mo_pds`, `mo_datacreazione`, `mo_crs` FROM `modello`': Internal error: Lost connection to ExeMgr. Please contact your administrator (1815)
      

      As requested by Allen, we ran select * from calpontsys.systable;
      and it returned results. However, when we tried running select on the issue table:

      MariaDB [dmc_dg_ita_920]> select * from modello limit 1;
      

      it got stuck, and failed with this message:

      [QBERG] root@sw-dbmcs-uat: columnstore # tail -f crit.log
      May 20 06:27:23 fw-dbmcs-uat PrimProc[27026]: 23.399433 |0|0|0| C 28 CAL0000: /home/jenkins/workspace/Build-Package/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX_ON_ES_BACKUP_DEBUGSOURCE/storage/columnstore/columnstore/primitives/primproc/columncommand.cpp error on projectResultRG for oid 1043 lbid 107485312: input rids 5018,  output rids 3277#012: Restarted a syscat job 120 times, bailing#012         %%10%%
      May 20 06:27:23 fw-dbmcs-uat joblist[27026]: 23.400347 |2147507803|0|0| C 05 CAL0000: st: 1 TupleBPS::receiveMultiPrimitiveMessages() caught  an exception originally thrown by PrimProc:          %%10%%
      May 20 06:27:23 fw-dbmcs-uat ExeMgr[27026]: 23.401097 |24155|0|0| C 16 CAL0055: ERROR: ExeMgr has caught an exception. MCS-2044: An internal error occurred.  Check the error log file & contact support.
      

      We'll need to figure this out, as the issue that was fixed and caused us to upgrade is also been experienced in Production.

      Attachments

        Activity

          People

            leonid.fedorov Leonid Fedorov
            michael.amadi Michael Amadi
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.