Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-5339

DDLProc[xx]: Could not connect to pmX_WriteEngineServer: Connection refused

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Unresolved
    • 22.08.2, 22.08.4, 6.4.6, 23.02.3
    • 23.10
    • None
    • 2024-2

    Description

      In CS 6.4.x and CS 22.08.4, a customer's DDLProc stops working quickly after a few create tables and a stored procedure. Attaching what I have.

      More triage is needed.

      Theres a lot of other activity on the server thats unknown to what extent its effecting columnstore. see /var/log/messages and youll see salt-minion, docker, slapd, pam_unix,postfix, prometheus and more.

      Customers Facing Issue
      ERROR 1815 (HY000): Internal error: Lost connection to DDLProc

      MariaDB [test]> CREATE TABLE `t1` (
      -> `id` int(11) NOT NULL
      -> ) ENGINE=Columnstore ;
      ERROR 1815 (HY000): Internal error: Lost connection to DDLProc
      

      Work around is to mcsShutdown, clearShm and mcsStart.

      Attachments

        Issue Links

          Activity

            drrtuy additional logs attached to the case

            bbancroft Bryan Bancroft (Inactive) added a comment - drrtuy additional logs attached to the case
            suresh.ramagiri@mariadb.com suresh ramagiri added a comment - - edited

            On of our customer too, hit this similar problem:

            All of the sudden, DDL execution i.e, create table for the CS engine, are failing with the error - "Internal error: Lost connection to DDLProc"

            Checking the syslogs can see recent DDLProc crashes triggered by unhandled exception in we_clients.cpp

             
            Jun 30 08:41:53 linxdd-edw01 DDLProc[272863]: joblist[272863]: 53.457773 |0|0|0| E 05 CAL0000: /home/jenkins/workspace/Build-Package/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX_ON_ES_BACKUP_DEBUGSOURCE/storage/columnstore/columnstore/writeengine/client/we_clients.cpp @ 280 Could not connect to pm1_WriteEngineServer: Connection refused      %%10%%
            Jun 30 08:41:53 linxdd-edw01 DDLProc[272863]: Could not connect to pm1_WriteEngineServer: Connection refused
            Jun 30 08:41:53 linxdd-edw01 DDLProc[272863]: joblist[272863]: 53.458115 |0|0|0| E 05 CAL0000: /home/jenkins/workspace/Build-Package/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX_ON_ES_BACKUP_DEBUGSOURCE/storage/columnstore/columnstore/writeengine/client/we_clients.cpp @ 280 Could not connect to pm2_WriteEngineServer: Connection refused      %%10%%
            Jun 30 08:41:53 linxdd-edw01 DDLProc[272863]: Could not connect to pm2_WriteEngineServer: Connection refused
            Jun 30 08:41:53 linxdd-edw01 joblist[272863]: 53.457773 |0|0|0| E 05 CAL0000: /home/jenkins/workspace/Build-Package/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX_ON_ES_BACKUP_DEBUGSOURCE/storage/columnstore/columnstore/writeengine/client/we_clients.cpp @ 280 Could not connect to pm1_WriteEngineServer: Connection refused      %%10%%
            
            

            DDLProc crashed:

             
            Date/time: 2023-06-30 08:41:53
            Signal: 6
             
            /usr/bin/DDLProc(+0x23066)[0x563cf2f86066]
            /lib64/libc.so.6(+0x54df0)[0x7f9f62654df0]
            /lib64/libc.so.6(+0xa154c)[0x7f9f626a154c]
            /lib64/libc.so.6(raise+0x16)[0x7f9f62654d46]
            /lib64/libc.so.6(abort+0xd3)[0x7f9f626287f3]
            /lib64/libstdc++.so.6(+0xa1a01)[0x7f9f62aa1a01]
            /lib64/libstdc++.so.6(+0xad37c)[0x7f9f62aad37c]
            /lib64/libstdc++.so.6(+0xac349)[0x7f9f62aac349]
            /lib64/libstdc++.so.6(__gxx_personality_v0+0x9a)[0x7f9f62aacaca]
            /lib64/libwriteengineclient.so(+0x280ac)[0x7f9f640af0ac]
            /lib64/libwriteengineclient.so(+0x28c2e)[0x7f9f640afc2e]
            /lib64/libwriteengineclient.so(+0x11485)[0x7f9f64098485]
            /lib64/libwriteengineclient.so(_ZN11WriteEngine9WEClientsD1Ev+0x1a)[0x7f9f6409ef0a]
            /lib64/libddlpackageproc.so(_ZN19ddlpackageprocessor19DDLPackageProcessorD1Ev+0x58)[0x7f9f6434bab8]
            /usr/bin/DDLProc(+0x2137c)[0x563cf2f8437c]
            /usr/bin/DDLProc(+0x1f47e)[0x563cf2f8247e]
            /usr/bin/DDLProc(+0x208ac)[0x563cf2f838ac]
            /lib64/libthreadpool.so(_ZN10threadpool10ThreadPool11beginThreadEv+0x64c)[0x7f9f62dd552c]
            /lib64/libboost_thread.so.1.75.0(+0x14032)[0x7f9f62fb9032]
            /lib64/libc.so.6(+0x9f802)[0x7f9f6269f802]
            /lib64/libc.so.6(+0x3f450)[0x7f9f6263f450]
            
            

            suresh.ramagiri@mariadb.com suresh ramagiri added a comment - - edited On of our customer too, hit this similar problem: All of the sudden, DDL execution i.e, create table for the CS engine, are failing with the error - "Internal error: Lost connection to DDLProc" Checking the syslogs can see recent DDLProc crashes triggered by unhandled exception in we_clients.cpp   Jun 30 08:41:53 linxdd-edw01 DDLProc[272863]: joblist[272863]: 53.457773 |0|0|0| E 05 CAL0000: /home/jenkins/workspace/Build-Package/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX_ON_ES_BACKUP_DEBUGSOURCE/storage/columnstore/columnstore/writeengine/client/we_clients.cpp @ 280 Could not connect to pm1_WriteEngineServer: Connection refused %%10%% Jun 30 08:41:53 linxdd-edw01 DDLProc[272863]: Could not connect to pm1_WriteEngineServer: Connection refused Jun 30 08:41:53 linxdd-edw01 DDLProc[272863]: joblist[272863]: 53.458115 |0|0|0| E 05 CAL0000: /home/jenkins/workspace/Build-Package/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX_ON_ES_BACKUP_DEBUGSOURCE/storage/columnstore/columnstore/writeengine/client/we_clients.cpp @ 280 Could not connect to pm2_WriteEngineServer: Connection refused %%10%% Jun 30 08:41:53 linxdd-edw01 DDLProc[272863]: Could not connect to pm2_WriteEngineServer: Connection refused Jun 30 08:41:53 linxdd-edw01 joblist[272863]: 53.457773 |0|0|0| E 05 CAL0000: /home/jenkins/workspace/Build-Package/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX_ON_ES_BACKUP_DEBUGSOURCE/storage/columnstore/columnstore/writeengine/client/we_clients.cpp @ 280 Could not connect to pm1_WriteEngineServer: Connection refused %%10%% DDLProc crashed:   Date/time: 2023-06-30 08:41:53 Signal: 6   /usr/bin/DDLProc(+0x23066)[0x563cf2f86066] /lib64/libc.so.6(+0x54df0)[0x7f9f62654df0] /lib64/libc.so.6(+0xa154c)[0x7f9f626a154c] /lib64/libc.so.6(raise+0x16)[0x7f9f62654d46] /lib64/libc.so.6(abort+0xd3)[0x7f9f626287f3] /lib64/libstdc++.so.6(+0xa1a01)[0x7f9f62aa1a01] /lib64/libstdc++.so.6(+0xad37c)[0x7f9f62aad37c] /lib64/libstdc++.so.6(+0xac349)[0x7f9f62aac349] /lib64/libstdc++.so.6(__gxx_personality_v0+0x9a)[0x7f9f62aacaca] /lib64/libwriteengineclient.so(+0x280ac)[0x7f9f640af0ac] /lib64/libwriteengineclient.so(+0x28c2e)[0x7f9f640afc2e] /lib64/libwriteengineclient.so(+0x11485)[0x7f9f64098485] /lib64/libwriteengineclient.so(_ZN11WriteEngine9WEClientsD1Ev+0x1a)[0x7f9f6409ef0a] /lib64/libddlpackageproc.so(_ZN19ddlpackageprocessor19DDLPackageProcessorD1Ev+0x58)[0x7f9f6434bab8] /usr/bin/DDLProc(+0x2137c)[0x563cf2f8437c] /usr/bin/DDLProc(+0x1f47e)[0x563cf2f8247e] /usr/bin/DDLProc(+0x208ac)[0x563cf2f838ac] /lib64/libthreadpool.so(_ZN10threadpool10ThreadPool11beginThreadEv+0x64c)[0x7f9f62dd552c] /lib64/libboost_thread.so.1.75.0(+0x14032)[0x7f9f62fb9032] /lib64/libc.so.6(+0x9f802)[0x7f9f6269f802] /lib64/libc.so.6(+0x3f450)[0x7f9f6263f450]
            allen.herrera Allen Herrera added a comment - - edited

            With the release of MCOL-5352 coming in 23.10.2, we believe this will be fixed

            allen.herrera Allen Herrera added a comment - - edited With the release of MCOL-5352 coming in 23.10.2, we believe this will be fixed

            People

              allen.herrera Allen Herrera
              allen.herrera Allen Herrera
              Votes:
              1 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.