[MCOL-5339] DDLProc[xx]: Could not connect to pmX_WriteEngineServer: Connection refused Created: 2022-12-08  Updated: 2024-02-07

Status: Stalled
Project: MariaDB ColumnStore
Component/s: None
Affects Version/s: 22.08.2, 22.08.4, 6.4.6, 23.02.3
Fix Version/s: 23.10

Type: Bug Priority: Critical
Reporter: Allen Herrera Assignee: Roman
Resolution: Unresolved Votes: 0
Labels: mcs_trg

Issue Links:
Relates
relates to MCOL-5352 Truncate table failed after PrimProc ... In Progress
relates to MCOL-5378 Nessus port scans crash exeMgr Closed

 Description   

In CS 6.4.x and CS 22.08.4, a customer's DDLProc stops working quickly after a few create tables and a stored procedure. Attaching what I have.

More triage is needed.

Theres a lot of other activity on the server thats unknown to what extent its effecting columnstore. see /var/log/messages and youll see salt-minion, docker, slapd, pam_unix,postfix, prometheus and more.

Customers Facing Issue
ERROR 1815 (HY000): Internal error: Lost connection to DDLProc

MariaDB [test]> CREATE TABLE `t1` (
-> `id` int(11) NOT NULL
-> ) ENGINE=Columnstore ;
ERROR 1815 (HY000): Internal error: Lost connection to DDLProc

Work around is to mcsShutdown, clearShm and mcsStart.



 Comments   
Comment by Bryan Bancroft (Inactive) [ 2022-12-08 ]

drrtuy additional logs attached to the case

Comment by suresh ramagiri [ 2023-06-30 ]

On of our customer too, hit this similar problem:

All of the sudden, DDL execution i.e, create table for the CS engine, are failing with the error - "Internal error: Lost connection to DDLProc"

Checking the syslogs can see recent DDLProc crashes triggered by unhandled exception in we_clients.cpp

 
Jun 30 08:41:53 linxdd-edw01 DDLProc[272863]: joblist[272863]: 53.457773 |0|0|0| E 05 CAL0000: /home/jenkins/workspace/Build-Package/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX_ON_ES_BACKUP_DEBUGSOURCE/storage/columnstore/columnstore/writeengine/client/we_clients.cpp @ 280 Could not connect to pm1_WriteEngineServer: Connection refused      %%10%%
Jun 30 08:41:53 linxdd-edw01 DDLProc[272863]: Could not connect to pm1_WriteEngineServer: Connection refused
Jun 30 08:41:53 linxdd-edw01 DDLProc[272863]: joblist[272863]: 53.458115 |0|0|0| E 05 CAL0000: /home/jenkins/workspace/Build-Package/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX_ON_ES_BACKUP_DEBUGSOURCE/storage/columnstore/columnstore/writeengine/client/we_clients.cpp @ 280 Could not connect to pm2_WriteEngineServer: Connection refused      %%10%%
Jun 30 08:41:53 linxdd-edw01 DDLProc[272863]: Could not connect to pm2_WriteEngineServer: Connection refused
Jun 30 08:41:53 linxdd-edw01 joblist[272863]: 53.457773 |0|0|0| E 05 CAL0000: /home/jenkins/workspace/Build-Package/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX_ON_ES_BACKUP_DEBUGSOURCE/storage/columnstore/columnstore/writeengine/client/we_clients.cpp @ 280 Could not connect to pm1_WriteEngineServer: Connection refused      %%10%%

DDLProc crashed:

 
Date/time: 2023-06-30 08:41:53
Signal: 6
 
/usr/bin/DDLProc(+0x23066)[0x563cf2f86066]
/lib64/libc.so.6(+0x54df0)[0x7f9f62654df0]
/lib64/libc.so.6(+0xa154c)[0x7f9f626a154c]
/lib64/libc.so.6(raise+0x16)[0x7f9f62654d46]
/lib64/libc.so.6(abort+0xd3)[0x7f9f626287f3]
/lib64/libstdc++.so.6(+0xa1a01)[0x7f9f62aa1a01]
/lib64/libstdc++.so.6(+0xad37c)[0x7f9f62aad37c]
/lib64/libstdc++.so.6(+0xac349)[0x7f9f62aac349]
/lib64/libstdc++.so.6(__gxx_personality_v0+0x9a)[0x7f9f62aacaca]
/lib64/libwriteengineclient.so(+0x280ac)[0x7f9f640af0ac]
/lib64/libwriteengineclient.so(+0x28c2e)[0x7f9f640afc2e]
/lib64/libwriteengineclient.so(+0x11485)[0x7f9f64098485]
/lib64/libwriteengineclient.so(_ZN11WriteEngine9WEClientsD1Ev+0x1a)[0x7f9f6409ef0a]
/lib64/libddlpackageproc.so(_ZN19ddlpackageprocessor19DDLPackageProcessorD1Ev+0x58)[0x7f9f6434bab8]
/usr/bin/DDLProc(+0x2137c)[0x563cf2f8437c]
/usr/bin/DDLProc(+0x1f47e)[0x563cf2f8247e]
/usr/bin/DDLProc(+0x208ac)[0x563cf2f838ac]
/lib64/libthreadpool.so(_ZN10threadpool10ThreadPool11beginThreadEv+0x64c)[0x7f9f62dd552c]
/lib64/libboost_thread.so.1.75.0(+0x14032)[0x7f9f62fb9032]
/lib64/libc.so.6(+0x9f802)[0x7f9f6269f802]
/lib64/libc.so.6(+0x3f450)[0x7f9f6263f450]

Generated at Thu Feb 08 02:57:08 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.