Details
-
Bug
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
23.02.8
-
None
-
None
-
S3 Cohesity, NFS, Rhel 8
Description
The root cause remains undetermined, as the issue occurs sporadically. The customer cluster has been experiencing this problem for over a year. While the behavior is similar to what's described in https://jira.mariadb.org/browse/MCOL-5565 (which was initially suspected to be the issue but has since been resolved), an analysis by the Columnstore Engineering team suggests that the underlying cause is different.
Business Environment details:
- cpimport runs every hour
- caldropartition runs every night
TOP Processes
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
|
3917453 mysql 20 0 4879.0g 290.6g 11464 S 6290 38.5 99658:27 /usr/bin/PrimProc
|
3915726 mysql 20 0 18.1g 2.4g 19004 S 9.5 0.3 2781:52 /usr/sbin/mariadbd
|
3917430 mysql 20 0 958208 748796 744732 S 0.0 0.1 63:49.48 /usr/bin/workernode DBRM_Worker2
|
3917700 mysql 20 0 237404 9172 5408 S 0.0 0.0 5:09.47 /usr/bin/WriteEngineServer
|
Identified Warnings:
The certificate /usr/share/columnstore/cmapi/cmapi_server/self-signed.crt for cmapi https is expired.
|
There is 1 zombie process.
|
Iptables rules exist.
|
QueryStats Enabled = N
|
HashJoin AllowDiskBasedJoin = N
|
Errors found in crit logs: reading compression header. Check for possible data file corruption.
|
Unknown ref item error found in error log. Mariadb server version may not be fully compatible with columnstore version.
|
There are 3 symbolic links found in /var/lib/columnstore.
|
Other identified issues:
HUGE NUMBER OF CONNECT RETRY IN MARIADB LOGS (1 million entries):
|
ClientRotator caught exception: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 76 inet: 127.0.0.1 port: 8601
|
 |
EXECMGR UNRESPONSIVE?
|
Could not get a ExeMgr connection.
|
joblist[153385]: 18.015363 |0|0|0| C 05 CAL0000: /home/jenkins/workspace/Build-Package/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX_ON_ES_BACKUP_DEBUGSOURCE/storage/columnstore/columnstore/dbcon/execplan/clientrotator.cpp @ 379 Could not get a ExeMgr connection. %%10%%
|