Details
-
Bug
-
Status: Confirmed (View Workflow)
-
Critical
-
Resolution: Unresolved
-
23.02.3, 23.02.4, 23.10.0
-
None
-
SkySQL, Docker and AWS EC2
4 cpu, 16 GB ram
-
2023-11, 2023-12, 2024-1
Description
Summary: Running the supplied workload randomly causes primproc to disappear. The workload can complete 1-2 times fine, but then randomly will "crash" (no core dump or stack trace found).
Expectation: Columnstore software stays stable. Rejecting or erroring out too large of queries, self recovery or maybe an error message suggesting what is needed to complete the query (cpu/ram) but a subprocess disappearing and system staying broken until manual intervention to restart the system isn't acceptable.
Workaround: restart columnstore
Reproduction: See developer comment
Client Side error:
ERROR 1815 (HY000) at line 1: Internal error: MCS-2004: Cannot connect to ExeMgr. |
primproc.log
getFreeMemory : returned from getMemUsageFromCGroup : usage 5211672576 (GIB) 4 |
debug.log
Oct 6 17:29:47 mcs1 messagequeue[794]: 47.156748 |0|0|0| W 31 CAL0071: InetStreamSocket::read: timeout during first read: socket read error: Success; InetStreamSocket: sd: 65 inet: 127.0.0.1 port: 8601; Will retry. |
mariadb-error.log
ClientRotator caught exception: InetStreamSocket::connect: connect() error: Connection refused to: InetStreamSocket: sd: 64 inet: 127.0.0.1 port: 8601 |