Details
-
Task
-
Status: In Progress (View Workflow)
-
Critical
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
-
2025-6, 2025-9, 2025-10
Description
Summary
Users have experienced new errors for existing queries they claim to have run on 23.02.x after upgrading to 23.10.4+. Current guess is an issue with the memory allocator in 23.10.4 calculating that the query would take too much memory. Either way clearer error logs and if possible sql client error is being requested.
Ideally we return how much memory was used and how much is being estimated to be needed.
Reproduction:
See edwards comments and Instructions_for_reproducing_the_bug_MCOL-6075.txt
Actual:
In the sql client they get error
MCS-2001: Join or subselect exceeds memory limit. |
and in the debug log we see std::bad_alloc
Sep 19 18:07:13 ip-172-31-39-250 joblist[50051]: 13.552385 |0|0|0| I 05 CAL0000: (358) MCS-2001: Join or subselect exceeds memory limit. %%10%% |
Sep 19 18:07:13 ip-172-31-39-250 threadpool[50051]: 13.554690 |0|0|0| E 22 CAL0005: threadFcn: Caught exception: std::bad_alloc |
Sep 19 18:07:13 ip-172-31-39-250 threadpool[50051]: 13.555167 |0|0|0| E 22 CAL0005: threadFcn: Caught exception: std::bad_alloc |
Sep 19 18:07:13 ip-172-31-39-250 threadpool[50051]: 13.555672 |0|0|0| E 22 CAL0005: threadFcn: Caught exception: std::bad_alloc |
Sep 19 18:07:13 ip-172-31-39-250 threadpool[50051]: 13.556261 |0|0|0| E 22 CAL0005: threadFcn: Caught exception: std::bad_alloc |
Sep 19 18:07:13 ip-172-31-39-250 ExeMgr[50051]: 13.556548 |6|0|0| D 16 CAL0042: End SQL statement |
Expected:
1) If the query ran in 23.02.x, then the query should also run in 23.10.x.
2) Clearer error messages around how much ram is needed
Example:
Client:
|
MCS-2001: Join or subselect exceeds memory limit of x.xGB. Estimated need of x.xGB |
 |
Logs:
|
Sep 19 18:07:13 ip-172-31-39-250 threadpool[50051]: 13.554690 |0|0|0| E 22 CAL0005: threadFcn: Caught exception: std::bad_alloc: Ran out of allocated memory xGB. Need approximately x GB |
-----------------------------------------------------
Old Ticket description included the following error messages too being experienced by users after upgrade but no reproduction has been found for these:
Internal error: MCS-2004: Cannot connect to ExeMgr. |
Internal error: MCS-2033: Error occurred when calling system catalog. |
Internal error: InetStreamSocket::readToMagic: Remote is closed
|