[MCOL-4626] Columnstore cluster becomes non operational when running out of memory on a query Created: 2021-03-20  Updated: 2024-02-05

Status: Stalled
Project: MariaDB ColumnStore
Component/s: None
Affects Version/s: 5.5.2
Fix Version/s: 23.10

Type: Bug Priority: Critical
Reporter: Todd Stoffel (Inactive) Assignee: Alan Mologorsky
Resolution: Unresolved Votes: 2
Labels: rm_stability

Issue Links:
Issue split
split to MCOL-4733 Memory allocations are kept longer th... Closed
Problem/Incident
is caused by MCOL-4810 Redundant copying and wasting memory ... Closed
Relates
relates to MCOL-4746 Prevent MCS cluster from starting on ... Open
relates to MCOL-4841 ExeMgr Crashes after large join - IDB... Closed
relates to MCOL-5187 OOM happening when querying large dat... Stalled
Sprint: 2021-5, 2021-6, 2021-7, 2021-8, 2021-9, 2021-10, 2021-11, 2021-12, 2021-13, 2021-14, 2021-15, 2021-16, 2021-17, 2022-22, 2022-23, 2023-4, 2023-5, 2023-6

 Description   

When there is too little memory available in docker, a destructive problem occurs. Currently - it is a crash of PrimProc, with the subsequent inability to run any queries. While prior commits improved the situation, it did not improve it enough. Yes, it is no longer a crash but an error message. And yes, in some cases the cluster remains operational. But - it is not always. There are cases when it is non operational after an error message.

In order to complete this ticket, the remaining problem needs to be corrected. Even after the memory is exceeded and an error message is generated, the cluster should be operational, and be able to execute queries that fit in memory.

We will continue working on this problem under MCOL-4733. The goal for that one is to prevent error message from happening in the first place. Memory should never be over-allocated.

Notes:
1.
The problem is not restricted to docker environments, or clusters with low memory in nodes. Some bigger jobs, like big insert into ... select from... may cause primProc crash even when there is 16GB of memory available, and that would happen on prem or in dockers.

Two things need to happen:

  • when there is larger memory (e.g. 16GB) things should just work with defaults.
  • If someone wants to run on lower memory (like 4GB), they should get a reasonable error message that memory is lacking for the job. Smaller jobs should continue to work.
  • and - in smaller memory deployments, one should be able to lower TotalUMMemory (25% default) and NumBlocksPCT (50% default) and be able to do even bigger jobs.

There may be deeper problems on very small settings like 1GB. Once we fix and verify the above, we should investigate what to do in those cases.

2. Technical description is below:
Goal is to implement internal realtime tracking of memory used by each process ExeMgr and PrimProc. This removes the need to ping the system at intervals to check memory usage and compare against some threshold (MaxPct). In doing so prior to each allocation further we can detect if it is approaching OOM quicker. (Before this would rely on the interval of refreshing the systems view of process memory and lead to possibly going further above the MaxPct than can be recovered. This should allow killing queries without consuming so much memory future queries can be blocked.

In order to complete the solution we would implement better management of who holds memory, and allow the system to free most of the held memory, and essentially reset all the block cache and instances holding any memory at an OOM event and ensure next query would be as if the processes were reset without having to restart them.



 Comments   
Comment by Todd Stoffel (Inactive) [ 2021-03-23 ]

The Docker host needs a minimum of 9 GB of RAM available to the containers in order to avoid this error.
A minimum limit of 3 GB RAM per container can be used via the mem_limit function.

PrimProc should not crash with an OOM error but I'm reducing the priority of this ticket since the cause is known and can be avoided.

Comment by Ben Thompson (Inactive) [ 2021-05-03 ]

PrimProc will throw a critical log message and fail to become operational.

PrimProc[1004]: 35.668435 |0|0|0| C 28 CAL0000: Error total memory available is less than 3GB.

attempts to interact with Columnstore while in this state will return errors.
MariaDB [mytest]> create table if not exists quicktest (c1 int, c2 char(15)) engine=columnstore;
ERROR 1815 (HY000): Internal error: System is not ready yet. Please try again.

Comment by Gregory Dorman (Inactive) [ 2021-05-03 ]

Good as the explanation may be, it is not good enough. If this is the way PrimProc does it, find someone else to do the test (ExeMger, maybe even CMAPI, I don't know). Or teach PrimProc to do it in a more usable way.

Guys - the days of Open Source attitudes are gone. We are enterprise software now. Especially in a cloud. People will not tolerate these kinds of things anymore.

Comment by Roman [ 2021-08-27 ]

gdorman Denis has implemented the fix in develop-6 for the original case, namely INSERT..SELECT with text or long varchar columns crashes PP.

Comment by Manjot Singh (Inactive) [ 2022-03-24 ]

Could s3 storage engine be leveraged to maintain global meta data?

Comment by alexey vorovich (Inactive) [ 2022-04-05 ]

as per David.Hall on todays standup this is not going to be easy

Should we move this to next release ?

Comment by alexey vorovich (Inactive) [ 2022-04-06 ]

moved to 641 as per Todd

Generated at Thu Feb 08 02:51:46 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.