[MCOL-5057] EM index code miscalculates RAM needed to allocate its structures Created: 2022-04-18 Updated: 2022-05-05 Resolved: 2022-04-22 |
|
| Status: | Closed |
| Project: | MariaDB ColumnStore |
| Component/s: | ExeMgr, PrimProc |
| Affects Version/s: | 5.6.5, 6.3.1 |
| Fix Version/s: | 6.3.1 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Roman | Assignee: | Daniel Lee (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||
| Sprint: | 2021-17 | ||||||||||||
| Description |
|
As of MCS 5.6.5 there cases when initial EM load(load_brm) causes boost::inter_process::bad_alloc exception in ExtentMap::loadVersion4() when populating EM index. There is an extent map example attached to this issue that can be used to reproduce the issue. At the certain record EM index managed shmem segment has 1.2 MB but this pool is fragmented so that 1.2 KB can't be allocated in a continues chunk. unordered_map rehashing throws bad_alloc in this case.
Moreover I think this can happen in a real_time also setting a cluster to read-only. |
| Comments |
| Comment by David Hall (Inactive) [ 2022-04-22 ] | |||||||
|
We reverted Linux maintains a max limit to how large shared memory is allowed to grow to. For normal Linux, this is set as one half total available RAM. But docker defaults this to 64mb. The file we were attempting to load tries to allocate shared memory to 423mb. Thus the crash. The solution:
If using docker compose, add
To your docker-compose.yml I've tried to discover why docker sets the default so small and if there are any negative affects for setting it larger. I have found nothing useful. Perhaps those with more docker experience can help. 512mb may be super large for our purposes. Remember, the system we got this from is super gigantic – far larger than anything sky is likely to ever see. If there are negative effects for increasing shm-size, we should look at scaling it to docker size. If there are not, 512m is a good max that will likely not be reached for a good many years. | |||||||
| Comment by alexey vorovich (Inactive) [ 2022-04-22 ] | |||||||
|
drrtuy We should consider adding a log message in the case of shared memory issues , if possible | |||||||
| Comment by Daniel Lee (Inactive) [ 2022-04-22 ] | |||||||
|
Build verified: 6.3.1-1 (#4308), cmapi 1.6.3 (#628) With a new, modified BRM_saves_em file provided by the development, the load_brm loaded the file successfully on VMs. On docker images, the same test ended with a core dump. As noted in David.Hall's comment above, increasing shm to would fix the core dump problem. The recommended amount of shm for running ColumnStore in Docker should be determined and published for internal and external use. | |||||||
| Comment by Roman [ 2022-05-05 ] | |||||||
|
The solution is to use try/catch in EM::insert3dLayer() to grow() EM index shmem and re-try the insert. |