[MCOL-5544] StatisticsManager crashes on PP startup unable to read the file. Created: 2023-07-28 Updated: 2024-01-09 |
|
| Status: | In Progress |
| Project: | MariaDB ColumnStore |
| Component/s: | PrimProc |
| Affects Version/s: | 23.02.3 |
| Fix Version/s: | 23.10 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Roman | Assignee: | Denis Khalikov |
| Resolution: | Unresolved | Votes: | 1 |
| Labels: | None | ||
| Attachments: |
|
| Sprint: | 2023-8, 2023-10, 2023-11, 2023-12 |
| Description |
|
PP was continuously crashing on its startup. There was a crash trace: {format}Date/time: 2023-07-28 14:13:27 Signal: 11 /usr/bin/PrimProc(+0xb8116)[0x55b7b95d5116] /lib64/libpthread.so.0(+0xf630)[0x7f7e1de5d630] /lib64/libcommon.so(_ZN10statistics17StatisticsManager26convertStatsFromDataStreamESt10unique_ptrIA_cSt14default_deleteIS2_EE+0x14e)[0x7f7e1e833dce] /lib64/libcommon.so(_ZN10statistics17StatisticsManager12loadFromFileEv+0x244)[0x7f7e1e834204] /usr/bin/PrimProc(+0xabb4d)[0x55b7b95c8b4d] /usr/bin/PrimProc(+0x4f1c5)[0x55b7b956c1c5] /usr/bin/PrimProc(+0x1b1a80)[0x55b7b96cea80] /lib64/libpthread.so.0(+0x7ea5)[0x7f7e1de55ea5] /lib64/libc.so.6(clone+0x6d)[0x7f7e1ca01b0d]{format} Presumably the /var/lib/columnstore/local/statistics file is crashed. I am attaching the file. |
| Comments |
| Comment by Roman [ 2023-11-03 ] |
|
Right denis0x0D, but before control flow loads data, it makes a buffer using data size from statistics storage file. And if the data size is crazy large this causes SEGV allocating the buffer. We need a failure detection here, e.g. save a hash of the data size counter and if hash(dataSize) != saved_hash StatisticsManager should clean statistics storage file and proceed. |
| Comment by Roman [ 2023-11-03 ] |
|
We have the actual file this time. |
| Comment by JiraAutomate [ 2023-12-17 ] |
|
Automated message: |
| Comment by Massimo [ 2023-12-18 ] |
|
hi |
| Comment by Roman [ 2023-12-18 ] |
|
We decided to re-open the issue. |