Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Unresolved
-
23.02.3
-
None
-
None
-
2023-8, 2023-10, 2023-11, 2023-12, 2024-1
Description
PP was continuously crashing on its startup. There was a crash trace:
{format}Date/time: 2023-07-28 14:13:27
Signal: 11
/usr/bin/PrimProc(+0xb8116)[0x55b7b95d5116]
/lib64/libpthread.so.0(+0xf630)[0x7f7e1de5d630]
/lib64/libcommon.so(_ZN10statistics17StatisticsManager26convertStatsFromDataStreamESt10unique_ptrIA_cSt14default_deleteIS2_EE+0x14e)[0x7f7e1e833dce]
/lib64/libcommon.so(_ZN10statistics17StatisticsManager12loadFromFileEv+0x244)[0x7f7e1e834204]
/usr/bin/PrimProc(+0xabb4d)[0x55b7b95c8b4d]
/usr/bin/PrimProc(+0x4f1c5)[0x55b7b956c1c5]
/usr/bin/PrimProc(+0x1b1a80)[0x55b7b96cea80]
/lib64/libpthread.so.0(+0x7ea5)[0x7f7e1de55ea5]
/lib64/libc.so.6(clone+0x6d)[0x7f7e1ca01b0d]{format}
Presumably the /var/lib/columnstore/local/statistics file is crashed. I am attaching the file.
Attachments
Activity
Field | Original Value | New Value |
---|---|---|
Rank | Ranked higher |
Assignee | Denis Khalikov [ JIRAUSER48434 ] |
Fix Version/s | 23.08.1 [ 29105 ] |
Status | Open [ 1 ] | In Progress [ 3 ] |
Sprint | 2023-8 [ 728 ] | 2023-8, 2023-9 [ 728, 733 ] |
Sprint | 2023-8, 2023-9 [ 728, 733 ] | 2023-8, 2023-10 [ 728, 734 ] |
Fix Version/s | 23.08 [ 28540 ] | |
Fix Version/s | 23.08.1 [ 29105 ] |
Sprint | 2023-8, 2023-10 [ 728, 734 ] | 2023-8, 2023-10, 2023-11 [ 728, 734, 737 ] |
Attachment | statistics_backup [ 72415 ] |
Status | In Progress [ 3 ] | Stalled [ 10000 ] |
Resolution | Cannot Reproduce [ 5 ] | |
Status | Stalled [ 10000 ] | Closed [ 6 ] |
Resolution | Cannot Reproduce [ 5 ] | |
Status | Closed [ 6 ] | Stalled [ 10000 ] |
Sprint | 2023-8, 2023-10, 2023-11 [ 728, 734, 737 ] | 2023-8, 2023-10, 2023-11, 2023-13 [ 728, 734, 737, 748 ] |
Status | Stalled [ 10000 ] | In Progress [ 3 ] |
Status | In Progress [ 3 ] | Stalled [ 10000 ] |
Status | Stalled [ 10000 ] | Needs Feedback [ 10501 ] |
Sprint | 2023-8, 2023-10, 2023-11, 2023-12 [ 728, 734, 737, 748 ] | 2023-8, 2023-10, 2023-11, 2023-12, 2023-13 [ 728, 734, 737, 748, 755 ] |
Status | Needs Feedback [ 10501 ] | Closed [ 6 ] |
Zendesk Related Tickets | 201981 120763 126323 |
Fix Version/s | 23.10 [ 28540 ] |
Right denis0x0D, but before control flow loads data, it makes a buffer using data size from statistics storage file. And if the data size is crazy large this causes SEGV allocating the buffer. We need a failure detection here, e.g. save a hash of the data size counter and if hash(dataSize) != saved_hash StatisticsManager should clean statistics storage file and proceed.