Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
23.02.18
-
None
-
None
-
None
-
MariaDB Enterprise Server: 10.6.15_10_23.02.18-1.el8.x86_64
Component: MariaDB ColumnStore (`PrimProc`)
-
2026-4
Description
Under heavy concurrent workloads, the `PrimProc` process crashes fatally with an `Assertion 'px != 0' failed` error. The crash is triggered by a hardcoded null pointer dereference within the error-handling path of the `TupleBPS` class.
Steps to Reproduce / Trigger
1. Initiate a heavy workload of concurrent `cpimport` bulk loading jobs.
2. Simultaneously run massive `AGG` (aggregation) `SELECT` queries that heavily utilize `DictScanJob` evaluations.
3. Wait for resource contention or a transient timeout to force the system to report an error via `TupleBPS::sendError()`.
4. `PrimProc` immediately aborts and drops the cluster.
Root Cause Analysis (Code Level)
The crash occurs in `storage/columnstore/columnstore/dbcon/joblist/tuple-bps.cpp` inside `void TupleBPS::sendError(uint16_t status)`.
Starting at line 1558:
1558: void TupleBPS::sendError(uint16_t status) |
1559: {
|
1560: SBS msgBpp;
|
1561: fBPP->setCount(1);
|
1562: fBPP->setStatus(status);
|
1563: fBPP->runErrorBPP(*msgBpp);
|
On line 1560, `msgBpp` is declared as an empty `messageqcpp::ByteStream` shared pointer (`SBS msgBpp;`). It is never allocated before it is dereferenced (`*msgBpp`) on line 1563. This causes the `boost::shared_ptr` to fail its internal `px != 0` assertion, instantly killing the `PrimProc` daemon.
Suggested Fix
Properly allocate the `messageqcpp::ByteStream` object before attempting to dereference it inside `sendError()`.
Customer Impact
Critical / S2. The system drops queries and causes a full production outage, requiring a manual restart of the cluster.