[MCOL-3273] ExeMgr crash - __memcpy_ssse3_back Created: 2019-04-22 Updated: 2021-07-08 Resolved: 2021-07-08 |
|
| Status: | Closed |
| Project: | MariaDB ColumnStore |
| Component/s: | ExeMgr |
| Affects Version/s: | 1.2.3 |
| Fix Version/s: | 1.4.5 |
| Type: | Bug | Priority: | Major |
| Reporter: | David Hill (Inactive) | Assignee: | David Hall (Inactive) |
| Resolution: | Won't Do | Votes: | 0 |
| Labels: | None | ||
| Environment: |
2um 2pm system |
||
| Sprint: | 2021-4 |
| Description |
|
Customer report ExeMgr crash, system recovered and continue working. Memory usage was shown to be normal from the customer. No logs reporting memory usage problems. from crash Program terminated with signal 11, Segmentation fault. from um1 log Apr 22 10:17:36 usfit-scdb1 ProcessMonitor[112390]: 36.751600 |0|0|0| C 18 CAL0000: *****MariaDB ColumnStore Process Restarting: ExeMgr, old PID = 179302 from trace Date/time: 2019-04-22 10:17:30 [0x55a9d60a4b40] |
| Comments |
| Comment by David Hill (Inactive) [ 2019-05-17 ] |
|
Support Issue number corrected, requested corefile. |
| Comment by David Hill (Inactive) [ 2019-05-17 ] |
|
compressed the corefile (it was 22G) and sent it through the ftp as core.ExeMgr.179302.gz |
| Comment by Andrew Hutchings (Inactive) [ 2019-05-30 ] |
|
Core file analysed. Crash happens because rgData is a null ptr. But RGData::serialize() tries to append it anyway. The length is random data from a bad pointer too. Happens in tupleannexstep.cpp:280. We get here because fDie is true. It is an aborted query. I suspect before any data was delivered so fRowGroupDeliver.setData() is never called. Not sure how TAS gets in that state but I think it might be an empty result set. |
| Comment by Andrew Hutchings (Inactive) [ 2019-05-30 ] |
|
OK. It is because nextBand() is called during abort after TupleAnnexStep is freed. This of course shouldn't happen. |