Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Not a Bug
-
None
-
None
-
None
Description
Steps to reproduce:
1. build and start 3 node cluster w or w/o MXS. Build and verification is ok (green).
2. exec to the slave node 2 using docker exec -it mcs2 bash, for me it is mcs2 everytime.
3. check process list (ps aux or mcs cluster status), all MCS processes should exist
4. wait 1-2 minutes and do nothing with a cluster
5. check process list again, now PrimProc process is gone
After PrimProc gone I got those additional info:
From /var/log/mariadb/columnstore/trace/PrimProc****
Date/time: 2022-12-13 16:11:25
|
Signal: 11
|
|
/usr/bin/PrimProc(+0xbe6c6)[0x55dacb8066c6]
|
/lib64/libpthread.so.0(+0x12cf0)[0x7f29c217dcf0]
|
/lib64/libjoblist.so(_ZN7joblist21DistributedEngineComm5SetupEv+0x1384)[0x7f29c3634b14]
|
/lib64/libjoblist.so(_ZN7joblist21DistributedEngineComm6ListenEN5boost10shared_ptrIN11messageqcpp18MessageQueueClientEEEj+0x522)[0x7f29c3635e02]
|
/lib64/libjoblist.so(+0x13b046)[0x7f29c3636046]
|
/usr/bin/PrimProc(+0xc01a7)[0x55dacb8081a7]
|
/lib64/libpthread.so.0(+0x81cf)[0x7f29c21731cf]
|
/lib64/libc.so.6(clone+0x43)[0x7f29c0b87e73]
|
Using MariaDB-columnstore-engine-debuginfo package I got those:
nm /usr/lib/debug/usr/lib64/libjoblist.so-10.6.11_6_22.08.4-1.el8.x86_64.debug | grep _ZN7joblist21DistributedEngineComm5SetupEv
|
0000000000138790 T _ZN7joblist21DistributedEngineComm5SetupEv
|
00000000000acd94 t _ZN7joblist21DistributedEngineComm5SetupEv.cold
|
|
0x1384 + 0x138790 = 0x139B14
|
|
addr2line -f -e /lib64/libjoblist.so 0x139b14
|
_ZN7joblist21DistributedEngineComm5SetupEv
|
/usr/src/debug/MariaDB-/src_0/storage/columnstore/columnstore/.boost/boost-lib/include/boost/smart_ptr/shared_ptr.hpp:786
|
At the same time I could observe this messages at debug.log
Seems that this is not related but anyway.
Dec 13 16:55:49 mcs2 joblist[564]: 49.623685 |0|0|0| W 05 CAL0000: /mdb/verylongdirnameforverystrangecpackbehavior/storage/columnstore/columnstore/dbcon/joblist/distributedenginecomm.cpp @ 308 Could not connect to PMS0: Connection refused from PMS0 %%10%%
|
Dec 13 16:55:49 mcs2 joblist[564]: 49.624413 |0|0|0| W 05 CAL0000: /mdb/verylongdirnameforverystrangecpackbehavior/storage/columnstore/columnstore/dbcon/joblist/distributedenginecomm.cpp @ 308 Could not connect to PMS0: Connection refused from PMS0 %%10%%
|
Dec 13 16:55:49 mcs2 joblist[564]: 49.624919 |0|0|0| W 05 CAL0000: /mdb/verylongdirnameforverystrangecpackbehavior/storage/columnstore/columnstore/dbcon/joblist/distributedenginecomm.cpp @ 308 Could not connect to PMS0: Connection refused from PMS0 %%10%%
|
Dec 13 16:55:49 mcs2 joblist[564]: 49.625456 |0|0|0| W 05 CAL0000: /mdb/verylongdirnameforverystrangecpackbehavior/storage/columnstore/columnstore/dbcon/joblist/distributedenginecomm.cpp @ 308 Could not connect to PMS0: Connection refused from PMS0 %%10%%
|