[MCOL-371] ExeMgr crash on ubuntu 16.04 regression testing, test200 Created: 2016-10-21  Updated: 2016-10-24  Resolved: 2016-10-24

Status: Closed
Project: MariaDB ColumnStore
Component/s: ExeMgr
Affects Version/s: 1.0.4
Fix Version/s: 1.0.4

Type: Bug Priority: Major
Reporter: David Hill (Inactive) Assignee: David Hill (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

Amazon AWS Ubuntu 16.04, I build and regression test


Sprint: 2016-20

 Description   

Noticed during the regression testing that there was some crashes reported in the crit.log. Time frame was when test200 ran. I setup to run test000, test100, test200 and was able to reproduce the crash.

in go.,sh

tests="test000.sh test100.sh test200.sh"

from crit log

Oct 21 19:44:50 ip-10-81-19-26 ProcessMonitor[5132]: 50.917898 |0|0|0| C 18 CAL0000: *****Calpont Process Restarting: ExeMgr, old PID = 30676
Oct 21 19:57:46 ip-10-81-19-26 ProcessMonitor[5132]: 46.887395 |0|0|0| C 18 CAL0000: *****Calpont Process Restarting: ExeMgr, old PID = 7195



 Comments   
Comment by David Hill (Inactive) [ 2016-10-21 ]

Program terminated with signal SIGABRT, Aborted.
#0 0x00007feb52f18418 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
54 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0x7feb4bdec700 (LWP 8646))]
(gdb) bt
#0 0x00007feb52f18418 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1 0x00007feb52f1a01a in __GI_abort () at abort.c:89
#2 0x00007feb52f10bd7 in __assert_fail_base (fmt=<optimized out>, assertion=assertion@entry=0x7feb58afb988 "!res",
file=file@entry=0x7feb58afba88 "/usr/include/boost/thread/pthread/mutex.hpp", line=line@entry=111,
function=function@entry=0x7feb58b1a040 <ZZN5boost5mutexD4EvE19PRETTY_FUNCTION_> "boost::mutex::~mutex()") at assert.c:92
#3 0x00007feb52f10c82 in _GI__assert_fail (assertion=0x7feb58afb988 "!res", file=0x7feb58afba88 "/usr/include/boost/thread/pthread/mutex.hpp", line=111,
function=0x7feb58b1a040 <ZZN5boost5mutexD4EvE19PRETTY_FUNCTION_> "boost::mutex::~mutex()") at assert.c:101
#4 0x00007feb58aaa761 in joblist::TupleAggregateStep::~TupleAggregateStep() () from /usr/local/mariadb/columnstore/lib/libjoblist.so.1
#5 0x00007feb58ab7a05 in boost::detail::sp_counted_impl_p<joblist::TupleAggregateStep>::dispose() () from /usr/local/mariadb/columnstore/lib/libjoblist.so.1
#6 0x0000000000414d4a in boost::detail::sp_counted_base::release() [clone .part.21] [clone .constprop.514] ()
#7 0x000000000042080b in std::_Rb_tree<int, std::pair<int const, boost::shared_ptr<joblist::JobStep> >, std::_Select1st<std::pair<int const, boost::shared_ptr<joblist::JobStep> > >, std::less<int>, std::allocator<std::pair<int const, boost::shared_ptr<joblist::JobStep> > > >::_M_erase(std::_Rb_tree_node<std::pair<int const, boost::shared_ptr<joblist::JobStep> > >*)
()
#8 0x00007feb58a2328b in joblist::JobList::~JobList() () from /usr/local/mariadb/columnstore/lib/libjoblist.so.1
#9 0x00007feb58a24029 in joblist::TupleJobList::~TupleJobList() () from /usr/local/mariadb/columnstore/lib/libjoblist.so.1
#10 0x0000000000414d4a in boost::detail::sp_counted_base::release() [clone .part.21] [clone .constprop.514] ()
#11 0x0000000000419c92 in (anonymous namespace)::SessionThread::operator()() [clone .constprop.488] ()
#12 0x00007feb550aa5d5 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.58.0
#13 0x00007feb545606fa in start_thread (arg=0x7feb4bdec700) at pthread_create.c:333
#14 0x00007feb52fe9b5d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Comment by David Hill (Inactive) [ 2016-10-21 ]

first crash from the run

(gdb) bt
#0 0x00007f07f9ca8418 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1 0x00007f07f9caa01a in __GI_abort () at abort.c:89
#2 0x00007f07f9ca0bd7 in __assert_fail_base (fmt=<optimized out>, assertion=assertion@entry=0x7f07ff88b988 "!res",
file=file@entry=0x7f07ff88ba88 "/usr/include/boost/thread/pthread/mutex.hpp", line=line@entry=111,
function=function@entry=0x7f07ff8aa040 <ZZN5boost5mutexD4EvE19PRETTY_FUNCTION_> "boost::mutex::~mutex()") at assert.c:92
#3 0x00007f07f9ca0c82 in _GI__assert_fail (assertion=0x7f07ff88b988 "!res", file=0x7f07ff88ba88 "/usr/include/boost/thread/pthread/mutex.hpp", line=111,
function=0x7f07ff8aa040 <ZZN5boost5mutexD4EvE19PRETTY_FUNCTION_> "boost::mutex::~mutex()") at assert.c:101
#4 0x00007f07ff83a761 in joblist::TupleAggregateStep::~TupleAggregateStep() () from /usr/local/mariadb/columnstore/lib/libjoblist.so.1
#5 0x00007f07ff847a05 in boost::detail::sp_counted_impl_p<joblist::TupleAggregateStep>::dispose() () from /usr/local/mariadb/columnstore/lib/libjoblist.so.1
#6 0x0000000000414d4a in boost::detail::sp_counted_base::release() [clone .part.21] [clone .constprop.514] ()
#7 0x000000000042080b in std::_Rb_tree<int, std::pair<int const, boost::shared_ptr<joblist::JobStep> >, std::_Select1st<std::pair<int const, boost::shared_ptr<joblist::JobStep> > >, std::less<int>, std::allocator<std::pair<int const, boost::shared_ptr<joblist::JobStep> > > >::_M_erase(std::_Rb_tree_node<std::pair<int const, boost::shared_ptr<joblist::JobStep> > >*)
()
#8 0x00007f07ff7b328b in joblist::JobList::~JobList() () from /usr/local/mariadb/columnstore/lib/libjoblist.so.1
#9 0x00007f07ff7b4029 in joblist::TupleJobList::~TupleJobList() () from /usr/local/mariadb/columnstore/lib/libjoblist.so.1
#10 0x00007f07ff7f04da in boost::detail::sp_counted_base::release() [clone .part.20] [clone .constprop.403] () from /usr/local/mariadb/columnstore/lib/libjoblist.so.1
#11 0x00007f07ff7f2d05 in joblist::SubQueryStep::~SubQueryStep() () from /usr/local/mariadb/columnstore/lib/libjoblist.so.1
#12 0x00007f07ff7f2d99 in joblist::SubQueryStep::~SubQueryStep() () from /usr/local/mariadb/columnstore/lib/libjoblist.so.1
#13 0x00007f07ff7f04da in boost::detail::sp_counted_base::release() [clone .part.20] [clone .constprop.403] () from /usr/local/mariadb/columnstore/lib/libjoblist.so.1
#14 0x00007f07ff7f3255 in joblist::SubAdapterStep::~SubAdapterStep() () from /usr/local/mariadb/columnstore/lib/libjoblist.so.1
#15 0x00007f07ff7f3379 in joblist::SubAdapterStep::~SubAdapterStep() () from /usr/local/mariadb/columnstore/lib/libjoblist.so.1
#16 0x00007f07ff76d028 in std::vector<boost::shared_ptr<joblist::JobStep>, std::allocator<boost::shared_ptr<joblist::JobStep> > >::~vector() ()
from /usr/local/mariadb/columnstore/lib/libjoblist.so.1
#17 0x00007f07ff7b3262 in joblist::JobList::~JobList() () from /usr/local/mariadb/columnstore/lib/libjoblist.so.1
#18 0x00007f07ff7b4029 in joblist::TupleJobList::~TupleJobList() () from /usr/local/mariadb/columnstore/lib/libjoblist.so.1
#19 0x0000000000414d4a in boost::detail::sp_counted_base::release() [clone .part.21] [clone .constprop.514] ()
#20 0x0000000000419c92 in (anonymous namespace)::SessionThread::operator()() [clone .constprop.488] ()
#21 0x00007f07fbe3a5d5 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.58.0
#22 0x00007f07fb2f06fa in start_thread (arg=0x7f07f2b7c700) at pthread_create.c:333
#23 0x00007f07f9d79b5d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
(gdb)

Comment by Andrew Hutchings (Inactive) [ 2016-10-23 ]

Managed to reproduce in Ubuntu 16.04 release builds only (didn't happen on my debug build). This patch made it go away.

Comment by David Hill (Inactive) [ 2016-10-24 ]

ubuntu 16.04 and test200 runs working without ExeMgr crashes...

Problem has been fixed..

Generated at Thu Feb 08 02:20:34 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.