[MCOL-850] ProcMgr crashes when too many 'getactivealarm' calls are made. Created: 2017-08-01  Updated: 2023-10-26  Resolved: 2017-08-14

Status: Closed
Project: MariaDB ColumnStore
Component/s: ?
Affects Version/s: 1.0.9, 1.1.0
Fix Version/s: 1.0.11, 1.1.0

Type: Bug Priority: Major
Reporter: David Hill (Inactive) Assignee: David Hill (Inactive)
Resolution: Fixed Votes: 1
Labels: None

Sprint: 2017-16

 Description   

Customer reported issue with ProcMgr crashing on a idle system with 8 modules configured.
Analysis showed a lot of getactivealarm commands were being process by ProcMgr at the same time and these request was coming from the 8 ServerMonitors on the system.

I was able to create the crash on my setup just at an idle state, but did when I ran the following script from pm5, which send alot of getactivealarm request to ProcMgr

#!/bin/bash
while [ true ]; do
/home/mariadb-user/mariadb/columnstore/bin/mcsadmin getactivea
done
exit 0

2715 Thread 0x7ff4effff700 (LWP 5932) "ProcMgr" 0x00007ff57a09666d in nanosleep () from /lib64/libc.so.6
2714 Thread 0x7ff507fff700 (LWP 5931) "ProcMgr" 0x00007ff57a09666d in nanosleep () from /lib64/libc.so.6
--Type <return> to continue, or q <return> to quit--q
Quit
(gdb) bt
#0 0x00007ff57a00d1d7 in raise () from /lib64/libc.so.6
#1 0x00007ff57a00e8c8 in abort () from /lib64/libc.so.6
#2 0x00007ff57a9119d5 in _gnu_cxx::_verbose_terminate_handler() () from /lib64/libstdc++.so.6
#3 0x00007ff57a90f946 in ?? () from /lib64/libstdc++.so.6
#4 0x00007ff57a90f973 in std::terminate() () from /lib64/libstdc++.so.6
#5 0x00007ff57a90fb93 in __cxa_throw () from /lib64/libstdc++.so.6
#6 0x00007ff57c35ad44 in messageqcpp::ByteStream::peek (this=0x7ff489ff8460, s="")
at /home/builder/mariadb-columnstore-server/mariadb-columnstore-engine/utils/messageqcpp/bytestream.cpp:416
#7 0x00007ff57c35a45b in messageqcpp::ByteStream::operator>> (this=0x7ff489ff8460, s="")
at /home/builder/mariadb-columnstore-server/mariadb-columnstore-engine/utils/messageqcpp/bytestream.cpp:310
#8 0x00000000004605bb in processmanager::processMSG (cfIos=0x7ff56bffeab0)
at /home/builder/mariadb-columnstore-server/mariadb-columnstore-engine/procmgr/processmanager.cpp:2743
#9 0x00007ff57adc9dc5 in start_thread () from /lib64/libpthread.so.0
#10 0x00007ff57a0cf73d in clone () from /lib64/libc.so.6



 Comments   
Comment by David Hill (Inactive) [ 2017-08-09 ]

fixed in develop-1.0 1.0.11

commit 06199595763c363dc36a6bd8d1c1d52a1eb2071c
Author: David Hill <david.hill@mariadb.com>
Date: Wed Aug 9 13:53:54 2017 -0500

Comment by David Hill (Inactive) [ 2017-08-09 ]

1.1.0 merged from 1.0.10

commit 76fb89c13f8e1495e127078474f315850d5b86ea
Merge: 42867bc 6ed975d
Author: David Hill <david.hill@mariadb.com>
Date: Wed Aug 9 16:01:38 2017 -0500

Comment by David Hill (Inactive) [ 2017-08-14 ]

tested by DH and customer

Generated at Thu Feb 08 02:24:17 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.