[MCOL-5309] mariadb service crashing and restarting intermittently - status=11/SEGV (long in-list) Created: 2022-11-16  Updated: 2023-12-20

Status: Stalled
Project: MariaDB ColumnStore
Component/s: None
Affects Version/s: 6.3.1
Fix Version/s: 22.08.7

Type: Bug Priority: Critical
Reporter: Michael Amadi Assignee: Roman
Resolution: Unresolved Votes: 1
Labels: None
Environment:

NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"


Attachments: File CSX2_PIXID_logs.tar.gz     XML File Columnstore_CSX2.xml     HTML File gigantic_query_list     HTML File mariadb_errlog    
Issue Links:
Problem/Incident
causes MCOL-5321 MCS plugin crashes MDB runtime for IN... Closed

 Description   

Three nodes of Columnstore - When a Particular gigantic SELECT query is being run on either of the replica nodes of the cluster, the mariadb service goes into a restart loop. Checking the logs, it shows the below :

Nov 16 06:57:54 pixid-csx2 systemd: mariadb.service: main process exited, code=killed, status=11/SEGV
Nov 16 06:57:54 pixid-csx2 systemd: Unit mariadb.service entered failed state.
Nov 16 06:57:54 pixid-csx2 systemd: mariadb.service failed.
Nov 16 06:57:59 pixid-csx2 systemd: mariadb.service holdoff time over, scheduling restart.
Nov 16 06:57:59 pixid-csx2 systemd: Cannot add dependency job for unit systemd-tmpfiles-clean.timer, ignoring: Unit is masked.
Nov 16 06:57:59 pixid-csx2 systemd: Stopping pt-kill service...
Nov 16 06:57:59 pixid-csx2 systemd: Stopped pt-kill service.
Nov 16 06:57:59 pixid-csx2 systemd: Stopped MariaDB 10.6.7-3 database server.
Nov 16 06:57:59 pixid-csx2 systemd: Starting MariaDB 10.6.7-3 database server...

Cluster health is only restored after a restart, and the query is no longer executing.



 Comments   
Comment by Rick Pizzi [ 2022-11-21 ]

Checked the crashing thread from the mariadb server crash, looks interesting:

[ ... 6000+ more frames... ]
#6014 0x00007f8335e557e9 in execplan::ParseTree::~ParseTree() () from /usr/lib64/mysql/plugin/ha_columnstore.so
#6015 0x00007f8335e557e9 in execplan::ParseTree::~ParseTree() () from /usr/lib64/mysql/plugin/ha_columnstore.so
#6016 0x00007f8335e557e9 in execplan::ParseTree::~ParseTree() () from /usr/lib64/mysql/plugin/ha_columnstore.so
#6017 0x00007f8335e557e9 in execplan::ParseTree::~ParseTree() () from /usr/lib64/mysql/plugin/ha_columnstore.so
#6018 0x00007f8335e557e9 in execplan::ParseTree::~ParseTree() () from /usr/lib64/mysql/plugin/ha_columnstore.so
#6019 0x00007f8335e557e9 in execplan::ParseTree::~ParseTree() () from /usr/lib64/mysql/plugin/ha_columnstore.so
#6020 0x00007f8335e557e9 in execplan::ParseTree::~ParseTree() () from /usr/lib64/mysql/plugin/ha_columnstore.so
#6021 0x00007f8335e557e9 in execplan::ParseTree::~ParseTree() () from /usr/lib64/mysql/plugin/ha_columnstore.so
#6022 0x00007f8335e557e9 in execplan::ParseTree::~ParseTree() () from /usr/lib64/mysql/plugin/ha_columnstore.so
#6023 0x00007f8335e557f8 in execplan::ParseTree::~ParseTree() () from /usr/lib64/mysql/plugin/ha_columnstore.so
#6024 0x00007f8335e557e9 in execplan::ParseTree::~ParseTree() () from /usr/lib64/mysql/plugin/ha_columnstore.so
#6025 0x00007f8335e557e9 in execplan::ParseTree::~ParseTree() () from /usr/lib64/mysql/plugin/ha_columnstore.so
#6026 0x00007f8335e557e9 in execplan::ParseTree::~ParseTree() () from /usr/lib64/mysql/plugin/ha_columnstore.so
#6027 0x00007f8335e557e9 in execplan::ParseTree::~ParseTree() () from /usr/lib64/mysql/plugin/ha_columnstore.so
#6028 0x00007f8335e557e9 in execplan::ParseTree::~ParseTree() () from /usr/lib64/mysql/plugin/ha_columnstore.so
#6029 0x00007f8335e557e9 in execplan::ParseTree::~ParseTree() () from /usr/lib64/mysql/plugin/ha_columnstore.so
#6030 0x00007f8335e557e9 in execplan::ParseTree::~ParseTree() () from /usr/lib64/mysql/plugin/ha_columnstore.so
#6031 0x00007f8335e557e9 in execplan::ParseTree::~ParseTree() () from /usr/lib64/mysql/plugin/ha_columnstore.so
#6032 0x00007f8335e557e9 in execplan::ParseTree::~ParseTree() () from /usr/lib64/mysql/plugin/ha_columnstore.so
#6033 0x00007f83343b3291 in execplan::CalpontSelectExecutionPlan::~CalpontSelectExecutionPlan() () from /lib64/libexecplan.so
#6034 0x00007f83343b3521 in execplan::CalpontSelectExecutionPlan::~CalpontSelectExecutionPlan() () from /lib64/libexecplan.so
#6035 0x00007f8335dd251e in boost::detail::sp_counted_base::release() () from /usr/lib64/mysql/plugin/ha_columnstore.so
#6036 0x00007f8335dfcc2e in ha_mcs_impl_pushdown_init(mcs_handler_info*, TABLE*) () from /usr/lib64/mysql/plugin/ha_columnstore.so
#6037 0x00007f8335de0155 in create_columnstore_select_handler(THD*, st_select_lex*) () from /usr/lib64/mysql/plugin/ha_columnstore.so
#6038 0x0000558397379d88 in mysql_select(THD*, TABLE_LIST*, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*) ()
#6039 0x000055839737a6a4 in handle_select(THD*, LEX*, select_result*, unsigned long) ()
#6040 0x00005583971dd476 in execute_sqlcom_select(THD*, TABLE_LIST*) ()
#6041 0x000055839731d002 in mysql_execute_command(THD*, bool) ()
#6042 0x000055839731f3db in mysql_parse(THD*, char*, unsigned int, Parser_state*) ()
#6043 0x000055839732177d in dispatch_command(enum_server_command, THD*, char*, unsigned int, bool) ()
#6044 0x0000558397322cee in do_command(THD*, bool) ()
#6045 0x00005583974188b7 in do_handle_one_connection(CONNECT*, bool) ()
#6046 0x0000558397418b54 in handle_one_connection ()
#6047 0x00005583977b1cec in pfs_spawn_thread ()
#6048 0x00007f833ab6aea5 in start_thread (arg=0x7f8304f1c700) at pthread_create.c:307
#6049 0x00007f833a085b0d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Comment by Todd Stoffel (Inactive) [ 2022-11-27 ]

Possible duplicate with MCOL-5310

Generated at Thu Feb 08 02:56:55 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.