[MCOL-472] mysqld not shutdown by shutdownsystem commands, sometimes Created: 2016-12-19  Updated: 2023-10-26  Resolved: 2017-07-19

Status: Closed
Project: MariaDB ColumnStore
Component/s: ?
Affects Version/s: 1.0.6, 1.0.7
Fix Version/s: 1.0.10

Type: Bug Priority: Minor
Reporter: David Hill (Inactive) Assignee: David Hill (Inactive)
Resolution: Fixed Votes: 1
Labels: None
Environment:

nightly regression test system



 Description   

This has happened on my nightly regression test system, I haven't seen it happen on a regular install mysql.

As shown below, there is a script that gets run to shutdown and run postConfigure on the nightly build. Shutdown reports suggess, but postConfigure reports error that port 3306 is in use and that is because mysqld is still running..

Because its the build machine, its probably happening because:

1. it didnt get shutdown cleanly at the start of the build
2. since the install directories are deleted, the pdi file is deleted
3. and when the shutdown is done, its not shutting down because the PDI file isn't there, but there should be code in mysql-Columnstore to do a force shutdown when the PDI file isn't there. So need to check that logic

[root@ip-172-30-0-119 nightly]# ./localSystemRestart.sh
shutdownsystem Mon Dec 19 14:51:41 2016

This command stops the processing of applications on all Modules within the MariaDB Columnstore System

Checking for active transactions

Stopping System...
Successful stop of System

Shutting Down System...
Successful shutdown of System

Mariab Columnstore uninstall completed
The next step is:

/usr/local/mariadb/columnstore/bin/postConfigure

This is the MariaDB Columnstore System Configuration and Installation tool.
It will Configure the MariaDB Columnstore System and will perform a Package
Installation of all of the Servers within the System that is being configured.

IMPORTANT: This tool should only be run on the Parent OAM Module
which is a Performance Module, preferred Module #1

With the no-Prompting Option being specified, you will be required to have the following:

1. Root user ssh keys setup between all nodes in the system or
use the password command line option.
2. A Configure File to use to retrieve configure data, default to Columnstore.xml.rpmsave
or use the '-c' option to point to a configuration file.

The Calpont Configuration Data is taken from /usr/local/mariadb/columnstore/etc/Columnstore.xml.rpmsave

Do you want to utilize the configuration data from the saved copy? [y,n] >
NOTE: my.cnf file was upgraded based on my.cnf.rpmsave

===== Setup System Server Type Configuration =====

There are 2 options when configuring the System Server Type: single and multi

'single' - Single-Server install is used when there will only be 1 server configured
on the system. It can also be used for production systems, if the plan is
to stay single-server.

'multi' - Multi-Server install is used when you want to configure multiple servers now or
in the future. With Multi-Server install, you can still configure just 1 server
now and add on addition servers/modules in the future.

Select the type of System Server install [1=single, 2=multi] (1) >

Performing the Single Server Install.
Enter System Name (columnstore-1) >

===== Storage Configuration = internal =====

Enter the list (Nx,Ny,Nz) or range (Nx-Nz) of DBRoot IDs assigned to module 'pm1' (1) >

The MariaDB Columnstore port of '3306' is already in-use
For No-prompt install, use the command line argument of 'port' to enter a different number

ps -ef | grep mysql
root 5215 1 0 Dec15 ? 00:00:00 /bin/sh /usr/local/mariadb/columnstore/mysql//bin/mysqld_safe --datadir=/usr/local/mariadb/columnstore/mysql/db --pid-file=/usr/local/mariadb/columnstore/mysql/db/ip-172-30-0-119.pid --ledir=/usr/local/mariadb/columnstore/mysql//bin
mysql 5398 5215 0 Dec15 ? 00:00:48 /usr/local/mariadb/columnstore/mysql//bin/mysqld --basedir=/usr/local/mariadb/columnstore/mysql/ --datadir=/usr/local/mariadb/columnstore/mysql/db --plugin-dir=/usr/local/mariadb/columnstore/mysql/lib/plugin --user=mysql --log-error=/usr/local/mariadb/columnstore/mysql/db/ip-172-30-0-119.err --pid-file=/usr/local/mariadb/columnstore/mysql/db/ip-172-30-0-119.pid --socket=/usr/local/mariadb/columnstore/mysql/lib/mysql/mysql.sock --port=3306
root 66142 62289 0 15:01 pts/0 00:00:00 grep --color=auto mysql



 Comments   
Comment by David Hill (Inactive) [ 2016-12-19 ]

nighly regression test work-around, doing a 'pkill -9 mysqld' after the shutdownsystem command

Comment by David Hill (Inactive) [ 2016-12-22 ]

same issue on a newly installed amazon system where the binary package was untarred and then the shutdown was done.

[mariadb-user@ip-172-30-0-108 ~]$ ma shutd y
shutdownsystem Thu Dec 22 17:01:38 2016

This command stops the processing of applications on all Modules within the MariaDB ColumnStore System

Checking for active transactions

Stopping System...
Successful stop of System

Shutting Down System...
Successful shutdown of System

[mariadb-user@ip-172-30-0-108 ~]$ ps -ef | grep mysql
mariadb+ 16669 1 0 16:07 ? 00:00:00 /bin/sh /home/mariadb-user/mariadb/columnstore/mysql//bin/mysqld_safe --datadir=/home/mariadb-user/mariadb/columnstore/mysql/db --pid-file=/home/mariadb-user/mariadb/columnstore/mysql/db/ip-172-30-0-108.us-west-2.compute.internal.pid --ledir=/home/mariadb-user/mariadb/columnstore/mysql//bin
mariadb+ 16838 16669 0 16:07 ? 00:00:00 /home/mariadb-user/mariadb/columnstore/mysql//bin/mysqld --basedir=/home/mariadb-user/mariadb/columnstore/mysql/ --datadir=/home/mariadb-user/mariadb/columnstore/mysql/db --plugin-dir=/home/mariadb-user/mariadb/columnstore/mysql/lib/plugin --log-error=/home/mariadb-user/mariadb/columnstore/mysql/db/ip-172-30-0-108.us-west-2.compute.internal.err --pid-file=/home/mariadb-user/mariadb/columnstore/mysql/db/ip-172-30-0-108.us-west-2.compute.internal.pid --socket=/home/mariadb-user/mariadb/columnstore/mysql/lib/mysql/mysql.sock --port=3306
mariadb+ 28016 26772 0 17:02 pts/0 00:00:00 grep --color=auto mysql
[mariadb-user@ip-172-30-0-108 ~]$

Comment by David Hill (Inactive) [ 2016-12-23 ]

work-around

  1. mcsadmin shutdownsystem y
  2. pkill -9 mysqld
  3. mcsadmin startsystem
Comment by David Hill (Inactive) [ 2017-06-05 ]

found error, its a mysqld path issue in mysql-columnstore script.

Comment by David Hill (Inactive) [ 2017-06-05 ]

changed line 313

eval $(ps -ef | grep "$COLUMNSTORE_INSTALL_DIR/mysql//sbin/mysqld" | grep -v grep | head -1 | awk '

{printf "pid=%d\n", $2}')

to

eval $(ps -ef | grep "$COLUMNSTORE_INSTALL_DIR/mysql//bin/mysqld" | grep -v grep | head -1 | awk '{printf "pid=%dn", $2}

')

commit ba7825cce669e3b7d592a4236a1c0de40e14fbac
Author: david hill <david.hill@mariadb.com>
Date: Mon Jun 5 16:55:45 2017 -0500

MCOL-472 - fixed mysqld path issue on kill by pid

dbcon/mysql/mysql-Columnstore | 2 +-

Comment by David Hill (Inactive) [ 2017-06-05 ]

how to reproduce.. start MCS and then delete the pid file.
run to test that a force shutdown will be done

./mysql-columnstore stop

Comment by David Hill (Inactive) [ 2017-06-12 ]

fixed for .10.10

commit 2dd99eabf7ace7cee3bf1e1452d58034ce3ce948
Author: david hill <david.hill@mariadb.com>
Date: Mon Jun 12 16:29:49 2017 -0500

MCOL-472 - fix the force shutdown command

dbcon/mysql/mysql-Columnstore | 2 +-

Comment by Daniel Lee (Inactive) [ 2017-06-12 ]

Build tested: Github source 1.1.0

[root@localhost mariadb-columnstore-server]# git show
commit 594ef1807a5d6cba45cf7c2bed03cccdc32f177a
Merge: a5f191d ce815f9
Author: David.Hall <david.hall@mariadb.com>
Date: Thu Jun 8 10:12:50 2017 -0500

[root@localhost mariadb-columnstore-engine]# git show
commit 3e1bdfb1e97490f3d66339eb11d1d1de1222487a
Author: david hill <david.hill@mariadb.com>
Date: Wed Jun 7 15:09:09 2017 -0500

Reproduced the issue in 1.0.10 source (Before Mr. Hill made the change to 1.0.10, as described in the last comment)
Verified the "sbin" is in the the mysql-Columnstore file for 1.0.10, but not in 1.0.1

Repeated the test using the stopsystem command in the ma console and it did not stop the mysqld process.

I tried the mysql-Columnstore script and shutdownsystem and they both worked.

After the shutdownsystem command, mysqld process no longer exist. There is also no pid file (I renamed during testing and removed it after). I did a startsystem. System came up and a new pid file was created. But ma did not show a pid for the mysqld process.

mcsadmin> getprocessstatus
getprocessstatus Mon Jun 12 21:28:29 2017

MariaDB ColumnStore Process statuses

Process Module Status Last Status Change Process ID
------------------ ------ --------------- ------------------------ ----------
ProcessMonitor um1 ACTIVE Mon Jun 12 21:27:04 2017 10748
ServerMonitor um1 ACTIVE Mon Jun 12 21:27:23 2017 11061
DBRMWorkerNode um1 ACTIVE Mon Jun 12 21:27:23 2017 11088
ExeMgr um1 ACTIVE Mon Jun 12 21:27:36 2017 12550
DDLProc um1 ACTIVE Mon Jun 12 21:27:40 2017 12564
DMLProc um1 ACTIVE Mon Jun 12 21:27:45 2017 12575
mysqld um1 ACTIVE Mon Jun 12 21:27:34 2017 <------pid missing here.

Comment by David Hill (Inactive) [ 2017-06-12 ]

1. the missing pid is nroaml since the pid file was deleted.
2. I reproduced the same issue, but on my system, the status of mysqld sent MAN_OFFLINE after I deleted the pid file. then when I did a stop, it didnt kill the mysqld, just like on daniels. I think on mine since it was already showing MAN_OFFLINE, it didnt try to do a stop. So Iw ill need to look into that code

[root@virtualbox-centos7 db]# rm -f virtualbox-centos7.pid
[root@virtualbox-centos7 db]# ma

MariaDB ColumnStore Admin Console
enter 'help' for list of commands
enter 'exit' to exit the MariaDB ColumnStore Command Console
use up/down arrows to recall commands

Active Alarm Counts: Critical = 0, Major = 0, Minor = 0, Warning = 0, Info = 0

Critical Active Alarms:

mcsadmin> getsystemi
getsysteminfo Mon Jun 12 17:21:47 2017

System columnstore-1

System and Module statuses

Component Status Last Status Change
------------ -------------------------- ------------------------
System ACTIVE Mon Jun 12 17:21:12 2017

Module pm1 DEGRADED Mon Jun 12 17:21:43 2017

MariaDB ColumnStore Process statuses

Process Module Status Last Status Change Process ID
------------------ ------ --------------- ------------------------ ----------
ProcessMonitor pm1 ACTIVE Mon Jun 12 17:19:05 2017 15644
ProcessManager pm1 ACTIVE Mon Jun 12 17:19:11 2017 15732
DBRMControllerNode pm1 ACTIVE Mon Jun 12 17:20:47 2017 17335
ServerMonitor pm1 ACTIVE Mon Jun 12 17:20:49 2017 17355
DBRMWorkerNode pm1 ACTIVE Mon Jun 12 17:20:49 2017 17375
DecomSvr pm1 ACTIVE Mon Jun 12 17:20:53 2017 17403
PrimProc pm1 ACTIVE Mon Jun 12 17:20:55 2017 17450
ExeMgr pm1 ACTIVE Mon Jun 12 17:20:59 2017 17508
WriteEngineServer pm1 ACTIVE Mon Jun 12 17:21:03 2017 17560
DDLProc pm1 ACTIVE Mon Jun 12 17:21:07 2017 17628
DMLProc pm1 ACTIVE Mon Jun 12 17:21:11 2017 17660
mysqld pm1 MAN_OFFLINE Mon Jun 12 17:21:43 2017

Active Alarm Counts: Critical = 0, Major = 0, Minor = 0, Warning = 0, Info = 0
mcsadmin> stops y
stopsystem Mon Jun 12 17:22:03 2017

This command stops the processing of applications on all Modules within the MariaDB ColumnStore System

Checking for active transactions

System being stopped now...
Successful stop of System

mcsadmin> getsystemi
getsysteminfo Mon Jun 12 17:22:16 2017

System columnstore-1

System and Module statuses

Component Status Last Status Change
------------ -------------------------- ------------------------
System MAN_OFFLINE Mon Jun 12 17:22:09 2017

Module pm1 MAN_OFFLINE Mon Jun 12 17:22:07 2017

MariaDB ColumnStore Process statuses

Process Module Status Last Status Change Process ID
------------------ ------ --------------- ------------------------ ----------
ProcessMonitor pm1 ACTIVE Mon Jun 12 17:19:05 2017 15644
ProcessManager pm1 ACTIVE Mon Jun 12 17:19:11 2017 15732
DBRMControllerNode pm1 MAN_OFFLINE Mon Jun 12 17:22:04 2017
ServerMonitor pm1 MAN_OFFLINE Mon Jun 12 17:22:04 2017
DBRMWorkerNode pm1 MAN_OFFLINE Mon Jun 12 17:22:04 2017
DecomSvr pm1 MAN_OFFLINE Mon Jun 12 17:22:04 2017
PrimProc pm1 MAN_OFFLINE Mon Jun 12 17:22:04 2017
ExeMgr pm1 MAN_OFFLINE Mon Jun 12 17:22:04 2017
WriteEngineServer pm1 MAN_OFFLINE Mon Jun 12 17:22:04 2017
DDLProc pm1 MAN_OFFLINE Mon Jun 12 17:22:04 2017
DMLProc pm1 MAN_OFFLINE Mon Jun 12 17:22:04 2017
mysqld pm1 MAN_OFFLINE Mon Jun 12 17:22:06 2017

Active Alarm Counts: Critical = 1, Major = 1, Minor = 9, Warning = 0, Info = 0
mcsadmin> exit
exit Mon Jun 12 17:22:22 2017
Exiting the MariaDB ColumnStore Admin Console
[root@virtualbox-centos7 db]# ps -ef | grep mysql
mysql 17217 1 0 17:20 pts/2 00:00:00 /usr/local/mariadb/columnstore/mysql//bin/mysqld --basedir=/usr/local/mariadb/columnstore/mysql/ --datadir=/usr/local/mariadb/columnstore/mysql/db --plugin-dir=/usr/local/mariadb/columnstore/mysql/lib/plugin --user=mysql --log-error=/usr/local/mariadb/columnstore/mysql/db/virtualbox-centos7.err --pid-file=/usr/local/mariadb/columnstore/mysql/db/virtualbox-centos7.pid --socket=/usr/local/mariadb/columnstore/mysql/lib/mysql/mysql.sock --port=3306
root 18204 10118 0 17:22 pts/2 00:00:00 grep --color=auto mysql
[root@virtualbox-centos7 db]#

Comment by Daniel Lee (Inactive) [ 2017-06-13 ]

Reopen per the issues we observed

Comment by David Hill (Inactive) [ 2017-06-13 ]

additional fix for stopsystem issue

1.1.0 develop

commit d16dfec7f7c87b95ae5a5dc210ea0dd4c5c9b71a
Author: david hill <david.hill@mariadb.com>
Date: Tue Jun 13 09:50:47 2017 -0500

MCOL-472 - additional tweak to full shutdown both mysql processes

dbcon/mysql/mysql-Columnstore | 2 +-

1.0.10 develop-1.0

commit 84741a7eb29fdd6976b1b9f96a57a677ccc2a3f0
Author: david hill <david.hill@mariadb.com>
Date: Tue Jun 13 09:51:46 2017 -0500

MCOL-472 - additional tweak to full shutdown both mysql processes

dbcon/mysql/mysql-Columnstore | 2 +-

Comment by Daniel Lee (Inactive) [ 2017-07-19 ]

Build verified: 1.0.10-1

Comment by Daniel Lee (Inactive) [ 2017-07-19 ]

Build tested: Github source 1.1.0

[root@localhost mariadb-columnstore-server]# git show
commit 8e07495da650d922c4d1f3f09d77382168132b11
Merge: 80e57a8 c27e1e5
Author: David.Hall <david.hall@mariadb.com>
Date: Wed Jul 12 13:07:42 2017 -0500

[root@localhost mariadb-columnstore-engine]# git show
commit 853eb1388ea5f880abc9b11e9498bd8a3407fa5a
Merge: d138692 9ad2b0c
Author: David.Hall <david.hall@mariadb.com>
Date: Tue Jul 18 09:21:39 2017 -0500

mysql processes remained after stopsystem command.

1. ColumnStore is up and running
2. Renamed mysql's pid file
3. shutdownsystem

root 11194 1 0 19:43 ? 00:00:00 /bin/sh /usr/local/mariadb/columnstore/mysql//bin/mysqld_safe --datadir=/usr/local/mariadb/columnstore/mysql/db --pid-file=/usr/local/mariadb/columnstore/mysql/db/localhost.localdomain.pid --ledir=/usr/local/mariadb/columnstore/mysql//bin
mysql 11382 11194 0 19:43 ? 00:00:05 /usr/local/mariadb/columnstore/mysql//bin/mysqld --basedir=/usr/local/mariadb/columnstore/mysql/ --datadir=/usr/local/mariadb/columnstore/mysql/db --plugin-dir=/usr/local/mariadb/columnstore/mysql/lib/plugin --user=mysql --log-error=/usr/local/mariadb/columnstore/mysql/db/localhost.localdomain.err --pid-file=/usr/local/mariadb/columnstore/mysql/db/localhost.localdomain.pid --socket=/usr/local/mariadb/columnstore/mysql/lib/mysql/mysql.sock --port=3306
root 14819 13361 0 19:56 pts/0 00:00:00 grep --color=auto mysql

Comment by Daniel Lee (Inactive) [ 2017-07-19 ]

Reopened per my last comment.

Comment by Daniel Lee (Inactive) [ 2017-07-19 ]

Remaining issue for 1.1.0 is being track by ticket MCOL-819. Closing this ticket as it worked in 1.0.10-1

Generated at Thu Feb 08 02:21:20 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.