[MCOL-486] can not login to server after PM failover Created: 2016-12-27  Updated: 2019-07-10  Resolved: 2019-07-10

Status: Closed
Project: MariaDB ColumnStore
Component/s: MariaDB Server
Affects Version/s: 1.0.6.1
Fix Version/s: Icebox

Type: Bug Priority: Major
Reporter: Si Tong Assignee: David Thompson (Inactive)
Resolution: Won't Fix Votes: 0
Labels: None
Environment:

CentOS release 6.5


Attachments: File columnstoreSupportReport.columnstore-1.tar.gz    
Sprint: 2016-25, 2017-01, 2017-2, 2017-3, 2017-4, 2017-5, 2017-6, 2017-7, 2017-8

 Description   

env desc:
install mcs with mode multi+separate+external,
with 1 UM and 2 PMs,
using glusterfs as external storage,
/etc/fstab is modified properly,
originally, PM1 is the OAM

related shell script
a.sh:
while true
do
date > /tmp/b
mysql -uroot -h127.0.0.1 -P3306 -e "set names utf8; select * from test.t2 limit 100" >> /tmp/b
done

reproduce steps:
1. create a table
CREATE TABLE `t2` (
`c0` varchar(30) DEFAULT NULL,
`c1` varchar(30) DEFAULT NULL,
`c2` varchar(30) DEFAULT NULL,
`c3` varchar(30) DEFAULT NULL,
`c4` varchar(30) DEFAULT NULL,
`c5` varchar(30) DEFAULT NULL,
`c6` varchar(30) DEFAULT NULL,
`c7` varchar(30) DEFAULT NULL,
`c8` varchar(30) DEFAULT NULL,
`c9` varchar(30) DEFAULT NULL,
`c10` varchar(30) DEFAULT NULL,
`c11` varchar(30) DEFAULT NULL,
`c12` varchar(30) DEFAULT NULL,
`c13` varchar(30) DEFAULT NULL,
`c14` varchar(30) DEFAULT NULL,
`c15` varchar(30) DEFAULT NULL,
`c16` varchar(30) DEFAULT NULL,
`c17` varchar(30) DEFAULT NULL,
`c18` varchar(30) DEFAULT NULL,
`c19` varchar(30) DEFAULT NULL,
`c20` varchar(30) DEFAULT NULL,
`c21` varchar(30) DEFAULT NULL,
`c22` varchar(30) DEFAULT NULL,
`c23` varchar(30) DEFAULT NULL,
`c24` varchar(30) DEFAULT NULL,
`c25` varchar(30) DEFAULT NULL,
`c26` varchar(30) DEFAULT NULL,
`c27` varchar(30) DEFAULT NULL,
`c28` varchar(30) DEFAULT NULL,
`c29` varchar(30) DEFAULT NULL
) ENGINE=Columnstore DEFAULT CHARSET=utf8;

2. load 25000000 lines of data into t2.
3. run a.sh, the script should always be running during the failover.
4. use command "shutdown -h now" to shut down the machine of PM1.
5. after mcs handle the failover, it report everything is ok. but you can not login from mysql client (mysql -uroot -P3306 -h127.0.0.1) to the UM's mysqld instance. in fact the failover was failed.

comment:
1. after some checks, I found /usr/local/mariadb/columnstore/mysql/lib/mysql/mysql.sock is not exist, but no error logged in the mysql instance's start up log.
2. stopped a.sh and startup the machine of PM1, after mcs finished the failover again, the system worked again.

columnstoreSupport related file is attached.



 Comments   
Comment by David Thompson (Inactive) [ 2016-12-28 ]

can you try to reproduce as i've not seen this. maybe the repetitive queries are something our tests dont cover?

Comment by Si Tong [ 2016-12-28 ]

no the query has nothing special, only a SET and a SELECT ... LIMIT 100.
i reproduced this everytime when the script is running during the failover. but if the script is not running during the failover, the failover will succeed.

the machines are all kvm virtual machines, with 1GB memory each.

Comment by David Thompson (Inactive) [ 2016-12-28 ]

As mentioned in 487, can you retest with higher memory limit and see if this still happens? If not we can try to do this based on our bug backlog priorities.

Comment by Si Tong [ 2016-12-30 ]

retested on new env with each machine 8 GB memory and enough disk space, and reproduced.
and i tested more, when startup the machine of PM1 again with a.sh running, the failover failed, too.

Comment by David Thompson (Inactive) [ 2016-12-30 ]

Thanks - appreciate your help! Will get this reviewed and prioritized.

Generated at Thu Feb 08 02:21:27 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.