[MCOL-1034] DDl/DML incorrect starts when active on um2 during a pm outage Created: 2017-11-15  Updated: 2018-02-01  Resolved: 2018-02-01

Status: Closed
Project: MariaDB ColumnStore
Component/s: None
Affects Version/s: 1.0.11, 1.1.0
Fix Version/s: 1.1.3

Type: Bug Priority: Major
Reporter: David Hill (Inactive) Assignee: Daniel Lee (Inactive)
Resolution: Fixed Votes: 0
Labels: relnote
Environment:

amazon ec2 2um / 3pm with ebs


Sprint: 2017-23, 2017-24, 2018-02, 2018-03

 Description   

this test case still failing...
um2 as active module (ddl/dml are active)
take down pm3 and bring back up.
Ends up with um1 as active module, but the replication master/slave arent setup correct. has um1 as slave and um2 as master



 Comments   
Comment by David Hill (Inactive) [ 2018-01-25 ]

also seeing where multi versions of ddl/dml are active after a pm1 outage and bring it back up

mcsadmin> getprocessstatus
getprocessstatus Thu Jan 25 17:22:21 2018

MariaDB ColumnStore Process statuses

Process Module Status Last Status Change Process ID
------------------ ------ --------------- ------------------------ ----------
ProcessMonitor pm1 ACTIVE Thu Jan 25 17:14:46 2018 1215
ProcessManager pm1 COLD_STANDBY Thu Jan 25 17:14:54 2018
DBRMControllerNode pm1 COLD_STANDBY Thu Jan 25 17:14:54 2018
ServerMonitor pm1 ACTIVE Thu Jan 25 17:14:57 2018 11529
DBRMWorkerNode pm1 ACTIVE Thu Jan 25 17:15:23 2018 11708
DecomSvr pm1 ACTIVE Thu Jan 25 17:15:02 2018 11585
PrimProc pm1 ACTIVE Thu Jan 25 17:15:38 2018 11766
ExeMgr pm1 ACTIVE Thu Jan 25 17:15:51 2018 11815
WriteEngineServer pm1 ACTIVE Thu Jan 25 17:16:26 2018 12057
DDLProc pm1 ACTIVE Thu Jan 25 17:16:40 2018 12107
DMLProc pm1 ACTIVE Thu Jan 25 17:16:50 2018 12143
mysqld pm1 ACTIVE Thu Jan 25 17:14:56 2018 11404

ProcessMonitor pm2 ACTIVE Thu Jan 25 16:50:30 2018 3362
ProcessManager pm2 ACTIVE Thu Jan 25 16:58:59 2018 3527
DBRMControllerNode pm2 ACTIVE Thu Jan 25 17:15:21 2018 24052
ServerMonitor pm2 ACTIVE Thu Jan 25 16:58:57 2018 6874
DBRMWorkerNode pm2 ACTIVE Thu Jan 25 17:15:27 2018 24159
DecomSvr pm2 ACTIVE Thu Jan 25 16:59:01 2018 6962
PrimProc pm2 ACTIVE Thu Jan 25 17:15:42 2018 24368
ExeMgr pm2 ACTIVE Thu Jan 25 17:15:55 2018 24540
WriteEngineServer pm2 ACTIVE Thu Jan 25 17:16:30 2018 25172
DDLProc pm2 ACTIVE Thu Jan 25 17:16:44 2018 25378
DMLProc pm2 ACTIVE Thu Jan 25 17:16:51 2018 25528
mysqld pm2 ACTIVE Thu Jan 25 16:58:55 2018 6648

ProcessMonitor pm3 ACTIVE Thu Jan 25 16:50:31 2018 3370
ProcessManager pm3 HOT_STANDBY Thu Jan 25 16:59:35 2018 5386
DBRMControllerNode pm3 COLD_STANDBY Thu Jan 25 17:15:20 2018
ServerMonitor pm3 ACTIVE Thu Jan 25 16:50:49 2018 3881
DBRMWorkerNode pm3 ACTIVE Thu Jan 25 17:15:32 2018 7769
DecomSvr pm3 ACTIVE Thu Jan 25 16:50:53 2018 3925
PrimProc pm3 ACTIVE Thu Jan 25 17:15:46 2018 7812
ExeMgr pm3 ACTIVE Thu Jan 25 17:15:59 2018 7888
WriteEngineServer pm3 ACTIVE Thu Jan 25 17:16:34 2018 7978
DDLProc pm3 COLD_STANDBY Thu Jan 25 17:16:46 2018
DMLProc pm3 COLD_STANDBY Thu Jan 25 17:16:50 2018
mysqld pm3 ACTIVE Thu Jan 25 16:59:24 2018 5267
[11:23 AM]
that's hard to read
[11:24 AM]
anyway, pm2 is active, should DML and DML procs on pm1 be ACTIVE?
[11:24 AM]
they are code standby on pm3

Comment by David Hill (Inactive) [ 2018-01-30 ]

The amazon 3pm with ebs failover test scenerios all worked great.

1. no doubleddl/dml active
2. no dbrm read-only states
3. able to create new tables with a pm down and it would show up when the pm was back up.

Next, separate um/pm system testing

Comment by David Hill (Inactive) [ 2018-01-31 ]

https://github.com/mariadb-corporation/mariadb-columnstore-engine/pull/390

Comment by Ben Thompson (Inactive) [ 2018-01-31 ]

Reviewed / Merged

Comment by Daniel Lee (Inactive) [ 2018-02-01 ]

Build verified: mcs-1.1.3 ami (ami-7e2b9006) released to QA on 02/01/2018.

Generated at Thu Feb 08 02:25:41 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.