[MCOL-983] DMLProc rollback took 10 minutes when there are 0 tables to rollback, non-root install Created: 2017-10-23  Updated: 2021-01-17  Resolved: 2021-01-17

Status: Closed
Project: MariaDB ColumnStore
Component/s: N/A
Affects Version/s: None
Fix Version/s: N/A

Type: New Feature Priority: Major
Reporter: David Hill (Inactive) Assignee: Todd Stoffel (Inactive)
Resolution: Won't Do Votes: 0
Labels: None
Environment:

amazon EC2 centos 7 - 2um/1pm with local query



 Description   

started with um1 Active on a 2um/1pm system with local query set. Non-root install using local disk.

1. stopped instance um1
2. um2 was failed over and made the Active UM. DMLProc gets restarted during this process so it cam run the rollback and the DMLProc will show BUSY_INIT during this process.
3. DMLproc showed BUSY_INIT for than 10 minutes and logs showed rollback didn't complete.

Component Status Last Status Change
------------ -------------------------- ------------------------
System BUSY_INIT Mon Oct 23 15:53:50 2017

Module um1 AUTO_DISABLED/DEGRADED Mon Oct 23 15:53:31 2017
Module um2 ACTIVE Mon Oct 23 15:53:51 2017
Module pm1 ACTIVE Mon Oct 23 15:44:58 2017

Active Parent OAM Performance Module is 'pm1'
Primary Front-End MariaDB ColumnStore Module is 'um2'
Local Query Feature is enabled
MariaDB ColumnStore Replication Feature is enabled

MariaDB ColumnStore Process statuses

Process Module Status Last Status Change Process ID
------------------ ------ --------------- ------------------------ ----------
ProcessMonitor um1 AUTO_OFFLINE Mon Oct 23 15:53:31 2017
ServerMonitor um1 AUTO_OFFLINE Mon Oct 23 15:53:31 2017
DBRMWorkerNode um1 AUTO_OFFLINE Mon Oct 23 15:53:31 2017
ExeMgr um1 AUTO_OFFLINE Mon Oct 23 15:53:31 2017
DDLProc um1 AUTO_OFFLINE Mon Oct 23 15:53:31 2017
DMLProc um1 AUTO_OFFLINE Mon Oct 23 15:53:31 2017
mysqld um1 AUTO_OFFLINE Mon Oct 23 15:53:31 2017

ProcessMonitor um2 ACTIVE Mon Oct 23 15:44:36 2017 11625
ServerMonitor um2 ACTIVE Mon Oct 23 15:44:51 2017 12081
DBRMWorkerNode um2 ACTIVE Mon Oct 23 15:44:53 2017 12107
ExeMgr um2 ACTIVE Mon Oct 23 15:53:35 2017 13973
DDLProc um2 ACTIVE Mon Oct 23 15:53:44 2017 14013
DMLProc um2 BUSY_INIT Mon Oct 23 15:53:50 2017 14026
mysqld um2 ACTIVE Mon Oct 23 15:54:00 2017 13022

ProcessMonitor pm1 ACTIVE Mon Oct 23 15:44:27 2017 6913
ProcessManager pm1 ACTIVE Mon Oct 23 15:44:33 2017 7069
DBRMControllerNode pm1 ACTIVE Mon Oct 23 15:44:43 2017 7803
ServerMonitor pm1 ACTIVE Mon Oct 23 15:44:45 2017 7835
DBRMWorkerNode pm1 ACTIVE Mon Oct 23 15:44:45 2017 7888
DecomSvr pm1 ACTIVE Mon Oct 23 15:44:49 2017 8070
PrimProc pm1 ACTIVE Mon Oct 23 15:44:51 2017 8166
ExeMgr pm1 ACTIVE Mon Oct 23 15:53:39 2017 24025
WriteEngineServer pm1 ACTIVE Mon Oct 23 15:44:59 2017 8579
mysqld pm1 ACTIVE Mon Oct 23 15:48:26 2017 15057

Active Alarm Counts: Critical = 1, Major = 1, Minor = 0, Warning = 0, Info = 0

Oct 23 15:53:46 ip-172-30-0-152 DMLProc[14026]: 46.224661 |0|0|0| I 20 CAL0002: DMLProc starts rollbackAll.
Oct 23 15:53:46 ip-172-30-0-152 DMLProc[14026]: 46.235930 |0|0|0| I 20 CAL0002: DMLProc will rollback 0 tables.
Oct 23 15:53:47 ip-172-30-0-152 ProcessMonitor[11625]: 47.193851 |0|0|0| I 18 CAL0000: MSG RECEIVED: Start All process request...
Oct 23 15:53:47 ip-172-30-0-152 ProcessMonitor[11625]: 47.240041 |0|0|0| I 18 CAL0000: STARTALL: ACK back to ProcMgr, return status = 0

Oct 23 16:03:35 ip-172-30-0-152 DMLProc[14026]: 35.587322 |0|0|0| I 20 CAL0002: DMLProc finished rollbackAll.



 Comments   
Comment by Daniel Lee (Inactive) [ 2017-10-31 ]

Build tested: GitHub source 1.0.12-1

/root/columnstore/mariadb-columnstore-server
commit a42eb6d1e74e44c9e8fd9bb8290e6ce7dbf909f5
Merge: 2965fc8 6a14ced
Author: David.Hall <david.hall@mariadb.com>
Date: Tue Oct 3 10:12:33 2017 -0500

Merge pull request #69 from mariadb-corporation/MCOL-940

MCOL-940

/root/columnstore/mariadb-columnstore-server/mariadb-columnstore-engine
commit 28d26c89018faa3ec02fd49559b2fb53e6847e97
Merge: a8414b9 5ab7538
Author: Andrew Hutchings <andrew@linuxjedi.co.uk>
Date: Thu Oct 26 20:22:27 2017 +0300

Merge pull request #304 from mariadb-corporation/MCOL-979-1.0

MCOL-979 getNullValueByType() should return string for all char types

user=root

1. Installed a 2um2pm stack with mysqlrep off
2. Enable mysqlrep
3. Verified mysqlrep is working
4. Suspended um1, UM failed over to um2

DML remains in BUSY_INIT state.

5. resumed um1, UM1 came back up

mysqld on um1 remained AUTO_OFFLINE

Comment by Daniel Lee (Inactive) [ 2017-10-31 ]

I don't know how long I have waited, the two mentioned processes eventually back to ACTIVE state.

Generated at Thu Feb 08 02:25:17 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.