Details
-
New Feature
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
6.1.1
-
None
-
2021-15, 2021-16, 2021-17
Description
There are known customer installations that don't use shared storage so failover mechanism might break such clusters.
There must be a knob in cmapi configuration file to disable failover facility if needed.
New
Changes has been made:
- add application section with auto_failover = False parameter to default cmapi_server.conf
- failover now is turned off by default even if there are no "application" section or no auto_failover parameter exist in cmapi_server.conf
- failover has now three different logical states:
- turned off - no failover thread started. To turn it on set auto_failover=True in application section of cmapi_server.conf file of each node and restart cmapi.
- turned on and inactive - there are failover thread but it doesn't work. It becomes active automatically if nodes count >= 3
- turned on and active - there are an active failover thread and it is activated. Can be deactivated automatically if nodes_count < 3
Attachments
Issue Links
Activity
dleeyh
Changes has been made:
- add application section with auto_failover = False parameter to default cmapi_server.conf
- failover now is turned off by default even if there are no "application" section or no auto_failover parameter exist in cmapi_server.conf
- failover has now three different logical states:
- turned off - no failover thread started. To turn it on set auto_failover=True in application section of cmapi_server.conf file of each node and restart cmapi.
- turned on and inactive - there are failover thread but it doesn't work. It becomes active automatically if nodes count >= 3
- turned on and active - there are an active failover thread and it is activated. Can be deactivated automatically if nodes_count < 3
Build tested: ColumnStore Engine (build 3561)
CMAPI (585)
Test cluster: 3-node cluster
Reproduced the reported issue in 6.2.2-1, with CMAPI 1.6 as released.
Non-shared storage (local dbroot)
With auto_failover=False, failover did not occur when PM1 was suspended.
With auto_failover=True, this is a misconfiguration since non-shared storage is used. When PM1 was suspended, I expected failover to occur and the cluster would end up in a non-operational state, but it did not occur. Was it because CMAPI detected non-shared storage and did not kick off the failover process? or was it failover simply did not occur?
Glusterfs
With auto_failover=False, failover did not occur when PM1 was suspended.
With auto_failover=True, I expected failover to occur, having PM2 taking over as the master node. It did not happen.
The same test did worked in 6.1.1 and 6.2.2.
David.Hall I moved this to testing . was it incorrect .
Is the action item for alan.mologorsky instead
alan.mologorsky i moved to testing by mistake
please post the status
dleeyh Question : all 3 scenarios worked in the previous version ? which one ?
alan.mologorsky please see Daniel's request for steps.
Note that we are discussing the possible regression in the overall failover functionally in this release - not the actual change that Alan did on this ticket.
toddstoffel should maxscale be involved in this ?
Eventually we will need to review this in Sky as well petko.vasilev
Build tested: 6.3.1-1 (#4101), cmapi 1.6 (#580), 3-node glusterfs
alexey.vorovich Yes, it was tested before and it worked fine. Just in case, I retested the same build of ColumnStore 6.3.1-1 using an older build of CMAPI.
1. When restarting ColumnStore (mcsShutdown and mcsStart, not a failover situation), PM1 remained as the master node. ColumnStore continued function properly as expected.
2. Failover scenario, PM1 also came back as master node. ColumnStore continued function properly as expected.
MariaDB [mytest]> select count(*) from lineitem;
|
+----------+
|
| count(*) |
|
+----------+
|
| 6001215 |
|
+----------+
|
1 row in set (0.186 sec)
|
|
MariaDB [mytest]> create table t1 (c1 int) engine=columnstore;
|
Query OK, 0 rows affected (1.619 sec)
|
|
use near 'table t1 values (1)' at line 1
|
MariaDB [mytest]> insert t1 values (1);
|
Query OK, 1 row affected (0.159 sec)
|
|
MariaDB [mytest]> insert t1 values (2);
|
Query OK, 1 row affected (0.074 sec)
|
|
MariaDB [mytest]> select * from t1;
|
+------+
|
| c1 |
|
+------+
|
| 1 |
|
| 2 |
|
+------+
|
2 rows in set (0.521 sec)
|
Guys, i would suggest this .
Start with something that works for at least one of you " ColumnStore 6.3.1-1 using an older build of CMAPI."
Indeed , Daniel, please create an annotated script in your repo to do the actions for test1,2,3 to minimize the chance of miscommunication.
Then Alan can try to repeat them on the version he believes cannot work and we go from there.
and please always use build numbers in notes instead of "latest" and "older build"
I always start my comments for test with a line like the following:
Build tested: 6.3.1-1 (#4101), cmapi 1.6 (#580)
When closing a ticket, I do
Build verified: 6.3.1-1 (#4101), cmapi 1.6 (#580)
The number in () is the build number from Drone.
For example. The CMAPI build that I had issues with was "cmapi 1.6.2 (#612)" and the older one that I retested was "cmapi 1.6 (#580)"
alan.mologorsky
Yes, in the future we will share QA scripts (and they will include k8s commands as well) .
For today, test1 has these 4 steps that you can run one after the other . Please try these with "cmapi 1.6 (#580)" and see if you can confirm what Daniel is seeing (he sees that working and you believe it cannot work) . Lets start from there.
1. Set auto_failover to True in /etc/columnstore/cmapi_serfer.conf in all nodes
2. "systemctl restart mariadb-columnstore-cmapi" in all nodes
3. mcsShutdown
4. mcsStart
As a side note : putting commands that are supposed to be run on separate multiples host could be done via a loop of SSH or via Kubectl . We will need to decide how to do that in the future. There is also SkyTf framework for these created by georgi.harizanov
Eventually we will integrate multi-node tests with that framework . For now - just heads up
i kind of doubt that rocky and types of data loaded are important. Could be.
What is important is to have the same common scripts to install the system to begin with .
If Daniel has these scripts that Alan should use them. I will try this install script as well.
after these scripts are used I would start with the the older build that works for both and then move to newer builds
well , If Alan can reproduce the problem using a separate setup then good.
However , i would definitely invest into a common installation script
We have 3 candidates for common install script for multi-node
- Daniel's QA setup . Needs work as per Daniel to make it really standalone
- Direct MOE that brings up cluster with pods ... Pending this week , i hope and pray
- Docker compose from toddstoffel. If this supports shared disk then we could use that to start/test /validate and share identical setup between different people short term and maybe long term as well
Todd, do u agree ?
The link leads to step 2 of 9 step procedure. Many of these steps require the user to execute commands on each host. Much time is needed and many possibilities of errors exist.
On the contrary the Direct MOE will allow something along these lines
moe -topology CS -replicas 3 -nodetype verylarge
this will create all the nodes, s3, nfs, config files etc etc
I suspect that docker compose approach is simple to use . Todd will clarify
My concern is actually debugging. How one can do symbolic debugging in the docker. There are tools for that as well (at least for Python)
alan.mologorsky can you point me to a repo where your scripts are so I can have a look?
alan.mologorsky did I summarize the meeting correctly ? If so, please do the reversal of default and pass to dleeyh so that we can move on
toddstoffel Here is a suggestion /question from gdorman What if we ALWAYS require maxscale to be present to enable HA. This is currently the case for Sky.
this would reduce the various options . Before we discuss this in dev , what is our take from PM point of view
alexey.vorovich As the result we decided that I will ask toddstoffel offline regarding default behavior and wait for his decision.
Build tested: 6.3.1-1 (#4234), CMAPI-1.6.3 (619)
Cluster: 3-node
Test #1, PASSED
Default auto_failover value
In /etc/columnstore/cmapi_server.conf, auto_failover is set to True by default.
------
Test #2, PASSED
Setup: no shared storage
auto_failover: False
This is the use case which the user does not have shared storage setup and failover is not desired
Failover did not occur
------
Test #3, PASSED
Setup: gluster
auto_failover: False
This is the use case which the user has glusterfs setup and failover IS NOT desired
Failover did not occur
------
Test #4, FAILED
Setup: gluster
auto_failover: True
This is the use case which the user has glusterfs setup and failover IS desired
Observation:
When PM1, which is the master node, taken offline
mcsStatus on PM2 showed PM2 as the master, PM3 as slave (2-node cluster)
MaxScale showed PM2 as the master, PM1 was down
So far, this is expected
When PM1 was put back online
mcsStatus showed PM1 eventually bacame the master node again
but MaxScale showed PM2 should be the master node
It was expected that ColumnStore would set the master according to what MaxScale selected, but this did not happen. Now ColumnStore and MaxScale are out of sync.
------
At the time of this writing, fixVersion of the ticket has been set for cmapi-1.6.3, but the package as been named for 1.6.2, such as MariaDB-columnstore-cmapi-1.6.2.x86_64.rpm. The package name shoudl be corrected.
Guys,
1. I tend to agree that default file section created at new install should be empty
2. Let's go back to Test4 failed in https://jira.mariadb.org/browse/MCOL-4939?focusedCommentId=219832&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-219832
I am trying to repro myself, but inconclusive so far..
dleeyh and toddstoffel
Besides the discrepancy between MCS and MXS in respect to master node choice , what issues with DDL/DML update do we observe ?
Please list what has been found. My understanding is that Maxscale will direct updates to PM2.
Also Daniel , for whatever symptoms we see , please confirm in which old release we did not see them
alan.mologorsky dleeyh I opened a new https://jira.mariadb.org/browse/MCOL-5052 for that mismatch discussion
The only remaining item here is for Alan and is described above
Build tested: 6.3.1-1 (#4234), CMAPI-1.6.3-1 (#623)
Preliminary test results for failover behavior. More functional tests will be done.
3-node cluster, with gluster, schema replication, MaxScale
For each of the follow tests, a newly installed 3-node cluster is used
Test #1
Default installation, auto_failover parameter has been removed from /etc/columnstore/cmapi_server.conf , default behavior is auto failover enabled.
Failover now works the same way as it used to be. When putting PM1 back online, PM2 remained as the master node, in sync with MaxScale.
Test #2
On each node, added the following to /etc/columnstore/cmapi_server.conf
[application]
|
auto_failover = False
|
and restarted cmapi
systemctl restart mariadb-columnstore-cmapi
|
mcsStatus on all three (3) nodes showed there is only one (1) node (pm1) in the cluster. pm2 and pm3 are no longer part of the cluster. Output is like the following:
[rocky8:root~]# mcsStatus
|
{
|
"timestamp": "2022-04-14 00:43:02.932548",
|
"s1pm1": {
|
"timestamp": "2022-04-14 00:43:02.938951",
|
"uptime": 1149,
|
"dbrm_mode": "master",
|
"cluster_mode": "readwrite",
|
"dbroots": [],
|
"module_id": 1,
|
"services": [
|
{
|
"name": "workernode",
|
"pid": 9290
|
},
|
{
|
"name": "controllernode",
|
"pid": 9301
|
},
|
{
|
"name": "PrimProc",
|
"pid": 9317
|
},
|
{
|
"name": "ExeMgr",
|
"pid": 9365
|
},
|
{
|
"name": "WriteEngine",
|
"pid": 9382
|
},
|
{
|
"name": "DDLProc",
|
"pid": 9413
|
}
|
]
|
},
|
"num_nodes": 1
|
}
|
I tried the same test again and all nodes returned somethig like the following
[rocky8:root~]# mcsStatus
|
{
|
"timestamp": "2022-04-14 01:46:02.956786",
|
"s1pm1": {
|
"timestamp": "2022-04-14 01:46:02.963366",
|
"uptime": 1631,
|
"dbrm_mode": "offline",
|
"cluster_mode": "readonly",
|
"dbroots": [],
|
"module_id": 1,
|
"services": []
|
},
|
"num_nodes": 1
|
}
|
Failover was not tested since there is only one node in the cluster now.
Test #3
On each node, added the following to /etc/columnstore/cmapi_server.conf
[application]
|
auto_failover = True
|
and restarted cmapi
systemctl restart mariadb-columnstore-cmapi
|
I got the same result as Test #1 above
Build verified: ColumnStore 6.3.1-1 (#4278), cmapi (#625)
Following the steps above and using the new cmapi build, test #2 worked as expected, failover did not take place, as it is disabled in the cmapi-server.cnf file.
Build verified: ColumnStore 6.3.1-1 (#4299), cmapi 1.6.3 (#626)
cmapi package name has been corrected: MariaDB-columnstore-cmapi-1.6.3-1.x86_6.rpm. from 1.6.2 to 1.6.3
Verified along with the latest build of ColumnStore. Created a 3-node docker cluster.
4QA Previously there was no way to disable failover for clusters with >= 3 nodes. It affects clusters with non-shared storages a lot.