[MXS-4983] Schemarouter monitoring requirements are not documented clearly - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Duplicate
Affects Version/s: 23.08.4
Fix Version/s: N/A
Component/s: schemarouter
Labels:
Environment:
Linux n.a.

Description

When we have a Sharding Cluster with SchemaRouter we could come to the idea to drain/set to maintenance a Shard (for example for upgrade the db or the server including a reboot). For this I would have expected to work drain/maintenance.
maxctrl list servers
┌────────┬──────────────┬──────┬─────────────┬─────────────────┬──────────────┬─────────────────┐
│ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │
├────────┼──────────────┼──────┼─────────────┼─────────────────┼──────────────┼─────────────────┤
│ shard2 │ 10.139.158.1 │ 3364 │ 1 │ Master, Running │ 0-3364-69817 │ MariaDB-Monitor │
├────────┼──────────────┼──────┼─────────────┼─────────────────┼──────────────┼─────────────────┤
│ shard3 │ 10.139.158.1 │ 3365 │ 1 │ Running │ 0-3365-63166 │ MariaDB-Monitor │
├────────┼──────────────┼──────┼─────────────┼─────────────────┼──────────────┼─────────────────┤
│ shard4 │ 10.139.158.1 │ 3366 │ 1 │ Running │ 0-3366-6902 │ MariaDB-Monitor │
└────────┴──────────────┴──────┴─────────────┴─────────────────┴──────────────┴─────────────────┘

maxctrl set server shard2 drain
maxctrl set server shard2 maintenance

But I am getting an error:

Error: Server at http://127.0.0.1:8989 responded with 400 Bad Request to `PUT servers/shard2/set?state=drain`
{
"errors": [

{ "detail": "The server is primary, so it cannot be set in maintenance or draining mode. First perform a switchover and then retry the operation." }

]
}

As I wrote before I do not understand the concept of "Master" in a sharding system.

So when I try to "switchover" the master role to another shard I get the following error:

maxctrl call command mariadbmon switchover MariaDB-Monitor shard3 shard2
Error: Server at http://127.0.0.1:8989 responded with 400 Bad Request to `POST maxscale/modules/mariadbmon/switchover?MariaDB-Monitor&shard3&shard2`
{
"links":

{ "self": "http://127.0.0.1:8989/v1/maxscale/modules/mariadbmon/switchover/" }

,
"meta": {
"errors": [

{ "detail": "'shard3' is not a valid promotion target for switchover because it is not replicating from 'shard2'." }

,

{ "detail": "Switchover cancelled." }

]
}
}

So I am a bit clueless now, how I should proceed...

@Markus: I wrote everything down for a blog article, if you want to cross-read before publishing, please let me know...

Attachments

Issue Links

duplicates

MXS-4964 Simple sharding tutorial is out of date

Closed

Activity

Ascending order - Click to sort in descending order

Oli Sennhauser added a comment - 2024-02-13 16:36

Those do also NOT help:
maxctrl call command mariadbmon switchover-force MariaDB-Monitor shard3 shard2
maxctrl call command mariadbmon failover MariaDB-Monitor

Error: Server at http://127.0.0.1:8989 responded with 400 Bad Request to `POST maxscale/modules/mariadbmon/switchover-force?MariaDB-Monitor&shard3&shard2`
{
"links":

{ "self": "http://127.0.0.1:8989/v1/maxscale/modules/mariadbmon/switchover-force/" }

,
"meta": {
"errors": [

{ "detail": "'shard3' is not a valid promotion target for switchover because it is not replicating from 'shard2'." }

,

{ "detail": "Switchover cancelled." }

]
}
}

Error: Server at http://127.0.0.1:8989 responded with 400 Bad Request to `POST maxscale/modules/mariadbmon/failover?MariaDB-Monitor`
{
"links":

{ "self": "http://127.0.0.1:8989/v1/maxscale/modules/mariadbmon/failover/" }

,
"meta": {
"errors": [

{ "detail": "Can not select 'shard2' as a demotion target for failover because it is a running master." }

,

{ "detail": "Failover cancelled." }

]
}

Oli Sennhauser added a comment - 2024-02-13 16:36 Those do also NOT help: maxctrl call command mariadbmon switchover-force MariaDB-Monitor shard3 shard2 maxctrl call command mariadbmon failover MariaDB-Monitor Error: Server at http://127.0.0.1:8989 responded with 400 Bad Request to `POST maxscale/modules/mariadbmon/switchover-force?MariaDB-Monitor&shard3&shard2` { "links": { "self": "http://127.0.0.1:8989/v1/maxscale/modules/mariadbmon/switchover-force/" } , "meta": { "errors": [ { "detail": "'shard3' is not a valid promotion target for switchover because it is not replicating from 'shard2'." } , { "detail": "Switchover cancelled." } ] } } Error: Server at http://127.0.0.1:8989 responded with 400 Bad Request to `POST maxscale/modules/mariadbmon/failover?MariaDB-Monitor` { "links": { "self": "http://127.0.0.1:8989/v1/maxscale/modules/mariadbmon/failover/" } , "meta": { "errors": [ { "detail": "Can not select 'shard2' as a demotion target for failover because it is a running master." } , { "detail": "Failover cancelled." } ] }

markus makela added a comment - 2024-02-22 11:49

I think that this is actually a limitation of sorts that's mainly in the monitor. The intention of preventing Maintenance on the Master node is to avoid making the cluster not writable. I think taht for the schemarouter we'll need some form of an override that allows this to be set even if it's a Master node.

markus makela added a comment - 2024-02-22 11:49 I think that this is actually a limitation of sorts that's mainly in the monitor. The intention of preventing Maintenance on the Master node is to avoid making the cluster not writable. I think taht for the schemarouter we'll need some form of an override that allows this to be set even if it's a Master node.

markus makela added a comment - 2024-02-22 14:12 - edited

One workaround might be to actually use galeramon to monitor the nodes instead of mariadbmon.

This is what the state of a normal async replication cluster looks like when monitored by galeramon. As long as it's not a Galera cluster, it should be possible to set all the nodes into maintenance mode.

┌─────────┬───────────┬──────┬─────────────┬─────────┬──────────┬─────────────────┐

│ Server  │ Address   │ Port │ Connections │ State   │ GTID     │ Monitor         │

├─────────┼───────────┼──────┼─────────────┼─────────┼──────────┼─────────────────┤

│ server1 │ 127.0.0.1 │ 3000 │ 0           │ Running │ 0-3000-9 │ MariaDB-Monitor │

├─────────┼───────────┼──────┼─────────────┼─────────┼──────────┼─────────────────┤

│ server2 │ 127.0.0.1 │ 3001 │ 0           │ Running │ 0-3000-9 │ MariaDB-Monitor │

├─────────┼───────────┼──────┼─────────────┼─────────┼──────────┼─────────────────┤

│ server3 │ 127.0.0.1 │ 3002 │ 0           │ Running │ 0-3000-9 │ MariaDB-Monitor │

└─────────┴───────────┴──────┴─────────────┴─────────┴──────────┴─────────────────┘

markus makela added a comment - 2024-02-22 14:12 - edited One workaround might be to actually use galeramon to monitor the nodes instead of mariadbmon. This is what the state of a normal async replication cluster looks like when monitored by galeramon . As long as it's not a Galera cluster, it should be possible to set all the nodes into maintenance mode. ┌─────────┬───────────┬──────┬─────────────┬─────────┬──────────┬─────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├─────────┼───────────┼──────┼─────────────┼─────────┼──────────┼─────────────────┤ │ server1 │ 127.0.0.1 │ 3000 │ 0 │ Running │ 0-3000-9 │ MariaDB-Monitor │ ├─────────┼───────────┼──────┼─────────────┼─────────┼──────────┼─────────────────┤ │ server2 │ 127.0.0.1 │ 3001 │ 0 │ Running │ 0-3000-9 │ MariaDB-Monitor │ ├─────────┼───────────┼──────┼─────────────┼─────────┼──────────┼─────────────────┤ │ server3 │ 127.0.0.1 │ 3002 │ 0 │ Running │ 0-3000-9 │ MariaDB-Monitor │ └─────────┴───────────┴──────┴─────────────┴─────────┴──────────┴─────────────────┘

Oli Sennhauser added a comment - 2024-02-22 17:23 - edited

Thanks for the hint. I will try the other monitor...

But then docu should be adapted here:
https://mariadb.com/kb/en/mariadb-maxscale-6-simple-sharding-with-two-servers/

Oli Sennhauser added a comment - 2024-02-22 17:23 - edited Thanks for the hint. I will try the other monitor... But then docu should be adapted here: https://mariadb.com/kb/en/mariadb-maxscale-6-simple-sharding-with-two-servers/

MariaDB MaxScale

Schemarouter monitoring requirements are not documented clearly

Details

Description

Attachments

Issue Links

Activity

People

Dates

Git Integration