Uploaded image for project: 'MariaDB MaxScale'
  1. MariaDB MaxScale
  2. MXS-4983

Schemarouter monitoring requirements are not documented clearly

Details

    Description

      When we have a Sharding Cluster with SchemaRouter we could come to the idea to drain/set to maintenance a Shard (for example for upgrade the db or the server including a reboot). For this I would have expected to work drain/maintenance.
      maxctrl list servers
      ┌────────┬──────────────┬──────┬─────────────┬─────────────────┬──────────────┬─────────────────┐
      │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │
      ├────────┼──────────────┼──────┼─────────────┼─────────────────┼──────────────┼─────────────────┤
      │ shard2 │ 10.139.158.1 │ 3364 │ 1 │ Master, Running │ 0-3364-69817 │ MariaDB-Monitor │
      ├────────┼──────────────┼──────┼─────────────┼─────────────────┼──────────────┼─────────────────┤
      │ shard3 │ 10.139.158.1 │ 3365 │ 1 │ Running │ 0-3365-63166 │ MariaDB-Monitor │
      ├────────┼──────────────┼──────┼─────────────┼─────────────────┼──────────────┼─────────────────┤
      │ shard4 │ 10.139.158.1 │ 3366 │ 1 │ Running │ 0-3366-6902 │ MariaDB-Monitor │
      └────────┴──────────────┴──────┴─────────────┴─────────────────┴──────────────┴─────────────────┘

      maxctrl set server shard2 drain
      maxctrl set server shard2 maintenance

      But I am getting an error:

      Error: Server at http://127.0.0.1:8989 responded with 400 Bad Request to `PUT servers/shard2/set?state=drain`
      {
      "errors": [

      { "detail": "The server is primary, so it cannot be set in maintenance or draining mode. First perform a switchover and then retry the operation." }

      ]
      }

      As I wrote before I do not understand the concept of "Master" in a sharding system.

      So when I try to "switchover" the master role to another shard I get the following error:

      maxctrl call command mariadbmon switchover MariaDB-Monitor shard3 shard2
      Error: Server at http://127.0.0.1:8989 responded with 400 Bad Request to `POST maxscale/modules/mariadbmon/switchover?MariaDB-Monitor&shard3&shard2`
      {
      "links":

      { "self": "http://127.0.0.1:8989/v1/maxscale/modules/mariadbmon/switchover/" }

      ,
      "meta": {
      "errors": [

      { "detail": "'shard3' is not a valid promotion target for switchover because it is not replicating from 'shard2'." }

      ,

      { "detail": "Switchover cancelled." }

      ]
      }
      }

      So I am a bit clueless now, how I should proceed...

      @Markus: I wrote everything down for a blog article, if you want to cross-read before publishing, please let me know...

      Attachments

        Issue Links

          Activity

            Those do also NOT help:
            maxctrl call command mariadbmon switchover-force MariaDB-Monitor shard3 shard2
            maxctrl call command mariadbmon failover MariaDB-Monitor

            Error: Server at http://127.0.0.1:8989 responded with 400 Bad Request to `POST maxscale/modules/mariadbmon/switchover-force?MariaDB-Monitor&shard3&shard2`
            {
            "links":

            { "self": "http://127.0.0.1:8989/v1/maxscale/modules/mariadbmon/switchover-force/" }

            ,
            "meta": {
            "errors": [

            { "detail": "'shard3' is not a valid promotion target for switchover because it is not replicating from 'shard2'." }

            ,

            { "detail": "Switchover cancelled." }

            ]
            }
            }

            Error: Server at http://127.0.0.1:8989 responded with 400 Bad Request to `POST maxscale/modules/mariadbmon/failover?MariaDB-Monitor`
            {
            "links":

            { "self": "http://127.0.0.1:8989/v1/maxscale/modules/mariadbmon/failover/" }

            ,
            "meta": {
            "errors": [

            { "detail": "Can not select 'shard2' as a demotion target for failover because it is a running master." }

            ,

            { "detail": "Failover cancelled." }

            ]
            }

            oli Oli Sennhauser added a comment - Those do also NOT help: maxctrl call command mariadbmon switchover-force MariaDB-Monitor shard3 shard2 maxctrl call command mariadbmon failover MariaDB-Monitor Error: Server at http://127.0.0.1:8989 responded with 400 Bad Request to `POST maxscale/modules/mariadbmon/switchover-force?MariaDB-Monitor&shard3&shard2` { "links": { "self": "http://127.0.0.1:8989/v1/maxscale/modules/mariadbmon/switchover-force/" } , "meta": { "errors": [ { "detail": "'shard3' is not a valid promotion target for switchover because it is not replicating from 'shard2'." } , { "detail": "Switchover cancelled." } ] } } Error: Server at http://127.0.0.1:8989 responded with 400 Bad Request to `POST maxscale/modules/mariadbmon/failover?MariaDB-Monitor` { "links": { "self": "http://127.0.0.1:8989/v1/maxscale/modules/mariadbmon/failover/" } , "meta": { "errors": [ { "detail": "Can not select 'shard2' as a demotion target for failover because it is a running master." } , { "detail": "Failover cancelled." } ] }
            markus makela markus makela added a comment -

            I think that this is actually a limitation of sorts that's mainly in the monitor. The intention of preventing Maintenance on the Master node is to avoid making the cluster not writable. I think taht for the schemarouter we'll need some form of an override that allows this to be set even if it's a Master node.

            markus makela markus makela added a comment - I think that this is actually a limitation of sorts that's mainly in the monitor. The intention of preventing Maintenance on the Master node is to avoid making the cluster not writable. I think taht for the schemarouter we'll need some form of an override that allows this to be set even if it's a Master node.
            markus makela markus makela added a comment - - edited

            One workaround might be to actually use galeramon to monitor the nodes instead of mariadbmon.

            This is what the state of a normal async replication cluster looks like when monitored by galeramon. As long as it's not a Galera cluster, it should be possible to set all the nodes into maintenance mode.

            ┌─────────┬───────────┬──────┬─────────────┬─────────┬──────────┬─────────────────┐
            │ Server  │ Address   │ Port │ Connections │ State   │ GTID     │ Monitor         │
            ├─────────┼───────────┼──────┼─────────────┼─────────┼──────────┼─────────────────┤
            │ server1 │ 127.0.0.1 │ 3000 │ 0           │ Running │ 0-3000-9 │ MariaDB-Monitor │
            ├─────────┼───────────┼──────┼─────────────┼─────────┼──────────┼─────────────────┤
            │ server2 │ 127.0.0.1 │ 3001 │ 0           │ Running │ 0-3000-9 │ MariaDB-Monitor │
            ├─────────┼───────────┼──────┼─────────────┼─────────┼──────────┼─────────────────┤
            │ server3 │ 127.0.0.1 │ 3002 │ 0           │ Running │ 0-3000-9 │ MariaDB-Monitor │
            └─────────┴───────────┴──────┴─────────────┴─────────┴──────────┴─────────────────┘
            

            markus makela markus makela added a comment - - edited One workaround might be to actually use galeramon to monitor the nodes instead of mariadbmon. This is what the state of a normal async replication cluster looks like when monitored by galeramon . As long as it's not a Galera cluster, it should be possible to set all the nodes into maintenance mode. ┌─────────┬───────────┬──────┬─────────────┬─────────┬──────────┬─────────────────┐ │ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │ ├─────────┼───────────┼──────┼─────────────┼─────────┼──────────┼─────────────────┤ │ server1 │ 127.0.0.1 │ 3000 │ 0 │ Running │ 0-3000-9 │ MariaDB-Monitor │ ├─────────┼───────────┼──────┼─────────────┼─────────┼──────────┼─────────────────┤ │ server2 │ 127.0.0.1 │ 3001 │ 0 │ Running │ 0-3000-9 │ MariaDB-Monitor │ ├─────────┼───────────┼──────┼─────────────┼─────────┼──────────┼─────────────────┤ │ server3 │ 127.0.0.1 │ 3002 │ 0 │ Running │ 0-3000-9 │ MariaDB-Monitor │ └─────────┴───────────┴──────┴─────────────┴─────────┴──────────┴─────────────────┘
            oli Oli Sennhauser added a comment - - edited

            Thanks for the hint. I will try the other monitor...

            But then docu should be adapted here:
            https://mariadb.com/kb/en/mariadb-maxscale-6-simple-sharding-with-two-servers/

            oli Oli Sennhauser added a comment - - edited Thanks for the hint. I will try the other monitor... But then docu should be adapted here: https://mariadb.com/kb/en/mariadb-maxscale-6-simple-sharding-with-two-servers/

            People

              markus makela markus makela
              oli Oli Sennhauser
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.