Uploaded image for project: 'MariaDB MaxScale'
  1. MariaDB MaxScale
  2. MXS-1310

schemarouter ignores local copy of duplicate schemas on JOIN

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 2.1.3
    • Fix Version/s: 2.1.5
    • Component/s: schemarouter
    • Labels:
      None
    • Environment:
      Ubuntu 14.04

      Description

      To resolve scaling issues we have a legacy database that we're sharding on client schemas across physical hosts.

      Original database layout:
      common_data
      more_common_data
      clientdb1
      clientdb2
      clientdb3
      ...
      clientdb99

      In our lab environment we've done the following:

      Central Masters:
      common_data
      more_common_data

      Shard1:
      common_data (replicated from centraldb)
      more_common_data (replicated from centraldb)
      clientdb1
      clientdb2

      Shard2:
      common_data (replicated from centraldb)
      more_common_data (replicated from centraldb)
      clientdb3
      clientdb4

      Our problem occurs when a query like the following is run:

      select `dateCreated`, `displayName`, `newsDetails`, `relatedToID`, `relatedToType`, `officeName` from `common_data`.`news_reel` inner join `user` on `recordCreatedBy` = `user`.`userID` inner join `office` on `recordCreatedByOffice` = `office`.`officeID` where `news_reel`.`clientID` = 6 and `news_reel`.`departmentType` = 1 and `news_reel`.`dateCreated` >= 2017-07-03 order by `news_reel`.`dateCreated` desc limit 4
      

      The initial connection connects to clientdb4 on shard 2 then attempts to run the above query. However the router ignores the presence of common_data on shard2 and routes to the first live server in the list, thus breaking the query as the clientdb4 schema that's expected to be on the other side of the JOIN, no longer exists.

      Queries involving joins should at least attempt the query in the DSN before any routing decisions are made. Perhaps an additional configuration directive, replicated databases/replicated databases regex similar to ignore databases, but that lets us make the schemarouter aware that it's ok there's X copies and in the case of at least reads/joins, use our shard's copy.

      We're anxious to leverage MaxScale in this scenario for our live platform and it's almost perfect, but being unable to select/join on common data is a show stopper.

      Route hints won't help as we've no easy way of knowing which shard we want at execution time, as the client shards could be anywhere and as discussed on IRC with Markus, controlling this in app logic is the wrong place to do it.

        Attachments

          Activity

            People

            Assignee:
            markus makela markus makela
            Reporter:
            Ninpo Alex Boag-Munroe
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                Git Integration