[MXS-1310] schemarouter ignores local copy of duplicate schemas on JOIN Created: 2017-07-04  Updated: 2017-07-06  Resolved: 2017-07-06

Status: Closed
Project: MariaDB MaxScale
Component/s: schemarouter
Affects Version/s: 2.1.3
Fix Version/s: 2.1.5

Type: Bug Priority: Blocker
Reporter: Alex Boag-Munroe Assignee: markus makela
Resolution: Fixed Votes: 0
Labels: None
Environment:

Ubuntu 14.04



 Description   

To resolve scaling issues we have a legacy database that we're sharding on client schemas across physical hosts.

Original database layout:
common_data
more_common_data
clientdb1
clientdb2
clientdb3
...
clientdb99

In our lab environment we've done the following:

Central Masters:
common_data
more_common_data

Shard1:
common_data (replicated from centraldb)
more_common_data (replicated from centraldb)
clientdb1
clientdb2

Shard2:
common_data (replicated from centraldb)
more_common_data (replicated from centraldb)
clientdb3
clientdb4

Our problem occurs when a query like the following is run:

select `dateCreated`, `displayName`, `newsDetails`, `relatedToID`, `relatedToType`, `officeName` from `common_data`.`news_reel` inner join `user` on `recordCreatedBy` = `user`.`userID` inner join `office` on `recordCreatedByOffice` = `office`.`officeID` where `news_reel`.`clientID` = 6 and `news_reel`.`departmentType` = 1 and `news_reel`.`dateCreated` >= 2017-07-03 order by `news_reel`.`dateCreated` desc limit 4

The initial connection connects to clientdb4 on shard 2 then attempts to run the above query. However the router ignores the presence of common_data on shard2 and routes to the first live server in the list, thus breaking the query as the clientdb4 schema that's expected to be on the other side of the JOIN, no longer exists.

Queries involving joins should at least attempt the query in the DSN before any routing decisions are made. Perhaps an additional configuration directive, replicated databases/replicated databases regex similar to ignore databases, but that lets us make the schemarouter aware that it's ok there's X copies and in the case of at least reads/joins, use our shard's copy.

We're anxious to leverage MaxScale in this scenario for our live platform and it's almost perfect, but being unable to select/join on common data is a show stopper.

Route hints won't help as we've no easy way of knowing which shard we want at execution time, as the client shards could be anywhere and as discussed on IRC with Markus, controlling this in app logic is the wrong place to do it.



 Comments   
Comment by markus makela [ 2017-07-04 ]

The query explicitly uses only one database, common_data. In addition to this explicit database, it also implicitly uses another database if a default database has been selected with a COM_INIT_DB or a USE <database> query. The routing should take the active database into consideration if tables with explicit and implicit databases are used in a query.

In practice this means that any query that uses implied databases must be routed to the server with the currently active database even if it refers to a table that is defined on another server. This will cause correct behavior with tables replicated to shards.

Comment by markus makela [ 2017-07-05 ]

I've created packages that contain a fix which routes any query that implicitly uses the current database to the shard which contains this database. You can find the packages here: http://max-tst-01.mariadb.com/ci-repository/MXS-1310/mariadb-maxscale/

The packages were built from commit 0c33572a96eb9c296efa567e0028f853fb3bc09a.

Comment by markus makela [ 2017-07-05 ]

I've built a new set of packages that add a new parameter that allows for deterministic conflict resolution when a database exists on more than one database. The documentation for the new parameter can be found here: https://github.com/mariadb-corporation/MaxScale/blob/MXS-1310/Documentation/Routers/SchemaRouter.md#preferred_server

The packages can be found here: http://max-tst-01.mariadb.com/ci-repository/MXS-1310-jul5/

The packages were built from commit 9618e63b5ff50613a23d3f0e628f0c2259dcea3e

Generated at Thu Feb 08 04:05:46 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.