[MXS-1657] Use authentication fails on backend servers with a large number of grants Created: 2018-02-08  Updated: 2018-02-21  Resolved: 2018-02-21

Status: Closed
Project: MariaDB MaxScale
Component/s: Authenticator
Affects Version/s: 2.1.13
Fix Version/s: N/A

Type: Bug Priority: Minor
Reporter: Augusto Bott Assignee: Unassigned
Resolution: Incomplete Votes: 0
Labels: None
Environment:

Ubuntu, DEB packages from MariaDB.org


Issue Links:
Relates
relates to MXS-306 User authentication fails when using ... Closed
relates to MXS-356 Connection timeouts for authenticatio... Closed

 Description   

While similar to MXS-306 (and possibly related to MXS-356), this doesn't seem to be the same bug. We had two different environments on which we're trying to use MaxScale for security purposes (by locking down access from some machines to specific users, via dbfwfilter).

TCPdump shows the following query running from the MaxScale node to a MySQL backend:

SELECT u.user, u.host, d.db, u.select_priv, u.password
FROM mysql.user AS u 
LEFT JOIN mysql.db AS d
ON (u.user = d.user AND u.host = d.host) WHERE u.user NOT IN ('root')
UNION
SELECT u.user, u.host, t.db, u.select_priv, u.password
FROM mysql.user AS u LEFT JOIN mysql.tables_priv AS t
ON (u.user = t.user AND u.host = t.host) WHERE u.user NOT IN ('root');

On server A, the query above returns 1620 rows and everything works fine. On server B, the same query (when run it manually, on the mysql CLI), it returns 2485 rows.

As I said before, for server A, it loads the authenticantion information just fine, but for backend server B, we get the following in the maxscale logs:

2018-02-08 14:38:46   notice : Started MaxScale log flusher.
2018-02-08 14:38:49   error  : [MySQLAuth] Failed to load users: Lost connection to MySQL server during query
2018-02-08 14:38:49   info   : [MySQLAuth] Added user: INSERT OR REPLACE INTO mysqlauth_users VALUES ('maxuser', '%', NULL, 1, 'A1328B81CED7AA9EFFAF8CC6DB1D0C540CDED1DA')
2018-02-08 14:38:49   info   : [MySQLAuth] Added user: INSERT OR REPLACE INTO mysqlauth_users VALUES ('maxuser', 'localhost', NULL, 1, 'A1328B81CED7AA9EFFAF8CC6DB1D0C540CDED1DA')
2018-02-08 14:38:49   notice : [MySQLAuth] [Firewall Service DEV] No users were loaded but 'inject_service_user' is enabled. Enabling service credentials for authentication until database users have been successfully loaded.

This seems to be related to the internal, in-memory, sqlite database (or memory allocation allowed for it): it works 'fine' for relatively small/simpler complexity for credentials, but fails with larger sets.



 Comments   
Comment by markus makela [ 2018-02-08 ]

The error message states why the loading of the users failed.

2018-02-08 14:38:49   error  : [MySQLAuth] Failed to load users: Lost connection to MySQL server during query

I'd recommend adding query_retries=2 under the [maxscale] section to see if the query retry feature helps.

Comment by Augusto Bott [ 2018-02-08 ]

Thanks for your prompt response. We did try adding query_retries=X to our maxscale.cnf (where X ranged from 2 to 5), but no dice: it still fails to load users from server B (2485 rows returned by that query above, in the description - all works fine for server A, with 1620 rows being loaded).

What did the trick for us was the following patch (against branch 2.1.13, BTW):

diff --git a/server/modules/authenticator/MySQLAuth/dbusers.c b/server/modules/authenticator/MySQLAuth/dbusers.c
index eb72bba..67e558c 100644
--- a/server/modules/authenticator/MySQLAuth/dbusers.c
+++ b/server/modules/authenticator/MySQLAuth/dbusers.c
@@ -59,7 +59,7 @@ static bool get_hostname(DCB *dcb, char *client_hostname, size_t size);
 static char* get_new_users_query(const char *server_version, bool include_root)
 {
     const char* password = strstr(server_version, "5.7.") ? MYSQL57_PASSWORD : MYSQL_PASSWORD;
-    const char *with_root = include_root ? "" : "WHERE u.user NOT IN ('root')";
+    const char *with_root = include_root ? "" : "WHERE u.user NOT IN ('root') AND u.user LIKE 'crew%'";
 
     size_t n_bytes = snprintf(NULL, 0, NEW_LOAD_DBUSERS_QUERY, password, with_root, password, with_root);
     char *rval = MXS_MALLOC(n_bytes + 1);
lines 1-13/13 (END)

This time, 31 and 73 users/credentials are (pre) loaded by MaxScale (from servers A and B, respectively) and everything works fine (since for this particular application we're only interested on those users).

We'll still trying to figure out what could be wrong and we're currently working with two hypothesis on our side:

  • there might be some malformed/invalid string somewhere in the authentication/credentials data for server B
  • it might be simply a matter some memory allocation limit being reached

We'll post more details as we have them.
Thanks for your time.

Comment by Augusto Bott [ 2018-02-09 ]

Ok - so we just tested one of the hypothesis mentioned above by artificially injecting 6000+ users on a mysql.user table via a stored procedure on a MariaDB 10.2 instance and... it worked: MaxScale loaded all those users just fine (this sorta rules out our 2nd hypothesis). We'll keep digging for the needle in the haystack (some malformed/invalid string somewhere) and report back when we find it so such a fringe case can be possibly addressed on a future release.
Thanks.

Comment by markus makela [ 2018-02-09 ]

As the monitors in MaxScale use the MariaDB Connector-C, it could also be a bug in that component. To test whether this affects newer versions of the connector, you could try to test with the 2.2.1 beta version of MaxScale. It upgrades the connector to the 3.0 version whereas the 2.1 version of MaxScale uses the 2.3 version of the connector.

Comment by Augusto Bott [ 2018-02-11 ]

Hi there - just downloaded MaxScale 2.2.1 from http://goo.gl/U43755 (maxscale-2.2.1.ubuntu.xenial.tar.gz) and deployed on our test server: no dice (still getting "Lost connection to MySQL server during query" - only for that particular server). The connector upgrade doesn't seem to be making any difference. We're left with diff'ing the auth credentials and shall post here when/if we find the culprit.
Thanks.

Comment by Augusto Bott [ 2018-02-14 ]

Still no luck in finding the culprit (that is, we're still working on the hypothesis of having a specific combination of an 'old' version of MySQL with MaxScale). We'll post news as we find them. Thanks.

Comment by Augusto Bott [ 2018-02-20 ]

Just a quick status update on this one: we've narrowed down the possiblities down to the fact the problematic server runs MySQL v5.1. We've moved on with MaxScale v2.2.1 (again, compiled with the patch above) for now, and plan to upgrade that server to a newer version (the bug doesn't occur with that same auth/credential information on MySQL v5.5 onwards or MariaDB v10.2 and up).
Feel free to close this one as we won't be pursuing fixing such old setup on our end.
Thanks for your time.

Comment by markus makela [ 2018-02-21 ]

OK, thanks for letting us know.

Generated at Thu Feb 08 04:08:26 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.