[MDEV-6901] [PATCH] session.gtid_domain_id should be able to be set by non-SUPER user Created: 2014-10-21  Updated: 2022-02-09

Status: Stalled
Project: MariaDB Server
Component/s: None
Fix Version/s: None

Type: Task Priority: Major
Reporter: Daniel Black Assignee: Unassigned
Resolution: Unresolved Votes: 3
Labels: None

Attachments: File MDEV-6901.patch    
Issue Links:
Relates
relates to MDEV-8017 current_user() in definer event/trigg... Closed
relates to MDEV-26706 Enable per table gtid domain_id Open

 Description   

Using different gtid_domain_id are a way to allow a multi purpose database server to define a set of different gtids to assist in parallel replication. Unfortunately since gtid_domain_id can only be set by a super user this restricts its use. It would be useful to have an allocation of gtid_domain_id per user to facilitate application assisted parallel replication.

https://mariadb.com/kb/en/mariadb/documentation/replication/standard-replication/global-transaction-id/#gtid_domain_id (doesn't list it as SUPER only)

mysql>  set session gtid_domain_id=59;
ERROR 1227 (42000): Access denied; you need (at least one of) the SUPER privilege(s) for this operation

If having multiple users set the same gtid_domain_id is risky for some reason consider:
a) an entry in the mysql.user table that defines the default gtid_domain_id for that user; and optionally extra
b) a min/max gtid_domain_id values in the users table to prevent conflicts (min/max available as readonly variable) where a non-super user can only set gtid_domain_id within this range.

b) would be useful if an application support a set of operations (e.g. db cache, batch payments) that it knows are independent.



 Comments   
Comment by Daniel Black [ 2014-10-22 ]

given my confusion on MDEV-6900 this bug/feature request is invalid too. the feature a) mentioned can be implemented in stored procedures with an init_connect and good db admin can work out b)

Comment by Arjen Lentz [ 2014-10-22 ]

Daniel - well it can, but I think the question posed by this issue is valid: why can't a session change its own gtid_domain_id? Generally speaking, aconnect may change its own session variables, but requires SUPER for global variables.

Comment by Daniel Black [ 2014-10-22 ]

quite right Arjen. Withdraw my self declaration of invalidity.

Comment by Daniel Black [ 2014-11-19 ]

The longer, less functional, workaround for the time being http://openquery.com.au/blog/optimising-multithreaded-replication

Comment by Daniel Black [ 2014-11-24 ]

Attached patch adds functionality.

The mysql.user table now has 3 extra columns.

added the following to grant option DOMAIN_ID_DEFAULT, DOMAIN_ID_MIN and DOMAIN_ID_MAX (which isn't getting parsed - got stuck stuck on this bit).

The following read only variables where added:

gtid_domain_id_default,gtid_domain_id_min,gtid_domain_id_max.

Migration is clean as the default mysql.user domain_id_default,domain_id_min,domain_id_max are all 0 the same as the current gtid_domain_id.

Comment by Kristian Nielsen [ 2014-11-24 ]

@Arjen: About the reason for non-privileged user not being allowed to set
@@SESSION.domain_id: A DBA might not want untrusted users to change it.

One problem is that changing domain_id causes the slave GTID position to
become multi-dimensional - one position per domain ever used. This might not
be desired.

If domain-based parallel replication is used, setting domain_id also makes it
possible to break replication, by assigning conflicting transactions different
domain_id. For example:

SET gtid_domain_id=1;
UPDATE t1 SET a=1 WHERE id=42;
SET gtid_domain_id=2;
UPDATE t1 SET a=2 WHERE id=42;

On a parallel slave, the UPDATES might run in the opposite order, causing
incorrect data.

So domain_id is restricted to prevent such things. However, it is clear that
some mechanism is needed for non-privileged user to change domain_id, it
just hasn't been done yet. This MDEV is a reasonable suggestion for one way...

Comment by Kristian Nielsen [ 2014-11-24 ]

So this idea looks reasonable, with a couple minor adjustments.

DOMAIN_ID_DEFAULT, DOMAIN_ID_MIN, DOMAIN_ID_MAX should be INT UNSIGNED NULL
DEFAULT NULL.

If DOMAIN_ID_DEFAULT is non-NULL, then upon authentication,
@@SESSION.gtid_domain_id is set to this value, else it has no effect.

Similarly, if DOMAIN_ID_MIN or DOMAIN_ID_MAX are NULL, then setting
@@SESSION.gtid_domain_id requires SUPER, else that range is valid for
non-provileged user.

I wonder if the names should be GTID_DOMAIN_ID_DEFAULT, to match the system
variable? But I'll leave that up to your preference.

I would like someone else (probably Serg?) to comment on the use of GRANT /
mysql.user for this purpose, as well as upgrade/downgrade issues, as this is
not something I am familiar with.

Otherwise looks pretty useful. Something like this is definitely needed.

Comment by Pavel Ivanov [ 2014-11-24 ]

If you end up including this into MariaDB, please add a flag that will turn off this functionality completely. I.e. with a flag set (or unset) I want for these additional fields in mysql.user to have no effect, additional GRANT syntax to give error and any non-SUPER user not being able to change its session domain_id.

Comment by Kristian Nielsen [ 2014-11-24 ]

@Pavel: Can you elaborate? You are probably the most experienced user of GTID,
so your input is especially valuable.

The additional functionality would not have effect unless explicitly GRANTed,
which already requires privileged user.

Is it that you want to use GTID in a way that you can ignore the domain_id
facility, so that the replication stream is always a single strictly ordered
sequence? This is a valid use-case, I believe. It might be reasonable to have
something to enforce that, this might then apply to other places as well...

Or do you have other concerns about this? Eg. do you see a use case for
multiple domain ids (eg. parallel replication), but envision a better way to
control the domain ids than suggested here?

Comment by Pavel Ivanov [ 2014-11-24 ]

We have an environment where we allow users basically to do with MySQL anything they want, except a few things we consider dangerous. One of those things is SUPER privilege, another thing is to be able to affect replication in any way. So ability of users to change their domain_id would be a non-starter for us. But as part of "users can do anything" they can modify mysql.user table directly any way they want and execute FLUSH PRIVILEGES statement after that (we have a special code . So we need a way for the new fields DOMAIN_ID_DEFAULT, DOMAIN_ID_MIN, DOMAIN_ID_MAX to not have any effect no matter what's written in them. The most logical way for this seem to be to have a flag that turns off this feature completely.

Comment by Daniel Black [ 2014-11-25 ]

> On a parallel slave, the UPDATES might run in the opposite order, causing
incorrect data.

Happy to leave the misuse of gtid_domain_id within a range by users as a consequence for them. We'd only grant them if they understood this.

> I wonder if the names should be GTID_DOMAIN_ID_DEFAULT, to match the system
variable?

I'm happy to make these consistent.

> So we need a way for the new fields DOMAIN_ID_DEFAULT, DOMAIN_ID_MIN, DOMAIN_ID_MAX to not have any effect no matter what's written in them

I think I've written the code so those columns don't need to be there and it will revert to existing behaviour. It does limit future expansion of the mysql.user table as a number of things directly expect fields an order with fixed offsets here.

A option for pivanof may be for the special code that does flush privileges to ensure these fields are NULL/0/explicitly set from global gtid_domain_id.

Another option is remove privileges on the mysql.user domain_id* columns.

Another option is to put these fields in a different table.

Or have session variables domain_id_min/max able to be set by SUPER only and use a "CREATE PROCEDURE setdomain(IN cuser varchar(140)) DETERMINISTIC READS SQL DATA SQL SECURITY DEFINER" as super to set this (like our prototype http://openquery.com.au/blog/optimising-multithreaded-replication). Deficiencies (which could be overcome) include - no init_connect for SUPER user, no access to CURRENT_USER() that returns the calling user and not the SECURITY DEFINER and caching these values in the acl cache if deterministic.

I see why you consider the flag as the most logical way, its easy to implement. I'd rather do something else as its a flag that controls the meaning of some columns that aren't normally modifiable by non-SUPER users looks like specialised case and introducing a flag for such a case looks like a software debt. A different table is probably my first preference of an alternate mechanism currently.

Comment by Daniel Black [ 2014-11-25 ]

Another option for pivanof, alter mysql.user to have virtual columns that returns NULL for the specific columns.

Comment by Pavel Ivanov [ 2014-11-25 ]

Among all the options offered I think the only one worth discussing is a different table, but it would be an inconsistent design – if the table is always read (doesn't have a flag to turn it off) and is always in 1:1 relation to mysql.user table, then why that data is separated from mysql.user table at all?

So, I guess if you and Kristian think that a flag is inappropriate here then don't do anything, we'll have to implement the flag ourselves in our tree. At least it's good that I know about this before we tried to upgrade, and it won't be a surprise...

Comment by Sergei Golubchik [ 2014-12-01 ]

pivanof, if users can do anything and that includes modifying mysql.user table directly, how can you prevent them from obtaining a SUPER privilege? Some special code of yours again?

Comment by Pavel Ivanov [ 2014-12-01 ]

Yes, it's a special code that prohibits any users in mysql.user table to have SUPER privilege. We keep all users that need to have SUPER in a separate table with the same structure as mysql.user.
I hope I'll be able to give you a link to this code by the end of this week.

Comment by Pavel Ivanov [ 2014-12-03 ]

Now that our code is opensourced you can look at the special code I was talking about if you wish. It's here https://code.google.com/p/google-mysql/source/list?name=mariadb-10.0.12%2F05-isolation, see the commit "Add parameter for blocking certain user privileges" which depends on the previous one "Add special system users support".

Comment by Daniel Black [ 2014-12-03 ]

Thanks pivanof I'll definitely look at integrating that in some way, especially if it gets merged and alternately using similar checks in a #ifdef xxx to reduce your maintenance.

For the moment I'm waiting on a little more of a conceptual review and perhaps a quick check if I've don't anything dumb in sql_yacc.yy.

Comment by Kristian Nielsen [ 2015-01-16 ]

I think Serg wanted to reply to this, but I guess he has been busy with other
stuff so far (as have I - I apologise for the delay in replying to your
several patches, but things are too busy lately).

Serg and I discussed this briefly, and we both felt that this (adding another
priviliege and new user table columns) does not seem the best way to do
things?

For one, it seems too ad-hoc - there are probably other settings that could be
useful for non-superuser access in some cases, should they also have separate
privileges and columns?

And besides - this method of new privilege does not seem to be the most useful
way for the DBA? One thing that could be really useful would be the ability to
make the domain_id be a hash function of the currently selected database, to
simulate mysql per-database parallelism. Or a hash function of the currently
selected user. Eg. pseudo-code:

SET GLOBAL gtid_domain_id = HASH(CURRENT DATABASE) MOD 10;

or something like that - but the exact syntax needs some more thought to be
flexible and consistent. Maybe just a GLOBAL gtid_domain_id_function variable,
which contains the name of a stored function that sets the value?

Or what if there was a general mechanism to set --init-connect, but on a
per-user basis? Then this could be used to set the domain id for a as desired,
but also be used to control any other setting.

Do you think it is really useful to set a range of allowed domain ids for a
non-superuser? It seems to me that as soon as a user has the ability to set
two different domain ids, that user has the ability to break replication
anyway, by creating conflicting transactions with different domain ids. So not
sure how restricting the range would help in practice.

It also seems that a stored procedure with SQL SECURITY DEFINER could have a
role here? The superuser could define a stored procedure and give a
non-privileged user execute access. Such a procedure could provide the ability
to change the domain_id, with a high degree of flexibility?

Comment by Daniel Black [ 2015-01-16 ]

--init-connect, but on a per-user basis? Then this could be used to set the domain id for a as desired,

but also be used to control any other setting.

The init-connect method is what i used in combination with a user lookup in the blog above and this could also do the same as gtid_domain_id_function.

Do you think it is really useful to set a range of allowed domain ids for a non-superuser? It seems to me that as soon as a user has the ability to set two different domain ids, that user has the ability to break replication anyway, by creating conflicting transactions with different domain ids. So not sure how restricting the range would help in practice.

Yes is only really useful if this is resolved such that breaks only stop the replication of conflicting domains and giving a user the ability to resolve this which I think is about two features beyond what really needs to be considered now.

It also seems that a stored procedure with SQL SECURITY DEFINER...

yes that was done in the blog above too.

I was aiming at a generic interface that allows a Drupal application for instance to keep disparate sections of their code (cache, vs entities vs users vs event logs) to have a different domain-id and hence explicitly paralleling them. Something that is clear and obvious and could be incorporated into the Drupal distribution code as a pseudo data paralleling interface.

So I'll experiment some more with stored procedures, multi source replication (each with domain id filters and the same master) to avoid total replication faults (concept flawed?) maybe. And I'll see how useful that is. Stats on in-order parallel replication will see if there is much to be improved by going explicit.

MDEV-6594 Use separate domain_id for Galera transactions kind of needs to be considered.

Comment by Arjen Lentz [ 2020-08-26 ]

Noting the irony of Dan getting allocated to this issue - he himself reported it from Open Query in 2014...

Comment by Daniel Black [ 2020-08-26 ]

I'm sure it won't be the last one

Comment by Daniel Black [ 2021-08-23 ]

Given this was largely a discussion from years ago, I value the reviews that have taken place. I'll let this issue wait until there's people still in operational roles.

To people that come across this and are interested. Please watch/vote for this issue and perhaps include a few works why its of interest to you.

Generated at Thu Feb 08 07:15:27 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.