[MDEV-32356] Setting gtid_slave_pos is not atomic Created: 2023-10-05  Updated: 2023-11-22

Status: Stalled
Project: MariaDB Server
Component/s: Galera, Replication
Affects Version/s: 10.4
Fix Version/s: 10.4

Type: Bug Priority: Major
Reporter: Jan Lindström Assignee: Kristian Nielsen
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Problem/Incident
causes MDEV-32193 Assertion `state() == s_executing || ... Stalled

 Description   

Consider first normal master-slave topology with gtid_strict_mode=0 where user stops slave and sets:

SET GLOBAL gtid_slave_pos= '1-2-3,2-4-6';

Yes, this could be totally incorrect i.e. there could not even be any node with domain_id with 1 or 2. This command is executed like this:

rpl_slave_state::record_gtid (this=0x55fb9f8f6c90, thd=0x7f2e80031480, gtid=0x7f2e89efa710, sub_id=2, in_transaction=false, 
    in_statement=true, out_hton=0x7f2e89efa6f8) at /home/jan/work/mariadb/10.4/sql/rpl_gtid.cc:690
#1  0x000055fb9b9ff12e in rpl_slave_state::load (this=0x55fb9f8f6c90, thd=0x7f2e80031480, state_from_master=0x7f2e8003e053 "", len=11, 
    reset=true, in_statement=true) at /home/jan/work/mariadb/10.4/sql/rpl_gtid.cc:1409
#2  0x000055fb9b81d972 in rpl_gtid_pos_update (thd=0x7f2e80031480, str=0x7f2e8003e048 "1-2-3,2-4-6", len=11)
    at /home/jan/work/mariadb/10.4/sql/sql_repl.cc:4728
#3  0x000055fb9b99469a in Sys_var_gtid_slave_pos::global_update (this=0x55fb9d1fde20 <Sys_gtid_slave_pos>, thd=0x7f2e80031480, 
    var=0x7f2e8003dff8) at /home/jan/work/mariadb/10.4/sql/sys_vars.cc:1858
#4  0x000055fb9b6a8c5e in sys_var::update (this=0x55fb9d1fde20 <Sys_gtid_slave_pos>, thd=0x7f2e80031480, var=0x7f2e8003dff8)
    at /home/jan/work/mariadb/10.4/sql/set_var.cc:208
#5  0x000055fb9b6aab8e in set_var::update (this=0x7f2e8003dff8, thd=0x7f2e80031480) at /home/jan/work/mariadb/10.4/sql/set_var.cc:837
#6  0x000055fb9b6aa7f0 in sql_set_variables (thd=0x7f2e80031480, var_list=0x7f2e80036360, free=true)
    at /home/jan/work/mariadb/10.4/sql/set_var.cc:740
#7  0x000055fb9b7db3f1 in mysql_execute_command (thd=0x7f2e80031480) at /home/jan/work/mariadb/10.4/sql/sql_parse.cc:5047
#8  0x000055fb9b7e5303 in mysql_parse (thd=0x7f2e80031480, rawbuf=0x7f2e8003de68 "SET GLOBAL gtid_slave_pos= '1-2-3,2-4-6'", length=40, 
    parser_state=0x7f2e89efb300, is_com_multi=false, is_next_command=false) at /home/jan/work/mariadb/10.4/sql/sql_parse.cc:8012
#9  0x000055fb9b7e499d in wsrep_mysql_parse (thd=0x7f2e80031480, rawbuf=0x7f2e8003de68 "SET GLOBAL gtid_slave_pos= '1-2-3,2-4-6'", length=40, 
    parser_state=0x7f2e89efb300, is_com_multi=false, is_next_command=false) at /home/jan/work/mariadb/10.4/sql/sql_parse.cc:7814
#10 0x000055fb9b7d0979 in dispatch_command (command=COM_QUERY, thd=0x7f2e80031480, 
    packet=0x7f2e8004fa01 "SET GLOBAL gtid_slave_pos= '1-2-3,2-4-6'", packet_length=40, is_com_multi=false, is_next_command=false)
    at /home/jan/work/mariadb/10.4/sql/sql_parse.cc:1843
#11 0x000055fb9b7cf2ce in do_command (thd=0x7f2e80031480) at /home/jan/work/mariadb/10.4/sql/sql_parse.cc:1378
#12 0x000055fb9b975bbe in do_handle_one_connection (connect=0x55fb9fded720) at /home/jan/work/mariadb/10.4/sql/sql_connect.cc:1420
#13 0x000055fb9b97591a in handle_one_connection (arg=0x55fb9fded720) at /home/jan/work/mariadb/10.4/sql/sql_connect.cc:1324
#14 0x000055fb9bf1154b in pfs_spawn_thread (arg=0x55fb9f972430) at /home/jan/work/mariadb/10.4/storage/perfschema/pfs.cc:1869
#15 0x00007f2e97c97ada in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:444
#16 0x00007f2e97d282e4 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100

Anyway, this is not atomic because rpl_slave_state::load contains a loop i.e. it takes one gtid and calls rpl_slave_state::record_gtid where we have

    if (err || (err= ha_commit_trans(thd, FALSE)))
      ha_rollback_trans(thd, FALSE);

The fact that storing these gtids is not atomic might have problems in following cases:

  • In Galera we replicate gtid_slave_pos table to other nodes and assume it is InnoDB. This replication is required at least on case where slave node is configured to use skip_slave_start=0 and node goes down and then starts again
  • What happens if we have stored first gtid position and committed transaction and then node crashes?
  • For Galera we need to have galera transaction and so we start it on rpl_slave_state::record_gtid but we lost that transaction because of ha_commit or ha_rollback. We might be able to fix this by cleaning Galera transaction context and start a new transaction but it is not optimal because gtid position update is not atomic.


 Comments   
Comment by Kristian Nielsen [ 2023-10-05 ]

I think even without considering Galera, it would be preferred that SET GLOBAL gtid_slave_pos=<...> is a single transaction.

Looking at the code, it seems this should be simple to achive:

  • In gtid_slave_state::load(), add an "in_transaction" parameter that is passed into record_gtid() instead of always passing in_transaction=false.
  • In rpl_gtid_pos_update(),
    • First start a transaction.
    • Pass in_transaction=true to gtid_slave_state::load().
    • Finally commit the transaction (or roll it back in case of error).
  • In all other callers of gtid_slave_state::load(), pass in_transaction=false.

Hope this helps,

- Kristian.

Comment by Jan Lindström [ 2023-11-22 ]

knielsen I think above is not enough. rpl_slave_state::record_gtid does open_and_lock_tables, we do not want to do so for every gtid we store on set. For Galera case transaction is needed naturally for only gtid_slave_pos table having InnoDB storage engine (I thought there could be more than one of them). I think we need a new function for SET or bigger refactoring.

Generated at Thu Feb 08 10:30:44 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.