Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-22605

SHOW SLAVE STATUS does not correctly reflect broken replication



    • Type: Bug
    • Status: Open (View Workflow)
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 10.2.27
    • Fix Version/s: 10.2
    • Component/s: Replication
    • Labels:
    • Environment:
      10.1.34 to 10.2.27 bin log pos multi-source replication


      MariaDB SHOW SLAVE STATUS does not correctly reflect broken replication

      Here is the phenomenon:

      I use multi-source-replication for some purpose, and I have found there is a connection stop replication data from the master node but without showing any error of `show slave status\G` or `show all slaves status\G`.

      The weird thing is that not only `SLAVE_IO_RUNNING/SLAVE_SQL_RUNNING` shows `Yes`, but also `Exec_Master_Log_Pos/Read_Master_Log_Pos` is continuously increasing... So I went to check the relay log on the slave, I found the DDL which has been written to relay log were never executed on the slave node itself, and the server log does not reflect any error.

      I have tested it serve times and I think I might found out how to reproduce it.. and I believe it is not the same as 21687 or 10703. Sorry if there are other issues mentioned about it and I may haven't found out.

      The master node version I am using is 10.1.34 and the slave is 10.2.27.

      1. create a master and slave node;
      2. use `mysqldump` to copy the data and record master_log_position;
      3. > `set @@default_master_connection='test-master'`;
      4. > set global replication_do_db = 'xxx';
      5. > change mastet to 'xxxxxx'...........
      6. start slave 'test-master';

      Everything is okay by now

      7. add `test-master.replicate_db_db='xxx'` to my.cnf/mysqld.cnf;
      8. systemctl restart mariadb.service

      9. > show all slaves status\G

      The slave replication is actually stopped but status is `Yes`, and `Exec_Master_Log_Pos` is still incrementing.

      I think the key is the `connection_name`, I have tried `reset slave 'connection_name' all`, and recreate a master connection with a different name, and start slave at the position where it starts first(the data is actually not growing at the first place). It can actually fix the problem of the non-growth database of which the last connection_name stays. but if u do not use `reset slave xxx all` to clean all the status which left before and start a new one with same database to be replicated. Or if u just add connection_name.replicate_do_db='xxxxx' to your config file and restart the slave. It can reproduce the problem.




            Elkin Andrei Elkin
            rockid zhang
            0 Vote for this issue
            1 Start watching this issue



                Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.