[MDEV-22605] SHOW SLAVE STATUS does not correctly reflect broken replication - Jira

XML

Word

Printable

Details

Type: Bug
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Affects Version/s: 10.2.27
Fix Version/s: 10.2(EOL)
Component/s: Replication
Labels:
- replication
Environment:
10.1.34 to 10.2.27 bin log pos multi-source replication

Description

MariaDB SHOW SLAVE STATUS does not correctly reflect broken replication

Here is the phenomenon:

I use multi-source-replication for some purpose, and I have found there is a connection stop replication data from the master node but without showing any error of `show slave status\G` or `show all slaves status\G`.

The weird thing is that not only `SLAVE_IO_RUNNING/SLAVE_SQL_RUNNING` shows `Yes`, but also `Exec_Master_Log_Pos/Read_Master_Log_Pos` is continuously increasing... So I went to check the relay log on the slave, I found the DDL which has been written to relay log were never executed on the slave node itself, and the server log does not reflect any error.

I have tested it serve times and I think I might found out how to reproduce it.. and I believe it is not the same as 21687 or 10703. Sorry if there are other issues mentioned about it and I may haven't found out.

The master node version I am using is 10.1.34 and the slave is 10.2.27.

1. create a master and slave node;
2. use `mysqldump` to copy the data and record master_log_position;
3. > `set @@default_master_connection='test-master'`;
4. > set global replication_do_db = 'xxx';
5. > change mastet to 'xxxxxx'...........
6. start slave 'test-master';

Everything is okay by now

7. add `test-master.replicate_db_db='xxx'` to my.cnf/mysqld.cnf;
8. systemctl restart mariadb.service

9. > show all slaves status\G

The slave replication is actually stopped but status is `Yes`, and `Exec_Master_Log_Pos` is still incrementing.

I think the key is the `connection_name`, I have tried `reset slave 'connection_name' all`, and recreate a master connection with a different name, and start slave at the position where it starts first(the data is actually not growing at the first place). It can actually fix the problem of the non-growth database of which the last connection_name stays. but if u do not use `reset slave xxx all` to clean all the status which left before and start a new one with same database to be replicated. Or if u just add connection_name.replicate_do_db='xxxxx' to your config file and restart the slave. It can reproduce the problem.

Attachments

Activity

People

Assignee:: Andrei Elkin

Reporter:: zhang

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 2020-05-17 16:35

Updated:: 2020-05-26 07:08

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.