[MDEV-22605] SHOW SLAVE STATUS does not correctly reflect broken replication Created: 2020-05-17 Updated: 2020-05-26 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | Replication |
| Affects Version/s: | 10.2.27 |
| Fix Version/s: | 10.2 |
| Type: | Bug | Priority: | Major |
| Reporter: | zhang | Assignee: | Andrei Elkin |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | replication | ||
| Environment: |
10.1.34 to 10.2.27 bin log pos multi-source replication |
||
| Description |
MariaDB SHOW SLAVE STATUS does not correctly reflect broken replicationHere is the phenomenon:I use multi-source-replication for some purpose, and I have found there is a connection stop replication data from the master node but without showing any error of `show slave status\G` or `show all slaves status\G`. The weird thing is that not only `SLAVE_IO_RUNNING/SLAVE_SQL_RUNNING` shows `Yes`, but also `Exec_Master_Log_Pos/Read_Master_Log_Pos` is continuously increasing... So I went to check the relay log on the slave, I found the DDL which has been written to relay log were never executed on the slave node itself, and the server log does not reflect any error. I have tested it serve times and I think I might found out how to reproduce it.. and I believe it is not the same as 21687 or 10703. Sorry if there are other issues mentioned about it and I may haven't found out. The master node version I am using is 10.1.34 and the slave is 10.2.27. 1. create a master and slave node; Everything is okay by now 7. add `test-master.replicate_db_db='xxx'` to my.cnf/mysqld.cnf; 9. > show all slaves status\G The slave replication is actually stopped but status is `Yes`, and `Exec_Master_Log_Pos` is still incrementing. I think the key is the `connection_name`, I have tried `reset slave 'connection_name' all`, and recreate a master connection with a different name, and start slave at the position where it starts first(the data is actually not growing at the first place). It can actually fix the problem of the non-growth database of which the last connection_name stays. but if u do not use `reset slave xxx all` to clean all the status which left before and start a new one with same database to be replicated. Or if u just add connection_name.replicate_do_db='xxxxx' to your config file and restart the slave. It can reproduce the problem. |