[MDEV-8945] Avoid overloading the master NIC on restarting IO_THREAD on lagging slave. Created: 2015-10-15 Updated: 2015-12-16 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | Replication |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major |
| Reporter: | Jean-François Gagné | Assignee: | Unassigned |
| Resolution: | Unresolved | Votes: | 4 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||
| Description |
|
When GTID slave negotiation is enabled, the relay logs are wiped on starting the IO_THREAD. This might not be a big issue in most cases, but it is very bad on a lagging slave. I recently ran "STOP SLAVE; START SLAVE UNTIL ...;" on a GTID enabled slave, and it saturated the network interface of the master for more than one hour (this slave was lagging by ~4 days and had more than 250 GB of unprocessed relay logs). I could have been more careful only restarting the SQL_THREAD (not the IO_THREAD), but there are still situations where restarting the IO_THREAD cannot be avoided (restarting MariaDB as an example). If would be much better to avoid re-downloading binary log on starting the IO_THREAD as much of the relays logs are good on disk. Some more details in the following: Thanks, JFG |
| Comments |
| Comment by Elena Stepanova [ 2015-10-17 ] |
|
See also MDEV-4698 |
| Comment by Daniel Black [ 2015-11-18 ] |
|
GTID indexing (MDEV-4991) is probably the first step solving this. |
| Comment by Jean-François Gagné [ 2015-11-18 ] |
|
I am not sure to understand how GTID indexing would solve the problem. |
| Comment by Daniel Black [ 2015-11-18 ] |
|
quite right, gtid indexing only helps this problems this scenario for an offset of the master's first binlog to the extent it is already proceeded on the slave. |
| Comment by Daniel Black [ 2015-12-16 ] |
|
fyi - stumbled upon https://github.com/percona/percona-server/pull/240 - haven't looked at code |