Details
Description
150 system user NULL Slave_IO 66361 Waiting for master to send event NULL 0.000 |
152 system user NULL Slave_worker 0 Write_rows_log_event::write_row(-1) NULL 0.000 |
153 system user NULL Slave_worker 0 Write_rows_log_event::write_row(-1) NULL 0.000 |
154 system user NULL Slave_worker 0 Write_rows_log_event::write_row(-1) NULL 0.000 |
155 system user NULL Slave_worker 0 Write_rows_log_event::write_row(-1) NULL 0.000 |
156 system user NULL Slave_worker 0 Write_rows_log_event::write_row(-1) NULL 0.000 |
157 system user NULL Slave_worker 0 Write_rows_log_event::write_row(-1) NULL 0.000 |
158 system user NULL Slave_worker 0 Write_rows_log_event::write_row(-1) NULL 0.000 |
159 system user NULL Slave_worker 0 Write_rows_log_event::write_row(-1) NULL 0.000 |
160 system user NULL Slave_worker 0 Write_rows_log_event::write_row(-1) NULL 0.000 |
162 system user NULL Slave_worker 0 Write_rows_log_event::write_row(-1) NULL 0.000 |
161 system user NULL Slave_worker 0 Write_rows_log_event::write_row(-1) NULL 0.000 |
163 system user NULL Slave_worker 0 Write_rows_log_event::write_row(-1) NULL 0.000 |
151 system user NULL Slave_SQL 1090 Reading event from the relay log NULL 0.000 |
2415 root 127.0.0.1:42392 NULL Query 835 Killing slave STOP SLAVE 0.000 |
slave machine configuration:
sync_master_info = 500000 |
sync_relay_log = 100000 |
sync_relay_log_info = 500000 |
slave_parallel_max_queued = 67108864 |
slave_parallel_mode = optimistic
|
slave_parallel_threads = 12 |
Is that normal?
Attachments
Issue Links
- blocks
-
MDEV-30458 Consolidate Serial Replica to Parallel Replica with 1 Worker Thread
-
- Open
-
Indeed, parallel replication can make STOP SLAVE much longer if sizable transactions are running. The test case below demonstrates that. It updates many rows on master in a table without PK, so that RBR is really slow, then waits till the updates start running on the slave, executes STOP SLAVE, waits till it's finished and checks what's happened to the contents of the table on slave.
If the test is run without parallel replication, STOP SLAVE finishes very fast, and the contents of the table remains unchanged – that is, updates on the slave are interrupted and not committed (rolled back).
If the test is run with parallel replication, even if it's slave-parallel-threads=1, STOP SLAVE takes a long time, and at the end the contents of the table is updated – that is, the slave executes the transaction to the end and commits it before stopping.
I expect it to be a design choice of parallel replication, hopefully Elkin will check it and confirm (or not).
Note: the test case below is for reproducing only, do not put it into the regression suite!
--source include/have_innodb.inc
--source include/master-slave.inc
--source include/have_binlog_format_row.inc
--sync_slave_with_master
--connection master
--connection slave
--let $show_statement= SHOW PROCESSLIST
--let $field= State
--let $condition= like 'Update_rows_log_event::find_row%'
--source include/wait_show_condition.inc
send stop slave;
--connection slave1
{
}
# Cleanup
--connection slave
--reap
--connection master