[MDEV-16333] Slow replication after upgrade from 10.1.33 to 10.3.7 Created: 2018-05-30  Updated: 2023-04-27

Status: Stalled
Project: MariaDB Server
Component/s: Replication
Affects Version/s: 10.3.7
Fix Version/s: 10.4

Type: Bug Priority: Major
Reporter: Maikel Punie Assignee: Andrei Elkin
Resolution: Unresolved Votes: 2
Labels: replication
Environment:

centos7 64bit


Attachments: Text File diff.txt     PNG File slave-slow-while-gtid_slave_pos-uses-innodb.png    

 Description   

We have 2 identical slaves running, same hardware and same software.
After upgrading one of the slaves from 10.1.33 to 10.3.7 (now 48hours ago) the upgraded slave (which is set to maintenance mode, so is not receiving extra work on top of the replication) can not follow the master server. its behind and the backlog is constantly increasing.

Both servers have the exact same hardware and work fine in 10.1.33.
They are not cpu and not io bound.
Those servers never ran 10.2
99% of the tables are innodb

attached is a config diff for a show variables.
If anything else is needed, let me know



 Comments   
Comment by Andrei Elkin [ 2018-05-30 ]

There is a chance for MDEV-15242 though it requires binlog_format = ROW. In your attachment the master's global var is to STATEMENT though. Could you check out this possibility anyway: that is whether there are
partitioned tables and row-binlog-format events on them?

Comment by Maikel Punie [ 2018-05-30 ]
  • master has binlog format statement
  • we have a couple of partitioned tables, and there are lot of inserts happening on these tables.
Comment by Andrei Elkin [ 2018-05-30 ]

Thanks for a partial confirmation. So we have the partitioned tables. As to the actual binlog format we can't take for granted that the global binlog_format is the same as the session one for your master clients. Could you please double-check what format exactly these INSERTs are binlogged on the master?

Comment by Maikel Punie [ 2018-05-30 ]

How exactly can we check this?

if i just check the binlog on the master i see the plain sql command in there.

Comment by Andrei Elkin [ 2018-05-30 ]

Could you try mysqlbinlog `a-binlog-file` | grep Write_rows ? Also grep for Update_rows and Delete_rows.

A typical snippet containing RBR events is like here:

# at 329
#180530 17:03:55 server id 1  end_log_pos 371 CRC32 0x5b1bff3b  GTID 0-1-1
/*!100101 SET @@session.skip_parallel_replication=0*//*!*/;
/*!100001 SET @@session.gtid_domain_id=0*//*!*/;
/*!100001 SET @@session.server_id=1*//*!*/;
/*!100001 SET @@session.gtid_seq_no=1*//*!*/;
BEGIN
/*!*/;
# at 371
# at 417
#180530 17:03:55 server id 1  end_log_pos 417 CRC32 0xe4ab6fb5  Annotate_rows:
#Q> insert into t_# at 329
#180530 17:03:55 server id 1  end_log_pos 371 CRC32 0x5b1bff3b  GTID 0-1-1
/*!100101 SET @@session.skip_parallel_replication=0*//*!*/;
/*!100001 SET @@session.gtid_domain_id=0*//*!*/;
/*!100001 SET @@session.server_id=1*//*!*/;
/*!100001 SET @@session.gtid_seq_no=1*//*!*/;
BEGIN
/*!*/;
# at 371
# at 417
#180530 17:03:55 server id 1  end_log_pos 417 CRC32 0xe4ab6fb5  Annotate_rows:
#Q> insert into t_x set a=1
#180530 17:03:55 server id 1  end_log_pos 463 CRC32 0xdc9694c4  Table_map: `test
`.`t_x` mapped to number 49
# at 463
#180530 17:03:55 server id 1  end_log_pos 501 CRC32 0x767e9d85  Write_rows: tabl
e id 49 flags: STMT_END_F
 
BINLOG '
S68OWxMBAAAALgAAAM8BAAAAADEAAAAAAAEABHRlc3QAA3RfeAABAwABxJSW3A==
S68OWxcBAAAAJgAAAPUBAAAAADEAAAAAAAEAAf/+AQAAAIWdfnY=
'/*!*/;x set a=1
#180530 17:03:55 server id 1  end_log_pos 463 CRC32 0xdc9694c4  Table_map: `test
`.`t_x` mapped to number 49
# at 463
#180530 17:03:55 server id 1  end_log_pos 501 CRC32 0x767e9d85  Write_rows: tabl
e id 49 flags: STMT_END_F
 
BINLOG '
S68OWxMBAAAALgAAAM8BAAAAADEAAAAAAAEABHRlc3QAA3RfeAABAwABxJSW3A==
S68OWxcBAAAAJgAAAPUBAAAAADEAAAAAAAEAAf/+AQAAAIWdfnY=
'/*!*/;

Comment by Maikel Punie [ 2018-05-31 ]

There is nothing like that in the binlog ...

[root@einstein data]# mysqlbinlog mysql-bin.001727 | grep Write_rows
[root@einstein data]# mysqlbinlog mysql-bin.001727 | grep Update_rows
[root@einstein data]# mysqlbinlog mysql-bin.001727 | grep Delete_rows
[root@einstein data]# 

Comment by Andrei Elkin [ 2018-05-31 ]

Could you show processlist on the slave to catch slow events or execution states? Also it's possible to get such info from performance_schema if it's configured.

Comment by Maikel Punie [ 2018-05-31 ]

What exactly do you need? I'll enable the performance schema to get the info

MariaDB [(none)]> show processlist;
+-------+-------------+----------------------+--------------+--------------+-------+-----------------------------------------------+------------------------------------------------------------------------------------------------------+----------+
| Id    | User        | Host                 | db           | Command      | Time  | State                                         | Info                                                                                                 | Progress |
+-------+-------------+----------------------+--------------+--------------+-------+-----------------------------------------------+------------------------------------------------------------------------------------------------------+----------+
|     1 | system user |                      | NULL         | Daemon       |  NULL | InnoDB purge coordinator                      | NULL                                                                                                 |    0.000 |
|     2 | system user |                      | NULL         | Daemon       |  NULL | InnoDB purge worker                           | NULL                                                                                                 |    0.000 |
|     3 | system user |                      | NULL         | Daemon       |  NULL | InnoDB purge worker                           | NULL                                                                                                 |    0.000 |
|     4 | system user |                      | NULL         | Daemon       |  NULL | InnoDB purge worker                           | NULL                                                                                                 |    0.000 |
|     5 | system user |                      | NULL         | Daemon       |  NULL | InnoDB purge worker                           | NULL                                                                                                 |    0.000 |
|     6 | system user |                      | NULL         | Daemon       |  NULL | InnoDB purge worker                           | NULL                                                                                                 |    0.000 |
|     7 | system user |                      | NULL         | Daemon       |  NULL | InnoDB purge worker                           | NULL                                                                                                 |    0.000 |
|     8 | system user |                      | NULL         | Daemon       |  NULL | InnoDB purge worker                           | NULL                                                                                                 |    0.000 |
|     9 | system user |                      | NULL         | Daemon       |  NULL | InnoDB shutdown handler                       | NULL                                                                                                 |    0.000 |
|    94 | root        | 138.203.36.219:56946 | NULL         | Sleep        |    23 |                                               | NULL                                                                                                 |    0.000 |
|   202 | root        | 138.203.36.219:57494 | NULL         | Sleep        |    38 |                                               | NULL                                                                                                 |    0.000 |
| 13551 | root        | 138.203.240.20:38886 | mysql        | Sleep        |     4 |                                               | NULL                                                                                                 |    0.000 |
| 16918 | system user |                      | NULL         | Slave_IO     | 99511 | Waiting for master to send event              | NULL                                                                                                 |    0.000 |
| 16920 | system user |                      | regressiondb | Slave_worker |    58 | Waiting for prior transaction to commit       | COMMIT                                                                                               |    0.000 |
| 16921 | system user |                      | regressiondb | Slave_worker |    58 | Updating                                      | UPDATE testAndSuite_status SET registered=0 WHERE testAndSuite_id IN (880023, 880024, 880022, 880021 |    0.000 |
| 16922 | system user |                      | NULL         | Slave_worker |    58 | Waiting for prior transaction to commit       | NULL                                                                                                 |    0.000 |
| 16923 | system user |                      | NULL         | Slave_worker |    54 | Waiting for prior transaction to commit       | NULL                                                                                                 |    0.000 |
| 16924 | system user |                      | NULL         | Slave_worker |    56 | Waiting for prior transaction to commit       | NULL                                                                                                 |    0.000 |
| 16926 | system user |                      | regressiondb | Slave_worker |    49 | Waiting for prior transaction to commit       | COMMIT                                                                                               |    0.000 |
| 16925 | system user |                      | regressiondb | Slave_worker |    53 | Waiting for prior transaction to commit       | COMMIT                                                                                               |    0.000 |
| 16927 | system user |                      | NULL         | Slave_worker |    53 | Waiting for prior transaction to commit       | NULL                                                                                                 |    0.000 |
| 16919 | system user |                      | NULL         | Slave_SQL    | 21272 | Waiting for room in worker thread event queue | NULL                                                                                                 |    0.000 |
| 22039 | root        | localhost            | NULL         | Sleep        |    28 |                                               | NULL                                                                                                 |    0.000 |
| 22042 | root        | localhost            | NULL         | Query        |     0 | Init                                          | show processlist                                                                                     |    0.000 |
+-------+-------------+----------------------+--------------+--------------+-------+-----------------------------------------------+------------------------------------------------------------------------------------------------------+----------+

Comment by Andrei Elkin [ 2018-05-31 ]

Perhaps we have got what we need.
The worker thread 16921 has been updating that partitioned table for 58 secs. Probably on the master the Update was quicker, and the same on the non-upgraded slave. This would indicate https://bugs.mysql.com/bug.php?id=73648.

Could you proceed from here and compare how long indeed it takes to execute such Update on master and the upgraded slave?

Comment by Maikel Punie [ 2018-05-31 ]

That is not a partitioned tabled, but the table is one file of almost 300G on disk

Comment by Andrei Elkin [ 2018-05-31 ]

Thanks for clarifying it's not partitioned. UPDATE is nevertheless slow. We can see it's running alone while other workers wait for its finish. If the pasted show-processlist snapshot is typical - UPDATEs on this big table are slow - it would make sense to compare UPDATE execution times between the slaves.

Comment by Maikel Punie [ 2018-05-31 ]

If i do these manually the speed is about the same for the 2 servers ....
about 0.2 seconds

Comment by Andrei Elkin [ 2018-05-31 ]

0.2 on the 2 servers - which ones - the two slaves? Have you benchmarked the Update also feeding it manually on the slow upgraded slave?

Comment by Andrei Elkin [ 2018-05-31 ]

Generally I could suspect rollbacks taking place on the slave 'cos of OPTIMISTIC parallel_mode.
Check out

show global status like 'Com_rollback';
show global status like 'Handler_rollback';

to estimate rollback contribution to the slowness.
If the values are rather high, you could work that around by SET @@global.slave_parallel_mode=CONSERVATIVE.

Comment by Maikel Punie [ 2018-06-01 ]

Idd there are a lot of rollbacks happening

MariaDB [(none)]> show global status like '%_rollback';
+----------------------------+--------+
| Variable_name              | Value  |
+----------------------------+--------+
| Com_rollback               | 0      |
| Com_xa_rollback            | 0      |
| Handler_rollback           | 225715 |
| Handler_savepoint_rollback | 0      |
+----------------------------+--------+
4 rows in set (0.001 sec)

i switched to CONSERVATIVE on the 10.3 slave.

On the 10.1 slave those rollbacks are also happening ...

MariaDB [(none)]> show global status like '%_rollback';
+----------------------------+---------+
| Variable_name              | Value   |
+----------------------------+---------+
| Com_rollback               | 0       |
| Com_xa_rollback            | 0       |
| Handler_rollback           | 2403122 |
| Handler_savepoint_rollback | 0       |
+----------------------------+---------+
4 rows in set (0.06 sec)

Comment by Andrei Elkin [ 2018-06-01 ]

Let's see how CONSERVATIVE would do. As you say the 10.1 slave can keep up we may suspect the cost of rollback has grown in 10.3. I doubt that though, but better consult with marko.

Secondly, sorry to repeat the question, could please confirm/clarify that 0.2 sec execution time of a show-prosselist-like manually-feed UPDATE on the slaves?

Comment by Maikel Punie [ 2018-06-01 ]

So the 0.2 seconds is a similar querry as the one in the processlist.

The conservative is not helping

Comment by Andrei Elkin [ 2018-06-01 ]

So we have to dig out and explain 0.2 vs 58 secs of your show-processlist.
If you could create and share a binlog whose replying on 10.3 goes slow it would be ultimately helpful.

Comment by Maikel Punie [ 2018-06-01 ]

I have a file like that, but its to big to attach to this ticket.
i'll upload somewhere else and link to it here.

Comment by Andrei Elkin [ 2018-06-01 ]

Thank you!

We will need table definitions as well. I would appreciate if you have tested the file ideally to arrange it as a relay-log (I sense we don't need master and the slave io thread), or with a "dumb" master server that won't have actual data - which would be about my environment.

Comment by Marko Mäkelä [ 2018-06-04 ]

The cost of rolling back an update or delete may have slightly increased in MariaDB Server 10.3 due to MDEV-13697 (a follow-up fix of MDEV-12288).

Comment by Maikel Punie [ 2018-08-29 ]

OK digging this item backup
it seems that the problem is not the slow replication ....

i have 2 slaves that have the exact same data, one is running 10.1.35 and the other one is running 10.3.9.
if i run the same query on both servers, the 10.1.35 takes about 0.5 seconds to return the data of a query.
The seond server running 10.3.9 never returns data for the the same query, it keeps hanging in sending data (still running after 2 hours ....)

both systems are idle, they only have replication running, the data is comming from the same master server. They are both (when the test started) at about the same GTID.

Comment by Andrei Elkin [ 2018-12-20 ]

I think this case should be closed on the basis of can't reproduce. It was suggested to the reporter to
provide a piece of binlog data which could be possible to replay and therefore witness slowness if any.
But nothing has been provided.

Comment by Andrei Elkin [ 2018-12-20 ]

Reopening after having found the latest comments.

Comment by Andrei Elkin [ 2018-12-20 ]

Maikel Punie, if I got it right the slowness is on 10.3.9 server and it can be cause by running
replication? In contrast 10.1.35 responds fine while serving the same slave duty.

Could you provide details of the query that hang such way?

Comment by Maikel Punie [ 2019-07-17 ]

this was not reproduceable after a few weeks, so not sure what happend, this report can be closed

Comment by Sergei Golubchik [ 2019-08-19 ]

let's give it one more month, just to be sure.
If we won't hear from you during that time, we'll close it.

Comment by david schumacher [ 2019-09-18 ]

hi all and @Maikel Punie ,

i experienced a comparable behaviour when switching from MySQL 5.1 to MariaDB 10.3.17-0+deb10u1 (Debian 10 Debian 9), kernel 4.9.110-3+deb9u6.
replication with MariaDB 10.3.17 or 10.1.37-MariaDB-0+deb9u1 was multiple times slower on SSDs than before using MySQL 5.1 on spinning discs.

as a workaround i changed table mysql.gtid_slave_pos from engine=innodb to engine=myisam on the slave (vs-db-misc-multidb-02) and replication was fast as normal immediately, see graph of the behaviour while a hourly cronjob inserts data:

just guessing at the moment: there is something slowing down innodb-transactions (in general or only for gtid_slave_pos?) dramatically.
have not done more research on this yet.

environment:
master: vs-db-misc-multidb-01
slave: vs-db-misc-multidb-02
100% MyISAM (exception before the change: mysql.gtid_slave_pos)
MariaDB 10.3.17-0+deb10u1 inside LXC (host and guest: Debian 10)
innodb-config: debian defaults
myisam-config: key_buffer_size = 4Gb

parallel-replication specific settings (experiemented with different settings, changed more or less nothing regarding replication speed while gtid_slave_pos was innodb):
{{MariaDB [(none)]> show variables like '%parallel%';
-----------------------------------------+

Variable_name Value

-----------------------------------------+

skip_parallel_replication OFF
slave_domain_parallel_threads 20
slave_parallel_max_queued 32001072 (increased until "Waiting for room in worker thread event queue" vanished from "Slave_SQL_Running_State")
slave_parallel_mode aggressive
slave_parallel_threads 20
slave_parallel_workers 20

-----------------------------------------+}}

any idea which information can be helpful or what to try to get helpful insights ?

thanks & regards,
david

Comment by Andrei Elkin [ 2019-09-18 ]

david_schumacher: what table type the replicated inserts go into?

Notice that aggressive may involve "internal" rollback and then the MyISAM type of gtid_slave_pos would fire back with a duplicate key error at replaying.
If it's pure insert load no rollback can occur though.

Comment by david schumacher [ 2019-09-20 ]

hi,

i tested using blackhole, myisam and innodb for the insert-table, and even with blackhole, replication is extremely delayed as long gtid_slave_pos uses innodb.

important correction, sorry for that ... maybe this behaviour is completely unrelated to maikels findings .. or maikel upgraded his kernel in the meantime:
i just noticed the host-kernels are different (now corrected in my first post): both slaves on different physical servers are using 4.9.110-3+deb9u6 (Debian 9), while master uses 4.19.67-2 (Debian 10). hardware of all three servers is the same (4 sata-SSDs in md-raid10, xfs with discard mount option). after moving one slave to the master-host, innodb-performance was back to normal. moving back to older kernel, performance dropped again.

side-test: 10.1.37-MariaDB-0+deb9u1 on the same hardware using the older kernel suffers the same low innodb-performance like 10.3.17-0+deb10u1 - as long global innodb_flush_log_at_trx_commit is set to 1 (see below).

if gtid_slave_pos is myisam and only insert-table is innodb, all inserts into this insert-table are extremely slow too (25 minutes for 140k inserts, while myisam takes 18 seconds for the same inserts). slave_parallel_mode was conservative and slave_parallel_threads/workers = 2 for other tests, changed nearly nothing.

i tested direct inserts into the slave without using replication, the results are comparable... so i think its a pure innodb-problem and not replication-specific, and only happend on kernel 4.9.110-3+deb9u6, but with older mariadb too. myisam-performance and IO-benchmarks are showing identical results on both kernels.

after i set innodb_flush_log_at_trx_commit to 2 or 0, inserts into innodb-table where fast as usual (for both gtid_slave_pos and the normal insert-table .. test-inserts took 41 seconds which matches what i saw in other cases comparing myisam vs. innodb peformance ).

different values for innodb_use_native_aio and innodb_flush_method did not change much.

for me this is solved too, but as long i run this older kernels, i can do tests if needed.

cheers,
david

Generated at Thu Feb 08 08:28:07 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.