[MDEV-17490] MariaDB 10.3 slave fails to replicate from a MariaDB 10.2 master - Jira

XML

Word

Printable

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Incomplete
Affects Version/s: 10.2.17, 10.2.18, 10.3.10, 10.8.5, 10.4.28
Fix Version/s: N/A
Component/s: Replication, Server
Labels:
- binlog
- replication
- slave
Environment:
CentOS 7.x

Description

We have a 10.2.18 master and 4 slaves with 10.2.18. Because of this bug that I've reported last week (https://jira.mariadb.org/browse/MDEV-17420) we tried to upgrade one of the replicas to 10.3.10 to see if the bug was still there in that version.

After the upgrade, the slave began to sync binary log events from the master, but after a few seconds it stopped with the following errors:

mysqld[15138]: 2018-10-17  9:22:28 156 [ERROR] Slave IO thread did not receive an expected Rows-log end-of-statement for event starting at log 'main.010755' position 89390527 whose last block was seen at log 'main.010755' position 89390527. The end-of-statement should have been delivered before the current one at log 'main.010755' position 89390610

mysqld[15138]: 2018-10-17  9:22:28 156 [ERROR] Slave I/O: Relay log write failure: could not queue event from master, Internal MariaDB error code: 1595

We checked the binary log in the master and does not seem corrupted and the other 3 replicas are working ok (still with 10.2.18). Also the relay log on the slave seems ok.

The slave is stopped with the following status:

mainro [(none)]> show slave status\G

*************************** 1. row ***************************

           Master_Log_File: main.010755

       Read_Master_Log_Pos: 89411517

            Relay_Log_File: relay-bin.000002

             Relay_Log_Pos: 550

     Relay_Master_Log_File: main.010755

          Slave_IO_Running: No

         Slave_SQL_Running: Yes

       Exec_Master_Log_Pos: 89410432

           Relay_Log_Space: 1657

             Last_IO_Errno: 1595

             Last_IO_Error: Relay log write failure: could not queue event from master

            Last_SQL_Errno: 0

            Last_SQL_Error:

   Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it

          Slave_DDL_Groups: 16763

Slave_Non_Transactional_Groups: 7974

    Slave_Transactional_Groups: 107415

We have already tried with RESET SLAVE, clearing relay logs and executing CHANGE MASTER TO again. But the result is the same.

I've dumped the master binary log at that position and, interestingly, slave IO thread stops in a binlog block (master binlog format is MIXED).

The master/slave set that is failing is the main database server for our organization. It has 1.2 TB of data with 38 databases, views, stored procedures, triggers and complex queries.

In the same servers, we have 3 other small sets of master/slaves with the same combination (10.2.18 master and 10.3.10 slaves) and are working perfectly. Of course, these sets are smaller and simpler than the instance that is failing.

Thanks

Attachments

Issue Links

blocks

MDEV-17420 MariaDB slave 10.2 leaks temporary tables

Closed

Activity

People

Assignee:: suresh ramagiri

Reporter:: Gabriel Gomiz

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 2018-10-18 12:59

Updated:: 2024-07-07 23:38

Resolved:: 2023-05-02 06:37

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.