[MDEV-605] LP:865108 - Could not execute Delete_rows event Created: 2011-10-03  Updated: 2014-11-17  Resolved: 2013-03-10

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Walter Heck (Inactive) Assignee: Unassigned
Resolution: Incomplete Votes: 0
Labels: Launchpad

Attachments: XML File LPexportBug865108.xml    

 Description   

Since a few weeks occasionally replication from a 5.2.8 to a 5.2.8 install fails where it was running fine before with no changes in config according to our puppet files. The only change I see on that server is an upgrade from 5.2.7, but I'm not 100% sure that is when it started happening. the repl stops with the full following:

MariaDB [(none)]> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.203
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mariadb-bin.001802
          Read_Master_Log_Pos: 87563717
               Relay_Log_File: relay-bin.000705
                Relay_Log_Pos: 259550619
        Relay_Master_Log_File: mariadb-bin.001797
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 1032
                   Last_Error: Could not execute Delete_rows event on table zabbix.history_uint; Can't find record in 'history_uint', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mariadb-bin.001797, end_log_pos 259552671
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 259550472
              Relay_Log_Space: 1353580511
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 1032
               Last_SQL_Error: Could not execute Delete_rows event on table zabbix.history_uint; Can't find record in 'history_uint', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mariadb-bin.001797, end_log_pos 259552671
1 row in set (0.00 sec)

I presume you boys need more info, so feel free to ask



 Comments   
Comment by Walter Heck (Inactive) [ 2011-10-18 ]

Re: Could not execute Delete_rows event
after reporting this bug, I was told in IRC that it was probably just data drift. I couldn't dispute that at that time, so I left it. Now I find almost exactly the same problem, except chances are microscopic that it's data drift this time. I cloned a slave by stopping the original slave, rsyncing the datadir and binary logs over and starting it in the new location with the same my.cnf. Within hours the new slave stopped with the following error, while the original machien has been humming along for months.

Could not execute Update_rows event on table yomamma.albums; Can't find record in 'albums', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.002006, end_log_pos 680669835

That seems too much of a coincidence to be data drift, right?

Comment by Rasmus Johansson (Inactive) [ 2011-10-18 ]

Launchpad bug id: 865108

Comment by Kristian Nielsen [ 2011-10-18 ]

Re: Could not execute Delete_rows event
Well, is it data drift or not?
You need to compare the table between the master and the slave to check.
Or at least check if the row that the replication is complaining about is indeed missing on the slave.

If the row is missing, the problem seems to be data drift, and the bug happened earlier, it is necessary to track down which event was replicated incorrectly to cause this.

If the row is missing, it seems to be a problem with the replication of a specific event, ideally we need the relevant binlog and table data to reproduce.

Comment by Elena Stepanova [ 2013-03-10 ]

There has been no response in the LP bug report, and we don't have any information to analyze here, so closing it as incomplete.

Comment by j [ 2014-11-17 ]

Hello,

I just experienced this issue with MariaDB 5.5.40. One node fell over with the same error, but the other kept on chugging along. What information is needed? I have the mysqlbinlog output, the exact query, and log crash information. I can also guarantee that this is not resolved.

Comment by Elena Stepanova [ 2014-11-17 ]

Hi,

First of all, what do you mean by node? Are you running a Galera cluster or traditional replication? Please describe your replication topology.
Investigation paths can be quite different depending on the answer, so I won't start asking next questions until we know which path to choose.

Comment by j [ 2014-11-17 ]

The environment uses Galera WAN replication through an IPsec VPN with 100ms latency. We have have three nodes, one in each datacenter:

  • MariaDB 5.5.40 in datacenter A.
  • MariaDB 5.5.40 in datacenter B.
  • Galera Arbitrator 25.3.5.rXXXX in datacenter C.
Comment by Elena Stepanova [ 2014-11-17 ]

Thanks.

Then, if you don't mind, please create a separate bug report. While the error looks the same (it's generic by nature), it has nothing to do with the reasons of the original report, whatever they were. Galera works quite differently from the traditional replication, e.g. the main suspect "data drift" that was mentioned in earlier comments to this report can hardly apply to your case.

In the new report, please provide the exact error that you got, preferably a wide quote from the error log and the structure of the table on which it happened. Then the assignee of the report will ask additional questions.

Generated at Thu Feb 08 06:30:00 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.