[MDEV-29252] error code returned by queue_event() is always ignored Created: 2022-08-05  Updated: 2023-11-28

Status: Open
Project: MariaDB Server
Component/s: Replication
Affects Version/s: 10.5, 10.6, 10.7, 10.8, 10.9
Fix Version/s: 10.5, 10.6

Type: Bug Priority: Major
Reporter: Hartmut Holzgraefe Assignee: Andrei Elkin
Resolution: Unresolved Votes: 2
Labels: None


 Description   

The queue_event() function in sql/slave.cc returns an integer which is zero in case of success or a non-zero error code in case of problems, e.g.

static int queue_event(Master_info* mi, const uchar *buf, ulong event_len)
{
  int error= 0;
  [...]
  if (event_checksum_test((uchar*) buf, event_len, checksum_alg))
  {
    error= ER_NETWORK_READ_EVENT_CHECKSUM_FAILURE;
    unlock_data_lock= FALSE;
    goto err;
  }
  [...]
err:
  [...]
  DBUG_RETURN(error);
}

But the only place that calls queue_event() (as far as I can tell) mostly ignores the returned error code and just always reports ER_SLAVE_RELAY_LOG_WRITE_FAILURE instead:

      if (queue_event(mi, event_buf, event_len))
      {
        mi->report(ERROR_LEVEL, ER_SLAVE_RELAY_LOG_WRITE_FAILURE, NULL,
                   ER_THD(thd, ER_SLAVE_RELAY_LOG_WRITE_FAILURE),
                   "could not queue event from master");
        goto err;
      }

So in the failed checksum check case quoted above instead of reporting ER_NETWORK_READ_EVENT_CHECKSUM_FAILURE the Last IO: error seen in SHOW SLAVE STATUS output will be

Last_IO_Errno: 1595
Last_IO_Error: Relay log write failure: could not queue event from master

and not

Last_IO_Errno: 1743
Last_IO_Error: Replication event checksum verification failed while reading from network.


Generated at Thu Feb 08 10:07:04 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.