[MDEV-25912] wsrep does not identify checksummed events correctly Created: 2021-06-14  Updated: 2023-11-27  Resolved: 2022-03-29

Status: Closed
Project: MariaDB Server
Component/s: wsrep
Affects Version/s: 10.2, 10.3, 10.4, 10.5, 10.6
Fix Version/s: 10.4.25, 10.5.16, 10.6.8, 10.7.4

Type: Bug Priority: Major
Reporter: Andrei Elkin Assignee: Mario Karuza (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Attachments: File var.7z    

 Description   

A recent assert added to Gtid ctor has revealed the galera applier does identify checksummed events correctly.
At running galera.galera_as_slave_gtid_auto_engine
the current cb0cad8156f 10.6 shows the following stack

#2  0x00007ffff4de148a in __assert_fail_base (fmt=0x7ffff4f68750 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertiony=0x555559840fa0 "static_cast<uint>(buf - buf_0) == event_len || buf_0[event_len - 1] == 0", file=file@entry=0x55555983d5e0 "/homB/WTs/Fixes/1/TMP/base/sql/log_event.cc", line=line@entry=2628, function=function@entry=0x5555598425e0 <Gtid_log_event::Gtid_log_(unsigned char const*, unsigned int, Format_description_log_event const*)::__PRETTY_FUNCTION__> "Gtid_log_event::Gtid_log_event(cuchar*, uint, const Format_description_log_event*)") at assert.c:92
#3  0x00007ffff4de1502 in __GI___assert_fail (assertion=0x555559840fa0 "static_cast<uint>(buf - buf_0) == event_len || buf_0[even - 1] == 0", file=0x55555983d5e0 "/home3/MDB/WTs/Fixes/1/TMP/base/sql/log_event.cc", line=2628, function=0x5555598425e0 <Gtid_logt::Gtid_log_event(unsigned char const*, unsigned int, Format_description_log_event const*)::__PRETTY_FUNCTION__> "Gtid_log_event:_log_event(const uchar*, uint, const Format_description_log_event*)") at assert.c:101
#4  0x0000555557c098e6 in Gtid_log_event::Gtid_log_event (this=0x617000016288, buf=0x7fffec2efa21 "", event_len=42, description_e0x6140000208c8) at log_event.cc:2627
#5  0x0000555557bfca77 in Log_event::read_log_event (buf=0x7fffec2efa00 "\247I\307`\242\003", event_len=42, error=0x7fffe902f730,=0x6140000208c8, crc_check=1 '\001') at log_event.cc:1141
#6  0x0000555558347691 in wsrep_read_log_event (arg_buf=0x7fffe902f880, arg_buf_len=0x7fffe902f810, description_event=0x614000020

The assert also fires in a pre-21117 10.6 version.
E.g checkout 82c07b178ab, apply

diff --git a/sql/log_event.cc b/sql/log_event.cc
index 9c7c56b1c34..d75fe0c17f3 100644
--- a/sql/log_event.cc
+++ b/sql/log_event.cc
@@ -2564,6 +2564,7 @@ Gtid_log_event::Gtid_log_event(const uchar *buf, uint event_len,
 {
   uint8 header_size= description_event->common_header_len;
   uint8 post_header_len= description_event->post_header_len[GTID_EVENT-1];
+  const uchar *buf_0= buf;
   if (event_len < (uint) header_size + (uint) post_header_len ||
       post_header_len < GTID_HEADER_LEN)
     return;
@@ -2597,6 +2598,8 @@ Gtid_log_event::Gtid_log_event(const uchar *buf, uint event_len,
     memcpy(xid.data, buf, data_length);
     buf+= data_length;
   }
+  DBUG_ASSERT(static_cast<uint>(buf - buf_0) == event_len ||
+              buf_0[event_len - 1] == 0);
 }

run the test to end up

#2  0x00007ffff4de148a in __assert_fail_base (fmt=0x7ffff4f68750 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x555559829fa0 "static_cast<uint>(buf - buf_0) == event_len || buf_0[event_len - 1] == 0", file=file@entry=0x5555598266e0 "/home3/MDB/WTs/Fixes/1/TMP/pre-base/sql/log_event.cc", line=line@entry=2602, function=function@entry=0x55555982b5e0 <Gtid_log_event::Gtid_log_event(unsigned char const*, unsigned int, Format_description_log_event const*)::__PRETTY_FUNCTION__> "Gtid_log_event::Gtid_log_event(const uchar*, uint, const Format_description_log_event*)") at assert.c:92
#3  0x00007ffff4de1502 in __GI___assert_fail (assertion=0x555559829fa0 "static_cast<uint>(buf - buf_0) == event_len || buf_0[event_len - 1] == 0", file=0x5555598266e0 "/home3/MDB/WTs/Fixes/1/TMP/pre-base/sql/log_event.cc", line=2602, function=0x55555982b5e0 <Gtid_log_event::Gtid_log_event(unsigned char const*, unsigned int, Format_description_log_event const*)::__PRETTY_FUNCTION__> "Gtid_log_event::Gtid_log_event(const uchar*, uint, const Format_description_log_event*)") at assert.c:101
#4  0x0000555557bf47b8 in Gtid_log_event::Gtid_log_event (this=0x617000016288, buf=0x7fffec1eca20 "", event_len=42, description_event=0x6140000208c8) at log_event.cc:2601
#5  0x0000555557be7cd3 in Log_event::read_log_event (buf=0x7fffec1eca00 "\374N\307`\242\003", event_len=42, error=0x7fffe8f2c730, fdle=0x6140000208c8, crc_check=1 '\001') at log_event.cc:1140
#6  0x0000555558331f0d in wsrep_read_log_event (arg_buf=0x7fffe8f2c880, arg_buf_len=0x7fffe8f2c810, description_event=0x614

To my surface analysis WSREP applier does not have the correct value of
description_event->checksum_alg at time of this Gtid event emergence.


Generated at Thu Feb 08 09:41:21 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.