Details
-
Bug
-
Status: Confirmed (View Workflow)
-
Major
-
Resolution: Unresolved
-
None
-
None
Description
We have a client application to read from the mariadb binary log on MariaDB server version 10.11.5-MariaDB-log.
We started getting below error on trying to ready the mariadb binlog event,
Failed to replace GTID event with backwards-compatible event: corrupt event.; the first event 'archive_mysql.005039' at 16814957, the last event read from 'archive_mysql.005039' at 16814957, the last byte read from 'archive_mysql.005039' at 16815003.
1)
The application works fine with all other GTID_EVENT "START TRANSACTION", but problem is there in reading only this particular event with length 2e. This event has come only once per binlog. Below is that event,
- at 16814957
#240314 17:59:51 server id 1 end_log_pos 16815003 CRC32 0x81e297df - Position
Timestamp Type Master ID Size Master Pos Flags - 100936d |17 3b f3 65 |a2 |01 00 00 00 |2e 00 00 00 |9b 93 00 01 |08 00
# - 1009380 38 94 2d 0b 00 00 00 00 00 00 00 00 0a 96 9e 61 |8.-............a|
- 1009390 04 00 00 00 00 01 ff df 97 e2 81 |...........|
# - Event: GTID 0-1-187536440 cid=73506454
/Unable to render embedded object: File (100001 SET @@session.gtid_seq_no=187536440*//*) not found./;
START TRANSACTION
/!/;
There is no problem reading other GTID_EVENT "START TRANSACTION" event as given below,
- at 17157329
#240314 18:00:02 server id 1 end_log_pos 17157373 CRC32 0xd95a7577 - Position
Timestamp Type Master ID Size Master Pos Flags - 105ccd1 |22 3b f3 65 |a2 |01 00 00 00 |2c 00 00 00 |fd cc 05 01 |08 00
# - 105cce4 5c 96 2d 0b 00 00 00 00 00 00 00 00 0a 1e af 61 |\.-............a|
- 105ccf4 04 00 00 00 00 77 75 5a d9 |.....wuZ.|
# - Event: GTID 0-1-187536988 cid=73510686
/Unable to render embedded object: File (100001 SET @@session.gtid_seq_no=187536988*//*) not found./;
START TRANSACTION
/!/;
So far we are unable to understand the nature of this GTID_EVENT that has 2e length and what could cause this event to appear in the binlog. Can you please provide some information regarding the nature of the event and how does it appear in the binary log.
Attaching the binary log 5039 , that has the binlog event GTID_EVENT in question at offset 16814957. Also Attaching the output "show variables" command to get the db configurations.
1.1) What exactly is this GTID_EVENT with length 2e ?
1.2) How to get this GTID_EVENT in the binlog ?
2) On searching over , there are two issues reported related to this,
https://jira.mariadb.org/browse/CONC-650
https://jira.mariadb.org/browse/MDEV-5754
But not none of them seems same. The CONC-650 seems related to XA transactions and MDEV-5754 fixed in 10.0.9. The version we are using is 10.11.5.
In CONC-650 , it is suggested to set the mariadb_slave_capability to 4 , which I understand it MARIA_SLAVE_CAPABILITY_GTID. After setting the mariadb_slave_capability to 4, the client application is receiving that GTID_EVENT properly , without any problem. But we are not certain what would be the implications of setting this capability.
2.1) What is the default capability of the slave application ?
2.2) What are the effects of setting mariadb_slave_capability to 4 ? Could there be a problem somewhere , in some scenarios ?
Attachments
Issue Links
- relates to
-
MDEV-7850 Extend GTID Binlog Events with Thread Id
-
- Closed
-
> We have a client application to read from the mariadb binary log on
> MariaDB server version 10.11.5-MariaDB-log.
From the description, it sounds like this application is connecting to the
mariadb server to read binlog data from the running server using
COM_BINLOG_DUMP.
> 2.2) What are the effects of setting mariadb_slave_capability to 4 ? Could
> there be a problem somewhere , in some scenarios ?
The consequence of setting mariadb_slave_capability to 4
(MARIA_SLAVE_CAPABILITY_GTID) is that the server will send its binlog data
unmodified to your application, including mariadb-specific events such as
ANNOTATE_ROWS_EVENT, BINLOG_CHECKPOINT_EVENT, and GTID_EVENT. It sounds like
this is what you want to do.
The consequence of not setting mariadb-specific to 4 is that the server
will think your application is a MySQL or MariaDB pre-5.3 slave that does
not know about these new events and would break if it receives them. Thus
the server will re-write the data on the fly to replace these events with
something that the old servers can handle.
> 2.1) What is the default capability of the slave application ?
The capability sent by all MariaDB versions since 10.0 is 4
(MARIA_SLAVE_CAPABILITY_GTID).
So this is a backwards-compatibility mechanism, to allow MariaDB to
replicate to server versions earlier than 10.0 (or MySQL versions), without
the old slave getting an error from event types it doesn't recognize.
> Failed to replace GTID event with backwards-compatible event: corrupt event.; the first event 'archive_mysql.005039' at 16814957, the last event read from 'archive_mysql.005039' at 16814957, the last byte read from 'archive_mysql.005039' at 16815003.
This is a bug in the MariaDB code. It causes certain kinds of GTID events to
fail to be rewritten for old slave versions. And your application, since it
is not setting mariadb_slave_capability and thus treated the same as an old
slave.
This bug is actually fixed in this patch:
https://github.com/MariaDB/server/pull/3035
But that has not yet been merged, and looks like it will only be fixed in
11.4.
bnestere, for your info, it looks like your patch for
MDEV-7850will fixthis bug as well.
> Can you please provide some information regarding the nature of the event
> and how does it appear in the binary log.
> 100936d |17 3b f3 65 |a2 |01 00 00 00 |2e 00 00 00 |9b 93 00 01 |08 00
> 1009380 38 94 2d 0b 00 00 00 00 00 00 00 00 0a 96 9e 61 |8.-............a|
> 1009390 04 00 00 00 00 01 ff df 97 e2 81 |...........|
This GTID event has a couple extra bytes 01 ff at the end (before the CRC32
checksum).
These are the flags_extra containing FL_EXTRA_MULTI_ENGINE and the value
UCHAR_MAX for extra_engines.
The bug is that such GTID events with FL_EXTRA_MULTI_ENGINE set are not
converted correctly for old slaves, instead throwing the error you see.
> Attaching the binary log 5039 , that has the binlog event GTID_EVENT in
> question at offset 16814957.
Thanks for attaching the binlog. There we can see that this an empty
transaction, probably resulting from some quirk in the server code when
rolling back a savepoint:
START TRANSACTION
SAVEPOINT `save_invoice_draftc516c65faf7b06712782c44132be0efb3e20f5cea0550fc86e990e20`
COMMIT
The value UCHAR_MAX for extra_engines means the number of storage engines
participating in the (empty) transactions is 0.
> 1.1) What exactly is this GTID_EVENT with length 2e ?
> 1.2) How to get this GTID_EVENT in the binlog ?
This is a GTID event containing the flag FL_EXTRA_MULTI_ENGINE as well as
the field extra_engines.
It looks like one can obtain this kind of event from committing an empty
transaction after rolling back to a savepoint. Probably it can also be
obtained from a transaction that changes both InnoDB and RocksDB tables in
the same commit.
Note that for this to trigger, the FL_GROUP_COMMIT_ID flag also has to be
set (seen as cid=73506454 in your binlog output). This happens when the
transaction was group-committed to the binlog along with some other
transaction(s); whether this happens or not depends on the exact timing of
different commits running in parallel on the master.
> After setting the mariadb_slave_capability to 4, the client application is
> receiving that GTID_EVENT properly , without any problem. But we are not
> certain what would be the implications of setting this capability.
Yes, because then the server knows your application can understand the GTID
event, so the event is sent without rewriting, bypassing the bug. There
should be no bad implications for your application since it already knows
how to parse the GTID events.