[MDEV-25588] Atomic DDL: Binlog query event written upon recovery is corrupt Created: 2021-05-03  Updated: 2022-11-01  Resolved: 2021-05-04

Status: Closed
Project: MariaDB Server
Component/s: Replication, Server
Affects Version/s: N/A
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Michael Widenius
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-17567 Atomic DDL Closed

 Description   

As a part of atomic DDL testing, the server was intentionally crashed (SIGKILL-ed) while executing concurrent DDL. One of the threads was running

DROP TABLE IF EXISTS transforms.insert_select_18282

on an existing InnoDB table.
After the crash .frm and .ibd files still existed.
Upon recovery the table was dropped, no error messages produced.
A binlog event for the drop was written.
However, the event is corrupt: each character in the DROP statement is followed by \0:

bb-10.6-monty 9cd2bc6b4

# at 384
#700101  2:00:00 server id 1  end_log_pos 614 CRC32 0xac0af65a  Query   thread_id=0     exec_time=1620081899    error_code=0
use `test`/*!*/;
SET TIMESTAMP=0/*!*/;
SET @@session.pseudo_thread_id=0/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1, @@session.check_constraint_checks=1, @@session.sql_if_exists=0/*!*/;
SET @@session.sql_mode=1411383296/*!*/;
SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;
/*!\C latin1 *//*!*/;
SET @@session.character_set_client=8,@@session.collation_connection=8,@@session.collation_server=640/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
^@D^@R^@O^@P^@ ^@T^@A^@B^@L^@E^@ ^@I^@F^@ ^@E^@X^@I^@S^@T^@S^@ ^@`^@t^@r^@a^@n^@s^@f^@o^@r^@m^@s^@`.^@`^@i^@n^@s^@e^@r^@t^@_^@s^@e^@l^@e^@c^@t^@_^@1^@8^@2^@8^@2^@`^@ ^@/^@*^@ ^@g^@e^@n^@e^@r^@a^@t^@e^@d^@ 
^@b^@y^@ ^@d^@d^@l^@ ^@l^@o^@g^@ ^@*^@/
/*!*/;
DELIMITER ;
# End of log file

000000b0  44 45 46 41 55 4c 54 2f  2a 21 2a 2f 3b 0a 00 44  |DEFAULT/*!*/;..D|
000000c0  00 52 00 4f 00 50 00 20  00 54 00 41 00 42 00 4c  |.R.O.P. .T.A.B.L|
000000d0  00 45 00 20 00 49 00 46  00 20 00 45 00 58 00 49  |.E. .I.F. .E.X.I|
000000e0  00 53 00 54 00 53 00 20  00 60 00 74 00 72 00 61  |.S.T.S. .`.t.r.a|
000000f0  00 6e 00 73 00 66 00 6f  00 72 00 6d 00 73 00 60  |.n.s.f.o.r.m.s.`|
00000100  2e 00 60 00 69 00 6e 00  73 00 65 00 72 00 74 00  |..`.i.n.s.e.r.t.|
00000110  5f 00 73 00 65 00 6c 00  65 00 63 00 74 00 5f 00  |_.s.e.l.e.c.t._.|
00000120  31 00 38 00 32 00 38 00  32 00 60 00 20 00 2f 00  |1.8.2.8.2.`. ./.|
00000130  2a 00 20 00 67 00 65 00  6e 00 65 00 72 00 61 00  |*. .g.e.n.e.r.a.|
00000140  74 00 65 00 64 00 20 00  62 00 79 00 20 00 64 00  |t.e.d. .b.y. .d.|
00000150  64 00 6c 00 20 00 6c 00  6f 00 67 00 20 00 2a 00  |d.l. .l.o.g. .*.|
00000160  2f 0a 2f 2a 21 2a 2f 3b  0a 44 45 4c 49 4d 49 54  |/./*!*/;.DELIMIT|

When the binlog is replayed, no errors are issued upon the corrupt DROP (surprisingly), it just gets ignored.

Please also note TIMESTAMP=0. I don't yet know if it can have any negative effect, but it doesn't look healthy.

The server was running with the following options (just in case any of them are important):

--plugin-load-add=ha_blackhole \
--innodb-purge-threads=32 \
--innodb-checksum-algorithm=full_crc32 \
--innodb-compression-algorithm=none \
--innodb-log-file-size=256M \
--innodb-write-io-threads=2 \
--innodb-flush-method=fsync \
--innodb-sort-buffer-size=64K \
--log_output=FILE \
--max-statement-time=30 \
--lock-wait-timeout=10 \
--innodb-lock-wait-timeout=5 \
--performance-schema=on \
--performance-schema-instrument=%=ON \
--performance-schema-consumer-events-stages-current=ON \
--performance-schema-consumer-events-stages-history=ON \
--performance-schema-consumer-events-stages-history-long=ON \
--performance-schema-consumer-events-statements-current=ON \
--performance-schema-consumer-events-statements-history=ON \
--performance-schema-consumer-events-statements-history-long=ON \
--performance-schema-consumer-events-waits-current=ON \
--performance-schema-consumer-events-waits-history=ON \
--performance-schema-consumer-events-waits-history-long=ON \
--performance-schema-consumer-global-instrumentation=ON \
--performance-schema-consumer-thread-instrumentation=ON \
--performance-schema-consumer-statements-digest=ON \
--performance-schema-consumer-events-transactions-current=ON \
--performance-schema-consumer-events-transactions-history=ON \
--performance-schema-consumer-events-transactions-history-long=ON \
--binlog-format=mixed \
--log-bin \
--log_bin_trust_function_creators=1 \
--character-set-server=ucs2 \
--collation-server=ucs2_croatian_ci \
--log-tc-size=1M \
--thread-handling=pool-of-threads \
--explicit-defaults-for-timestamp=on \
--max-digest-length=0 \
--skip-external-locking=OFF



 Comments   
Comment by Michael Widenius [ 2021-05-04 ]

Fixed in the original commit.
Fix tested manually with the data from the bug report

Generated at Thu Feb 08 09:38:49 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.