Details
-
Task
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Incomplete
Description
*Hi,
In our production lab we found one issue that the bin log update got interrupted and hung there for minutes after C++ API mysql_query with "commit" param returned success.
Timeline:*
1. Transaction committed by calling of mysql_query with "commit" param which returned success (product log which confirmed successful return of mysql_query):
+++ 2022/05/26 11:32:46.065 DB MEDIUM remotedb:11310 E:105110 S:2657 (RemotedbNotify.cpp 1026 v-0001 169.254.64.1 oam-a)
INFO: Completed notify connection attempt.
2. bin log started to be updated, but only updated the first 3 records:
- at 2296562
#220526 11:32:46 server id 1 end_log_pos 2296604 CRC32 0x30537424 GTID 0-1-29252 trans
/Unable to render embedded object: File (100001 SET @@session.gtid_seq_no=29252*//*) not found./;
START TRANSACTION
/!/; - at 2296604
#220526 11:32:46 server id 1 end_log_pos 2297037 CRC32 0xed2efe0a Query thread_id=11783 exec_time=0 error_code=0
SET TIMESTAMP=1653564766/!/;
update db01host.notify_remdb_operation set process_id = 11312, objlen = 48, flag = 0, socket = 171, retry_count = 0, host_ip = x'00a9fe4101000000000000000000000000000000', schema_class = 'cfg' where operation = 1 and notification_type = 2 and remote_ip = x'00a9fe4001000000000000000000000000000000' and table_name = 'resource_parms' and process_name = 'SSPcntl_agt'
/!/; - at 2297037
#220526 11:32:46 server id 1 end_log_pos 2297470 CRC32 0xa716708c Query thread_id=11783 exec_time=0 error_code=0
SET TIMESTAMP=1653564766/!/;
update db01host.notify_remdb_operation set process_id = 11312, objlen = 48, flag = 0, socket = 171, retry_count = 0, host_ip = x'00a9fe4101000000000000000000000000000000', schema_class = 'cfg' where operation = 2 and notification_type = 2 and remote_ip = x'00a9fe4001000000000000000000000000000000' and table_name = 'resource_parms' and process_name = 'SSPcntl_agt'
/!/; - at 2297470
#220526 11:32:46 server id 1 end_log_pos 2297903 CRC32 0x3144ec9e Query thread_id=11783 exec_time=0 error_code=0
SET TIMESTAMP=1653564766/!/;
update db01host.notify_remdb_operation set process_id = 11312, objlen = 48, flag = 0, socket = 171, retry_count = 0, host_ip = x'00a9fe4101000000000000000000000000000000', schema_class = 'cfg' where operation = 3 and notification_type = 2 and remote_ip = x'00a9fe4001000000000000000000000000000000' and table_name = 'resource_parms' and process_name = 'SSPcntl_agt'
/!/; - at 2297903
#220526 11:32:46 server id 1 end_log_pos 2297934 CRC32 0x3214e514 Xid = 249264
COMMIT/!/;
*3. About >6m later, it started to write bin log for the remaining records for that transaction (Note in-between there were bin logs writes for other transactions): *
- at 2443522
#220526 11:39:18 server id 1 end_log_pos 2443961 CRC32 0xd83cc209 Query thread_id=72 exec_time=0 error_code=0
SET TIMESTAMP=1653565158/!/;
update db01host.notify_remdb_operation set process_id = 11312, objlen = 824, flag = 0, socket = 171, retry_count = 0, host_ip = x'00a9fe4001000000000000000000000000000000', schema_class = 'cfg' where operation = 1 and notification_type = 2 and remote_ip = x'00a9fe4001000000000000000000000000000000' and table_name = 'gateway_family_base' and process_name = 'SSPcntl_agt'
/!/;
...... - at 2498796
#220526 11:39:20 server id 1 end_log_pos 2499229 CRC32 0x2b4c10a9 Query thread_id=72 exec_time=0 error_code=0
SET TIMESTAMP=1653565160/!/;
update db01host.notify_remdb_operation set process_id = 11312, objlen = 48, flag = 0, socket = 171, retry_count = 0, host_ip = x'00a9fe4001000000000000000000000000000000', schema_class = 'cfg' where operation = 3 and notification_type = 2 and remote_ip = x'00a9fe4001000000000000000000000000000000' and table_name = 'resource_parms' and process_name = 'SSPcntl_agt'
/!/;
Then finally the DB contents were overwritten incorrectly by the delayed bin log updates causing product failures.
*Mariadb Server version: 10.4.21-MariaDB-log Source distribution
Attached config file my.cnf and bin log as well as decoded bin log.
Your advice is highly appreciated.*