[MDEV-28681] bin log interrupted and hung for minutes after transaction commit - Jira

XML

Word

Printable

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Won't Fix
Affects Version/s: 10.4.21
Fix Version/s: N/A
Component/s: Data Manipulation - Update, Platform RedHat, Storage Engine - InnoDB
Labels:
None

Description

*Hi,

In our production lab we found one issue that the bin log update got interrupted and hung there for minutes after C++ API mysql_query with "commit" param returned success.

Timeline:*

1. Transaction committed by calling of mysql_query with "commit" param which returned success (product log which confirmed successful return of mysql_query):

+++ 2022/05/26 11:32:46.065 DB MEDIUM remotedb:11310 E:105110 S:2657 (RemotedbNotify.cpp 1026 v-0001 169.254.64.1 oam-a)
INFO: Completed notify connection attempt.

2. bin log started to be updated, but only updated the first 3 records:

at 2296562
#220526 11:32:46 server id 1 end_log_pos 2296604 CRC32 0x30537424 GTID 0-1-29252 trans
/Unable to render embedded object: File (100001 SET @@session.gtid_seq_no=29252*//*) not found./;
START TRANSACTION
/!/;
at 2296604
#220526 11:32:46 server id 1 end_log_pos 2297037 CRC32 0xed2efe0a Query thread_id=11783 exec_time=0 error_code=0
SET TIMESTAMP=1653564766/!/;
update db01host.notify_remdb_operation set process_id = 11312, objlen = 48, flag = 0, socket = 171, retry_count = 0, host_ip = x'00a9fe4101000000000000000000000000000000', schema_class = 'cfg' where operation = 1 and notification_type = 2 and remote_ip = x'00a9fe4001000000000000000000000000000000' and table_name = 'resource_parms' and process_name = 'SSPcntl_agt'
/!/;
at 2297037
#220526 11:32:46 server id 1 end_log_pos 2297470 CRC32 0xa716708c Query thread_id=11783 exec_time=0 error_code=0
SET TIMESTAMP=1653564766/!/;
update db01host.notify_remdb_operation set process_id = 11312, objlen = 48, flag = 0, socket = 171, retry_count = 0, host_ip = x'00a9fe4101000000000000000000000000000000', schema_class = 'cfg' where operation = 2 and notification_type = 2 and remote_ip = x'00a9fe4001000000000000000000000000000000' and table_name = 'resource_parms' and process_name = 'SSPcntl_agt'
/!/;
at 2297470
#220526 11:32:46 server id 1 end_log_pos 2297903 CRC32 0x3144ec9e Query thread_id=11783 exec_time=0 error_code=0
SET TIMESTAMP=1653564766/!/;
update db01host.notify_remdb_operation set process_id = 11312, objlen = 48, flag = 0, socket = 171, retry_count = 0, host_ip = x'00a9fe4101000000000000000000000000000000', schema_class = 'cfg' where operation = 3 and notification_type = 2 and remote_ip = x'00a9fe4001000000000000000000000000000000' and table_name = 'resource_parms' and process_name = 'SSPcntl_agt'
/!/;
at 2297903
#220526 11:32:46 server id 1 end_log_pos 2297934 CRC32 0x3214e514 Xid = 249264
COMMIT/!/;

*3. About >6m later, it started to write bin log for the remaining records for that transaction (Note in-between there were bin logs writes for other transactions): *

at 2443522
#220526 11:39:18 server id 1 end_log_pos 2443961 CRC32 0xd83cc209 Query thread_id=72 exec_time=0 error_code=0
SET TIMESTAMP=1653565158/!/;
update db01host.notify_remdb_operation set process_id = 11312, objlen = 824, flag = 0, socket = 171, retry_count = 0, host_ip = x'00a9fe4001000000000000000000000000000000', schema_class = 'cfg' where operation = 1 and notification_type = 2 and remote_ip = x'00a9fe4001000000000000000000000000000000' and table_name = 'gateway_family_base' and process_name = 'SSPcntl_agt'
/!/;
......
at 2498796
#220526 11:39:20 server id 1 end_log_pos 2499229 CRC32 0x2b4c10a9 Query thread_id=72 exec_time=0 error_code=0
SET TIMESTAMP=1653565160/!/;
update db01host.notify_remdb_operation set process_id = 11312, objlen = 48, flag = 0, socket = 171, retry_count = 0, host_ip = x'00a9fe4001000000000000000000000000000000', schema_class = 'cfg' where operation = 3 and notification_type = 2 and remote_ip = x'00a9fe4001000000000000000000000000000000' and table_name = 'resource_parms' and process_name = 'SSPcntl_agt'
/!/;

Then finally the DB contents were overwritten incorrectly by the delayed bin log updates causing product failures.

*Mariadb Server version: 10.4.21-MariaDB-log Source distribution

Attached config file my.cnf and bin log as well as decoded bin log.

Your advice is highly appreciated.*

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

mysql-lcpdb-bin.000003
4.38 MB
2022-05-27 10:52
bin.log
5.06 MB
2022-05-27 10:51
my.cnf
3 kB
2022-05-27 10:51

Activity

People

Assignee:: Unassigned

Reporter:: Andy Liu

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 2022-05-27 10:52

Updated:: 2022-08-29 22:33

Resolved:: 2022-08-29 22:33

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.