[MDEV-35217] Provide information for debugging parallel replication conflicts and hangs - Jira

XML

Word

Printable

Details

Type: New Feature
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Fix Version/s: 10.6
Component/s: Replication
Labels:
None

Description

When the is a problem with optimistic parallel replication, such as excessive conflicts or hang/slowness due to bugs, there is very little information available to investigate the problem. This causes debugging the problem to be extremely difficult, and many occurrences end up being ignored that could have been used to track down a bug and fix to the benefit of all users.

A simple idea that should greatly improve this is to implement an option --slave-parallel-print-all-deadlocks, inspired by --innodb-print-all-deadlocks. This option, when enabled, will output additional information in the error log about parallel replication conflicts:

When a conflict is detected, the blocked GTID as well as the blocking GTID to be aborted, along with their associated worker thread states and active query.
When an event group needs to retry, a dump of the chain of wait-for-prior-commit threads, and the SHOW ENGINE STATUS for all engines participating in the transaction.

Two cases are of particular interest; this is when an event group needs to retry due to a lock wait timeout, or needs to retry more than once. This is not expected to happen in normal operation, and might indicate a bug, so it will be useful to be able to enable the new option only for these cases, to make it feasible to have it enabled always in production environment.

The option should also be possible to enable for all conflicts, which will be useful to get information during specific investigations, but might produce too much output/overhead in normal use.

Attachments

Issue Links

mentioned in: Page Loading...; Page Loading...

Activity

People

Assignee:: Kristian Nielsen

Reporter:: Kristian Nielsen

Votes:: 1 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 2024-10-21 07:47

Updated:: 1 week ago 23:21

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.