[MDEV-5296] No status information available for parallel replication - Jira

Details

Type: Task
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Fix Version/s: None
Component/s: None
Labels:
- foundation
- parallelslave

Description

There is very little information available to the DBA about how parallel
replication works. This makes it very hard to manage/tune it properly.

Here is a rough idea of what could be useful, from discussions with Giuseppe
Maxia:

It could be useful to have in I_S or P_S a table that gives for each worker
thread stuff like:

The GTID of the event group currently executing, or NULL if idle
Status (executing, waiting for prior transaction before starting, waiting
for prior transaction before committing, stuff like that)
The current database (USE xxx)
Currently executing query
Whether this worker was scheduled in parallel with something else, and if
so, why that was possible (group commit id or replication domain id)
Total number of events and event groups executed by worker thread
Possibly time spent idle, time spent executing, and time spent waiting for
prior transactions to commit (if such times can be obtained without too
high performance overhead).

On top of this, I think we could also add some statistics for the SQL
thread. Like, how often did it have to wait for a worker to become free to
schedule a potentially parallel transaction (might indicate a too-low
--slave-parallel-threads). And how many transactions could / could not be
scheduled in parallel (could indicate the need to tune the master to provide
more parallelism in the binlog).

What would be really nice is to have two numbers in SHOW SLAVE STATUS. One is
the wall-clock time since START SLAVE. The other is the total time spent by
workers on executing events for this master connection (excluding waiting for
other replication threads). The ratio between these two numbers would
immediately give an indication of how effective parallel replication is at
utilising the machine, same as the cpu% numbers in the `top` Linux utility.

Attachments

Issue Links

relates to

MDEV-4506 MWL#184: Parallel replication of group-committed transactions

Closed

MDEV-7340 [PATCH] parallel replication status variables

Open

Activity

There are no comments yet on this issue.

People

Assignee:: Kristian Nielsen

Reporter:: Kristian Nielsen

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 2013-11-15 12:14

Updated:: 2025-02-06 07:19

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server