[MDEV-20220] Merge 5.7 P_S replication table 'replication_applier_status_by_worker' Created: 2019-07-30  Updated: 2021-04-23  Resolved: 2021-04-08

Status: Closed
Project: MariaDB Server
Component/s: Replication, Variables
Fix Version/s: 10.6.0

Type: Task Priority: Critical
Reporter: Geoff Montee (Inactive) Assignee: Sujatha Sivakumar (Inactive)
Resolution: Fixed Votes: 2
Labels: None


 Description   

On a replication slave that has parallel replication enabled, if slave_parallel_max_queued is set too small, then the SQL thread's queue can sometimes be too small to assign work to all of the slave worker threads. When this happens, the SQL thread's state will be "Waiting for room in worker thread event", and the idle worker thread's state will be "Waiting for work from SQL thread".

|  4 | system user |           | NULL | Connect |    139 | Waiting for work from SQL thread              | NULL             |    0.000 |
...
| 21 | system user |           | NULL | Connect |     45 | Waiting for room in worker thread event queue | NULL             |    0.000 |
...

https://mariadb.com/kb/en/library/parallel-replication/#configuring-the-maximum-size-of-the-parallel-slave-queue

https://mariadb.com/kb/en/library/replication-and-binary-log-system-variables/#slave_parallel_max_queued

I think it would be useful to add a status variable that gets incremented if slave's parallel worker thread is idle because the SQL thread's queue is full.



 Comments   
Comment by Sujatha Sivakumar (Inactive) [ 2020-05-05 ]

Hello Andrei,

Please review the fix for MDEV-20220.

https://github.com/MariaDB/server/commit/92fbb4e911d700576d9b894637a476b18324f1ed

Added a new status variable 'Slave_idle_parallel_worker_count'.
The counter variable gets incremented when a slave worker is in
idle state. On START SLAVE all workers are initially in idle state.

MariaDB [(none)]> start slave;
Query OK, 0 rows affected (0.006 sec)
 
MariaDB [(none)]> show status like '%idle%';
+----------------------------------+-------+
| Variable_name                    | Value |
+----------------------------------+-------+
| Innodb_master_thread_idle_loops  | 61    |
| Slave_idle_parallel_worker_count | 4     |
| Threadpool_idle_threads          | 0     |
+----------------------------------+-------+
3 rows in set (0.005 sec)
 
MariaDB [(none)]> show processlist;
+----+-------------+-----------+------+--------------+------+--------------------------------------------------------+------------------+----------+
| Id | User        | Host      | db   | Command      | Time | State                                                  | Info             | Progress |
+----+-------------+-----------+------+--------------+------+--------------------------------------------------------+------------------+----------+
|  4 | root        | localhost | NULL | Query        |    0 | starting                                               | show processlist |    0.000 |
|  5 | system user |           | NULL | Slave_IO     |   13 | Waiting for master to send event                       | NULL             |    0.000 |
|  8 | system user |           | NULL | Slave_worker |   13 | Waiting for work from SQL thread                       | NULL             |    0.000 |
| 10 | system user |           | NULL | Slave_worker |   13 | Waiting for work from SQL thread                       | NULL             |    0.000 |
|  7 | system user |           | NULL | Slave_worker |   13 | Waiting for work from SQL thread                       | NULL             |    0.000 |
|  9 | system user |           | NULL | Slave_worker |   13 | Waiting for work from SQL thread                       | NULL             |    0.000 |
|  6 | system user |           | NULL | Slave_SQL    |   13 | Slave has read all relay log; waiting for more updates | NULL             |    0.000 |
+----+-------------+-----------+------+--------------+------+--------------------------------------------------------+------------------+----------+

On master:

MariaDB [test]> create table t ( f int) engine=innodb;
Query OK, 0 rows affected (0.064 sec)

On slave: One of the workers will execute the above DDL and
on completion will get back to idle state. Hence the count gets
incremented.

MariaDB [test]> show tables;
+----------------+
| Tables_in_test |
+----------------+
| t              |
+----------------+
1 row in set (0.002 sec)
 
MariaDB [test]> show status like '%idle%';
+----------------------------------+-------+
| Variable_name                    | Value |
+----------------------------------+-------+
| Innodb_master_thread_idle_loops  | 122   |
| Slave_idle_parallel_worker_count | 5     |
| Threadpool_idle_threads          | 0     |
+----------------------------------+-------+
3 rows in set (0.004 sec)
 
MariaDB [test]> stop slave;
Query OK, 0 rows affected (0.003 sec)
 
MariaDB [test]> show status like '%idle%';
+----------------------------------+-------+
| Variable_name                    | Value |
+----------------------------------+-------+
| Innodb_master_thread_idle_loops  | 177   |
| Slave_idle_parallel_worker_count | 5     |
| Threadpool_idle_threads          | 0     |
+----------------------------------+-------+
3 rows in set (0.004 sec)

Counter gets cleared during START SLAVE.

MariaDB [test]> start slave;
Query OK, 0 rows affected (0.004 sec)
 
MariaDB [test]> show status like '%idle%';
+----------------------------------+-------+
| Variable_name                    | Value |
+----------------------------------+-------+
| Innodb_master_thread_idle_loops  | 187   |
| Slave_idle_parallel_worker_count | 4     |
| Threadpool_idle_threads          | 0     |
+----------------------------------+-------+
3 rows in set (0.004 sec)

Comment by Sujatha Sivakumar (Inactive) [ 2020-06-03 ]

Hello GeoffMontee

A draft patch has been implemented as per the task description.
Please refer the following comment for detailed description.

https://jira.mariadb.org/browse/MDEV-20220?focusedCommentId=152038&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-152038

Would it be helpful, if the status variable displays the total
count of idle time spent by workers instead of proposed counter.

Please let us know.

Comment by Andrei Elkin [ 2020-06-08 ]

I would go with the most "lazy" one of FLUSH STATUS.

Comment by Geoff Montee (Inactive) [ 2020-06-16 ]

Hi sujatha.sivakumar and Elkin,

Would it be helpful, if the status variable displays the total count of idle time spent by workers instead of proposed counter.

This raises a question to me. Let's say that you have a slave with 4 parallel worker threads. And let's say that during a 1 second window, 3 of those worker threads were idle because their queues were full.

What would the "the total count of idle time spent by workers" be in this case?

Would it be "1 second", because all 3 worker threads were idle for the same 1 second time period?

Or would it be "3 seconds", because there were 3 workers threads, and each one was idle for 1 second?

Or would it be something else?

Thanks!

Comment by Andrei Elkin [ 2020-06-16 ]

GeoffMontee And let's say that during a 1 second window, 3 of those worker threads were idle because their queues were full. - that means 3 are 'busy', are not? I am guessing you meant that, as 'idle' that is one with an empty queue.

Then we count the total idleness. If all N have nothing to do within time t then the idle period is N*t.

Comment by Geoff Montee (Inactive) [ 2020-06-16 ]

Hi Elkin,

that means 3 are 'busy', are not? I am guessing you meant that, as 'idle' that is one with an empty queue.

No, I meant what I said for the example--that 3 worker threads are idle because their queues do not have room, and only 1 worker thread is performing work. I am specifically referring to cases like the one shown in the ticket description, where the worker thread has the state "Waiting for work from SQL thread", and the SQL thread has the state "Waiting for room in worker thread event queue":

|  4 | system user |           | NULL | Connect |    139 | Waiting for work from SQL thread              | NULL             |    0.000 |
...
| 21 | system user |           | NULL | Connect |     45 | Waiting for room in worker thread event queue | NULL             |    0.000 |
...

I don't completely understand why this happens. It seems somewhat contradictory to me. If the worker thread isn't performing work, then shouldn't the worker thread's queue have room? I assume that this happens because of some internal implementation details. Do you know?

Then we count the total idleness. If all N have nothing to do within time t then the idle period is N*t.

OK, I see. So if 3 worker threads are idle for the same 1 second period, then the status variable would say that the total idle time is 3 seconds. Thanks!

Comment by Andrei Elkin [ 2020-06-16 ]

3 worker threads are idle because their queues do not have room

Well, they are not really idle, unless

Waiting for work from SQL thread 

The other form of idling is by the Driver thread

Waiting for room in worker thread event queue 

In your paste the idler is really one Worker.

So if 3 worker threads are idle for the same 1 second period, then the status variable would say that the total idle time is 3 seconds. Thanks!

Comment by Geoff Montee (Inactive) [ 2020-06-16 ]

Hi Elkin,

Well, they are not really idle, unless

Waiting for work from SQL thread

The other form of idling is by the Driver thread

Waiting for room in worker thread event queue

Yes, I understand that.

In your paste the idler is really one Worker.

Yes, I understand that. I do not have example text showing "3 idle worker threads, 1 active worker thread". That was a hypothetical example that I came up with to ask you how your new status variable would behave in that scenario.

Comment by Sujatha Sivakumar (Inactive) [ 2020-08-17 ]

Hello serg, GeoffMontee and Elkin

As part of current task can we add a new `performance schema` table similar to upstream?
The table can capture following details. It can be extended further to include few more details.

performance_schema.replication_applier_status_by_worker

Column Name Value
THREAD_ID None
SERVICE_STATE None
LAST_SEEN_TRANSACTION None
LAST_ERROR_NUMBER Last_SQL_Errno
LAST_ERROR_MESSAGE Last_SQL_Error
LAST_ERROR_TIMESTAMP Last_SQL_Error_Timestamp
WORKER_IDLE_TIME Total idle time by worker "Waiting for work from SQL thread"
LAST_TRANS_WAIT_FOR_COMMIT_ORDER Total waiting time spent by current transaction to reach its turn for commit
LAST_TRANS_RETRY_COUNT Total number retries attempted by last transaction
Comment by Geoff Montee (Inactive) [ 2020-08-17 ]

Hi sujatha.sivakumar,

That sounds good to me.

Comment by Sujatha Sivakumar (Inactive) [ 2020-09-23 ]

Hello serg

Can you please review the changes for MDEV-20220.

Patch: https://github.com/MariaDB/server/commit/0922edb1b162c5aa73c11867157e0e118f79bbc6
Test results: http://buildbot.askmonty.org/buildbot/grid?category=main&branch=bb-10.5-sujatha

Comment by Sujatha Sivakumar (Inactive) [ 2020-11-05 ]

Hello serg

Since fix version is 10.6, ported my 10.5 changes to 10.6 and tested them.

Please find the 10.6 patch: https://github.com/MariaDB/server/commit/1d5f0db0158c5a4c2e82abd3b881de46a719412d
Test results: http://buildbot.askmonty.org/buildbot/grid?category=main&branch=bb-10.6-sujatha

This patch is for 'replication_applier_status_by_worker' table alone.
Rest of the tables are implemented as part of MDEV-16437.

Current patch will be used as base of MDEV-16437 and each PFS table specific change will be added as a separate commit.

Thank you.

Comment by Sujatha Sivakumar (Inactive) [ 2021-03-30 ]

Hello Sergei,

Good Morning.

I have addressed your review comments. Please review the new set of changes.

Patch: https://github.com/MariaDB/server/commit/9abbb589e6d6f61e03cad0fc5c5aee31077dd00c

BuildBot Results: http://buildbot.askmonty.org/buildbot/grid?category=main&branch=bb-10.6-sujatha

Thank you.

Generated at Thu Feb 08 08:57:47 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.