[MDEV-10761] read_only applies to slave parallel worker threads - Jira

XML

Word

Printable

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Cannot Reproduce
Affects Version/s: 10.1.17, 10.0(EOL), 10.1(EOL), 10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5(EOL), 10.6, 10.7(EOL)
Fix Version/s: N/A
Component/s: Locking, Replication
Labels:
- maybe_fixed

Bug Category:
Can result in hang or crash
Sprint:
10.0.28

Description

Hi,

I ran into a strange issue when setting a server to read_only.

Settings:
slave_parallel_mode=optimistic
slave_parallel_threads=8

The system was replicating a lot of changes (it had just been restored from a backup).
The process I executed was:
1. Restore backup
2. SET GLOBAL gtid_slave_pos='slave pos from backup'
3. CHANGE MASTER TO MASTER_USE_GTID=slave_pos
4. START SLAVE
5. SET GLOBAL read_only=1;

The system started to hang with this processlist:

MariaDB [(none)]> show processlist;

+----+-------------+--------------------+--------------+---------+------+-----------------------------------------------+------------------------+----------+

| Id | User        | Host               | db           | Command | Time | State                                         | Info                   | Progress |

+----+-------------+--------------------+--------------+---------+------+-----------------------------------------------+------------------------+----------+

|  4 | root        | a.b.c.d:53344 | NULL         | Sleep   |    0 |                                               | NULL                   |    0.000 |

|  5 | root        | localhost          | NULL         | Query   |   33 | Waiting for commit lock                       | set global read_only=1 |    0.000 |

|  6 | system user |                    | NULL         | Connect |   41 | Waiting for master to send event              | NULL                   |    0.000 |

|  7 | system user |                    | NULL         | Connect |   33 | Waiting for global read lock                  | NULL                   |    0.000 |

|  8 | system user |                    | NULL         | Connect |   33 | Waiting for global read lock                  | NULL                   |    0.000 |

|  9 | system user |                    | NULL         | Connect |   33 | Waiting for prior transaction to commit       | NULL                   |    0.000 |

| 10 | system user |                    | NULL         | Connect |   33 | Waiting for global read lock                  | NULL                   |    0.000 |

| 11 | system user |                    | NULL         | Connect |   33 | Waiting for prior transaction to commit       | NULL                   |    0.000 |

| 12 | system user |                    | NULL         | Connect |   33 | Waiting for prior transaction to commit       | NULL                   |    0.000 |

| 13 | system user |                    | NULL         | Connect |   32 | Waiting for global read lock                  | NULL                   |    0.000 |

| 14 | system user |                    | NULL         | Connect |   33 | Waiting for global read lock                  | NULL                   |    0.000 |

| 15 | system user |                    | NULL         | Connect |   40 | Waiting for room in worker thread event queue | NULL                   |    0.000 |

| 16 | root        | 10.255.10.32:38644 | regressiondb | Sleep   |    3 |                                               | NULL                   |    0.000 |

| 20 | root        | a.b.c.d:53514 | NULL         | Sleep   |   16 |                                               | NULL                   |    0.000 |

| 22 | root        | localhost          | NULL         | Query   |    0 | init                                          | show processlist       |    0.000 |

+----+-------------+--------------------+--------------+---------+------+-----------------------------------------------+------------------------+----------+

15 rows in set (0.00 sec)

I think the slave threads caught a read only state in the process. The server started to hang.

I tried to stop the slave with STOP SLAVE, it still hung.
Then I killed the SET GLOBAL read_only=1, the server freed up. The slave threads stopped as well.

The server I just experienced this one had to go back into production, hopefully we can try to reproduce it on another system or hopefully you can reproduce it in a lab.

Attachments

Activity

People

Assignee:: Susil Behera

Reporter:: Michaël de groot

Votes:: 0 Vote for this issue

Watchers:: 11 Start watching this issue

Dates

Created:: 2016-09-07 14:37

Updated:: 2025-10-29 10:07

Resolved:: 2025-10-29 10:07

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.