Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-10761

read_only applies to slave parallel worker threads

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Cannot Reproduce
    • 10.1.17, 10.6, 10.0(EOL), 10.1(EOL), 10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5(EOL), 10.7(EOL)
    • N/A
    • Locking, Replication
    • Can result in hang or crash
    • 10.0.28

    Description

      Hi,

      I ran into a strange issue when setting a server to read_only.

      Settings:
      slave_parallel_mode=optimistic
      slave_parallel_threads=8

      The system was replicating a lot of changes (it had just been restored from a backup).
      The process I executed was:
      1. Restore backup
      2. SET GLOBAL gtid_slave_pos='slave pos from backup'
      3. CHANGE MASTER TO MASTER_USE_GTID=slave_pos
      4. START SLAVE
      5. SET GLOBAL read_only=1;

      The system started to hang with this processlist:

      MariaDB [(none)]> show processlist;
      +----+-------------+--------------------+--------------+---------+------+-----------------------------------------------+------------------------+----------+
      | Id | User        | Host               | db           | Command | Time | State                                         | Info                   | Progress |
      +----+-------------+--------------------+--------------+---------+------+-----------------------------------------------+------------------------+----------+
      |  4 | root        | a.b.c.d:53344 | NULL         | Sleep   |    0 |                                               | NULL                   |    0.000 |
      |  5 | root        | localhost          | NULL         | Query   |   33 | Waiting for commit lock                       | set global read_only=1 |    0.000 |
      |  6 | system user |                    | NULL         | Connect |   41 | Waiting for master to send event              | NULL                   |    0.000 |
      |  7 | system user |                    | NULL         | Connect |   33 | Waiting for global read lock                  | NULL                   |    0.000 |
      |  8 | system user |                    | NULL         | Connect |   33 | Waiting for global read lock                  | NULL                   |    0.000 |
      |  9 | system user |                    | NULL         | Connect |   33 | Waiting for prior transaction to commit       | NULL                   |    0.000 |
      | 10 | system user |                    | NULL         | Connect |   33 | Waiting for global read lock                  | NULL                   |    0.000 |
      | 11 | system user |                    | NULL         | Connect |   33 | Waiting for prior transaction to commit       | NULL                   |    0.000 |
      | 12 | system user |                    | NULL         | Connect |   33 | Waiting for prior transaction to commit       | NULL                   |    0.000 |
      | 13 | system user |                    | NULL         | Connect |   32 | Waiting for global read lock                  | NULL                   |    0.000 |
      | 14 | system user |                    | NULL         | Connect |   33 | Waiting for global read lock                  | NULL                   |    0.000 |
      | 15 | system user |                    | NULL         | Connect |   40 | Waiting for room in worker thread event queue | NULL                   |    0.000 |
      | 16 | root        | 10.255.10.32:38644 | regressiondb | Sleep   |    3 |                                               | NULL                   |    0.000 |
      | 20 | root        | a.b.c.d:53514 | NULL         | Sleep   |   16 |                                               | NULL                   |    0.000 |
      | 22 | root        | localhost          | NULL         | Query   |    0 | init                                          | show processlist       |    0.000 |
      +----+-------------+--------------------+--------------+---------+------+-----------------------------------------------+------------------------+----------+
      15 rows in set (0.00 sec)
      

      I think the slave threads caught a read only state in the process. The server started to hang.

      I tried to stop the slave with STOP SLAVE, it still hung.
      Then I killed the SET GLOBAL read_only=1, the server freed up. The slave threads stopped as well.

      The server I just experienced this one had to go back into production, hopefully we can try to reproduce it on another system or hopefully you can reproduce it in a lab.

      Attachments

        Activity

          People

            susil.behera Susil Behera
            michaeldg Michaël de groot
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: