Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-30619

Parallel Slave SQL Thread Can Update Seconds_Behind_Master with Active Workers

Details

    Description

      If the workers of a parallel replica are busy (potentially with long queues), but the SQL thread has no events left to distribute (so it goes idle). Then the next event that comes from the primary will update LMT with its timestamp, even if the workers may be quite far behind.

      Proposed fix is for the SQL thread to additionally check if there are uncommitted events. That is, we should add an atomic counter (displayable as a new system status variable), which the SQL thread increments on reads, and that the workers decrement on commits. last_master_timestamp should only be updated by the SQL thread with the MDEV-29639 logic if this counter is 0.

      Attachments

        Issue Links

          Activity

            bnestere Brandon Nesterenko created issue -
            bnestere Brandon Nesterenko made changes -
            Field Original Value New Value
            bnestere Brandon Nesterenko made changes -
            julien.fritsch Julien Fritsch made changes -
            Fix Version/s 10.7 [ 24805 ]
            julien.fritsch Julien Fritsch made changes -
            Fix Version/s 10.3 [ 22126 ]
            julien.fritsch Julien Fritsch made changes -
            Fix Version/s 10.8 [ 26121 ]
            bnestere Brandon Nesterenko made changes -
            Status Open [ 1 ] In Progress [ 3 ]

            Hi Andrei!

            This is ready for review as PR-2682

            bnestere Brandon Nesterenko added a comment - Hi Andrei! This is ready for review as PR-2682
            bnestere Brandon Nesterenko made changes -
            Assignee Brandon Nesterenko [ JIRAUSER48702 ] Andrei Elkin [ elkin ]
            Status In Progress [ 3 ] In Review [ 10002 ]
            pandi.gurusamy Pandikrishnan Gurusamy made changes -
            Labels CS0610910
            pandi.gurusamy Pandikrishnan Gurusamy made changes -
            Roel Roel Van de Paar made changes -
            Status In Review [ 10002 ] In Testing [ 10301 ]
            julien.fritsch Julien Fritsch made changes -
            Labels CS0610910
            Roel Roel Van de Paar made changes -
            Assignee Andrei Elkin [ elkin ] Roel Van de Paar [ roel ]
            Roel Roel Van de Paar made changes -
            Description If the workers of a parallel replica are busy (potentially with long queues), but the SQL thread has no events left to distribute (so it goes idle). Then the next event that comes from the primary will update lmt with its timestamp, even if the workers may be quite far behind.

            Proposed fix is for the SQL thread to additionally check if there are uncommitted events. That is, we should add an atomic counter (displayable as a new system status variable), which the SQL thread increments on reads, and that the workers decrement on commits. last_master_timestamp should only be updated by the SQL thread with the MDEV-29639 logic if this counter is 0.
            If the workers of a parallel replica are busy (potentially with long queues), but the SQL thread has no events left to distribute (so it goes idle). Then the next event that comes from the primary will update LMT with its timestamp, even if the workers may be quite far behind.

            Proposed fix is for the SQL thread to additionally check if there are uncommitted events. That is, we should add an atomic counter (displayable as a new system status variable), which the SQL thread increments on reads, and that the workers decrement on commits. last_master_timestamp should only be updated by the SQL thread with the MDEV-29639 logic if this counter is 0.
            Roel Roel Van de Paar made changes -
            Roel Roel Van de Paar made changes -
            Assignee Roel Van de Paar [ roel ] Brandon Nesterenko [ JIRAUSER48702 ]
            Roel Roel Van de Paar made changes -
            Assignee Brandon Nesterenko [ JIRAUSER48702 ] Roel Van de Paar [ roel ]
            Roel Roel Van de Paar made changes -
            Assignee Roel Van de Paar [ roel ] Andrei Elkin [ elkin ]
            Status In Testing [ 10301 ] Stalled [ 10000 ]

            Except for MDEV-31749 this is OK to push.

            Roel Roel Van de Paar added a comment - Except for MDEV-31749 this is OK to push.

            Please note that rpl.rpl_parallel_optimistic_until test failures (ref MDEV-23021) may be more pronounced after the implementation of this patch.

            Roel Roel Van de Paar added a comment - Please note that rpl.rpl_parallel_optimistic_until test failures (ref MDEV-23021 ) may be more pronounced after the implementation of this patch.
            Roel Roel Van de Paar made changes -
            Elkin Andrei Elkin made changes -
            Fix Version/s 10.4.31 [ 29010 ]
            Fix Version/s 10.5.22 [ 29011 ]
            Fix Version/s 10.6.15 [ 29013 ]
            Fix Version/s 10.9.8 [ 29015 ]
            Fix Version/s 10.10.6 [ 29017 ]
            Fix Version/s 10.11.5 [ 29019 ]
            Fix Version/s 11.0.3 [ 28920 ]
            Fix Version/s 11.1.2 [ 28921 ]
            Fix Version/s 11.2.1 [ 29034 ]
            Fix Version/s 10.8.8 [ 28518 ]
            Fix Version/s 10.4 [ 22408 ]
            Fix Version/s 10.5 [ 23123 ]
            Fix Version/s 10.6 [ 24028 ]
            Fix Version/s 10.9 [ 26905 ]
            Fix Version/s 10.10 [ 27530 ]
            Resolution Fixed [ 1 ]
            Status Stalled [ 10000 ] Closed [ 6 ]
            ralf.gebhardt Ralf Gebhardt made changes -
            bnestere Brandon Nesterenko made changes -
            bnestere Brandon Nesterenko made changes -
            mariadb-jira-automation Jira Automation (IT) made changes -
            Zendesk Related Tickets 202138
            bnestere Brandon Nesterenko made changes -

            People

              Elkin Andrei Elkin
              bnestere Brandon Nesterenko
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.