Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-31211

START ALTER hang with multi-source replication and --gtid-ignore-duplicates

    XMLWordPrintable

Details

    Description

      When START ALTER event is applied, it registers in mi->start_alter_list.
      However, mi is the wrong place to have this list. There is no guarantee that
      the matching COMMIT ALTER event will be applied in the context of the same
      mi.

      In multi-source replication A<->B, A->C, B->C with --gtid-ignore-duplicates,
      C will receive duplicates of all events on the A and B master connection,
      and it is random which one will be applied and which one ignored. If START
      ALTER runs in the context A, but COMMIT ALTER runs in B, then COMMIT ALTER
      will not find the start_alter_info, and will try to do the full ALTER TABLE.
      But this deadlocks, because the SA thread of START ALTER is holding the locks on
      the table, waiting to be signalled from COMMIT ALTER.

      I have a testcase on my knielsen_start_alter branch on github (Jira<->Github
      integration will hopefully keep it referenced over rebases). It needs some
      more work currently, but does manage to reproduce the issue if run
      sufficient number of times.

      Suggested fix: The start_alter_list needs to be global shared between all
      mi's (and rli's). Then some thought need to be given to which pending START
      ALTERs (and SA threads) to abort when one multi-source slave connection is
      stopped. I think it is reasonable to stop those START ALTERs that originated
      from the master connection that is stopped. This is overly conservative in
      scenarios like the one described above, but that's probably ok.

      I am working on a fix, might require some time as it's not entirely trivial.

      • Kristian.

      Attachments

        Activity

          People

            knielsen Kristian Nielsen
            knielsen Kristian Nielsen
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.