Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-30541

IST always fails -- wsrep_sst_mariabackup does not handle "secret" correctly when doing an IST

    XMLWordPrintable

Details

    Description

      I am rewriting this description completely, having learned a lot more. I have also changed the title of this report.

      I have a Galera cluster using [sst] encrypt=3 with SST mode mariabackup. Whenever a node gracefully shuts down and then comes back up, it fails. Retrying (which by default happens automatically because of systemd) ultimately succeeds, but only after a full SST is done.

      That is, IST always fails. Trying again results in an SST which succeeds.

      My database is big enough that this is not really acceptable, and it doesn't seem to be the intended behavior. I narrowed it down to an error in syslog "Donor does not know my secret!".

      Sure enough, in wsrep_sst_mariabackup, when we are NOT bypassing (that is, in full SST mode), there is the following:

      if [ -n "$WSREP_SST_OPT_REMOTE_PSWD" ]; then

      1. Let joiner know that we know its secret
        echo "$SECRET_TAG $WSREP_SST_OPT_REMOTE_PSWD" >> "$MAGIC_FILE"
        fi

      And when we ARE bypassing (that is, in IST mode) it is missing.

      I've modified wsrep_sst_mariabackup to add that statement in bypass mode, just after the $MAGIC_FILE is initially written, and now my nodes can come up with a quick IST rather than a long SST.

      Attachments

        Issue Links

          Activity

            People

              sysprg Julius Goryavsky
              xan@biblionix.com Xan Charbonnet
              Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.