Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-14256

MariaDB 10.2.10 can't SST with xtrabackup-v2

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Fixed
    • 10.2.8
    • 10.2.11
    • Galera SST, wsrep
    • [jg4461@db2 ~]$ cat /etc/redhat-release
      CentOS Linux release 7.4.1708 (Core)
      [jg4461@db2 ~]$ uname -r
      3.10.0-693.5.2.el7.x86_64
    • 10.2.11

    Description

      Following an upgrade to MariaDB-server-10.2.10-1.el7.centos.x86_64 wsrep_sst_xtrabackup-v2 is unable to initiate an SST to join a node to the cluster.

      It fails with the following errors:

      2017-11-01  7:23:33 140136889161472 [Note] WSREP: Flow-control interval: [23, 23]
      2017-11-01  7:23:33 140136889161472 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 4823208097)
      2017-11-01  7:23:33 140136880768768 [Note] WSREP: State transfer required: 
              Group state: c37249ee-cc56-11e3-8839-da7603c8db1b:4823208097
              Local state: c37249ee-cc56-11e3-8839-da7603c8db1b:4822595673
      2017-11-01  7:23:33 140136880768768 [Note] WSREP: New cluster view: global state: c37249ee-cc56-11e3-8839-da7603c8db1b:4823208097, view# 1879: Primary, number of nodes: 2, my index: 0, protocol version 3
      2017-11-01  7:23:33 140136880768768 [Warning] WSREP: Gap in state sequence. Need state transfer.
      2017-11-01  7:23:33 140136880039680 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'joiner' --address '137.222.8.66' --datadir '/var/lib/mysql/data/'   --parent '27765'  '' '
      /usr//bin/wsrep_sst_xtrabackup-v2: line 646: WSREP_SST_OPT_PORT: unbound variable
      2017-11-01  7:23:34 140136880039680 [ERROR] WSREP: Failed to read 'ready <addr>' from: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '137.222.8.66' --datadir '/var/lib/mysql/data/'   --parent '27765'  '' 
              Read: '(null)'
      2017-11-01  7:23:34 140136880039680 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '137.222.8.66' --datadir '/var/lib/mysql/data/'   --parent '27765'  '' : 1 (Operation not permitted)
      2017-11-01  7:23:34 140136880768768 [ERROR] WSREP: Failed to prepare for 'xtrabackup-v2' SST. Unrecoverable.
      2017-11-01  7:23:34 140136880768768 [ERROR] Aborting
      

      The root cause appears to be:

      /usr/bin/wsrep_sst_xtrabackup-v2: line 646: WSREP_SST_OPT_PORT: unbound variable
      

      The WSREP_SST_OPT_PORT doesn't have a default value set, either set in wsrep_sst_xtrabackup-v2 or in wsrep_sst_common

      The following diff sets a default value for the WSREP_SST_OPT_PORT variable, and allows the SST to proceed.

      --- wsrep_sst_common    2017-11-02 11:02:09.561266862 +0000
      +++ wsrep_sst_common_modified   2017-11-02 11:02:00.473166368 +0000
      @@ -27,6 +27,7 @@
       WSREP_SST_OPT_PSWD=${WSREP_SST_OPT_PSWD:-}
       WSREP_SST_OPT_DEFAULT=""
       WSREP_SST_OPT_EXTRA_DEFAULT=""
      +WSREP_SST_OPT_PORT=4444
       
       while [ $# -gt 0 ]; do
       case "$1" in
      

      Attachments

        Issue Links

          Activity

            serg it looks it was merged in this commit, before that it did parse address expression directly:
            https://github.com/MariaDB/server/commit/83664e21e4fb6755c8c0c90d3dee8819d36928c9#diff-cca56af3f0ce3e7f4fbc13dc62cc2823R640

            anikitin Andrii Nikitin (Inactive) added a comment - serg it looks it was merged in this commit, before that it did parse address expression directly: https://github.com/MariaDB/server/commit/83664e21e4fb6755c8c0c90d3dee8819d36928c9#diff-cca56af3f0ce3e7f4fbc13dc62cc2823R640

            I'm not sure that was it. Old code used

                 '--address')
                     readonly WSREP_SST_OPT_ADDR="$2"
            ...
                    SST_PORT=$(echo ${WSREP_SST_OPT_ADDR} | awk -F ':' '{ print $2 }')
            

            That would've thrown an error if --address is not used. New code does

                 '--address')
                    readonly WSREP_SST_OPT_ADDR="$2"
            ...
                    readonly WSREP_SST_OPT_PORT=$(echo $WSREP_SST_OPT_ADDR | \
                            cut -d ']' -f 2 | cut -s -d ':' -f 2 | cut -d '/' -f 1)
            ...
                    SST_PORT=$WSREP_SST_OPT_PORT
            

            Assuming that --address is used (because the old code didn't fail), I don't see how WSREP_SST_OPT_PORT could be unset. There was a later relevant commit — 4c2c057d404 — but I don't see how it could've left WSREP_SST_OPT_PORT unset either.

            serg Sergei Golubchik added a comment - I'm not sure that was it. Old code used '--address' ) readonly WSREP_SST_OPT_ADDR= "$2" ... SST_PORT=$( echo ${WSREP_SST_OPT_ADDR} | awk -F ':' '{ print $2 }' ) That would've thrown an error if --address is not used. New code does '--address' ) readonly WSREP_SST_OPT_ADDR= "$2" ... readonly WSREP_SST_OPT_PORT=$( echo $WSREP_SST_OPT_ADDR | \ cut -d ']' -f 2 | cut -s -d ':' -f 2 | cut -d '/' -f 1) ... SST_PORT=$WSREP_SST_OPT_PORT Assuming that --address is used (because the old code didn't fail), I don't see how WSREP_SST_OPT_PORT could be unset. There was a later relevant commit — 4c2c057d404 — but I don't see how it could've left WSREP_SST_OPT_PORT unset either.
            anikitin Andrii Nikitin (Inactive) added a comment - - edited

            Yes, correct. Then after fix for MDEV-13968 it became WSREP_SST_OPT_ADDR_PORT which was initialized in --address and WSREP_SST_OPT_PORT left unset
            Actually probably patch from duplicate MDEV-14299 is better than mine suggested above

            anikitin Andrii Nikitin (Inactive) added a comment - - edited Yes, correct. Then after fix for MDEV-13968 it became WSREP_SST_OPT_ADDR_PORT which was initialized in --address and WSREP_SST_OPT_PORT left unset Actually probably patch from duplicate MDEV-14299 is better than mine suggested above

            Committed a patch

            serg Sergei Golubchik added a comment - Committed a patch

            serg The patch is good and I've verified it by patching 10.2.10 in docker image from earlier case like this https://github.com/AndriiNikitin/bugs/blob/master/MDEV-14256-test1.sh#L24

            anikitin Andrii Nikitin (Inactive) added a comment - serg The patch is good and I've verified it by patching 10.2.10 in docker image from earlier case like this https://github.com/AndriiNikitin/bugs/blob/master/MDEV-14256-test1.sh#L24

            People

              serg Sergei Golubchik
              jgazeley Jonathan Gazeley
              Votes:
              11 Vote for this issue
              Watchers:
              21 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.