Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-35387

wsrep_sst_rsync crash if aria_log_dir_path is defined

Details

    Description

      After update from 10.11.9 to 10.11.10 we observe problem with Rsync SST. It seems that there is a mass rewrite of wsrep_sst_common and if we use dedicated log dir for aria, Rsync SST method crash,

      aria_log_dir_path=/var/lib/mysql/logs
      datadir=/var/lib/mysql/data
      

      Problem is in function create_dirs in in wsrep_sst_common which is probably trapped somewhere on other exit code than 0. But if you use custom aria log dir, last exit code is 1 due to condition on line 1866

      [ $simplify -ne 0 -a "$ar_log_dir" = "$DATA_DIR" ] && ar_log_dir=""
      

      After it scripts flow go to simple_cleanup and exit. If you add on line below, for example echo "1", which has exit code 0, SST can continue and all work.

      Output from bash -x

      ....
      + '[' 0 -ne 0 -a /var/lib/mysql/logs = /var/lib/mysql/data ']'
      + simple_cleanup
      + local estatus=1
      + '[' 1 -ne 0 ']'
      + wsrep_log_error 'Cleanup after exit with status: 1'
      + wsrep_log '[ERROR] Cleanup after exit with status: 1'
      + local t
      + '[' Linux = Linux ']'
      ++ date '+%Y%m%d %H:%M:%S.%3N'
      + t='20241111 19:27:39.773'
      + echo 'WSREP_SST: [ERROR] Cleanup after exit with status: 1 (20241111 19:27:39.773)'
      WSREP_SST: [ERROR] Cleanup after exit with status: 1 (20241111 19:27:39.773)
      + '[' -n /var/lib/mysql/data/wsrep_sst.pid ']'
      ++ pwd
      + '[' /usr/bin '!=' /usr/bin ']'
      + '[' -f /var/lib/mysql/data/wsrep_sst.pid ']'
      + rm -f /var/lib/mysql/data/wsrep_sst.pid
      + exit 1'
      

      How to test:
      add different aria and data dir, eg.

      aria_log_dir_path=/var/lib/mysql/logs
      datadir=/var/lib/mysql/data
      

      an run rsync SST - problematic part is:.

      wsrep_sst_rsync --role 'joiner' --address '10.11.2.31' --datadir '/var/lib/mysql/data/' --parent 2603575 --progress 0 --binlog '/var/lib/mysql/logs/mysql-bin' --binlog-index '/var/lib/mysql/logs/mysql-bin.index' --mysqld-args --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1,0-0-0
      

      If you need some more data, let me know.

      Attachments

        Activity

          TM Tomas Merta created issue -
          TM Tomas Merta made changes -
          Field Original Value New Value
          Description After update from 10.11.9 to 10.11.10 we observe problem with Rsync SST. It seems that there is a mass rewrite of `wsrep_sst_common` and if we use dedicated log dir for aria Rsync SST crash,
          ```
          aria_log_dir_path=/var/lib/mysql/logs
          datadir=/var/lib/mysql/data
          ```
          Problem is in function `create_dirs` which is trapped somewhere on other exit code than 0. But if you use custom aria log dir, last exit code is 1 from condition on line 1866
          ```
          [ $simplify -ne 0 -a "$ar_log_dir" = "$DATA_DIR" ] && ar_log_dir=""
          ```
          After it code go to `simple_cleanup` and exit. It is line 1866 in wsrep_sst_common. If you add on line below, for example echo "1", which has exit code 0, SST can continue and all work.

          Output from bash -x
          ```
          ....
          '[' 0 -ne 0 -a /var/lib/mysql/logs = /var/lib/mysql/data ']'
          + simple_cleanup
          + local estatus=1
          + '[' 1 -ne 0 ']'
          + wsrep_log_error 'Cleanup after exit with status: 1'
          + wsrep_log '[ERROR] Cleanup after exit with status: 1'
          ....
          ```
          How to test:
          add different aria and data dir, eg.
          ```
          aria_log_dir_path=/var/lib/mysql/logs
          datadir=/var/lib/mysql/data
          ```
          an run rsync SST - problematic part is:.
          ```
          wsrep_sst_rsync --role 'joiner' --address '10.11.2.31' --datadir '/var/lib/mysql/data/' --parent 2603575 --progress 0 --binlog '/var/lib/mysql/logs/mysql-bin' --binlog-index '/var/lib/mysql/logs/mysql-bin.index' --mysqld-args --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1,0-0-0
          ```
           
          After update from 10.11.9 to 10.11.10 we observe problem with Rsync SST. It seems that there is a mass rewrite of `wsrep_sst_common` and if we use dedicated log dir for aria, Rsync SST method crash,
          {code:bash}
          aria_log_dir_path=/var/lib/mysql/logs
          datadir=/var/lib/mysql/data
          {code}
          Problem is in function
          {noformat}
          create_dirs
          {noformat}
           which is trapped somewhere on other exit code than 0. But if you use custom aria log dir, last exit code is 1 from condition on line 1866


          {code:bash}
          [ $simplify -ne 0 -a "$ar_log_dir" = "$DATA_DIR" ] && ar_log_dir=""
          {code}


          After it code go to `simple_cleanup` and exit. It is line 1866 in wsrep_sst_common. If you add on line below, for example echo "1", which has exit code 0, SST can continue and all work.

          Output from bash -x
          {code:bash}
          ....
          '[' 0 -ne 0 -a /var/lib/mysql/logs = /var/lib/mysql/data ']'
          + simple_cleanup
          + local estatus=1
          + '[' 1 -ne 0 ']'
          + wsrep_log_error 'Cleanup after exit with status: 1'
          + wsrep_log '[ERROR] Cleanup after exit with status: 1'
          {code}

          How to test:
          add different aria and data dir, eg.
          {code:bash}
          aria_log_dir_path=/var/lib/mysql/logs
          datadir=/var/lib/mysql/data
          {code}
          an run rsync SST - problematic part is:.
          {code:bash}
          wsrep_sst_rsync --role 'joiner' --address '10.11.2.31' --datadir '/var/lib/mysql/data/' --parent 2603575 --progress 0 --binlog '/var/lib/mysql/logs/mysql-bin' --binlog-index '/var/lib/mysql/logs/mysql-bin.index' --mysqld-args --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1,0-0-0
          {code}
           
          TM Tomas Merta made changes -
          Description After update from 10.11.9 to 10.11.10 we observe problem with Rsync SST. It seems that there is a mass rewrite of `wsrep_sst_common` and if we use dedicated log dir for aria, Rsync SST method crash,
          {code:bash}
          aria_log_dir_path=/var/lib/mysql/logs
          datadir=/var/lib/mysql/data
          {code}
          Problem is in function
          {noformat}
          create_dirs
          {noformat}
           which is trapped somewhere on other exit code than 0. But if you use custom aria log dir, last exit code is 1 from condition on line 1866


          {code:bash}
          [ $simplify -ne 0 -a "$ar_log_dir" = "$DATA_DIR" ] && ar_log_dir=""
          {code}


          After it code go to `simple_cleanup` and exit. It is line 1866 in wsrep_sst_common. If you add on line below, for example echo "1", which has exit code 0, SST can continue and all work.

          Output from bash -x
          {code:bash}
          ....
          '[' 0 -ne 0 -a /var/lib/mysql/logs = /var/lib/mysql/data ']'
          + simple_cleanup
          + local estatus=1
          + '[' 1 -ne 0 ']'
          + wsrep_log_error 'Cleanup after exit with status: 1'
          + wsrep_log '[ERROR] Cleanup after exit with status: 1'
          {code}

          How to test:
          add different aria and data dir, eg.
          {code:bash}
          aria_log_dir_path=/var/lib/mysql/logs
          datadir=/var/lib/mysql/data
          {code}
          an run rsync SST - problematic part is:.
          {code:bash}
          wsrep_sst_rsync --role 'joiner' --address '10.11.2.31' --datadir '/var/lib/mysql/data/' --parent 2603575 --progress 0 --binlog '/var/lib/mysql/logs/mysql-bin' --binlog-index '/var/lib/mysql/logs/mysql-bin.index' --mysqld-args --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1,0-0-0
          {code}
           
          After update from 10.11.9 to 10.11.10 we observe problem with Rsync SST. It seems that there is a mass rewrite of *wsrep_sst_common* and if we use dedicated log dir for aria, Rsync SST method crash,
          {code:bash}
          aria_log_dir_path=/var/lib/mysql/logs
          datadir=/var/lib/mysql/data
          {code}
          Problem is in function *create_dirs* in *in wsrep_sst_common* which is probably trapped somewhere on other exit code than 0. But if you use custom aria log dir, last exit code is 1 due to condition on line 1866

          {code:bash}
          [ $simplify -ne 0 -a "$ar_log_dir" = "$DATA_DIR" ] && ar_log_dir=""
          {code}

          After it scripts flow go to *simple_cleanup* and exit. If you add on line below, for example echo "1", which has exit code 0, SST can continue and all work.

          Output from bash -x
          {code:bash}
          ....
          + '[' 0 -ne 0 -a /var/lib/mysql/logs = /var/lib/mysql/data ']'
          + simple_cleanup
          + local estatus=1
          + '[' 1 -ne 0 ']'
          + wsrep_log_error 'Cleanup after exit with status: 1'
          + wsrep_log '[ERROR] Cleanup after exit with status: 1'
          + local t
          + '[' Linux = Linux ']'
          ++ date '+%Y%m%d %H:%M:%S.%3N'
          + t='20241111 19:27:39.773'
          + echo 'WSREP_SST: [ERROR] Cleanup after exit with status: 1 (20241111 19:27:39.773)'
          WSREP_SST: [ERROR] Cleanup after exit with status: 1 (20241111 19:27:39.773)
          + '[' -n /var/lib/mysql/data/wsrep_sst.pid ']'
          ++ pwd
          + '[' /usr/bin '!=' /usr/bin ']'
          + '[' -f /var/lib/mysql/data/wsrep_sst.pid ']'
          + rm -f /var/lib/mysql/data/wsrep_sst.pid
          + exit 1'
          {code}

          How to test:
          add different aria and data dir, eg.
          {code:bash}
          aria_log_dir_path=/var/lib/mysql/logs
          datadir=/var/lib/mysql/data
          {code}
          an run rsync SST - problematic part is:.
          {code:bash}
          wsrep_sst_rsync --role 'joiner' --address '10.11.2.31' --datadir '/var/lib/mysql/data/' --parent 2603575 --progress 0 --binlog '/var/lib/mysql/logs/mysql-bin' --binlog-index '/var/lib/mysql/logs/mysql-bin.index' --mysqld-args --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1,0-0-0
          {code}
          If you need some more data, let me know.
           
          serg Sergei Golubchik made changes -
          Assignee Julius Goryavsky [ sysprg ]
          serg Sergei Golubchik made changes -
          Fix Version/s 10.11 [ 27614 ]
          sysprg Julius Goryavsky made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          sysprg Julius Goryavsky added a comment - Fixed, https://github.com/MariaDB/server/commit/b52f88edf8b4660311c47fc0ffe05675ac06a5a9
          sysprg Julius Goryavsky made changes -
          Fix Version/s 10.5.28 [ 29952 ]
          Fix Version/s 10.11 [ 27614 ]
          Resolution Fixed [ 1 ]
          Status In Progress [ 3 ] Closed [ 6 ]
          JIraAutomate JiraAutomate made changes -
          Fix Version/s 10.6.21 [ 29953 ]
          Fix Version/s 10.11.11 [ 29954 ]
          Fix Version/s 11.4.5 [ 29956 ]
          Fix Version/s 11.7.2 [ 29914 ]

          TM Thank you for reporting this error!

          sysprg Julius Goryavsky added a comment - TM Thank you for reporting this error!

          People

            sysprg Julius Goryavsky
            TM Tomas Merta
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.