Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-13517

SST fails when donor node has problem with logging file descriptor

Details

    Description

      Hi there,

      we are running a MariaDB Galera Cluster on Debian Jessie and we are sometimes experiencing the issue analyzed here: https://serverfault.com/a/715940/115361

      I suspect it happens when systemd-journald is being restarted while everything else is running.

      The immediately visible hint to that is when systemctl status mariadb.service on a node does not show any recent logging output anymore (although there should be some).
      mysqld is still running normally, accepting queries and replicating to the other nodes, but an SST that tries to use such a defective node as donor will always fail with Will never receive state. Need to abort.

      Restarting mariadb.service on the defective donor node fixes the issue until the logger process dies the next time.

      mariadb-service-convert created the dropin configuration file to make mysqld log directly to syslog:

      StandardOutput=syslog
      StandardError=syslog
      SyslogFacility=daemon
      SyslogLevel=err
      SyslogIdentifier=mysqld

      I'd be happy to assist with more information to solve this issue soon.

      Attachments

        Activity

          The problem has been verified in 10.1 on Jessie and Stretch using steps below:

          1. Setup Environs framework for easy local cluster management

          sudo apt-get -y install m4 git
          git clone http://github.com/AndriiNikitin/mariadb-environs
          cd mariadb-environs
          

          2. (optional) Install 10.1, unless already installed

          # danger !!! will wipe out datadir
          ./replant.sh m0-system
          # run these commands with sudo to wipe out current installation:
          _system/uninstall.sh
          rm -rf /var/lib/mysql/*
          m0-system/install.sh 10.1
          

          3. Configure new cluster
          Make sure that following galera configuration is in my.cnf

          [galera]
          wsrep_on=ON
          wsrep_provider=/usr/lib/galera/libgalera_smm.so
          bind-address=0.0.0.0
          wsrep_cluster_address=gcomm://127.0.0.1
          wsrep_cluster_name=t1
          binlog_format=row
          default_storage_engine=InnoDB
          innodb_autoinc_lock_mode=2
          wsrep_sst_method=rsync
          wsrep_node_address=127.0.0.1
          wsrep_node_name=n0
          

          4. Start new cluster and make sure it is started properly:

          sudo systemctl stop mariadb
          sudo galera_new_cluster
          sleep 15
          # this must show ON
          mysql -uroot -e 'show variables like "wsrep_on"'
          

          5. configure another local instance on 3308

          ./replant.sh m2-system2
          m2*/gen_cnf.sh
          m2*/install_db.sh
          cat << EOF >> m2*/mysqldextra.cnf
          binlog_format=ROW
          default-storage-engine=innodb
          innodb_autoinc_lock_mode=2
          bind-address=0.0.0.0
          wsrep_on=ON
          wsrep_provider=/usr/lib/galera/libgalera_smm.so
          wsrep_cluster_name=t1
          wsrep_cluster_address=gcomm://127.0.0.1:4567
          wsrep_sst_method=rsync
          wsrep_node_address=127.0.0.1:4569
          wsrep_node_name=127.0.0.1_4569
          EOF
          

          6. Join new instance to the cluster, observe that it joined properly

          mysqld_safe --defaults-file=m2-system2/my.cnf --user=$(whoami) &
          sleep 60
          # must show ON
          mysql --defaults-file=m2-system2/my.cnf -e 'show variables like "wsrep_on"'
          # must show 2
          mysql --defaults-file=m2-system2/my.cnf -e 'show status like "wsrep_cluster_size"'
          

          7. Shutdown joiner, cleanup-reset it

          m2*/shutdown.sh
          rm -rf m2*/dt/*
          m2*/install_db.sh
          

          8. (optional) repeat step 6 and 7 many times to confirm it joins every time without problem

          9. restart journal service

          sudo systemctl stop systemd-journald
          

          10. try step 6 again, observe join failure and following error in m2-system2/dt/error.log

          WSREP: 0.0 (n0): State transfer to 1.0 (127.0.0.1_4569) failed: -141 (Unknown error 141)

          mysqld_safe --defaults-file=m2-system2/my.cnf --user=$(whoami) &
          sleep 30
          grep -B1 ERROR m2-system2/dt/error.log
          

          anikitin Andrii Nikitin (Inactive) added a comment - The problem has been verified in 10.1 on Jessie and Stretch using steps below: 1. Setup Environs framework for easy local cluster management sudo apt-get -y install m4 git git clone http: //github .com /AndriiNikitin/mariadb-environs cd mariadb-environs 2. (optional) Install 10.1, unless already installed # danger !!! will wipe out datadir . /replant .sh m0-system # run these commands with sudo to wipe out current installation: _system /uninstall .sh rm -rf /var/lib/mysql/ * m0-system /install .sh 10.1 3. Configure new cluster Make sure that following galera configuration is in my.cnf [galera] wsrep_on=ON wsrep_provider= /usr/lib/galera/libgalera_smm .so bind-address=0.0.0.0 wsrep_cluster_address=gcomm: //127 .0.0.1 wsrep_cluster_name=t1 binlog_format=row default_storage_engine=InnoDB innodb_autoinc_lock_mode=2 wsrep_sst_method= rsync wsrep_node_address=127.0.0.1 wsrep_node_name=n0 4. Start new cluster and make sure it is started properly: sudo systemctl stop mariadb sudo galera_new_cluster sleep 15 # this must show ON mysql -uroot -e 'show variables like "wsrep_on"' 5. configure another local instance on 3308 . /replant .sh m2-system2 m2* /gen_cnf .sh m2* /install_db .sh cat << EOF >> m2* /mysqldextra .cnf binlog_format=ROW default-storage-engine=innodb innodb_autoinc_lock_mode=2 bind-address=0.0.0.0 wsrep_on=ON wsrep_provider= /usr/lib/galera/libgalera_smm .so wsrep_cluster_name=t1 wsrep_cluster_address=gcomm: //127 .0.0.1:4567 wsrep_sst_method= rsync wsrep_node_address=127.0.0.1:4569 wsrep_node_name=127.0.0.1_4569 EOF 6. Join new instance to the cluster, observe that it joined properly mysqld_safe --defaults- file =m2-system2 /my .cnf --user=$( whoami ) & sleep 60 # must show ON mysql --defaults- file =m2-system2 /my .cnf -e 'show variables like "wsrep_on"' # must show 2 mysql --defaults- file =m2-system2 /my .cnf -e 'show status like "wsrep_cluster_size"' 7. Shutdown joiner, cleanup-reset it m2* /shutdown .sh rm -rf m2* /dt/ * m2* /install_db .sh 8. (optional) repeat step 6 and 7 many times to confirm it joins every time without problem 9. restart journal service sudo systemctl stop systemd-journald 10. try step 6 again, observe join failure and following error in m2-system2/dt/error.log WSREP: 0.0 (n0): State transfer to 1.0 (127.0.0.1_4569) failed: -141 (Unknown error 141) mysqld_safe --defaults- file =m2-system2 /my .cnf --user=$( whoami ) & sleep 30 grep -B1 ERROR m2-system2 /dt/error .log

          strace shows:

          14229 write(2, "2017-08-15 10:47:26 140592914888448 [Note] WSREP: (5c7243d7, 'tcp://0.0.0.0:4567') connection established to a7a45725 tcp://127.0.0.1:4569\n", 139) = -1 EPIPE (Broken pipe)
          

          anikitin Andrii Nikitin (Inactive) added a comment - strace shows: 14229 write(2, "2017-08-15 10:47:26 140592914888448 [Note] WSREP: (5c7243d7, 'tcp://0.0.0.0:4567') connection established to a7a45725 tcp://127.0.0.1:4569\n", 139) = -1 EPIPE (Broken pipe)

          Related external bugs :
          https://bugs.freedesktop.org/show_bug.cgi?id=84923
          https://github.com/systemd/systemd/issues/6620
          https://github.com/codership/galera/issues/478

          Workaround is to avoid syslog for MariaDB Server logging :
          Make sure that Error log is enabled in configuration files and syslog is disabled, e.g.

          [mysqld_safe]
          skip_syslog
          [mysqld]
          log_error=/your/path/error.log
          

          With configuration above original error is not shown and journal restart doesn't affect sst

          anikitin Andrii Nikitin (Inactive) added a comment - Related external bugs : https://bugs.freedesktop.org/show_bug.cgi?id=84923 https://github.com/systemd/systemd/issues/6620 https://github.com/codership/galera/issues/478 Workaround is to avoid syslog for MariaDB Server logging : Make sure that Error log is enabled in configuration files and syslog is disabled, e.g. [mysqld_safe] skip_syslog [mysqld] log_error=/your/path/error.log With configuration above original error is not shown and journal restart doesn't affect sst

          Since the error is caused by external components (systemd and codership/galera), I am going to close this ticket , unless any objections.
          Please contribute / comment to provided links above, unless any justification present that the problem may be fixed on Server part.

          anikitin Andrii Nikitin (Inactive) added a comment - Since the error is caused by external components (systemd and codership/galera), I am going to close this ticket , unless any objections. Please contribute / comment to provided links above, unless any justification present that the problem may be fixed on Server part.

          I am not sure at the moment why the problem reproduces only in Debian (including buster with the most recent systemd 234), while e.g. on Fedora 25/26 with systemd 233/231 and Centos 7 with older systemd 219-30 the worst behavior is just mariadb service becomes silent.
          Anyway - it looks that nothing can be done on MariaDB Server side besides suggested workaround.

          anikitin Andrii Nikitin (Inactive) added a comment - I am not sure at the moment why the problem reproduces only in Debian (including buster with the most recent systemd 234), while e.g. on Fedora 25/26 with systemd 233/231 and Centos 7 with older systemd 219-30 the worst behavior is just mariadb service becomes silent. Anyway - it looks that nothing can be done on MariaDB Server side besides suggested workaround.
          schlarbm Moritz Schlarb added a comment - - edited

          Hey @Andrii Nikitin,

          sorry that I can only answer now!
          Thanks for investigating the issue and even properly reporting it to upstream.

          So as a followup to your last comment, we should add that systemctl restart systemd-jounald.service is different from systemctl stop systemd-jounald.service; systemctl start systemd-jounald.service (at least for Systemd versions >= 219), right?

          Then I would suspect that the other possible workaround for Debian Jessie systems would be to install
          systemd from jessie-backports (the fix that Lennart mentions in the freedesktop.org bug, where journald "stores" its file descriptors in PID 1 on restart is available since version 219, I think):

            Installed: 215-17+deb8u7
            Candidate: 215-17+deb8u7
            Version table:
               230-7~bpo8+2 0
                  600 http://ftp.uni-mainz.de/debian/ jessie-backports/main amd64 Packages
                  600 http://ftp.de.debian.org/debian/ jessie-backports/main amd64 Packages
           *** 215-17+deb8u7 0
                  700 http://ftp.uni-mainz.de/debian/ jessie/main amd64 Packages
                  700 http://ftp.de.debian.org/debian/ jessie/main amd64 Packages
                  100 /var/lib/dpkg/status
          

          I understand that there is not all too much that you can fix there from MariaDB Server PoV, but since the configuration is currently the default on a new installation of MariaDB on Debian, I think it would be nice if you could at least add a prominent warning to README or the config file (/etc/mysql/conf.d/mysqld_safe_syslog.cnf maybe) that this can happen.

          Regards,
          Moritz

          schlarbm Moritz Schlarb added a comment - - edited Hey @Andrii Nikitin, sorry that I can only answer now! Thanks for investigating the issue and even properly reporting it to upstream. So as a followup to your last comment, we should add that systemctl restart systemd-jounald.service is different from systemctl stop systemd-jounald.service; systemctl start systemd-jounald.service (at least for Systemd versions >= 219), right? Then I would suspect that the other possible workaround for Debian Jessie systems would be to install systemd from jessie-backports (the fix that Lennart mentions in the freedesktop.org bug, where journald "stores" its file descriptors in PID 1 on restart is available since version 219, I think): Installed: 215-17+deb8u7 Candidate: 215-17+deb8u7 Version table: 230-7~bpo8+2 0 600 http://ftp.uni-mainz.de/debian/ jessie-backports/main amd64 Packages 600 http://ftp.de.debian.org/debian/ jessie-backports/main amd64 Packages *** 215-17+deb8u7 0 700 http://ftp.uni-mainz.de/debian/ jessie/main amd64 Packages 700 http://ftp.de.debian.org/debian/ jessie/main amd64 Packages 100 /var/lib/dpkg/status I understand that there is not all too much that you can fix there from MariaDB Server PoV, but since the configuration is currently the default on a new installation of MariaDB on Debian, I think it would be nice if you could at least add a prominent warning to README or the config file ( /etc/mysql/conf.d/mysqld_safe_syslog.cnf maybe) that this can happen. Regards, Moritz
          anikitin Andrii Nikitin (Inactive) added a comment - - edited

          Yes in my understanding 'restart' should be safer than 'stop&start' , at least in recent versions. (because https://github.com/systemd/systemd/issues/6620 should not trigger after 'restart' ).
          But on practice I see that mariadb service becomes silent after restart even on systemd (230-7~bpo8+2) on jessie .

          But good news is that I can confirm that the problem doesn't occur with my steps when using both 'stop' or 'restart' with your suggested workaround (i.e. SST is performed properly, just service becomes silent anyway):

          sudo apt-get -t jessie-backports install systemd
          

          $ apt-cache policy systemd
          systemd:
            Installed: 215-17
            Candidate: 215-17+deb8u7
            Version table:
               230-7~bpo8+2 0
                  100 http://ftp.debian.org/debian/ jessie-backports/main amd64 Packages
               215-17+deb8u7 0
                  500 http://debian.osuosl.org/debian/ jessie/main amd64 Packages
           *** 215-17 0
                  100 /var/lib/dpkg/status
          $ sudo apt-get -t jessie-backports install systemd
          ...
          $ git clone http://github.com/AndriiNikitin/mariadb-environs
          ...
          $ ./replant.sh m0-system
          $ sudo m0-system/install.sh 10.1
          ...
          $ sudo vim /etc/mysql/my.cnf 
          $ ls /usr/lib/galera/
          libgalera_smm.so
          $ sudo systemctl stop mariadb
          $ sudo galera_new_cluster
          $ mysql -uroot -e 'show variables like "wsrep_on"'
          +---------------+-------+
          | Variable_name | Value |
          +---------------+-------+
          | wsrep_on      | ON    |
          +---------------+-------+
          $ ./replant.sh m2-system2
          $ m2*/gen_cnf.sh
          $ m2*/install_db.sh
          $ cat << EOF >> m2*/mysqldextra.cnf
          > binlog_format=ROW
          > default-storage-engine=innodb
          > innodb_autoinc_lock_mode=2
          > bind-address=0.0.0.0
          > wsrep_on=ON
          > wsrep_provider=/usr/lib/galera/libgalera_smm.so
          > wsrep_cluster_name=t1
          > wsrep_cluster_address=gcomm://127.0.0.1:4567
          > wsrep_sst_method=rsync
          > wsrep_node_address=127.0.0.1:4569
          > wsrep_node_name=127.0.0.1_4569
          > EOF
          $ mysqld_safe --defaults-file=m2-system2/my.cnf --user=$(whoami) &
          [1] 27407
          $ mysql --defaults-file=m2-system2/my.cnf -e 'show status like "wsrep_cluster_size"'
          +--------------------+-------+
          | Variable_name      | Value |
          +--------------------+-------+
          | wsrep_cluster_size | 2     |
          +--------------------+-------+
          $ m2*/shutdown.sh
          [1]+  Done                    mysqld_safe --defaults-file=m2-system2/my.cnf --user=$(whoami)
          $ rm -rf m2*/dt/*
          $ m2*/install_db.sh
          $ sudo systemctl restart systemd-journald
          $ sudo systemctl status mariadb
          ● mariadb.service - MariaDB database server
             Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled)
            Drop-In: /etc/systemd/system/mariadb.service.d
                     └─migrated-from-my.cnf-settings.conf
             Active: active (running) since Fri 2017-08-18 10:26:46 EDT; 6min ago
            Process: 28088 ExecStartPost=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
            Process: 28086 ExecStartPost=/etc/mysql/debian-start (code=exited, status=0/SUCCESS)
            Process: 27909 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= ||   VAR=`/usr/bin/galera_recovery`; [ $? -eq 0 ]   && systemctl set-env
            Process: 27905 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
            Process: 27902 ExecStartPre=/usr/bin/install -m 755 -o mysql -g root -d /var/run/mysqld (code=exited, status=0/SUCCESS)
           Main PID: 28053 (mysqld)
             Status: "Taking your SQL requests now..."
             CGroup: /system.slice/mariadb.service
                     └─28053 /usr/sbin/mysqld --wsrep-new-cluster --wsrep_start_position=afe4dfc2-8420-11e7-b8c2-aa002665b1b9:0
           
          Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: group UUID = afe4dfc2-8420-11e7-b8c2-aa002665b1b9
          Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203415037696 [Note] WSREP: Flow-control interval: [16, 16]
          Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203716564736 [Note] WSREP: New cluster view: global state: afe4dfc2-8420-11e7-b8c
          Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203716564736 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification
          Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203716564736 [Note] WSREP: REPL Protocols: 7 (3, 2)
          Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203716564736 [Note] WSREP: Assign initial position for certification: 0, protocol
          Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203473073920 [Note] WSREP: Service thread queue flushed.
          Aug 18 10:31:27 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:27 140203423430400 [Note] WSREP: (436428c9, 'tcp://0.0.0.0:4567') connection establishe
          Aug 18 10:31:27 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:27 140203423430400 [Warning] WSREP: discarding established (time wait) 4e987697 (tcp://
          Aug 18 10:31:28 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:28 140203423430400 [Note] WSREP:  cleaning up 4e987697 (tcp://127.0.0.1:4569)
          $ mysqld_safe --defaults-file=m2-system2/my.cnf --user=$(whoami) &
          [1] 28557
          $ mysql --defaults-file=m2-system2/my.cnf -e 'show status like "wsrep_cluster_size"'
          +--------------------+-------+
          | Variable_name      | Value |
          +--------------------+-------+
          | wsrep_cluster_size | 2     |
          +--------------------+-------+
          $ sudo systemctl status mariadb
          ● mariadb.service - MariaDB database server
             Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled)
            Drop-In: /etc/systemd/system/mariadb.service.d
                     └─migrated-from-my.cnf-settings.conf
             Active: active (running) since Fri 2017-08-18 10:26:46 EDT; 8min ago
            Process: 28088 ExecStartPost=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
            Process: 28086 ExecStartPost=/etc/mysql/debian-start (code=exited, status=0/SUCCESS)
            Process: 27909 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= ||   VAR=`/usr/bin/galera_recovery`; [ $? -eq 0 ]   && systemctl set-env
            Process: 27905 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
            Process: 27902 ExecStartPre=/usr/bin/install -m 755 -o mysql -g root -d /var/run/mysqld (code=exited, status=0/SUCCESS)
           Main PID: 28053 (mysqld)
             Status: "Taking your SQL requests now..."
             CGroup: /system.slice/mariadb.service
                     └─28053 /usr/sbin/mysqld --wsrep-new-cluster --wsrep_start_position=afe4dfc2-8420-11e7-b8c2-aa002665b1b9:0
           
          Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: group UUID = afe4dfc2-8420-11e7-b8c2-aa002665b1b9
          Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203415037696 [Note] WSREP: Flow-control interval: [16, 16]
          Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203716564736 [Note] WSREP: New cluster view: global state: afe4dfc2-8420-11e7-b8c
          Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203716564736 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification
          Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203716564736 [Note] WSREP: REPL Protocols: 7 (3, 2)
          Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203716564736 [Note] WSREP: Assign initial position for certification: 0, protocol
          Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203473073920 [Note] WSREP: Service thread queue flushed.
          Aug 18 10:31:27 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:27 140203423430400 [Note] WSREP: (436428c9, 'tcp://0.0.0.0:4567') connection establishe
          Aug 18 10:31:27 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:27 140203423430400 [Warning] WSREP: discarding established (time wait) 4e987697 (tcp://
          Aug 18 10:31:28 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:28 140203423430400 [Note] WSREP:  cleaning up 4e987697 (tcp://127.0.0.1:4569)
          $ m2*/shutdown.sh
          [1]+  Done                    mysqld_safe --defaults-file=m2-system2/my.cnf --user=$(whoami)
          $ rm -rf m2*/dt/*
          $ m2*/install_db.sh
          $ sudo systemctl stop systemd-journald
          $ mysqld_safe --defaults-file=m2-system2/my.cnf --user=$(whoami) &
          [1] 29020
          $ mysql --defaults-file=m2-system2/my.cnf -e 'show status like "wsrep_cluster_size"'
          +--------------------+-------+
          | Variable_name      | Value |
          +--------------------+-------+
          | wsrep_cluster_size | 2     |
          +--------------------+-------+
          $ sudo systemctl status mariadb
          ● mariadb.service - MariaDB database server
             Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled)
            Drop-In: /etc/systemd/system/mariadb.service.d
                     └─migrated-from-my.cnf-settings.conf
             Active: active (running) since Fri 2017-08-18 10:26:46 EDT; 11min ago
            Process: 28088 ExecStartPost=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
            Process: 28086 ExecStartPost=/etc/mysql/debian-start (code=exited, status=0/SUCCESS)
            Process: 27909 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= ||   VAR=`/usr/bin/galera_recovery`; [ $? -eq 0 ]   && systemctl set-envi
            Process: 27905 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
            Process: 27902 ExecStartPre=/usr/bin/install -m 755 -o mysql -g root -d /var/run/mysqld (code=exited, status=0/SUCCESS)
           Main PID: 28053 (mysqld)
             Status: "Taking your SQL requests now..."
             CGroup: /system.slice/mariadb.service
                     └─28053 /usr/sbin/mysqld --wsrep-new-cluster --wsrep_start_position=afe4dfc2-8420-11e7-b8c2-aa002665b1b9:0
           
          Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: group UUID = afe4dfc2-8420-11e7-b8c2-aa002665b1b9
          Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203415037696 [Note] WSREP: Flow-control interval: [16, 16]
          Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203716564736 [Note] WSREP: New cluster view: global state: afe4dfc2-8420-11e7-b8c2
          Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203716564736 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
          Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203716564736 [Note] WSREP: REPL Protocols: 7 (3, 2)
          Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203716564736 [Note] WSREP: Assign initial position for certification: 0, protocol 
          Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203473073920 [Note] WSREP: Service thread queue flushed.
          Aug 18 10:31:27 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:27 140203423430400 [Note] WSREP: (436428c9, 'tcp://0.0.0.0:4567') connection established
          Aug 18 10:31:27 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:27 140203423430400 [Warning] WSREP: discarding established (time wait) 4e987697 (tcp://1
          Aug 18 10:31:28 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:28 140203423430400 [Note] WSREP:  cleaning up 4e987697 (tcp://127.0.0.1:4569)
          

          anikitin Andrii Nikitin (Inactive) added a comment - - edited Yes in my understanding 'restart' should be safer than 'stop&start' , at least in recent versions. (because https://github.com/systemd/systemd/issues/6620 should not trigger after 'restart' ). But on practice I see that mariadb service becomes silent after restart even on systemd (230-7~bpo8+2) on jessie . But good news is that I can confirm that the problem doesn't occur with my steps when using both 'stop' or 'restart' with your suggested workaround (i.e. SST is performed properly, just service becomes silent anyway): sudo apt-get -t jessie-backports install systemd $ apt-cache policy systemd systemd: Installed: 215-17 Candidate: 215-17+deb8u7 Version table: 230-7~bpo8+2 0 100 http://ftp.debian.org/debian/ jessie-backports/main amd64 Packages 215-17+deb8u7 0 500 http://debian.osuosl.org/debian/ jessie/main amd64 Packages *** 215-17 0 100 /var/lib/dpkg/status $ sudo apt-get -t jessie-backports install systemd ... $ git clone http://github.com/AndriiNikitin/mariadb-environs ... $ ./replant.sh m0-system $ sudo m0-system/install.sh 10.1 ... $ sudo vim /etc/mysql/my.cnf $ ls /usr/lib/galera/ libgalera_smm.so $ sudo systemctl stop mariadb $ sudo galera_new_cluster $ mysql -uroot -e 'show variables like "wsrep_on"' +---------------+-------+ | Variable_name | Value | +---------------+-------+ | wsrep_on | ON | +---------------+-------+ $ ./replant.sh m2-system2 $ m2*/gen_cnf.sh $ m2*/install_db.sh $ cat << EOF >> m2*/mysqldextra.cnf > binlog_format=ROW > default-storage-engine=innodb > innodb_autoinc_lock_mode=2 > bind-address=0.0.0.0 > wsrep_on=ON > wsrep_provider=/usr/lib/galera/libgalera_smm.so > wsrep_cluster_name=t1 > wsrep_cluster_address=gcomm://127.0.0.1:4567 > wsrep_sst_method=rsync > wsrep_node_address=127.0.0.1:4569 > wsrep_node_name=127.0.0.1_4569 > EOF $ mysqld_safe --defaults-file=m2-system2/my.cnf --user=$(whoami) & [1] 27407 $ mysql --defaults-file=m2-system2/my.cnf -e 'show status like "wsrep_cluster_size"' +--------------------+-------+ | Variable_name | Value | +--------------------+-------+ | wsrep_cluster_size | 2 | +--------------------+-------+ $ m2*/shutdown.sh [1]+ Done mysqld_safe --defaults-file=m2-system2/my.cnf --user=$(whoami) $ rm -rf m2*/dt/* $ m2*/install_db.sh $ sudo systemctl restart systemd-journald $ sudo systemctl status mariadb ● mariadb.service - MariaDB database server Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled) Drop-In: /etc/systemd/system/mariadb.service.d └─migrated-from-my.cnf-settings.conf Active: active (running) since Fri 2017-08-18 10:26:46 EDT; 6min ago Process: 28088 ExecStartPost=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS) Process: 28086 ExecStartPost=/etc/mysql/debian-start (code=exited, status=0/SUCCESS) Process: 27909 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= || VAR=`/usr/bin/galera_recovery`; [ $? -eq 0 ] && systemctl set-env Process: 27905 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS) Process: 27902 ExecStartPre=/usr/bin/install -m 755 -o mysql -g root -d /var/run/mysqld (code=exited, status=0/SUCCESS) Main PID: 28053 (mysqld) Status: "Taking your SQL requests now..." CGroup: /system.slice/mariadb.service └─28053 /usr/sbin/mysqld --wsrep-new-cluster --wsrep_start_position=afe4dfc2-8420-11e7-b8c2-aa002665b1b9:0   Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: group UUID = afe4dfc2-8420-11e7-b8c2-aa002665b1b9 Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203415037696 [Note] WSREP: Flow-control interval: [16, 16] Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203716564736 [Note] WSREP: New cluster view: global state: afe4dfc2-8420-11e7-b8c Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203716564736 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203716564736 [Note] WSREP: REPL Protocols: 7 (3, 2) Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203716564736 [Note] WSREP: Assign initial position for certification: 0, protocol Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203473073920 [Note] WSREP: Service thread queue flushed. Aug 18 10:31:27 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:27 140203423430400 [Note] WSREP: (436428c9, 'tcp://0.0.0.0:4567') connection establishe Aug 18 10:31:27 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:27 140203423430400 [Warning] WSREP: discarding established (time wait) 4e987697 (tcp:// Aug 18 10:31:28 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:28 140203423430400 [Note] WSREP: cleaning up 4e987697 (tcp://127.0.0.1:4569) $ mysqld_safe --defaults-file=m2-system2/my.cnf --user=$(whoami) & [1] 28557 $ mysql --defaults-file=m2-system2/my.cnf -e 'show status like "wsrep_cluster_size"' +--------------------+-------+ | Variable_name | Value | +--------------------+-------+ | wsrep_cluster_size | 2 | +--------------------+-------+ $ sudo systemctl status mariadb ● mariadb.service - MariaDB database server Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled) Drop-In: /etc/systemd/system/mariadb.service.d └─migrated-from-my.cnf-settings.conf Active: active (running) since Fri 2017-08-18 10:26:46 EDT; 8min ago Process: 28088 ExecStartPost=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS) Process: 28086 ExecStartPost=/etc/mysql/debian-start (code=exited, status=0/SUCCESS) Process: 27909 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= || VAR=`/usr/bin/galera_recovery`; [ $? -eq 0 ] && systemctl set-env Process: 27905 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS) Process: 27902 ExecStartPre=/usr/bin/install -m 755 -o mysql -g root -d /var/run/mysqld (code=exited, status=0/SUCCESS) Main PID: 28053 (mysqld) Status: "Taking your SQL requests now..." CGroup: /system.slice/mariadb.service └─28053 /usr/sbin/mysqld --wsrep-new-cluster --wsrep_start_position=afe4dfc2-8420-11e7-b8c2-aa002665b1b9:0   Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: group UUID = afe4dfc2-8420-11e7-b8c2-aa002665b1b9 Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203415037696 [Note] WSREP: Flow-control interval: [16, 16] Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203716564736 [Note] WSREP: New cluster view: global state: afe4dfc2-8420-11e7-b8c Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203716564736 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203716564736 [Note] WSREP: REPL Protocols: 7 (3, 2) Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203716564736 [Note] WSREP: Assign initial position for certification: 0, protocol Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203473073920 [Note] WSREP: Service thread queue flushed. Aug 18 10:31:27 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:27 140203423430400 [Note] WSREP: (436428c9, 'tcp://0.0.0.0:4567') connection establishe Aug 18 10:31:27 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:27 140203423430400 [Warning] WSREP: discarding established (time wait) 4e987697 (tcp:// Aug 18 10:31:28 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:28 140203423430400 [Note] WSREP: cleaning up 4e987697 (tcp://127.0.0.1:4569) $ m2*/shutdown.sh [1]+ Done mysqld_safe --defaults-file=m2-system2/my.cnf --user=$(whoami) $ rm -rf m2*/dt/* $ m2*/install_db.sh $ sudo systemctl stop systemd-journald $ mysqld_safe --defaults-file=m2-system2/my.cnf --user=$(whoami) & [1] 29020 $ mysql --defaults-file=m2-system2/my.cnf -e 'show status like "wsrep_cluster_size"' +--------------------+-------+ | Variable_name | Value | +--------------------+-------+ | wsrep_cluster_size | 2 | +--------------------+-------+ $ sudo systemctl status mariadb ● mariadb.service - MariaDB database server Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled) Drop-In: /etc/systemd/system/mariadb.service.d └─migrated-from-my.cnf-settings.conf Active: active (running) since Fri 2017-08-18 10:26:46 EDT; 11min ago Process: 28088 ExecStartPost=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS) Process: 28086 ExecStartPost=/etc/mysql/debian-start (code=exited, status=0/SUCCESS) Process: 27909 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= || VAR=`/usr/bin/galera_recovery`; [ $? -eq 0 ] && systemctl set-envi Process: 27905 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS) Process: 27902 ExecStartPre=/usr/bin/install -m 755 -o mysql -g root -d /var/run/mysqld (code=exited, status=0/SUCCESS) Main PID: 28053 (mysqld) Status: "Taking your SQL requests now..." CGroup: /system.slice/mariadb.service └─28053 /usr/sbin/mysqld --wsrep-new-cluster --wsrep_start_position=afe4dfc2-8420-11e7-b8c2-aa002665b1b9:0   Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: group UUID = afe4dfc2-8420-11e7-b8c2-aa002665b1b9 Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203415037696 [Note] WSREP: Flow-control interval: [16, 16] Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203716564736 [Note] WSREP: New cluster view: global state: afe4dfc2-8420-11e7-b8c2 Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203716564736 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203716564736 [Note] WSREP: REPL Protocols: 7 (3, 2) Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203716564736 [Note] WSREP: Assign initial position for certification: 0, protocol Aug 18 10:31:23 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:23 140203473073920 [Note] WSREP: Service thread queue flushed. Aug 18 10:31:27 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:27 140203423430400 [Note] WSREP: (436428c9, 'tcp://0.0.0.0:4567') connection established Aug 18 10:31:27 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:27 140203423430400 [Warning] WSREP: discarding established (time wait) 4e987697 (tcp://1 Aug 18 10:31:28 debian-jessie-amd64 mysqld[28053]: 2017-08-18 10:31:28 140203423430400 [Note] WSREP: cleaning up 4e987697 (tcp://127.0.0.1:4569)

          People

            anikitin Andrii Nikitin (Inactive)
            schlarbm Moritz Schlarb
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.