Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-27682

Bundled wsrep_notify.sh causes mariadbd to freeze during start

Details

    Description

      starting with galera_new_cluster stuck at this point

      mysql      29610  0.6  2.5 1159880 99828 ?       Ssl  02:48   0:00 /usr/sbin/mariadbd --wsrep-new-cluster --wsrep_start_position=09e652bc-8169-11ec-848c-dfaaaea09e0a:999
      mysql      29638  0.0  0.0   2420   528 ?        S    02:48   0:00  \_ sh -c /usr/local/bin/wsrep_notify.sh --status initialized
      mysql      29639  0.0  0.0   2420   524 ?        S    02:48   0:00      \_ /bin/sh -eu /usr/local/bin/wsrep_notify.sh --status initialized
      mysql      29641  0.0  0.1  20104  7828 ?        S    02:48   0:00          \_ mysql -B
      

      root@frytka:~# strace -p 29610
      strace: Process 29610 attached
      wait4(29638,
      

      There are connection errors in error.log on every attempt to execute wsrep_notify.sh

      2022-01-30  2:48:48 1 [Note] WSREP: Server status change disconnected -> connected
      ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/run/mysqld/mysqld.sock' (111)
      2022-01-30  2:48:48 1 [ERROR] WSREP: Process completed with error: /usr/local/bin/wsrep_notify.sh --status connected: 1 (Operation not permitted)
      2022-01-30  2:48:48 1 [ERROR] WSREP: Notification command failed: 1 (Operation not permitted): "/usr/local/bin/wsrep_notify.sh --status connected"
       
      2022-01-30  2:48:48 1 [Note] WSREP: Server status change connected -> joiner
      ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/run/mysqld/mysqld.sock' (111)
      2022-01-30  2:48:48 1 [ERROR] WSREP: Process completed with error: /usr/local/bin/wsrep_notify.sh --status joiner: 1 (Operation not permitted)
      2022-01-30  2:48:48 1 [ERROR] WSREP: Notification command failed: 1 (Operation not permitted): "/usr/local/bin/wsrep_notify.sh --status joiner"
       
      2022-01-30  2:48:48 1 [Note] WSREP: Server status change joiner -> initializing
      ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/run/mysqld/mysqld.sock' (111)
      2022-01-30  2:48:48 1 [ERROR] WSREP: Process completed with error: /usr/local/bin/wsrep_notify.sh --status initializing: 1 (Operation not permitted)
      2022-01-30  2:48:48 1 [ERROR] WSREP: Notification command failed: 1 (Operation not permitted): "/usr/local/bin/wsrep_notify.sh --status initializing"
      
      

      But:
      1: when it's in frozen state, I can't connect either
      2: after killing (SIGKILL) mariadbd and restarting without wsrep_notify_cmd everything is working normally
      3: I've modified the script to connect as service account (mysql) via socket, but had the same issue when trying to use TCP to localhost

      Config I'm running:

      [sst]
      encrypt=3
      sst-log-archive-dir=/var/log/mysql/
      sst-log-archive=0
      tkey = /etc/mysql/ssl/key.pem
      tcert = /etc/mysql/ssl/cert.pem
      tca = /etc/mysql/ssl/ca.pem
      ssl-mode=VERIFY_CA
       
      [galera]
      wsrep_on                 = ON
      wsrep_cluster_name       = "clustername"
      wsrep_provider           = /usr/lib/galera/libgalera_smm.so
      wsrep_cluster_address    = gcomm://host02.domain.com,host01.domain.com,host03.domain.com
      binlog_format            = row
      default_storage_engine   = InnoDB
      innodb_autoinc_lock_mode = 2
      innodb_doublewrite = 1
       
      wsrep_notify_cmd=/usr/local/bin/wsrep_notify.sh
       
      wsrep_provider_options="socket.ssl_cert=/etc/mysql/ssl/cert.pem;socket.ssl_key=/etc/mysql/ssl/key.pem;socket.ssl_ca=/etc/mysql/ssl/ca.pem"
      #wsrep_sst_method = rsync_wan
      wsrep_sst_method = mariabackup
      wsrep_sst_auth = mysql:
       
       
      wsrep_node_address = host01.domain.com
      
      

      Attachments

        Activity

          I'm facing almost same issue. When I use wsrep_notify_cmd=/home/mh/instances/10508/wsrep_notify.sh in my.cnf and try to bootstrap very first node, it's getting started but hanging.

          2021-12-18  2:40:39 0 [Note] InnoDB: 128 rollback segments are active.
          2021-12-18  2:40:39 0 [Note] InnoDB: Creating shared tablespace for temporary tables
          2021-12-18  2:40:39 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
          2021-12-18  2:40:39 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
          2021-12-18  2:40:39 0 [Note] InnoDB: 10.5.8 started; log sequence number 556538; transaction id 1404
          2021-12-18  2:40:39 0 [Note] Plugin 'FEEDBACK' is disabled.
          2021-12-18  2:40:39 0 [Note] InnoDB: Loading buffer pool(s) from /home/mh/instances/mariadb-10.5.8-linux-x86_64.10508/data/ib_buffer_pool
          2021-12-18  2:40:39 0 [Note] InnoDB: Buffer pool(s) load completed at 211218  2:40:39
          2021-12-18  2:40:39 0 [Note] Server socket created on IP: '::'.
          2021-12-18  2:40:39 0 [Note] WSREP: wsrep_init_schema_and_SR 0x0
          2021-12-18  2:40:39 0 [Note] WSREP: Server initialized
          2021-12-18  2:40:39 0 [Note] WSREP: Server status change initializing -> initialized
          2021-12-18  2:40:39 2 [Note] WSREP: Bootstrapping a new cluster, setting initial position to 00000000-0000-0000-0000-000000000000:-1
          2021-12-18  2:40:39 5 [Note] WSREP: Recovered cluster id 5e868df3-5fd5-11ec-beab-eef60befafd4
          2021-12-18  2:40:39 2 [Note] WSREP: Server status change initialized -> joined
          

          Even I can't login to server.

          [root@nilcentos7 10508]# mysql -h127.0.0.1 -uneel -pnil@123 -P 10508
          ^C
          [root@nilcentos7 10508]# 
          

          I've created user like below, add details into wsrep_notify.sh script something like below

          MariaDB [(none)]> GRANT ALL PRIVILEGES ON *.* TO 'neel'@'localhost' IDENTIFIED BY 'neel@123';
          Query OK, 0 rows affected (0.032 sec)
          

          My script has below changes. (I have created neel@localhost user at MariaDB server)

          USER=neel
          PASS=nil@123
          HOST=127.0.0.1
          PORT=10508
          SCHEMA="mtr_wsrep_notify"
          MEMB_TABLE="$SCHEMA.membership"
          STATUS_TABLE="$SCHEMA.status"
          ..
          ..
          .
          case $STATUS in
              "joined" | "donor" | "synced")
                  $COM | mysql -B -u$USER -p$PASS -h$HOST -P$PORT
                  ;;
              *)
                  exit 0
                  ;;
          esac
          

          niljoshi Nilnandan Joshi added a comment - I'm facing almost same issue. When I use wsrep_notify_cmd=/home/mh/instances/10508/wsrep_notify.sh in my.cnf and try to bootstrap very first node, it's getting started but hanging. 2021-12-18 2:40:39 0 [Note] InnoDB: 128 rollback segments are active. 2021-12-18 2:40:39 0 [Note] InnoDB: Creating shared tablespace for temporary tables 2021-12-18 2:40:39 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ... 2021-12-18 2:40:39 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB. 2021-12-18 2:40:39 0 [Note] InnoDB: 10.5.8 started; log sequence number 556538; transaction id 1404 2021-12-18 2:40:39 0 [Note] Plugin 'FEEDBACK' is disabled. 2021-12-18 2:40:39 0 [Note] InnoDB: Loading buffer pool(s) from /home/mh/instances/mariadb-10.5.8-linux-x86_64.10508/data/ib_buffer_pool 2021-12-18 2:40:39 0 [Note] InnoDB: Buffer pool(s) load completed at 211218 2:40:39 2021-12-18 2:40:39 0 [Note] Server socket created on IP: '::'. 2021-12-18 2:40:39 0 [Note] WSREP: wsrep_init_schema_and_SR 0x0 2021-12-18 2:40:39 0 [Note] WSREP: Server initialized 2021-12-18 2:40:39 0 [Note] WSREP: Server status change initializing -> initialized 2021-12-18 2:40:39 2 [Note] WSREP: Bootstrapping a new cluster, setting initial position to 00000000-0000-0000-0000-000000000000:-1 2021-12-18 2:40:39 5 [Note] WSREP: Recovered cluster id 5e868df3-5fd5-11ec-beab-eef60befafd4 2021-12-18 2:40:39 2 [Note] WSREP: Server status change initialized -> joined Even I can't login to server. [root@nilcentos7 10508]# mysql -h127.0.0.1 -uneel -pnil@123 -P 10508 ^C [root@nilcentos7 10508]# I've created user like below, add details into wsrep_notify.sh script something like below MariaDB [(none)]> GRANT ALL PRIVILEGES ON *.* TO 'neel'@'localhost' IDENTIFIED BY 'neel@123'; Query OK, 0 rows affected (0.032 sec) My script has below changes. (I have created neel@localhost user at MariaDB server) USER=neel PASS=nil@123 HOST=127.0.0.1 PORT=10508 SCHEMA="mtr_wsrep_notify" MEMB_TABLE="$SCHEMA.membership" STATUS_TABLE="$SCHEMA.status" .. .. . case $STATUS in "joined" | "donor" | "synced") $COM | mysql -B -u$USER -p$PASS -h$HOST -P$PORT ;; *) exit 0 ;; esac

          ok to push

          jplindst Jan Lindström (Inactive) added a comment - ok to push
          sysprg Julius Goryavsky added a comment - Fixed, https://github.com/MariaDB/server/commit/19f0b96d53dec47d7b8680c44997afba2ed7431e

          sysprg, this patch caused conflicts on merge to 10.4, and the test galera.galera_var_notify_ssl_ipv6 started to hang in both local environments where I tested it. Also, some test (name unknown, but hopefully it is that one) started to hang on several buildbot workers. Please fix.

          In the future, please provide branches for newer versions if there are conflicts. 10.3 uses Galera 3 and newer versions use Galera 4. That can make a huge difference.

          marko Marko Mäkelä added a comment - sysprg , this patch caused conflicts on merge to 10.4 , and the test galera.galera_var_notify_ssl_ipv6 started to hang in both local environments where I tested it. Also, some test (name unknown, but hopefully it is that one) started to hang on several buildbot workers. Please fix. In the future, please provide branches for newer versions if there are conflicts. 10.3 uses Galera 3 and newer versions use Galera 4. That can make a huge difference.

          People

            sysprg Julius Goryavsky
            mkozlowski Michal Kozlowski
            Votes:
            1 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.