Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-16488

Cannot bootstrap the cluster for [ERROR] WSREP: failed to open gcomm backend connection: 131: invalid UUID: 00000000

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Not a Bug
    • 10.3.7
    • N/A
    • Galera
    • 3 Galera Nodes Master-Master,CentOS 7.4

    Description

      Cannot bootstrap the cluster for [ERROR] WSREP: failed to open gcomm backend connection: 131: invalid UUID: 00000000

      Problem occurred on Galera cluster composed from 3 Nodes.
      Node2 was desynced manually from the group with wsrep_desync.

      Cluster remained standby in that state until the occurrence of power down - power up events.It seems that Node2 succeed to sync with the group as it shows the same GTID
      as Nodes' 1 and 3 in the recovery logs ,
      wsrep_desync was not set permanently and it's cleaned after rebooting.
      On all Nodes however mysqld is not started and it's received the Error:

       [ERROR] WSREP: failed to open gcomm backend connection: 131: invalid UUID: 00000000 (FATAL)
               at gcomm/src/pc.cpp:PC():267
      

      Note: that Errors was received after trying to bootstraping 1 Node with
      galera_new_cluster command , also with
      wsrep-recover option service mysql start --wsrep-recover ;

      The same Error was received also from rejoining Nodes with commands
      mysql start , or
      mysql start --wsrep_cluster_address="gcomm://192.168.104.193,192.168.104.195"

      Initial GTID positions:

      Node1,Node3:  3c15149f-5766-11e8-9a99-22bc53d40581:26987
      Node2:             3c15149f-5766-11e8-9a99-22bc53d40581:26985
      

      The GTID positions seen in recovery logs:

      Node1,Node2,Node3:  3c15149f-5766-11e8-9a99-22bc53d40581:26987
      

      ]# systemctl status mysql.service
      ● mariadb.service - MariaDB 10.3.7 database server
         Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled)
        Drop-In: /etc/systemd/system/mariadb.service.d
                 └─migrated-from-my.cnf-settings.conf
         Active: failed (Result: exit-code) since Thu 2018-06-14 08:30:35 EEST; 3h 11min ago
           Docs: man:mysqld(8)
                 https://mariadb.com/kb/en/library/systemd/
        Process: 987 ExecStart=/usr/sbin/mysqld $MYSQLD_OPTS $_WSREP_NEW_CLUSTER $_WSREP_START_POSITION (code=exited, status=1/FAILURE)
        Process: 880 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= ||   VAR=`/usr/bin/galera_recovery`; [ $? -eq 0 ]   && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1 (code=exited, status=0/SUCCESS)
        Process: 876 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
       Main PID: 987 (code=exited, status=1/FAILURE)
         Status: "MariaDB server is down"
      

      bootstraping cluster

      [root@t4w3 ~]# galera_new_cluster
      Job for mariadb.service failed because the control process exited with error code. See "systemctl status mariadb.service" and "journalctl                     -xe" for details.
      

      2018-06-14 11:58:49 0 [Note] WSREP: Read nil XID from storage engines, skipping position init
      2018-06-14 11:58:49 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so'
      2018-06-14 11:58:49 0 [Note] WSREP: wsrep_load(): Galera 25.3.22(r3764) by Codership Oy <info@codership.com> loaded successfully.
      2018-06-14 11:58:49 0 [Note] WSREP: CRC-32C: using "slicing-by-8" algorithm.
      2018-06-14 11:58:49 0 [Note] WSREP: Found saved state: 3c15149f-5766-11e8-9a99-22bc53d40581:-1, safe_to_bootstrap: 1
      2018-06-14 11:58:49 0 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 192.168.104.195; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.recover = no; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; p
      2018-06-14 11:58:49 0 [Note] WSREP: GCache history reset: 3c15149f-5766-11e8-9a99-22bc53d40581:0 -> 3c15149f-5766-11e8-9a99-22bc53d40581:26987
      2018-06-14 11:58:49 0 [Note] WSREP: Assign initial position for certification: 26987, protocol version: -1
      2018-06-14 11:58:49 0 [Note] WSREP: wsrep_sst_grab()
      2018-06-14 11:58:49 0 [Note] WSREP: Start replication
      2018-06-14 11:58:49 0 [Note] WSREP: 'wsrep-new-cluster' option used, bootstrapping the cluster
      2018-06-14 11:58:49 0 [Note] WSREP: Setting initial position to 3c15149f-5766-11e8-9a99-22bc53d40581:26987
      2018-06-14 11:58:49 0 [Note] WSREP: protonet asio version 0
      2018-06-14 11:58:49 0 [Note] WSREP: Using CRC-32C for message checksums.
      2018-06-14 11:58:49 0 [Note] WSREP: backend: asio
      2018-06-14 11:58:49 0 [Note] WSREP: gcomm thread scheduling priority set to other:0
      2018-06-14 11:58:49 0 [Note] WSREP: restore pc from disk successfully
      2018-06-14 11:58:49 0 [Note] WSREP: GMCast version 0
      2018-06-14 11:58:49 0 [Note] WSREP: (00000000, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
      2018-06-14 11:58:49 0 [Note] WSREP: (00000000, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
      2018-06-14 11:58:49 0 [ERROR] WSREP: failed to open gcomm backend connection: 131: invalid UUID: 00000000 (FATAL)
               at gcomm/src/pc.cpp:PC():267
      2018-06-14 11:58:49 0 [ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():208: Failed to open backend connection: -131 (State not recoverable)
      2018-06-14 11:58:49 0 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1458: Failed to open channel 'cluster1' at 'gcomm://192.168.104.191,192.168.104.195,192.168.104.196': -131 (State not recoverable)
      2018-06-14 11:58:49 0 [ERROR] WSREP: gcs connect failed: State not recoverable
      2018-06-14 11:58:49 0 [ERROR] WSREP: wsrep::connect(gcomm://192.168.104.191,192.168.104.195,192.168.104.196) failed: 7
      2018-06-14 11:58:49 0 [ERROR] Aborting
      
      

      node 6

      [root@t4w6 ~]#  service mysql start  --wsrep-recover
      

      2018-06-14 12:40:23 0 [Note] /usr/sbin/mysqld (mysqld 10.3.7-MariaDB) starting as process 9273 ...
      2018-06-14 12:40:23 0 [ERROR] mysqld: Can't open shared library 'handlersocket.so' (errno: 1, Loading of beta plugin handlersocket is prohibited by --plugin-maturity=gamma)
      2018-06-14 12:40:23 0 [Note] InnoDB: For Galera, using innodb_lock_schedule_algorithm=fcfs
      2018-06-14 12:40:23 0 [Note] InnoDB: Using Linux native AIO
      2018-06-14 12:40:23 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
      2018-06-14 12:40:23 0 [Note] InnoDB: Uses event mutexes
      2018-06-14 12:40:23 0 [Note] InnoDB: Compressed tables use zlib 1.2.7
      2018-06-14 12:40:23 0 [Note] InnoDB: Number of pools: 1
      2018-06-14 12:40:23 0 [Note] InnoDB: Using generic crc32 instructions
      2018-06-14 12:40:23 0 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M
      2018-06-14 12:40:24 0 [Note] InnoDB: Completed initialization of buffer pool
      2018-06-14 12:40:24 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
      2018-06-14 12:40:24 0 [Note] InnoDB: 128 out of 128 rollback segments are active.
      2018-06-14 12:40:24 0 [Note] InnoDB: Creating shared tablespace for temporary tables
      2018-06-14 12:40:24 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
      2018-06-14 12:40:24 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
      2018-06-14 12:40:24 0 [Note] InnoDB: Waiting for purge to start
      2018-06-14 12:40:24 0 [Note] InnoDB: 10.3.7 started; log sequence number 7796129428; transaction id 80278
      2018-06-14 12:40:24 0 [Warning] InnoDB: Skipping buffer pool dump/restore during wsrep recovery.
      2018-06-14 12:40:24 0 [Note] Plugin 'FEEDBACK' is disabled.
      2018-06-14 12:40:24 0 [Note] Server socket created on IP: '0.0.0.0'.
      2018-06-14 12:40:24 0 [Note] WSREP: Recovered position: 3c15149f-5766-11e8-9a99-22bc53d40581:26987
      
      

      root      9126  0.0  0.0 113260  1728 pts/0    S+   12:40   0:00 /bin/sh /usr/sbin/service mysql start --wsrep-recover
      root      9143  0.0  0.0 115520  1836 pts/0    S+   12:40   0:00 /bin/sh /etc/init.d/mysql start --wsrep-recover
      root      9153  0.0  0.0 128428  1448 pts/0    S+   12:40   0:00 /bin/systemctl start mysql.service
      mysql    10013 29.0  5.8 1874804 116608 ?      Ssl  12:40   0:00 /usr/sbin/mysqld --wsrep_start_position=3c15149f-5766-11e8-9a99-22bc53d40581:26987
      
      

      2018-06-14 12:40:26 0 [ERROR] mysqld: Can't open shared library 'handlersocket.so' (errno: 1, Loading of beta plugin handlersocket is prohibited by --plugin-maturity=gamma)
      2018-06-14 12:40:26 0 [Note] InnoDB: For Galera, using innodb_lock_schedule_algorithm=fcfs
      2018-06-14 12:40:26 0 [Note] InnoDB: Using Linux native AIO
      2018-06-14 12:40:26 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
      2018-06-14 12:40:26 0 [Note] InnoDB: Uses event mutexes
      2018-06-14 12:40:26 0 [Note] InnoDB: Compressed tables use zlib 1.2.7
      2018-06-14 12:40:26 0 [Note] InnoDB: Number of pools: 1
      2018-06-14 12:40:26 0 [Note] InnoDB: Using generic crc32 instructions
      2018-06-14 12:40:26 0 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M
      2018-06-14 12:40:26 0 [Note] InnoDB: Completed initialization of buffer pool
      2018-06-14 12:40:26 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
      2018-06-14 12:40:26 0 [Note] InnoDB: 128 out of 128 rollback segments are active.
      2018-06-14 12:40:26 0 [Note] InnoDB: Creating shared tablespace for temporary tables
      2018-06-14 12:40:26 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
      2018-06-14 12:40:26 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
      2018-06-14 12:40:26 0 [Note] InnoDB: Waiting for purge to start
      2018-06-14 12:40:26 0 [Note] InnoDB: 10.3.7 started; log sequence number 7796129437; transaction id 80278
      2018-06-14 12:40:26 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
      2018-06-14 12:40:26 0 [Note] Plugin 'FEEDBACK' is disabled.
      2018-06-14 12:40:26 0 [Note] InnoDB: Buffer pool(s) load completed at 180614 12:40:26
      2018-06-14 12:40:26 0 [Note] Server socket created on IP: '0.0.0.0'.
      2018-06-14 12:40:26 0 [Note] WSREP: Initial position: 3c15149f-5766-11e8-9a99-22bc53d40581:26987
      2018-06-14 12:40:26 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so'
      2018-06-14 12:40:26 0 [Note] WSREP: wsrep_load(): Galera 25.3.23(r3789) by Codership Oy <info@codership.com> loaded successfully.
      2018-06-14 12:40:26 0 [Note] WSREP: CRC-32C: using "slicing-by-8" algorithm.
      2018-06-14 12:40:26 0 [Note] WSREP: Found saved state: 3c15149f-5766-11e8-9a99-22bc53d40581:-1, safe_to_bootstrap: 1
      2018-06-14 12:40:26 0 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 192.168.104.196; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.recover = no; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; p
      2018-06-14 12:40:26 0 [Note] WSREP: GCache history reset: 3c15149f-5766-11e8-9a99-22bc53d40581:0 -> 3c15149f-5766-11e8-9a99-22bc53d40581:26987
      2018-06-14 12:40:26 0 [Note] WSREP: Assign initial position for certification: 26987, protocol version: -1
      2018-06-14 12:40:26 0 [Note] WSREP: Start replication
      2018-06-14 12:40:26 0 [Note] WSREP: Setting initial position to 3c15149f-5766-11e8-9a99-22bc53d40581:26987
      2018-06-14 12:40:26 0 [Note] WSREP: protonet asio version 0
      2018-06-14 12:40:26 0 [Note] WSREP: Using CRC-32C for message checksums.
      2018-06-14 12:40:26 0 [Note] WSREP: backend: asio
      2018-06-14 12:40:26 0 [Note] WSREP: gcomm thread scheduling priority set to other:0
      2018-06-14 12:40:26 0 [Note] WSREP: restore pc from disk successfully
      2018-06-14 12:40:26 0 [Note] WSREP: GMCast version 0
      2018-06-14 12:40:26 0 [Note] WSREP: (00000000, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
      2018-06-14 12:40:26 0 [Note] WSREP: (00000000, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
      2018-06-14 12:40:26 0 [ERROR] WSREP: failed to open gcomm backend connection: 131: invalid UUID: 00000000 (FATAL)
               at gcomm/src/pc.cpp:PC():267
      2018-06-14 12:40:26 0 [ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():208: Failed to open backend connection: -131 (State not recoverable)
      2018-06-14 12:40:26 0 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1458: Failed to open channel 'cluster1' at 'gcomm://192.168.104.191,192.168.104.195,192.168.104.196': -131 (State not recoverable)
      2018-06-14 12:40:26 0 [ERROR] WSREP: gcs connect failed: State not recoverable
      2018-06-14 12:40:26 0 [ERROR] WSREP: wsrep::connect(gcomm://192.168.104.191,192.168.104.195,192.168.104.196) failed: 7
      2018-06-14 12:40:26 0 [ERROR] Aborting
      
      

      Attachments

        Activity

          People

            jplindst Jan Lindström (Inactive)
            winstone Zdravelina Sokolovska (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.