Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-11048

Galera / SST Fails when running under Vagrant

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Minor
    • Resolution: Not a Bug
    • 10.0.25-galera
    • N/A
    • Galera, Galera SST, wsrep
    • None
    • Vagrant 1.8.6
      Ubuntu 14.04
      Galera galera-3 25.3.18-trusty
      MariaDB 10.1.18+maria-1~trusty

    Description

      When attempting to create a Galera cluster when using Vagrant SST fails as a Joiner will get an incorrect address for the Donor.

      Vagrant provisions eth0 for NAT and therefore this address is invalid to act as the Donor.

      core0 is established database.

      wsrep configuration is

      wsrep_cluster_address = gcomm://core2,core0,core1
      

      host eth1 ip
      core0 192.168.50.3
      core1 192.168.50.4
      core2 192.168.50.5

      core1 finds core 0 at 192.168.50.3:4567

      Oct 12 15:15:02 core1 mysqld: 2016-10-12 15:15:02 140403237877696 [Note] WSREP: gcomm: connecting to group 'TestSystem', peer 'core2:,core0:,core1:'
      Oct 12 15:15:02 core1 mysqld: 2016-10-12 15:15:02 140403237877696 [Note] WSREP: (a61950db, 'tcp://0.0.0.0:4567') connection established to a61950db tcp://127.0.0.1:4567
      Oct 12 15:15:02 core1 mysqld: 2016-10-12 15:15:02 140403237877696 [Note] WSREP: (a61950db, 'tcp://0.0.0.0:4567') connection established to a61950db tcp://127.0.1.1:4567
      Oct 12 15:15:02 core1 mysqld: 2016-10-12 15:15:02 140403237877696 [Warning] WSREP: (a61950db, 'tcp://0.0.0.0:4567') address 'tcp://127.0.1.1:4567' points to own listening address, blacklisting
      Oct 12 15:15:02 core1 mysqld: 2016-10-12 15:15:02 140403237877696 [Note] WSREP: (a61950db, 'tcp://0.0.0.0:4567') connection established to a5301480 tcp://192.168.50.3:4567
      Oct 12 15:15:02 core1 mysqld: 2016-10-12 15:15:02 140403237877696 [Note] WSREP: (a61950db, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: 
      Oct 12 15:15:03 core1 mysqld: 2016-10-12 15:15:03 140403237877696 [Note] WSREP: declaring a5301480 at tcp://192.168.50.3:4567 stable
      Oct 12 15:15:03 core1 mysqld: 2016-10-12 15:15:03 140403237877696 [Note] WSREP: Node a5301480 state prim
      

      core1 attempts to perform SST using `10.0.2.15`

      Oct 12 15:15:03 core1 mysqld: 2016-10-12 15:15:03 140403237563136 [Note] WSREP: State transfer required: 
      Oct 12 15:15:03 core1 mysqld: #011Group state: a530f9fd-908d-11e6-a72a-b2e3a6b91029:1113
      Oct 12 15:15:03 core1 mysqld: #011Local state: 00000000-0000-0000-0000-000000000000:-1
      Oct 12 15:15:03 core1 mysqld: 2016-10-12 15:15:03 140403237563136 [Note] WSREP: New cluster view: global state: a530f9fd-908d-11e6-a72a-b2e3a6b91029:1113, view# 2: Primary, number of nodes: 2, my index: 1, protocol version 3
      Oct 12 15:15:03 core1 mysqld: 2016-10-12 15:15:03 140403237563136 [Warning] WSREP: Gap in state sequence. Need state transfer.
      Oct 12 15:15:03 core1 mysqld: 2016-10-12 15:15:03 140402002753280 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.0.2.15' --datadir '/var/lib/mysql/'   --parent '9043' --binlog '/var/log/mariadb_bin/mariadb-bin' '
      Oct 12 15:15:03 core1 mysqld: WSREP_SST: [INFO] Logging all stderr of SST/Innobackupex to syslog (20161012 15:15:03.985)
      Oct 12 15:15:03 core1 -wsrep-sst-joiner: Streaming with xbstream
      Oct 12 15:15:03 core1 -wsrep-sst-joiner: Using socat as streamer
      Oct 12 15:15:04 core1 -wsrep-sst-joiner: Evaluating timeout -k 110 100 socat -u TCP-LISTEN:4444,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} )
      Oct 12 15:15:04 core1 mysqld: 2016-10-12 15:15:04 140403237563136 [Note] WSREP: Prepared SST request: xtrabackup-v2|10.0.2.15:4444/xtrabackup_sst//1
      Oct 12 15:15:04 core1 mysqld: 2016-10-12 15:15:04 140403237563136 [Note] WSREP: REPL Protocols: 7 (3, 2)
      Oct 12 15:15:04 core1 mysqld: 2016-10-12 15:15:04 140402075592448 [Note] WSREP: Service thread queue flushed.
      Oct 12 15:15:04 core1 mysqld: 2016-10-12 15:15:04 140403237563136 [Note] WSREP: Assign initial position for certification: 1113, protocol version: 3
      Oct 12 15:15:04 core1 mysqld: 2016-10-12 15:15:04 140402075592448 [Note] WSREP: Service thread queue flushed.
      Oct 12 15:15:04 core1 mysqld: 2016-10-12 15:15:04 140403237563136 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (a530f9fd-908d-11e6-a72a-b2e3a6b91029): 1 (Operation not permitted)
      Oct 12 15:15:04 core1 mysqld: #011 at galera/src/replicator_str.cpp:prepare_for_IST():482. IST will be unavailable.
      Oct 12 15:15:04 core1 mysqld: 2016-10-12 15:15:04 140402019526400 [Note] WSREP: Member 1.0 (core1) requested state transfer from '*any*'. Selected 0.0 (core0)(SYNCED) as donor.
      Oct 12 15:15:04 core1 mysqld: 2016-10-12 15:15:04 140402019526400 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 1113)
      Oct 12 15:15:04 core1 mysqld: 2016-10-12 15:15:04 140403237563136 [Note] WSREP: Requesting state transfer: success, donor: 0
      Oct 12 15:15:04 core1 mysqld: 2016-10-12 15:15:04 140402019526400 [Warning] WSREP: 0.0 (core0): State transfer to 1.0 (core1) failed: -32 (Broken pipe)
      Oct 12 15:15:04 core1 mysqld: 2016-10-12 15:15:04 140402019526400 [ERROR] WSREP: gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():736: Will never receive state. Need to abort.
      

      wsrep information from core0

      SHOW GLOBAL STATUS LIKE 'wsrep_%'
       
       
      +------------------------------+--------------------------------------+
      | Variable_name                | Value                                |
      +------------------------------+--------------------------------------+
      ...
      | wsrep_cluster_state_uuid     | a530f9fd-908d-11e6-a72a-b2e3a6b91029 |
      | wsrep_cluster_status         | Primary                              |
      | wsrep_gcomm_uuid             | a5301480-908d-11e6-a84e-0b2444c3985f |
      | wsrep_incoming_addresses     | 10.0.2.15:3306                       |
      | wsrep_local_state            | 4                                    |
      | wsrep_local_state_comment    | Synced                               |
      | wsrep_local_state_uuid       | a530f9fd-908d-11e6-a72a-b2e3a6b91029 |
      ...
      +------------------------------+--------------------------------------+
      

      core0 ipconfig

      vagrant@core0:~$ ifconfig 
      eth0      Link encap:Ethernet  HWaddr 08:00:27:de:04:89  
                inet addr:10.0.2.15  Bcast:10.0.2.255  Mask:255.255.255.0
                inet6 addr: fe80::a00:27ff:fede:489/64 Scope:Link
                UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
                RX packets:218886 errors:0 dropped:0 overruns:0 frame:0
                TX packets:81596 errors:0 dropped:0 overruns:0 carrier:0
                collisions:0 txqueuelen:1000 
                RX bytes:205966097 (205.9 MB)  TX bytes:6015101 (6.0 MB)
       
      eth1      Link encap:Ethernet  HWaddr 08:00:27:bc:f7:ee  
                inet addr:192.168.50.3  Bcast:192.168.50.255  Mask:255.255.255.0
                inet6 addr: fe80::a00:27ff:febc:f7ee/64 Scope:Link
                UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
                RX packets:261637 errors:0 dropped:0 overruns:0 frame:0
                TX packets:244284 errors:0 dropped:0 overruns:0 carrier:0
                collisions:0 txqueuelen:1000 
                RX bytes:59467905 (59.4 MB)  TX bytes:114065906 (114.0 MB)
       
      lo        Link encap:Local Loopback  
                inet addr:127.0.0.1  Mask:255.0.0.0
                inet6 addr: ::1/128 Scope:Host
                UP LOOPBACK RUNNING  MTU:65536  Metric:1
                RX packets:246320 errors:0 dropped:0 overruns:0 frame:0
                TX packets:246320 errors:0 dropped:0 overruns:0 carrier:0
                collisions:0 txqueuelen:0 
                RX bytes:64552545 (64.5 MB)  TX bytes:64552545 (64.5 MB)
      

      I feel there is some inconsistent behaviour, if able to retrieve the current state shouldn't the synchronization then use the same address to perform the transfer?

      I think this is issue is related.
      https://jira.mariadb.org/browse/MDEV-5487

      Attachments

        Issue Links

          Activity

            People

              nirbhay_c Nirbhay Choubey (Inactive)
              georgealton George Alton
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.