Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-15453

IST failed during upgrade of version and SST failed too from 10.2.12 to 10.2.13

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Cannot Reproduce
    • 10.2.13
    • N/A
    • Galera, Galera SST
    • one master , one galera cluster on slave with 10.2.12 under Debian 9.3. We try to upgrade the galera cluster 10.2.12 to 10.2.13

    Description

      Hello I tried to upgrade one galera cluster behind on master.

      they way : upgrade node by node and reconnect to the cluster.
      (SST method : Xtrabackup)

      1) upgraded the node 3 who is slave (no problem, no event on cluster)
      2) we let replication, we disconect the node 2 we upgrade MariaDB from 10.2.12 to 10.2.13
      (maybe one or 2 minutes)
      impossible to rejoin the cluster IST failed, so SST and SST failed.
      3) stop the slave on slave 1, remove node 3 from cluster, upgrade to 10.2.13 reconnect to cluster, everthing fine.

      2018-03-02 10:47:43 140596706543808 [Note] WSREP: GCache history reset: 6b51589d-0b5f-11e8-94e6-82aa9d3be300:0 -> 6b51589d-0b5f-11e8-94e6-82aa9d3be300:467319538
      2018-03-02 10:47:43 140596706543808 [Note] WSREP: Assign initial position for certification: 467319538, protocol version: -1
      2018-03-02 10:47:43 140596706543808 [Note] WSREP: wsrep_sst_grab()
      2018-03-02 10:47:43 140596706543808 [Note] WSREP: Start replication
      2018-03-02 10:47:43 140596706543808 [Note] WSREP: Setting initial position to 6b51589d-0b5f-11e8-94e6-82aa9d3be300:467319538
      2018-03-02 10:47:43 140596706543808 [Note] WSREP: protonet asio version 0
      2018-03-02 10:47:43 140596706543808 [Note] WSREP: Using CRC-32C for message checksums.
      2018-03-02 10:47:43 140596706543808 [Note] WSREP: backend: asio
      2018-03-02 10:47:43 140596706543808 [Note] WSREP: gcomm thread scheduling priority set to other:0
      2018-03-02 10:47:43 140596706543808 [Warning] WSREP: access file(/RDC/mysql/data//gvwstate.dat) failed(No such file or directory)
      2018-03-02 10:47:43 140596706543808 [Note] WSREP: restore pc from disk failed
      2018-03-02 10:47:43 140596706543808 [Note] WSREP: GMCast version 0
      2018-03-02 10:47:43 140596706543808 [Note] WSREP: (c144d827, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
      2018-03-02 10:47:43 140596706543808 [Note] WSREP: (c144d827, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
      2018-03-02 10:47:43 140596706543808 [Note] WSREP: EVS version 0
      2018-03-02 10:47:43 140596706543808 [Note] WSREP: gcomm: connecting to group 'rdc-cluster-1', peer '10.241.5.221:,10.241.5.222:,10.241.5.220:'
      2018-03-02 10:47:43 140596706543808 [Note] WSREP: (c144d827, 'tcp://0.0.0.0:4567') connection established to c144d827 tcp://10.241.5.221:4567
      2018-03-02 10:47:43 140596706543808 [Warning] WSREP: (c144d827, 'tcp://0.0.0.0:4567') address 'tcp://10.241.5.221:4567' points to own listening address, blacklisting
      2018-03-02 10:47:43 140596706543808 [Note] WSREP: (c144d827, 'tcp://0.0.0.0:4567') connection established to 40c1e704 tcp://10.241.5.220:4567
      2018-03-02 10:47:43 140596706543808 [Note] WSREP: (c144d827, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
      2018-03-02 10:47:43 140596706543808 [Note] WSREP: (c144d827, 'tcp://0.0.0.0:4567') connection established to 0a0faad9 tcp://10.241.5.222:4567
      2018-03-02 10:47:44 140596706543808 [Note] WSREP: declaring 0a0faad9 at tcp://10.241.5.222:4567 stable
      2018-03-02 10:47:44 140596706543808 [Note] WSREP: declaring 40c1e704 at tcp://10.241.5.220:4567 stable
      2018-03-02 10:47:44 140596706543808 [Note] WSREP: Node 0a0faad9 state prim
      2018-03-02 10:47:44 140596706543808 [Note] WSREP: view(view_id(PRIM,0a0faad9,23) memb {
              0a0faad9,0
              40c1e704,0
              c144d827,0
      } joined {
      } left {
      } partitioned {
      })
      2018-03-02 10:47:44 140596706543808 [Note] WSREP: save pc into disk
      2018-03-02 10:47:44 140596706543808 [Note] WSREP: gcomm: connected
      2018-03-02 10:47:44 140596706543808 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
      2018-03-02 10:47:44 140596706543808 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
      2018-03-02 10:47:44 140596706543808 [Note] WSREP: Opened channel 'rdc-cluster-1'
      2018-03-02 10:47:44 140596507354880 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 2, memb_num = 3
      2018-03-02 10:47:44 140596706543808 [Note] WSREP: Waiting for SST to complete.
      2018-03-02 10:47:44 140596507354880 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
      2018-03-02 10:47:44 140596507354880 [Note] WSREP: STATE EXCHANGE: sent state msg: c191bd3a-1dfe-11e8-b54c-4f0a6cf28162
      2018-03-02 10:47:44 140596507354880 [Note] WSREP: STATE EXCHANGE: got state msg: c191bd3a-1dfe-11e8-b54c-4f0a6cf28162 from 0 (hb01-galera-cl01-03)
      2018-03-02 10:47:44 140596507354880 [Note] WSREP: STATE EXCHANGE: got state msg: c191bd3a-1dfe-11e8-b54c-4f0a6cf28162 from 1 (hb01-galera-cl01-01)
      2018-03-02 10:47:44 140596507354880 [Note] WSREP: STATE EXCHANGE: got state msg: c191bd3a-1dfe-11e8-b54c-4f0a6cf28162 from 2 (hb01-galera-cl01-02)
      2018-03-02 10:47:44 140596507354880 [Note] WSREP: Quorum results:
              version    = 4,
              component  = PRIMARY,
              conf_id    = 22,
              members    = 2/3 (joined/total),
              act_id     = 467382351,
              last_appl. = -1,
              protocols  = 0/7/3 (gcs/repl/appl),
              group UUID = 6b51589d-0b5f-11e8-94e6-82aa9d3be300
      2018-03-02 10:47:44 140596507354880 [Note] WSREP: Flow-control interval: [28, 28]
      2018-03-02 10:47:44 140596507354880 [Note] WSREP: Trying to continue unpaused monitor
      2018-03-02 10:47:44 140596507354880 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 467382351)
      2018-03-02 10:47:44 140596498659072 [Note] WSREP: State transfer required:
              Group state: 6b51589d-0b5f-11e8-94e6-82aa9d3be300:467382351
              Local state: 6b51589d-0b5f-11e8-94e6-82aa9d3be300:467319538
      2018-03-02 10:47:44 140596498659072 [Note] WSREP: New cluster view: global state: 6b51589d-0b5f-11e8-94e6-82aa9d3be300:467382351, view# 23: Primary, number of nodes: 3, my index: 2, protocol version 3
      2018-03-02 10:47:44 140596498659072 [Warning] WSREP: Gap in state sequence. Need state transfer.
      2018-03-02 10:47:44 140574950680320 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.241.5.221' --datadir '/RDC/mysql/data/'   --parent '21401' --binlog '/RDC/mysql/binlog/mariadb-bin' '
      WSREP_SST: [INFO] Logging all stderr of SST/Innobackupex to syslog (20180302 10:47:44.863)
      2018-03-02 10:47:46 140596515747584 [Note] WSREP: (c144d827, 'tcp://0.0.0.0:4567') connection to peer c144d827 with addr tcp://10.241.5.221:4567 timed out, no messages seen in PT3S
      2018-03-02 10:47:47 140596515747584 [Note] WSREP: (c144d827, 'tcp://0.0.0.0:4567') turning message relay requesting off
      2018-03-02 10:49:13 140574950680320 [ERROR] WSREP: Failed to read 'ready <addr>' from: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.241.5.221' --datadir '/RDC/mysql/data/'   --parent '21401' --binlog '/RDC/mysql/binlog/mariadb-bin'
              Read: '(null)'
      2018-03-02 10:49:13 140574950680320 [ERROR] WSREP: Process was aborted.
      2018-03-02 10:49:13 140574950680320 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.241.5.221' --datadir '/RDC/mysql/data/'   --parent '21401' --binlog '/RDC/mysql/binlog/mariadb-bin' : 2 (No such file or directory)
      2018-03-02 10:49:13 140596498659072 [ERROR] WSREP: Failed to prepare for 'xtrabackup-v2' SST. Unrecoverable.
      2018-03-02 10:49:13 140596498659072 [ERROR] Aborting
       
      Error in my_thread_global_end(): 1 threads didn't exit
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: Read nil XID from storage engines, skipping position init
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: wsrep_load(): Galera 25.3.23(r3789) by Codership Oy <info@codership.com> loaded successfully.
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: CRC-32C: using hardware acceleration.
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: Found saved state: 6b51589d-0b5f-11e8-94e6-82aa9d3be300:-1, safe_to_bootstrap: 0
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: Passing config to GCS: base_dir = /RDC/mysql/data/; base_host = 10.241.5.221; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /RDC/mysql/data/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /RDC/mysql/data//galera.cache; gcache.page_size = 128M; gcache.recover = no; gcache.size = 20G; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: GCache history reset: 6b51589d-0b5f-11e8-94e6-82aa9d3be300:0 -> 6b51589d-0b5f-11e8-94e6-82aa9d3be300:467319538
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: Assign initial position for certification: 467319538, protocol version: -1
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: wsrep_sst_grab()
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: Start replication
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: Setting initial position to 6b51589d-0b5f-11e8-94e6-82aa9d3be300:467319538
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: protonet asio version 0
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: Using CRC-32C for message checksums.
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: backend: asio
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: gcomm thread scheduling priority set to other:0
      2018-03-02 11:06:51 139691983460544 [Warning] WSREP: access file(/RDC/mysql/data//gvwstate.dat) failed(No such file or directory)
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: restore pc from disk failed
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: GMCast version 0
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: (6d87b0f9, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: (6d87b0f9, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: EVS version 0
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: gcomm: connecting to group 'rdc-cluster-1', peer '10.241.5.221:,10.241.5.222:,10.241.5.220:'
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: (6d87b0f9, 'tcp://0.0.0.0:4567') connection established to 6d87b0f9 tcp://10.241.5.221:4567
      2018-03-02 11:06:51 139691983460544 [Warning] WSREP: (6d87b0f9, 'tcp://0.0.0.0:4567') address 'tcp://10.241.5.221:4567' points to own listening address, blacklisting
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: (6d87b0f9, 'tcp://0.0.0.0:4567') connection established to 40c1e704 tcp://10.241.5.220:4567
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: (6d87b0f9, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: (6d87b0f9, 'tcp://0.0.0.0:4567') connection established to 0a0faad9 tcp://10.241.5.222:4567
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: declaring 0a0faad9 at tcp://10.241.5.222:4567 stable
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: declaring 40c1e704 at tcp://10.241.5.220:4567 stable
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: Node 0a0faad9 state prim
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: view(view_id(PRIM,0a0faad9,25) memb {
              0a0faad9,0
              40c1e704,0
              6d87b0f9,0
      } joined {
      } left {
      } partitioned {
      })
      2018-03-02 11:06:51 139691983460544 [Note] WSREP: save pc into disk
      2018-03-02 11:06:52 139691983460544 [Note] WSREP: gcomm: connected
      2018-03-02 11:06:52 139691983460544 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
      2018-03-02 11:06:52 139691983460544 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
      2018-03-02 11:06:52 139691983460544 [Note] WSREP: Opened channel 'rdc-cluster-1'
      2018-03-02 11:06:52 139670239299328 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 2, memb_num = 3
      2018-03-02 11:06:52 139670239299328 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
      2018-03-02 11:06:52 139691983460544 [Note] WSREP: Waiting for SST to complete.
      2018-03-02 11:06:52 139670239299328 [Note] WSREP: STATE EXCHANGE: sent state msg: 6d8f82e8-1e01-11e8-b761-8af7ed4b3c00
      2018-03-02 11:06:52 139670239299328 [Note] WSREP: STATE EXCHANGE: got state msg: 6d8f82e8-1e01-11e8-b761-8af7ed4b3c00 from 0 (hb01-galera-cl01-03)
      2018-03-02 11:06:52 139670239299328 [Note] WSREP: STATE EXCHANGE: got state msg: 6d8f82e8-1e01-11e8-b761-8af7ed4b3c00 from 1 (hb01-galera-cl01-01)
      2018-03-02 11:06:52 139670239299328 [Note] WSREP: STATE EXCHANGE: got state msg: 6d8f82e8-1e01-11e8-b761-8af7ed4b3c00 from 2 (hb01-galera-cl01-02)
      2018-03-02 11:06:52 139670239299328 [Note] WSREP: Quorum results:
              version    = 4,
              component  = PRIMARY,
              conf_id    = 24,
              members    = 2/3 (joined/total),
              act_id     = 467641595,
              last_appl. = -1,
              protocols  = 0/7/3 (gcs/repl/appl),
              group UUID = 6b51589d-0b5f-11e8-94e6-82aa9d3be300
      2018-03-02 11:06:52 139670239299328 [Note] WSREP: Flow-control interval: [28, 28]
      2018-03-02 11:06:52 139670239299328 [Note] WSREP: Trying to continue unpaused monitor
      2018-03-02 11:06:52 139670239299328 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 467641595)
      2018-03-02 11:06:52 139691800753920 [Note] WSREP: State transfer required:
              Group state: 6b51589d-0b5f-11e8-94e6-82aa9d3be300:467641595
              Local state: 6b51589d-0b5f-11e8-94e6-82aa9d3be300:467319538
      2018-03-02 11:06:52 139691800753920 [Note] WSREP: New cluster view: global state: 6b51589d-0b5f-11e8-94e6-82aa9d3be300:467641595, view# 25: Primary, number of nodes: 3, my index: 2, protocol version 3
      2018-03-02 11:06:52 139691800753920 [Warning] WSREP: Gap in state sequence. Need state transfer.
      2018-03-02 11:06:52 139670230906624 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.241.5.221' --datadir '/RDC/mysql/data/'   --parent '24625' --binlog '/RDC/mysql/binlog/mariadb-bin' '
      WSREP_SST: [INFO] Logging all stderr of SST/Innobackupex to syslog (20180302 11:06:52.380)
      2018-03-02 11:06:54 139670247692032 [Note] WSREP: (6d87b0f9, 'tcp://0.0.0.0:4567') connection to peer 6d87b0f9 with addr tcp://10.241.5.221:4567 timed out, no messages seen in PT3S
      2018-03-02 11:06:55 139670247692032 [Note] WSREP: (6d87b0f9, 'tcp://0.0.0.0:4567') turning message relay requesting off
      2018-03-02 11:08:21 139670230906624 [ERROR] WSREP: Failed to read 'ready <addr>' from: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.241.5.221' --datadir '/RDC/mysql/data/'   --parent '24625' --binlog '/RDC/mysql/binlog/mariadb-bin'
              Read: '(null)'
      2018-03-02 11:08:21 139670230906624 [ERROR] WSREP: Process was aborted.
      2018-03-02 11:08:21 139670230906624 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.241.5.221' --datadir '/RDC/mysql/data/'   --parent '24625' --binlog '/RDC/mysql/binlog/mariadb-bin' : 2 (No such file or directory)
      2018-03-02 11:08:21 139691800753920 [ERROR] WSREP: Failed to prepare for 'xtrabackup-v2' SST. Unrecoverable.
      2018-03-02 11:08:21 139691800753920 [ERROR] Aborting
       
      Error in my_thread_global_end(): 1 threads didn't exit
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: Read nil XID from storage engines, skipping position init
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: wsrep_load(): Galera 25.3.23(r3789) by Codership Oy <info@codership.com> loaded successfully.
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: CRC-32C: using hardware acceleration.
      2018-03-02 11:15:16 140552530852032 [Warning] WSREP: Could not open state file for reading: '/RDC/mysql/data//grastate.dat'
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1, safe_to_bootstrap: 1
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: Passing config to GCS: base_dir = /RDC/mysql/data/; base_host = 10.241.5.221; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /RDC/mysql/data/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /RDC/mysql/data//galera.cache; gcache.page_size = 128M; gcache.recover = no; gcache.size = 20G; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: GCache history reset: 00000000-0000-0000-0000-000000000000:0 -> 00000000-0000-0000-0000-000000000000:-1
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: wsrep_sst_grab()
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: Start replication
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: protonet asio version 0
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: Using CRC-32C for message checksums.
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: backend: asio
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: gcomm thread scheduling priority set to other:0
      2018-03-02 11:15:16 140552530852032 [Warning] WSREP: access file(/RDC/mysql/data//gvwstate.dat) failed(No such file or directory)
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: restore pc from disk failed
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: GMCast version 0
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: (9a6e409c, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: (9a6e409c, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: EVS version 0
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: gcomm: connecting to group 'rdc-cluster-1', peer '10.241.5.221:,10.241.5.222:,10.241.5.220:'
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: (9a6e409c, 'tcp://0.0.0.0:4567') connection established to 9a6e409c tcp://10.241.5.221:4567
      2018-03-02 11:15:16 140552530852032 [Warning] WSREP: (9a6e409c, 'tcp://0.0.0.0:4567') address 'tcp://10.241.5.221:4567' points to own listening address, blacklisting
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: (9a6e409c, 'tcp://0.0.0.0:4567') connection established to 40c1e704 tcp://10.241.5.220:4567
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: (9a6e409c, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: (9a6e409c, 'tcp://0.0.0.0:4567') connection established to 0a0faad9 tcp://10.241.5.222:4567
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: declaring 0a0faad9 at tcp://10.241.5.222:4567 stable
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: declaring 40c1e704 at tcp://10.241.5.220:4567 stable
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: Node 0a0faad9 state prim
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: view(view_id(PRIM,0a0faad9,27) memb {
              0a0faad9,0
              40c1e704,0
              9a6e409c,0
      } joined {
      } left {
      } partitioned {
      })
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: save pc into disk
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: gcomm: connected
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: Opened channel 'rdc-cluster-1'
      2018-03-02 11:15:16 140530784655104 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 2, memb_num = 3
      2018-03-02 11:15:16 140552530852032 [Note] WSREP: Waiting for SST to complete.
      2018-03-02 11:15:16 140530784655104 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
      2018-03-02 11:15:16 140530784655104 [Note] WSREP: STATE EXCHANGE: sent state msg: 9a6ee7ea-1e02-11e8-9075-071fb0408cc9
      2018-03-02 11:15:16 140530784655104 [Note] WSREP: STATE EXCHANGE: got state msg: 9a6ee7ea-1e02-11e8-9075-071fb0408cc9 from 0 (hb01-galera-cl01-03)
      2018-03-02 11:15:16 140530784655104 [Note] WSREP: STATE EXCHANGE: got state msg: 9a6ee7ea-1e02-11e8-9075-071fb0408cc9 from 1 (hb01-galera-cl01-01)
      2018-03-02 11:15:16 140530784655104 [Note] WSREP: STATE EXCHANGE: got state msg: 9a6ee7ea-1e02-11e8-9075-071fb0408cc9 from 2 (hb01-galera-cl01-02)
      2018-03-02 11:15:16 140530784655104 [Note] WSREP: Quorum results:
              version    = 4,
              component  = PRIMARY,
              conf_id    = 26,
              members    = 2/3 (joined/total),
              act_id     = 467915709,
              last_appl. = -1,
              protocols  = 0/7/3 (gcs/repl/appl),
              group UUID = 6b51589d-0b5f-11e8-94e6-82aa9d3be300
      2018-03-02 11:15:16 140530784655104 [Note] WSREP: Flow-control interval: [28, 28]
      2018-03-02 11:15:16 140530784655104 [Note] WSREP: Trying to continue unpaused monitor
      2018-03-02 11:15:16 140530784655104 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 467915709)
      2018-03-02 11:15:16 140552339752704 [Note] WSREP: State transfer required:
              Group state: 6b51589d-0b5f-11e8-94e6-82aa9d3be300:467915709
              Local state: 00000000-0000-0000-0000-000000000000:-1
      2018-03-02 11:15:16 140552339752704 [Note] WSREP: New cluster view: global state: 6b51589d-0b5f-11e8-94e6-82aa9d3be300:467915709, view# 27: Primary, number of nodes: 3, my index: 2, protocol version 3
      2018-03-02 11:15:16 140552339752704 [Warning] WSREP: Gap in state sequence. Need state transfer.
      2018-03-02 11:15:16 140530776262400 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.241.5.221' --datadir '/RDC/mysql/data/'   --parent '26402' --binlog '/RDC/mysql/binlog/mariadb-bin' '
      WSREP_SST: [INFO] Logging all stderr of SST/Innobackupex to syslog (20180302 11:15:17.186)
      2018-03-02 11:15:19 140530793047808 [Note] WSREP: (9a6e409c, 'tcp://0.0.0.0:4567') connection to peer 9a6e409c with addr tcp://10.241.5.221:4567 timed out, no messages seen in PT3S
      2018-03-02 11:15:19 140530793047808 [Note] WSREP: (9a6e409c, 'tcp://0.0.0.0:4567') turning message relay requesting off
      2018-03-02 11:16:46 140530776262400 [ERROR] WSREP: Failed to read 'ready <addr>' from: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.241.5.221' --datadir '/RDC/mysql/data/'   --parent '26402' --binlog '/RDC/mysql/binlog/mariadb-bin'
              Read: '(null)'
      2018-03-02 11:16:46 140530776262400 [ERROR] WSREP: Process was aborted.
      2018-03-02 11:16:46 140530776262400 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.241.5.221' --datadir '/RDC/mysql/data/'   --parent '26402' --binlog '/RDC/mysql/binlog/mariadb-bin' : 2 (No such file or directory)
      2018-03-02 11:16:46 140552339752704 [ERROR] WSREP: Failed to prepare for 'xtrabackup-v2' SST. Unrecoverable.
      2018-03-02 11:16:46 140552339752704 [ERROR] Aborting
       
      Error in my_thread_global_end(): 1 threads didn't exit
      2018-03-02 11:45:51 140065529612480 [Note] WSREP: Read nil XID from storage engines, skipping position init
      2018-03-02 11:45:51 140065529612480 [Note] WSREP: wsrep_load(): loading provider library 'none'
      2018-03-02 11:45:51 140065529612480 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
      2018-03-02 11:45:51 140065529612480 [Note] InnoDB: Uses event mutexes
      2018-03-02 11:45:51 140065529612480 [Note] InnoDB: Compressed tables use zlib 1.2.8
      2018-03-02 11:45:51 140065529612480 [Note] InnoDB: Using Linux native AIO
      2018-03-02 11:45:51 140065529612480 [Note] InnoDB: Number of pools: 1
      2018-03-02 11:45:51 140065529612480 [Note] InnoDB: Using SSE2 crc32 instructions
      2018-03-02 11:45:51 140065529612480 [Note] InnoDB: Initializing buffer pool, total size = 378G, instances = 8, chunk size = 128M
      2018-03-02 11:46:02 140065529612480 [Note] InnoDB: Completed initialization of buffer pool
      2018-03-02 11:46:02 139632422274816 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
      2018-03-02 11:46:02 140065529612480 [Note] InnoDB: Highest supported file format is Barracuda.
      2018-03-02 11:46:04 140065529612480 [Note] InnoDB: 128 out of 128 rollback segments are active.
      2018-03-02 11:46:04 140065529612480 [Note] InnoDB: Creating shared tablespace for temporary tables
      2018-03-02 11:46:04 140065529612480 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
      2018-03-02 11:46:04 140065529612480 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
      2018-03-02 11:46:04 140065529612480 [Note] InnoDB: Waiting for purge to start
      2018-03-02 11:46:04 140065529612480 [Note] InnoDB: 5.7.21 started; log sequence number 1619267
      2018-03-02 11:46:04 139641726428928 [Note] InnoDB: Loading buffer pool(s) from /RDC/mysql/data/ib_buffer_pool
      2018-03-02 11:46:04 139641726428928 [Note] InnoDB: Buffer pool(s) load completed at 180302 11:46:04
      2018-03-02 11:46:04 140065529612480 [Note] Plugin 'FEEDBACK' is disabled.
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: Read nil XID from storage engines, skipping position init
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: wsrep_load(): Galera 25.3.23(r3789) by Codership Oy <info@codership.com> loaded successfully.
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: CRC-32C: using hardware acceleration.
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1, safe_to_bootstrap: 1
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: Passing config to GCS: base_dir = /RDC/mysql/data/; base_host = 10.241.5.221; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /RDC/mysql/data/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /RDC/mysql/data//galera.cache; gcache.page_size = 128M; gcache.recover = no; gcache.size = 20G; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: GCache history reset: 00000000-0000-0000-0000-000000000000:0 -> 00000000-0000-0000-0000-000000000000:-1
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: wsrep_sst_grab()
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: Start replication
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: protonet asio version 0
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: Using CRC-32C for message checksums.
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: backend: asio
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: gcomm thread scheduling priority set to other:0
      2018-03-02 11:46:48 140111382202560 [Warning] WSREP: access file(/RDC/mysql/data//gvwstate.dat) failed(No such file or directory)
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: restore pc from disk failed
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: GMCast version 0
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: (025fa3ee, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: (025fa3ee, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: EVS version 0
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: gcomm: connecting to group 'rdc-cluster-1', peer '10.241.5.221:,10.241.5.222:,10.241.5.220:'
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: (025fa3ee, 'tcp://0.0.0.0:4567') connection established to 025fa3ee tcp://10.241.5.221:4567
      2018-03-02 11:46:48 140111382202560 [Warning] WSREP: (025fa3ee, 'tcp://0.0.0.0:4567') address 'tcp://10.241.5.221:4567' points to own listening address, blacklisting
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: (025fa3ee, 'tcp://0.0.0.0:4567') connection established to 8fa6765f tcp://10.241.5.220:4567
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: (025fa3ee, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: (025fa3ee, 'tcp://0.0.0.0:4567') connection established to 0a0faad9 tcp://10.241.5.222:4567
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: declaring 0a0faad9 at tcp://10.241.5.222:4567 stable
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: declaring 8fa6765f at tcp://10.241.5.220:4567 stable
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: Node 0a0faad9 state prim
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: view(view_id(PRIM,025fa3ee,31) memb {
              025fa3ee,0
              0a0faad9,0
              8fa6765f,0
      } joined {
      } left {
      } partitioned {
      })
      2018-03-02 11:46:48 140111382202560 [Note] WSREP: save pc into disk
      2018-03-02 11:46:49 140111382202560 [Note] WSREP: gcomm: connected
      2018-03-02 11:46:49 140111382202560 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
      2018-03-02 11:46:49 140111382202560 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
      2018-03-02 11:46:49 140111382202560 [Note] WSREP: Opened channel 'rdc-cluster-1'
      2018-03-02 11:46:49 140089639511808 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 3
      2018-03-02 11:46:49 140111382202560 [Note] WSREP: Waiting for SST to complete.
      2018-03-02 11:46:49 140089639511808 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 02ac5985-1e07-11e8-9509-d60e7fa9376f
      2018-03-02 11:46:49 140089639511808 [Note] WSREP: STATE EXCHANGE: sent state msg: 02ac5985-1e07-11e8-9509-d60e7fa9376f
      2018-03-02 11:46:49 140089639511808 [Note] WSREP: STATE EXCHANGE: got state msg: 02ac5985-1e07-11e8-9509-d60e7fa9376f from 0 (hb01-galera-cl01-02)
      2018-03-02 11:46:49 140089639511808 [Note] WSREP: STATE EXCHANGE: got state msg: 02ac5985-1e07-11e8-9509-d60e7fa9376f from 1 (hb01-galera-cl01-03)
      2018-03-02 11:46:49 140089639511808 [Note] WSREP: STATE EXCHANGE: got state msg: 02ac5985-1e07-11e8-9509-d60e7fa9376f from 2 (hb01-galera-cl01-01)
      2018-03-02 11:46:49 140089639511808 [Note] WSREP: Quorum results:
              version    = 4,
              component  = PRIMARY,
              conf_id    = 30,
              members    = 2/3 (joined/total),
              act_id     = 469231120,
              last_appl. = -1,
              protocols  = 0/8/3 (gcs/repl/appl),
              group UUID = 6b51589d-0b5f-11e8-94e6-82aa9d3be300
      2018-03-02 11:46:49 140089639511808 [Note] WSREP: Flow-control interval: [28, 28]
      2018-03-02 11:46:49 140089639511808 [Note] WSREP: Trying to continue unpaused monitor
      2018-03-02 11:46:49 140089639511808 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 469231120)
      2018-03-02 11:46:49 140111229683456 [Note] WSREP: State transfer required:
              Group state: 6b51589d-0b5f-11e8-94e6-82aa9d3be300:469231120
              Local state: 00000000-0000-0000-0000-000000000000:-1
      2018-03-02 11:46:49 140111229683456 [Note] WSREP: New cluster view: global state: 6b51589d-0b5f-11e8-94e6-82aa9d3be300:469231120, view# 31: Primary, number of nodes: 3, my index: 0, protocol version 3
      2018-03-02 11:46:49 140111229683456 [Warning] WSREP: Gap in state sequence. Need state transfer.
      2018-03-02 11:46:49 140089631119104 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.241.5.221' --datadir '/RDC/mysql/data/'   --parent '31376' --binlog '/RDC/mysql/binlog/mariadb-bin' '
      WSREP_SST: [INFO] Logging all stderr of SST/Innobackupex to syslog (20180302 11:46:49.566)
      2018-03-02 11:46:52 140089647904512 [Note] WSREP: (025fa3ee, 'tcp://0.0.0.0:4567') connection to peer 025fa3ee with addr tcp://10.241.5.221:4567 timed out, no messages seen in PT3S
      2018-03-02 11:46:52 140089647904512 [Note] WSREP: (025fa3ee, 'tcp://0.0.0.0:4567') turning message relay requesting off
      2018-03-02 11:48:18 140089631119104 [ERROR] WSREP: Failed to read 'ready <addr>' from: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.241.5.221' --datadir '/RDC/mysql/data/'   --parent '31376' --binlog '/RDC/mysql/binlog/mariadb-bin'
              Read: '(null)'
      2018-03-02 11:48:18 140089631119104 [ERROR] WSREP: Process was aborted.
      2018-03-02 11:48:18 140089631119104 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.241.5.221' --datadir '/RDC/mysql/data/'   --parent '31376' --binlog '/RDC/mysql/binlog/mariadb-bin' : 2 (No such file or directory)
      2018-03-02 11:48:18 140111229683456 [ERROR] WSREP: Failed to prepare for 'xtrabackup-v2' SST. Unrecoverable.
      2018-03-02 11:48:18 140111229683456 [ERROR] Aborting
       
      Error in my_thread_global_end(): 1 threads didn't exit
      

      on init (12.2.12) I load node 3 with slave, then node 1 & 2 with SST and it's worked perfectly

      Attachments

        Issue Links

          Activity

            Is it not the same problem as MDEV-15254?

            elenst Elena Stepanova added a comment - Is it not the same problem as MDEV-15254 ?

            i seem i saw request on donor, i check it too on monday

            Aurelien_LEQUOY Aurélien LEQUOY added a comment - i seem i saw request on donor, i check it too on monday
            Aurelien_LEQUOY Aurélien LEQUOY added a comment - - edited

            the problem here, I should take IST and not SST. there are 2 differents problems there.

            and SST totally failed. even on new cluster maybe fixed with 15254.

            Aurelien_LEQUOY Aurélien LEQUOY added a comment - - edited the problem here, I should take IST and not SST. there are 2 differents problems there. and SST totally failed. even on new cluster maybe fixed with 15254.

            XtraBackup 2.4 is not safe to use with MariaDB 10.2, and Mariabackup should be used instead.

            The reason is that unlike MySQL 5.7, which XtraBackup 2.4 is based on, MariaDB 10.2 made an effort (MDEV-12289) to not break compatibility with older InnoDB versions. The fix of this compatibility issue makes MariaDB 10.2 slightly less compatible with MySQL 5.7 and XtraBackup 2.4.

            marko Marko Mäkelä added a comment - XtraBackup 2.4 is not safe to use with MariaDB 10.2, and Mariabackup should be used instead. The reason is that unlike MySQL 5.7, which XtraBackup 2.4 is based on, MariaDB 10.2 made an effort ( MDEV-12289 ) to not break compatibility with older InnoDB versions. The fix of this compatibility issue makes MariaDB 10.2 slightly less compatible with MySQL 5.7 and XtraBackup 2.4.
            Aurelien_LEQUOY Aurélien LEQUOY added a comment - https://jira.mariadb.org/browse/MDEV-15383?focusedCommentId=108624&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-108624
            Aurelien_LEQUOY Aurélien LEQUOY added a comment - @Marko Mäkelä Xtrabackup or Mariabackup : the problem is the same : https://jira.mariadb.org/browse/MDEV-15383?focusedCommentId=108624&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-108624
            jplindst Jan Lindström (Inactive) added a comment - - edited

            I could not repeat the problem, in my system 10.2.12 -> 10.2.13 upgrade worked fine. So, either my configuration was different or system was different. Can you provide configuration files, joiner and donor full logs and system information?

            jplindst Jan Lindström (Inactive) added a comment - - edited I could not repeat the problem, in my system 10.2.12 -> 10.2.13 upgrade worked fine. So, either my configuration was different or system was different. Can you provide configuration files, joiner and donor full logs and system information?

            People

              jplindst Jan Lindström (Inactive)
              Aurelien_LEQUOY Aurélien LEQUOY
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.