Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-11280

IST time is far superior to SST

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Minor
    • Resolution: Not a Bug
    • 10.1.18
    • N/A
    • Galera SST
    • Debian Jessie

    Description

      Ok, so I have 3 nodes cluster (with about 150GB of data on it), and I am setting up a backup system on another node.

      Since it's a big amount of data, my idea was to set up galera.gcache to 5GB (in order to avoid sst when adding new node), add a new Node, sync it, shut it down, and do backups of /var/lib/mysql folder.

      System works fine as long as there are small amount of writes in IST cache, but when the time delta is about 1 day, following weird thing happens:

      Starting Joiner on 09.38:

      Nov 14 09:38:24 recovery systemd[1]: Starting MariaDB database server...
      Nov 14 09:38:25 recovery sh[6950]: 2016-11-14  9:38:25 139816457574336 [Note] /usr/sbin/mysqld (mysqld 10.1.18-MariaDB-1~jessie) starting as process 7127 ...
      Nov 14 09:38:29 recovery sh[6950]: WSREP: Recovered position 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268:11125081
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] /usr/sbin/mysqld (mysqld 10.1.18-MariaDB-1~jessie) starting as process 7160 ...
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: Read nil XID from storage engines, skipping position init
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: wsrep_load(): Galera 25.3.18(r3632) by Codership Oy <info@codership.com> loaded successfully.
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: CRC-32C: using hardware acceleration.
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: Found saved state: 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268:11125081
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = ccc.ccc.169.121; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 5G; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; ist.recv_addr = ccc.ccc.169.121; ist.recv_bind = 10.2.0.201; pc.announce_ti
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139635027445504 [Note] WSREP: Service thread queue flushed.
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: Assign initial position for certification: 11125081, protocol version: -1
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: wsrep_sst_grab()
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: Start replication
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: Setting initial position to 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268:11125081
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: protonet asio version 0
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: Using CRC-32C for message checksums.
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: backend: asio
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: gcomm thread scheduling priority set to other:0
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Warning] WSREP: access file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory)
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: restore pc from disk failed
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: GMCast version 0
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: (197dad25, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: (197dad25, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: EVS version 0
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: gcomm: connecting to group 'Bw Galera Cluster', peer 'aa.aaa.249.61:,ddd.ddd.156.26:,bb.bbb.15.127:'
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: (197dad25, 'tcp://0.0.0.0:4567') connection established to 17f1f617 tcp://bb.bbb.15.127:4567
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: (197dad25, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: (197dad25, 'tcp://0.0.0.0:4567') connection established to 15890d99 tcp://aa.aaa.249.61:4567
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: (197dad25, 'tcp://0.0.0.0:4567') connection established to 5b4b3258 tcp://ddd.ddd.156.26:4567
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: declaring 15890d99 at tcp://aa.aaa.249.61:4567 stable
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: declaring 17f1f617 at tcp://bb.bbb.15.127:4567 stable
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: declaring 5b4b3258 at tcp://ddd.ddd.156.26:4567 stable
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: Node 15890d99 state prim
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: view(view_id(PRIM,15890d99,78) memb {
       
      Nov 14 09:38:29 recovery mysqld[7160]: 15890d99,0
      Nov 14 09:38:29 recovery mysqld[7160]: 17f1f617,0
      Nov 14 09:38:29 recovery mysqld[7160]: 197dad25,0
      Nov 14 09:38:29 recovery mysqld[7160]: 5b4b3258,0
      Nov 14 09:38:29 recovery mysqld[7160]: } joined {
      Nov 14 09:38:29 recovery mysqld[7160]: } left {
      Nov 14 09:38:29 recovery mysqld[7160]: } partitioned {
      Nov 14 09:38:29 recovery mysqld[7160]: })
      Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14  9:38:29 139640505280448 [Note] WSREP: save pc into disk
      Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14  9:38:30 139640505280448 [Note] WSREP: gcomm: connected
      Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14  9:38:30 139640505280448 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
      Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14  9:38:30 139640505280448 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
      Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14  9:38:30 139640505280448 [Note] WSREP: Opened channel 'Bw Galera Cluster'
      Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14  9:38:30 139640505280448 [Note] WSREP: Waiting for SST to complete.
      Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14  9:38:30 139634956826368 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 2, memb_num = 4
      Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14  9:38:30 139634956826368 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
      Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14  9:38:30 139634956826368 [Note] WSREP: STATE EXCHANGE: sent state msg: 19cac395-aa4e-11e6-8208-2a55f97f848c
      Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14  9:38:30 139634956826368 [Note] WSREP: STATE EXCHANGE: got state msg: 19cac395-aa4e-11e6-8208-2a55f97f848c from 0 (bwgalera-node01.xxx.com)
      Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14  9:38:30 139634956826368 [Note] WSREP: STATE EXCHANGE: got state msg: 19cac395-aa4e-11e6-8208-2a55f97f848c from 1 (sd-87324)
      Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14  9:38:30 139634956826368 [Note] WSREP: STATE EXCHANGE: got state msg: 19cac395-aa4e-11e6-8208-2a55f97f848c from 3 (sd-77104)
      Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14  9:38:30 139634956826368 [Note] WSREP: STATE EXCHANGE: got state msg: 19cac395-aa4e-11e6-8208-2a55f97f848c from 2 (recovery.xxx.com)
      Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14  9:38:30 139634956826368 [Note] WSREP: Quorum results:
      Nov 14 09:38:30 recovery mysqld[7160]: version    = 4,
      Nov 14 09:38:30 recovery mysqld[7160]: component  = PRIMARY,
      Nov 14 09:38:30 recovery mysqld[7160]: conf_id    = 77,
      Nov 14 09:38:30 recovery mysqld[7160]: members    = 3/4 (joined/total),
      Nov 14 09:38:30 recovery mysqld[7160]: act_id     = 13870669,
      Nov 14 09:38:30 recovery mysqld[7160]: last_appl. = -1,
      Nov 14 09:38:30 recovery mysqld[7160]: protocols  = 0/7/3 (gcs/repl/appl),
      Nov 14 09:38:30 recovery mysqld[7160]: group UUID = 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268
      Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14  9:38:30 139634956826368 [Note] WSREP: Flow-control interval: [32, 32]
      Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14  9:38:30 139634956826368 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 13870669)
      Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14  9:38:30 139640504965888 [Note] WSREP: State transfer required:
      Nov 14 09:38:30 recovery mysqld[7160]: Group state: 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268:13870669
      Nov 14 09:38:30 recovery mysqld[7160]: Local state: 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268:11125081
      Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14  9:38:30 139640504965888 [Note] WSREP: New cluster view: global state: 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268:13870669, view# 78: Primary, number of nodes: 4, my index: 2, protocol version 3
      Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14  9:38:30 139640504965888 [Warning] WSREP: Gap in state sequence. Need state transfer.
      Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14  9:38:30 139634927466240 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'joiner' --address 'ccc.ccc.169.121' --datadir '/var/lib/mysql/'   --parent '7160' --binlog '/var/lib/mysql/mysql-bin' '
      Nov 14 09:38:31 recovery mysqld[7160]: WSREP_SST: [INFO] Streaming with xbstream (20161114 09:38:31.031)
      Nov 14 09:38:31 recovery mysqld[7160]: WSREP_SST: [INFO] Using socat as streamer (20161114 09:38:31.033)
      Nov 14 09:38:31 recovery mysqld[7160]: WSREP_SST: [INFO] Evaluating timeout -k 110 100 socat -u TCP-LISTEN:4444,reuseaddr stdio | pv -f  -i 10 -N joiner -F '%N => Rate:%r Avg:%a Elapsed:%t %e Bytes: %b %p'  2>>/tmp/log-sst | xbstream -x; RC=( ${PIPESTATUS[@]} ) (20161114 09:38:31.069)
      Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14  9:38:31 139640504965888 [Note] WSREP: Prepared SST request: xtrabackup-v2|ccc.ccc.169.121:4444/xtrabackup_sst//1
      Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14  9:38:31 139640504965888 [Note] WSREP: REPL Protocols: 7 (3, 2)
      Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14  9:38:31 139635027445504 [Note] WSREP: Service thread queue flushed.
      Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14  9:38:31 139640504965888 [Note] WSREP: Assign initial position for certification: 13870669, protocol version: 3
      Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14  9:38:31 139635027445504 [Note] WSREP: Service thread queue flushed.
      Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14  9:38:31 139640504965888 [Note] WSREP: IST receiver addr using tcp://ccc.ccc.169.121:4568
      Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14  9:38:31 139640504965888 [Note] WSREP: IST receiver bind using tcp://10.2.0.201:4568
      Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14  9:38:31 139640504965888 [Note] WSREP: Prepared IST receiver, listening at: tcp://10.2.0.201:4568
      Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14  9:38:31 139634956826368 [Note] WSREP: Member 2.0 (recovery.xxx.com) requested state transfer from '*any*'. Selected 0.0 (bwgalera-node01.xxx.com)(SYNCED) as donor.
      Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14  9:38:31 139634956826368 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 13870683)
      Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14  9:38:31 139640504965888 [Note] WSREP: Requesting state transfer: success, donor: 0
      Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14  9:38:31 139634956826368 [Note] WSREP: 0.0 (bwgalera-node01.xxx.com): State transfer to 2.0 (recovery.xxx.com) complete.
      Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14  9:38:31 139634956826368 [Note] WSREP: Member 0.0 (bwgalera-node01.xxx.com) synced with group.
      Nov 14 09:38:31 recovery mysqld[7160]: WSREP_SST: [INFO] NOTE: Joiner-Recv-gtid took 0 seconds (20161114 09:38:31.454)
      Nov 14 09:38:31 recovery mysqld[7160]: WSREP_SST: [INFO] xtrabackup_ist received from donor: Running IST (20161114 09:38:31.476)
      Nov 14 09:38:31 recovery mysqld[7160]: WSREP_SST: [INFO] Galera co-ords from recovery: 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268:11125081 0 (20161114 09:38:31.478)
      Nov 14 09:38:31 recovery mysqld[7160]: WSREP_SST: [INFO] Total time on joiner: 0 seconds (20161114 09:38:31.480)
      Nov 14 09:38:31 recovery mysqld[7160]: WSREP_SST: [INFO] Removing the sst_in_progress file (20161114 09:38:31.484)
      Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14  9:38:31 139640505280448 [Note] WSREP: SST complete, seqno: 11125081
      Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14  9:38:31 139640505280448 [Note] Using unique option prefix 'myisam-recover' is error-prone and can break in the future. Please use the full name 'myisam-recover-options' instead.
      Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14  9:38:31 139640505280448 [Note] InnoDB: Using mutexes to ref count buffer pool pages
      Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14  9:38:31 139640505280448 [Note] InnoDB: The InnoDB memory heap is disabled
      Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14  9:38:31 139640505280448 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
      Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14  9:38:31 139640505280448 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
      Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14  9:38:31 139640505280448 [Note] InnoDB: Compressed tables use zlib 1.2.8
      Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14  9:38:31 139640505280448 [Note] InnoDB: Using Linux native AIO
      Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14  9:38:31 139640505280448 [Note] InnoDB: Using SSE crc32 instructions
      Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14  9:38:31 139640505280448 [Note] InnoDB: Initializing buffer pool, size = 2.7G
      Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14  9:38:31 139640505280448 [Note] InnoDB: Completed initialization of buffer pool
      Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14  9:38:31 139640505280448 [Note] InnoDB: Highest supported file format is Barracuda.
      Nov 14 09:38:32 recovery mysqld[7160]: 2016-11-14  9:38:32 139640505280448 [Note] InnoDB: 128 rollback segment(s) are active.
      Nov 14 09:38:32 recovery mysqld[7160]: 2016-11-14  9:38:32 139640505280448 [Note] InnoDB: Waiting for purge to start
      Nov 14 09:38:32 recovery mysqld[7160]: 2016-11-14  9:38:32 139640505280448 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.32-78.1 started; log sequence number 2788257505299
      Nov 14 09:38:32 recovery mysqld[7160]: 2016-11-14  9:38:32 139631500691200 [Note] InnoDB: Dumping buffer pool(s) not yet started
      Nov 14 09:38:32 recovery mysqld[7160]: 2016-11-14  9:38:32 139640505280448 [Note] Plugin 'FEEDBACK' is disabled.
      Nov 14 09:38:32 recovery mysqld[7160]: 2016-11-14  9:38:32 139634973603584 [Note] WSREP: (197dad25, 'tcp://0.0.0.0:4567') turning message relay requesting off
      Nov 14 09:38:32 recovery mysqld[7160]: 2016-11-14  9:38:32 139640505280448 [Note] Server socket created on IP: '::'.
      Nov 14 09:38:32 recovery mysqld[7160]: 2016-11-14  9:38:32 139640505280448 [Note] WSREP: Signalling provider to continue.
      Nov 14 09:38:32 recovery mysqld[7160]: 2016-11-14  9:38:32 139640505280448 [Note] WSREP: SST received: 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268:11125081
      Nov 14 09:38:32 recovery mysqld[7160]: 2016-11-14  9:38:32 139640504965888 [Note] WSREP: Receiving IST: 2745588 writesets, seqnos 11125081-13870669
      Nov 14 09:38:32 recovery mysqld[7160]: 2016-11-14  9:38:32 139640505280448 [Note] /usr/sbin/mysqld: ready for connections.
      Nov 14 09:38:32 recovery mysqld[7160]: Version: '10.1.18-MariaDB-1~jessie'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  mariadb.org binary distribution
      Nov 14 09:38:32 recovery debian-start[7502]: Checking for corrupt, not cleanly closed and upgrade needing tables.
      Nov 14 09:38:32 recovery systemd[1]: Started MariaDB database server.
      

      Donor:

      Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:29 140066827532032 [Note] WSREP: (15890d99, 'tcp://0.0.0.0:4567') connection established to 197dad25 tcp://ccc.ccc.169.121:4567
      Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:29 140066827532032 [Note] WSREP: (15890d99, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
      Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:29 140066827532032 [Note] WSREP: declaring 17f1f617 at tcp://bb.bbb.15.127:4567 stable
      Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:29 140066827532032 [Note] WSREP: declaring 197dad25 at tcp://ccc.ccc.169.121:4567 stable
      Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:29 140066827532032 [Note] WSREP: declaring 5b4b3258 at tcp://ddd.ddd.156.26:4567 stable
      Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:29 140066827532032 [Note] WSREP: Node 15890d99 state prim
      Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:29 140066827532032 [Note] WSREP: view(view_id(PRIM,15890d99,78) memb {
      Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 15890d99,0
      Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 17f1f617,0
      Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 197dad25,0
      Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 5b4b3258,0
      Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: } joined {
      Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: } left {
      Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: } partitioned {
      Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: })
      Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:29 140066827532032 [Note] WSREP: save pc into disk
      Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:29 140066819139328 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 4
      Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:29 140066819139328 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 19cac395-aa4e-11e6-8208-2a55f97f848c
      Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:29 140066819139328 [Note] WSREP: STATE EXCHANGE: sent state msg: 19cac395-aa4e-11e6-8208-2a55f97f848c
      Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:29 140066819139328 [Note] WSREP: STATE EXCHANGE: got state msg: 19cac395-aa4e-11e6-8208-2a55f97f848c from 0 (bwgalera-node01.xxx.com)
      Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:29 140066819139328 [Note] WSREP: STATE EXCHANGE: got state msg: 19cac395-aa4e-11e6-8208-2a55f97f848c from 1 (sd-87324)
      Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:29 140066819139328 [Note] WSREP: STATE EXCHANGE: got state msg: 19cac395-aa4e-11e6-8208-2a55f97f848c from 3 (sd-77104)
      Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:30 140066819139328 [Note] WSREP: STATE EXCHANGE: got state msg: 19cac395-aa4e-11e6-8208-2a55f97f848c from 2 (recovery.xxx.com)
      Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:30 140066819139328 [Note] WSREP: Quorum results:
      Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: version    = 4,
      Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: component  = PRIMARY,
      Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: conf_id    = 77,
      Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: members    = 3/4 (joined/total),
      Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: act_id     = 13870669,
      Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: last_appl. = 13870643,
      Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: protocols  = 0/7/3 (gcs/repl/appl),
      Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: group UUID = 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268
      Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:30 140066819139328 [Note] WSREP: Flow-control interval: [32, 32]
      Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:30 140072341510912 [Note] WSREP: New cluster view: global state: 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268:13870669, view# 78: Primary, number of nodes: 4, my index: 0, protocol version 3
      Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:30 140072341510912 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
      Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:30 140072341510912 [Note] WSREP: REPL Protocols: 7 (3, 2)
      Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:30 140066881373952 [Note] WSREP: Service thread queue flushed.
      Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:30 140072341510912 [Note] WSREP: Assign initial position for certification: 13870669, protocol version: 3
      Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:30 140066881373952 [Note] WSREP: Service thread queue flushed.
      Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:31 140066819139328 [Note] WSREP: Member 2.0 (recovery.xxx.com) requested state transfer from '*any*'. Selected 0.0 (bwgalera-node01.xxx.com)(SYNCED) as donor.
      Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:31 140066819139328 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 13870683)
      Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:31 140072341510912 [Note] WSREP: IST request: 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268:11125081-13870669|tcp://ccc.ccc.169.121:4568
      Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:31 140072341510912 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
      Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:31 139988716861184 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'donor' --address 'ccc.ccc.169.121:4444/xtrabackup_sst//1' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/'    --binlog '/var/lib/mysql/mysql-bin' --gtid '9a7742ee-a1b7-11e6-98c0-d2f1fa8da268:11125081' --gtid-domain-id '0' --bypass'
      Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:31 140072341510912 [Note] WSREP: sst_donor_thread signaled with 0
      Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:31 139988708468480 [Note] WSREP: async IST sender starting to serve tcp://ccc.ccc.169.121:4568 sending 11125082-13870669
      Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: WSREP_SST: [INFO] Streaming with xbstream (20161114 09:38:31.422)
      Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: WSREP_SST: [INFO] Using socat as streamer (20161114 09:38:31.424)
      Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: WSREP_SST: [INFO] Bypassing the SST for IST (20161114 09:38:31.427)
      Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: WSREP_SST: [INFO] Evaluating xbstream -c ${INFO_FILE} ${IST_FILE} | socat -u stdio TCP:ccc.ccc.169.121:4444; RC=( ${PIPESTATUS[@]} ) (20161114 09:38:31.432)
      Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: WSREP_SST: [INFO] NOTE: donor-IST took 0 seconds (20161114 09:38:31.436)
      Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:31 140066819139328 [Note] WSREP: 0.0 (bwgalera-node01.xxx.com): State transfer to 2.0 (recovery.xxx.com) complete.
      Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:31 140066819139328 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 13870683)
      Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: WSREP_SST: [INFO] Total time on donor: 0 seconds (20161114 09:38:31.437)
      Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:31 140066819139328 [Note] WSREP: Member 0.0 (bwgalera-node01.xxx.com) synced with group.
      Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:31 140066819139328 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 13870683)
      Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:31 140072357923584 [Note] WSREP: Synchronized with group, ready for connections
      Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:31 140072357923584 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
      Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: WSREP_SST: [INFO] Cleaning up temporary directories (20161114 09:38:31.439)
      Nov 14 09:38:32 bwgalera-node01 mysqld[45233]: 2016-11-14  9:38:32 140066827532032 [Note] WSREP: (15890d99, 'tcp://0.0.0.0:4567') turning message relay requesting off
      

      Right now (after 1 hour of IST) this is situation on joiner

      SHOW GLOBAL STATUS LIKE 'wsrep_%';
      +------------------------------+--------------------------------------------------------------------------------+
      | Variable_name                | Value                                                                          |
      +------------------------------+--------------------------------------------------------------------------------+
      | wsrep_apply_oooe             | 0.973601                                                                       |
      | wsrep_apply_oool             | 0.000003                                                                       |
      | wsrep_apply_window           | 3.953276                                                                       |
      | wsrep_causal_reads           | 0                                                                              |
      | wsrep_cert_deps_distance     | 0.000000                                                                       |
      | wsrep_cert_index_size        | 0                                                                              |
      | wsrep_cert_interval          | 0.000000                                                                       |
      | wsrep_cluster_conf_id        | 78                                                                             |
      | wsrep_cluster_size           | 4                                                                              |
      | wsrep_cluster_state_uuid     | 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268                                           |
      | wsrep_cluster_status         | Primary                                                                        |
      | wsrep_commit_oooe            | 0.000000                                                                       |
      | wsrep_commit_oool            | 0.000002                                                                       |
      | wsrep_commit_window          | 2.928580                                                                       |
      | wsrep_connected              | ON                                                                             |
      | wsrep_desync_count           | 0                                                                              |
      | wsrep_evs_delayed            |                                                                                |
      | wsrep_evs_evict_list         |                                                                                |
      | wsrep_evs_repl_latency       | 0/0/0/0/0                                                                      |
      | wsrep_evs_state              | OPERATIONAL                                                                    |
      | wsrep_flow_control_paused    | 0.000000                                                                       |
      | wsrep_flow_control_paused_ns | 0                                                                              |
      | wsrep_flow_control_recv      | 0                                                                              |
      | wsrep_flow_control_sent      | 0                                                                              |
      | wsrep_gcomm_uuid             | 197dad25-aa4e-11e6-a2c5-ab393356446b                                           |
      | wsrep_incoming_addresses     | aa.aaa.249.61:3306,bb.bbb.15.127:3306,ccc.ccc.169.121:3306,ddd.ddd.156.26:3306 |
      | wsrep_last_committed         | 13787724                                                                       |
      | wsrep_local_bf_aborts        | 0                                                                              |
      | wsrep_local_cached_downto    | 18446744073709551615                                                           |
      | wsrep_local_cert_failures    | 0                                                                              |
      | wsrep_local_commits          | 0                                                                              |
      | wsrep_local_index            | 2                                                                              |
      | wsrep_local_recv_queue       | 20343                                                                          |
      | wsrep_local_recv_queue_avg   | 10170.500049                                                                   |
      | wsrep_local_recv_queue_max   | 20343                                                                          |
      | wsrep_local_recv_queue_min   | 0                                                                              |
      | wsrep_local_replays          | 0                                                                              |
      | wsrep_local_send_queue       | 0                                                                              |
      | wsrep_local_send_queue_avg   | 0.333333                                                                       |
      | wsrep_local_send_queue_max   | 2                                                                              |
      | wsrep_local_send_queue_min   | 0                                                                              |
      | wsrep_local_state            | 1                                                                              |
      | wsrep_local_state_comment    | Joining: receiving State Transfer                                              |
      | wsrep_local_state_uuid       | 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268                                           |
      | wsrep_protocol_version       | 7                                                                              |
      | wsrep_provider_name          | Galera                                                                         |
      | wsrep_provider_vendor        | Codership Oy <info@codership.com>                                              |
      | wsrep_provider_version       | 25.3.18(r3632)                                                                 |
      | wsrep_ready                  | OFF                                                                            |
      | wsrep_received               | 1                                                                              |
      | wsrep_received_bytes         | 396                                                                            |
      | wsrep_repl_data_bytes        | 0                                                                              |
      | wsrep_repl_keys              | 0                                                                              |
      | wsrep_repl_keys_bytes        | 0                                                                              |
      | wsrep_repl_other_bytes       | 0                                                                              |
      | wsrep_replicated             | 0                                                                              |
      | wsrep_replicated_bytes       | 0                                                                              |
      | wsrep_thread_count           | 5                                                                              |
      +------------------------------+--------------------------------------------------------------------------------+
      

      And this is "master" node providing IST

      SHOW GLOBAL STATUS LIKE 'wsrep_%';
      +------------------------------+------------------------------------------------------------+
      | Variable_name                | Value                                                      |
      +------------------------------+------------------------------------------------------------+
      | wsrep_apply_oooe             | 0.022080                                                   |
      | wsrep_apply_oool             | 0.000002                                                   |
      | wsrep_apply_window           | 1.024199                                                   |
      | wsrep_causal_reads           | 0                                                          |
      | wsrep_cert_deps_distance     | 71.001237                                                  |
      | wsrep_cert_index_size        | 29                                                         |
      | wsrep_cert_interval          | 0.050889                                                   |
      | wsrep_cluster_conf_id        | 79                                                         |
      | wsrep_cluster_size           | 3                                                          |
      | wsrep_cluster_state_uuid     | 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268                       |
      | wsrep_cluster_status         | Primary                                                    |
      | wsrep_commit_oooe            | 0.000000                                                   |
      | wsrep_commit_oool            | 0.000001                                                   |
      | wsrep_commit_window          | 1.002219                                                   |
      | wsrep_connected              | ON                                                         |
      | wsrep_desync_count           | 0                                                          |
      | wsrep_evs_delayed            |                                                            |
      | wsrep_evs_evict_list         |                                                            |
      | wsrep_evs_repl_latency       | 0.000336649/0.000641091/0.000783475/8.88954e-05/44         |
      | wsrep_evs_state              | OPERATIONAL                                                |
      | wsrep_flow_control_paused    | 0.000007                                                   |
      | wsrep_flow_control_paused_ns | 4177144290                                                 |
      | wsrep_flow_control_recv      | 338                                                        |
      | wsrep_flow_control_sent      | 1                                                          |
      | wsrep_gcomm_uuid             | 15890d99-a506-11e6-9af8-7b2457495062                       |
      | wsrep_incoming_addresses     | aa.aaa.249.61:3306,bb.bbb.15.127:3306,ddd.ddd.156.26:3306 |
      | wsrep_last_committed         | 13892160                                                   |
      | wsrep_local_bf_aborts        | 25                                                         |
      | wsrep_local_cached_downto    | 5984759                                                    |
      | wsrep_local_cert_failures    | 1                                                          |
      | wsrep_local_commits          | 7556981                                                    |
      | wsrep_local_index            | 0                                                          |
      | wsrep_local_recv_queue       | 0                                                          |
      | wsrep_local_recv_queue_avg   | 0.359858                                                   |
      | wsrep_local_recv_queue_max   | 456                                                        |
      | wsrep_local_recv_queue_min   | 0                                                          |
      | wsrep_local_replays          | 0                                                          |
      | wsrep_local_send_queue       | 0                                                          |
      | wsrep_local_send_queue_avg   | 0.004681                                                   |
      | wsrep_local_send_queue_max   | 28                                                         |
      | wsrep_local_send_queue_min   | 0                                                          |
      | wsrep_local_state            | 4                                                          |
      | wsrep_local_state_comment    | Synced                                                     |
      | wsrep_local_state_uuid       | 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268                       |
      | wsrep_protocol_version       | 7                                                          |
      | wsrep_provider_name          | Galera                                                     |
      | wsrep_provider_vendor        | Codership Oy <info@codership.com>                          |
      | wsrep_provider_version       | 25.3.18(r3632)                                             |
      | wsrep_ready                  | ON                                                         |
      | wsrep_received               | 514789                                                     |
      | wsrep_received_bytes         | 186876645                                                  |
      | wsrep_repl_data_bytes        | 2829795019                                                 |
      | wsrep_repl_keys              | 37148075                                                   |
      | wsrep_repl_keys_bytes        | 471116441                                                  |
      | wsrep_repl_other_bytes       | 0                                                          |
      | wsrep_replicated             | 7556985                                                    |
      | wsrep_replicated_bytes       | 3784558500                                                 |
      | wsrep_thread_count           | 5                                                          |
      +------------------------------+------------------------------------------------------------+
      

      EDIT:
      Finally on 10:45 (about an hour and 15 minutes) IST was completed on backup node

      Nov 14 10:45:15 recovery mysqld[7160]: 2016-11-14 10:45:15 139640504965888 [Note] WSREP: IST received: 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268:13870669
      Nov 14 10:45:15 recovery mysqld[7160]: 2016-11-14 10:45:15 139634956826368 [Note] WSREP: 2.0 (recovery.xxx.com): State transfer from 0.0 (bwgalera-node01.xxx.com) complete.
      Nov 14 10:45:15 recovery mysqld[7160]: 2016-11-14 10:45:15 139634956826368 [Note] WSREP: Shifting JOINER -> JOINED (TO: 13891875)
      Nov 14 10:45:50 recovery mysqld[7160]: 2016-11-14 10:45:50 139634956826368 [Note] WSREP: Member 2.0 (recovery.xxx.com) synced with group.
      Nov 14 10:45:50 recovery mysqld[7160]: 2016-11-14 10:45:50 139634956826368 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 13892087)
      Nov 14 10:45:50 recovery mysqld[7160]: 2016-11-14 10:45:50 139640504965888 [Note] WSREP: Synchronized with group, ready for connections
      Nov 14 10:45:54 recovery systemd[1]: Stopping MariaDB database server...
      Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139640486193920 [Note] /usr/sbin/mysqld: Normal shutdown
      Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139640486193920 [Note] WSREP: Stop replication
      Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139640486193920 [Note] WSREP: Closing send monitor...
      Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139640486193920 [Note] WSREP: Closed send monitor.
      Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139640486193920 [Note] WSREP: gcomm: terminating thread
      Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139640486193920 [Note] WSREP: gcomm: joining thread
      Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139640486193920 [Note] WSREP: gcomm: closing backend
      Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139640486193920 [Note] WSREP: view(view_id(NON_PRIM,15890d99,78) memb {
      Nov 14 10:45:54 recovery mysqld[7160]: 197dad25,0
      Nov 14 10:45:54 recovery mysqld[7160]: } joined {
      Nov 14 10:45:54 recovery mysqld[7160]: } left {
      Nov 14 10:45:54 recovery mysqld[7160]: } partitioned {
      Nov 14 10:45:54 recovery mysqld[7160]: 15890d99,0
      Nov 14 10:45:54 recovery mysqld[7160]: 17f1f617,0
      Nov 14 10:45:54 recovery mysqld[7160]: 5b4b3258,0
      Nov 14 10:45:54 recovery mysqld[7160]: })
      Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139640486193920 [Note] WSREP: view((empty))
      Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139640486193920 [Note] WSREP: gcomm: closed
      Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139634956826368 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
      Nov 14 10:45:54 recovery mysqld[7160]: 5b4b3258,0
      Nov 14 10:45:54 recovery mysqld[7160]: })
      Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139640486193920 [Note] WSREP: view((empty))
      Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139640486193920 [Note] WSREP: gcomm: closed
      Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139634956826368 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
      Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139634956826368 [Note] WSREP: Flow-control interval: [16, 16]
      Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139634956826368 [Note] WSREP: Received NON-PRIMARY.
      Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139634956826368 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 13892122)
      Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139634956826368 [Note] WSREP: Received self-leave message.
      

      So, it would take me about 45 minutes to complete SST, and 1 hour and 15 for IST. Is it a normal behaviour or I am missing something?

      Attachments

        Activity

          People

            anikitin Andrii Nikitin (Inactive)
            alekc Alexander Chernov
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.