Details
-
Bug
-
Status: Closed (View Workflow)
-
Minor
-
Resolution: Not a Bug
-
10.1.18
-
Debian Jessie
Description
Ok, so I have 3 nodes cluster (with about 150GB of data on it), and I am setting up a backup system on another node.
Since it's a big amount of data, my idea was to set up galera.gcache to 5GB (in order to avoid sst when adding new node), add a new Node, sync it, shut it down, and do backups of /var/lib/mysql folder.
System works fine as long as there are small amount of writes in IST cache, but when the time delta is about 1 day, following weird thing happens:
Starting Joiner on 09.38:
Nov 14 09:38:24 recovery systemd[1]: Starting MariaDB database server... |
Nov 14 09:38:25 recovery sh[6950]: 2016-11-14 9:38:25 139816457574336 [Note] /usr/sbin/mysqld (mysqld 10.1.18-MariaDB-1~jessie) starting as process 7127 ... |
Nov 14 09:38:29 recovery sh[6950]: WSREP: Recovered position 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268:11125081 |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] /usr/sbin/mysqld (mysqld 10.1.18-MariaDB-1~jessie) starting as process 7160 ... |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: Read nil XID from storage engines, skipping position init |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so' |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: wsrep_load(): Galera 25.3.18(r3632) by Codership Oy <info@codership.com> loaded successfully. |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: CRC-32C: using hardware acceleration. |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: Found saved state: 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268:11125081 |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = ccc.ccc.169.121; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 5G; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; ist.recv_addr = ccc.ccc.169.121; ist.recv_bind = 10.2.0.201; pc.announce_ti |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139635027445504 [Note] WSREP: Service thread queue flushed. |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: Assign initial position for certification: 11125081, protocol version: -1 |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: wsrep_sst_grab() |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: Start replication |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: Setting initial position to 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268:11125081 |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: protonet asio version 0 |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: Using CRC-32C for message checksums. |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: backend: asio |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: gcomm thread scheduling priority set to other:0 |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Warning] WSREP: access file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory) |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: restore pc from disk failed |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: GMCast version 0 |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: (197dad25, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567 |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: (197dad25, 'tcp://0.0.0.0:4567') multicast: , ttl: 1 |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: EVS version 0 |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: gcomm: connecting to group 'Bw Galera Cluster', peer 'aa.aaa.249.61:,ddd.ddd.156.26:,bb.bbb.15.127:' |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: (197dad25, 'tcp://0.0.0.0:4567') connection established to 17f1f617 tcp://bb.bbb.15.127:4567 |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: (197dad25, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: (197dad25, 'tcp://0.0.0.0:4567') connection established to 15890d99 tcp://aa.aaa.249.61:4567 |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: (197dad25, 'tcp://0.0.0.0:4567') connection established to 5b4b3258 tcp://ddd.ddd.156.26:4567 |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: declaring 15890d99 at tcp://aa.aaa.249.61:4567 stable |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: declaring 17f1f617 at tcp://bb.bbb.15.127:4567 stable |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: declaring 5b4b3258 at tcp://ddd.ddd.156.26:4567 stable |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: Node 15890d99 state prim |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: view(view_id(PRIM,15890d99,78) memb { |
 |
Nov 14 09:38:29 recovery mysqld[7160]: 15890d99,0 |
Nov 14 09:38:29 recovery mysqld[7160]: 17f1f617,0 |
Nov 14 09:38:29 recovery mysqld[7160]: 197dad25,0 |
Nov 14 09:38:29 recovery mysqld[7160]: 5b4b3258,0 |
Nov 14 09:38:29 recovery mysqld[7160]: } joined { |
Nov 14 09:38:29 recovery mysqld[7160]: } left { |
Nov 14 09:38:29 recovery mysqld[7160]: } partitioned { |
Nov 14 09:38:29 recovery mysqld[7160]: }) |
Nov 14 09:38:29 recovery mysqld[7160]: 2016-11-14 9:38:29 139640505280448 [Note] WSREP: save pc into disk |
Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14 9:38:30 139640505280448 [Note] WSREP: gcomm: connected |
Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14 9:38:30 139640505280448 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636 |
Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14 9:38:30 139640505280448 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0) |
Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14 9:38:30 139640505280448 [Note] WSREP: Opened channel 'Bw Galera Cluster' |
Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14 9:38:30 139640505280448 [Note] WSREP: Waiting for SST to complete. |
Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14 9:38:30 139634956826368 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 2, memb_num = 4 |
Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14 9:38:30 139634956826368 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID. |
Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14 9:38:30 139634956826368 [Note] WSREP: STATE EXCHANGE: sent state msg: 19cac395-aa4e-11e6-8208-2a55f97f848c |
Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14 9:38:30 139634956826368 [Note] WSREP: STATE EXCHANGE: got state msg: 19cac395-aa4e-11e6-8208-2a55f97f848c from 0 (bwgalera-node01.xxx.com) |
Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14 9:38:30 139634956826368 [Note] WSREP: STATE EXCHANGE: got state msg: 19cac395-aa4e-11e6-8208-2a55f97f848c from 1 (sd-87324) |
Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14 9:38:30 139634956826368 [Note] WSREP: STATE EXCHANGE: got state msg: 19cac395-aa4e-11e6-8208-2a55f97f848c from 3 (sd-77104) |
Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14 9:38:30 139634956826368 [Note] WSREP: STATE EXCHANGE: got state msg: 19cac395-aa4e-11e6-8208-2a55f97f848c from 2 (recovery.xxx.com) |
Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14 9:38:30 139634956826368 [Note] WSREP: Quorum results: |
Nov 14 09:38:30 recovery mysqld[7160]: version = 4, |
Nov 14 09:38:30 recovery mysqld[7160]: component = PRIMARY, |
Nov 14 09:38:30 recovery mysqld[7160]: conf_id = 77, |
Nov 14 09:38:30 recovery mysqld[7160]: members = 3/4 (joined/total), |
Nov 14 09:38:30 recovery mysqld[7160]: act_id = 13870669, |
Nov 14 09:38:30 recovery mysqld[7160]: last_appl. = -1, |
Nov 14 09:38:30 recovery mysqld[7160]: protocols = 0/7/3 (gcs/repl/appl), |
Nov 14 09:38:30 recovery mysqld[7160]: group UUID = 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268 |
Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14 9:38:30 139634956826368 [Note] WSREP: Flow-control interval: [32, 32] |
Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14 9:38:30 139634956826368 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 13870669) |
Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14 9:38:30 139640504965888 [Note] WSREP: State transfer required: |
Nov 14 09:38:30 recovery mysqld[7160]: Group state: 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268:13870669 |
Nov 14 09:38:30 recovery mysqld[7160]: Local state: 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268:11125081 |
Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14 9:38:30 139640504965888 [Note] WSREP: New cluster view: global state: 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268:13870669, view# 78: Primary, number of nodes: 4, my index: 2, protocol version 3 |
Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14 9:38:30 139640504965888 [Warning] WSREP: Gap in state sequence. Need state transfer. |
Nov 14 09:38:30 recovery mysqld[7160]: 2016-11-14 9:38:30 139634927466240 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'joiner' --address 'ccc.ccc.169.121' --datadir '/var/lib/mysql/' --parent '7160' --binlog '/var/lib/mysql/mysql-bin' ' |
Nov 14 09:38:31 recovery mysqld[7160]: WSREP_SST: [INFO] Streaming with xbstream (20161114 09:38:31.031) |
Nov 14 09:38:31 recovery mysqld[7160]: WSREP_SST: [INFO] Using socat as streamer (20161114 09:38:31.033) |
Nov 14 09:38:31 recovery mysqld[7160]: WSREP_SST: [INFO] Evaluating timeout -k 110 100 socat -u TCP-LISTEN:4444,reuseaddr stdio | pv -f -i 10 -N joiner -F '%N => Rate:%r Avg:%a Elapsed:%t %e Bytes: %b %p' 2>>/tmp/log-sst | xbstream -x; RC=( ${PIPESTATUS[@]} ) (20161114 09:38:31.069) |
Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14 9:38:31 139640504965888 [Note] WSREP: Prepared SST request: xtrabackup-v2|ccc.ccc.169.121:4444/xtrabackup_sst//1 |
Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14 9:38:31 139640504965888 [Note] WSREP: REPL Protocols: 7 (3, 2) |
Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14 9:38:31 139635027445504 [Note] WSREP: Service thread queue flushed. |
Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14 9:38:31 139640504965888 [Note] WSREP: Assign initial position for certification: 13870669, protocol version: 3 |
Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14 9:38:31 139635027445504 [Note] WSREP: Service thread queue flushed. |
Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14 9:38:31 139640504965888 [Note] WSREP: IST receiver addr using tcp://ccc.ccc.169.121:4568 |
Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14 9:38:31 139640504965888 [Note] WSREP: IST receiver bind using tcp://10.2.0.201:4568 |
Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14 9:38:31 139640504965888 [Note] WSREP: Prepared IST receiver, listening at: tcp://10.2.0.201:4568 |
Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14 9:38:31 139634956826368 [Note] WSREP: Member 2.0 (recovery.xxx.com) requested state transfer from '*any*'. Selected 0.0 (bwgalera-node01.xxx.com)(SYNCED) as donor. |
Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14 9:38:31 139634956826368 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 13870683) |
Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14 9:38:31 139640504965888 [Note] WSREP: Requesting state transfer: success, donor: 0 |
Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14 9:38:31 139634956826368 [Note] WSREP: 0.0 (bwgalera-node01.xxx.com): State transfer to 2.0 (recovery.xxx.com) complete. |
Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14 9:38:31 139634956826368 [Note] WSREP: Member 0.0 (bwgalera-node01.xxx.com) synced with group. |
Nov 14 09:38:31 recovery mysqld[7160]: WSREP_SST: [INFO] NOTE: Joiner-Recv-gtid took 0 seconds (20161114 09:38:31.454) |
Nov 14 09:38:31 recovery mysqld[7160]: WSREP_SST: [INFO] xtrabackup_ist received from donor: Running IST (20161114 09:38:31.476) |
Nov 14 09:38:31 recovery mysqld[7160]: WSREP_SST: [INFO] Galera co-ords from recovery: 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268:11125081 0 (20161114 09:38:31.478) |
Nov 14 09:38:31 recovery mysqld[7160]: WSREP_SST: [INFO] Total time on joiner: 0 seconds (20161114 09:38:31.480) |
Nov 14 09:38:31 recovery mysqld[7160]: WSREP_SST: [INFO] Removing the sst_in_progress file (20161114 09:38:31.484) |
Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14 9:38:31 139640505280448 [Note] WSREP: SST complete, seqno: 11125081 |
Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14 9:38:31 139640505280448 [Note] Using unique option prefix 'myisam-recover' is error-prone and can break in the future. Please use the full name 'myisam-recover-options' instead. |
Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14 9:38:31 139640505280448 [Note] InnoDB: Using mutexes to ref count buffer pool pages |
Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14 9:38:31 139640505280448 [Note] InnoDB: The InnoDB memory heap is disabled |
Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14 9:38:31 139640505280448 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins |
Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14 9:38:31 139640505280448 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier |
Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14 9:38:31 139640505280448 [Note] InnoDB: Compressed tables use zlib 1.2.8 |
Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14 9:38:31 139640505280448 [Note] InnoDB: Using Linux native AIO |
Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14 9:38:31 139640505280448 [Note] InnoDB: Using SSE crc32 instructions |
Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14 9:38:31 139640505280448 [Note] InnoDB: Initializing buffer pool, size = 2.7G |
Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14 9:38:31 139640505280448 [Note] InnoDB: Completed initialization of buffer pool |
Nov 14 09:38:31 recovery mysqld[7160]: 2016-11-14 9:38:31 139640505280448 [Note] InnoDB: Highest supported file format is Barracuda. |
Nov 14 09:38:32 recovery mysqld[7160]: 2016-11-14 9:38:32 139640505280448 [Note] InnoDB: 128 rollback segment(s) are active. |
Nov 14 09:38:32 recovery mysqld[7160]: 2016-11-14 9:38:32 139640505280448 [Note] InnoDB: Waiting for purge to start |
Nov 14 09:38:32 recovery mysqld[7160]: 2016-11-14 9:38:32 139640505280448 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.32-78.1 started; log sequence number 2788257505299 |
Nov 14 09:38:32 recovery mysqld[7160]: 2016-11-14 9:38:32 139631500691200 [Note] InnoDB: Dumping buffer pool(s) not yet started |
Nov 14 09:38:32 recovery mysqld[7160]: 2016-11-14 9:38:32 139640505280448 [Note] Plugin 'FEEDBACK' is disabled. |
Nov 14 09:38:32 recovery mysqld[7160]: 2016-11-14 9:38:32 139634973603584 [Note] WSREP: (197dad25, 'tcp://0.0.0.0:4567') turning message relay requesting off |
Nov 14 09:38:32 recovery mysqld[7160]: 2016-11-14 9:38:32 139640505280448 [Note] Server socket created on IP: '::'. |
Nov 14 09:38:32 recovery mysqld[7160]: 2016-11-14 9:38:32 139640505280448 [Note] WSREP: Signalling provider to continue. |
Nov 14 09:38:32 recovery mysqld[7160]: 2016-11-14 9:38:32 139640505280448 [Note] WSREP: SST received: 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268:11125081 |
Nov 14 09:38:32 recovery mysqld[7160]: 2016-11-14 9:38:32 139640504965888 [Note] WSREP: Receiving IST: 2745588 writesets, seqnos 11125081-13870669 |
Nov 14 09:38:32 recovery mysqld[7160]: 2016-11-14 9:38:32 139640505280448 [Note] /usr/sbin/mysqld: ready for connections. |
Nov 14 09:38:32 recovery mysqld[7160]: Version: '10.1.18-MariaDB-1~jessie' socket: '/var/run/mysqld/mysqld.sock' port: 3306 mariadb.org binary distribution |
Nov 14 09:38:32 recovery debian-start[7502]: Checking for corrupt, not cleanly closed and upgrade needing tables. |
Nov 14 09:38:32 recovery systemd[1]: Started MariaDB database server. |
Donor:
Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:29 140066827532032 [Note] WSREP: (15890d99, 'tcp://0.0.0.0:4567') connection established to 197dad25 tcp://ccc.ccc.169.121:4567 |
Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:29 140066827532032 [Note] WSREP: (15890d99, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: |
Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:29 140066827532032 [Note] WSREP: declaring 17f1f617 at tcp://bb.bbb.15.127:4567 stable |
Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:29 140066827532032 [Note] WSREP: declaring 197dad25 at tcp://ccc.ccc.169.121:4567 stable |
Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:29 140066827532032 [Note] WSREP: declaring 5b4b3258 at tcp://ddd.ddd.156.26:4567 stable |
Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:29 140066827532032 [Note] WSREP: Node 15890d99 state prim |
Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:29 140066827532032 [Note] WSREP: view(view_id(PRIM,15890d99,78) memb { |
Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 15890d99,0 |
Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 17f1f617,0 |
Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 197dad25,0 |
Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 5b4b3258,0 |
Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: } joined { |
Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: } left { |
Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: } partitioned { |
Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: }) |
Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:29 140066827532032 [Note] WSREP: save pc into disk |
Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:29 140066819139328 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 4 |
Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:29 140066819139328 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 19cac395-aa4e-11e6-8208-2a55f97f848c |
Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:29 140066819139328 [Note] WSREP: STATE EXCHANGE: sent state msg: 19cac395-aa4e-11e6-8208-2a55f97f848c |
Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:29 140066819139328 [Note] WSREP: STATE EXCHANGE: got state msg: 19cac395-aa4e-11e6-8208-2a55f97f848c from 0 (bwgalera-node01.xxx.com) |
Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:29 140066819139328 [Note] WSREP: STATE EXCHANGE: got state msg: 19cac395-aa4e-11e6-8208-2a55f97f848c from 1 (sd-87324) |
Nov 14 09:38:29 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:29 140066819139328 [Note] WSREP: STATE EXCHANGE: got state msg: 19cac395-aa4e-11e6-8208-2a55f97f848c from 3 (sd-77104) |
Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:30 140066819139328 [Note] WSREP: STATE EXCHANGE: got state msg: 19cac395-aa4e-11e6-8208-2a55f97f848c from 2 (recovery.xxx.com) |
Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:30 140066819139328 [Note] WSREP: Quorum results: |
Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: version = 4, |
Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: component = PRIMARY, |
Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: conf_id = 77, |
Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: members = 3/4 (joined/total), |
Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: act_id = 13870669, |
Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: last_appl. = 13870643, |
Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: protocols = 0/7/3 (gcs/repl/appl), |
Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: group UUID = 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268 |
Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:30 140066819139328 [Note] WSREP: Flow-control interval: [32, 32] |
Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:30 140072341510912 [Note] WSREP: New cluster view: global state: 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268:13870669, view# 78: Primary, number of nodes: 4, my index: 0, protocol version 3 |
Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:30 140072341510912 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. |
Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:30 140072341510912 [Note] WSREP: REPL Protocols: 7 (3, 2) |
Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:30 140066881373952 [Note] WSREP: Service thread queue flushed. |
Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:30 140072341510912 [Note] WSREP: Assign initial position for certification: 13870669, protocol version: 3 |
Nov 14 09:38:30 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:30 140066881373952 [Note] WSREP: Service thread queue flushed. |
Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:31 140066819139328 [Note] WSREP: Member 2.0 (recovery.xxx.com) requested state transfer from '*any*'. Selected 0.0 (bwgalera-node01.xxx.com)(SYNCED) as donor. |
Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:31 140066819139328 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 13870683) |
Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:31 140072341510912 [Note] WSREP: IST request: 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268:11125081-13870669|tcp://ccc.ccc.169.121:4568 |
Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:31 140072341510912 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. |
Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:31 139988716861184 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'donor' --address 'ccc.ccc.169.121:4444/xtrabackup_sst//1' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --binlog '/var/lib/mysql/mysql-bin' --gtid '9a7742ee-a1b7-11e6-98c0-d2f1fa8da268:11125081' --gtid-domain-id '0' --bypass' |
Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:31 140072341510912 [Note] WSREP: sst_donor_thread signaled with 0 |
Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:31 139988708468480 [Note] WSREP: async IST sender starting to serve tcp://ccc.ccc.169.121:4568 sending 11125082-13870669 |
Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: WSREP_SST: [INFO] Streaming with xbstream (20161114 09:38:31.422) |
Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: WSREP_SST: [INFO] Using socat as streamer (20161114 09:38:31.424) |
Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: WSREP_SST: [INFO] Bypassing the SST for IST (20161114 09:38:31.427) |
Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: WSREP_SST: [INFO] Evaluating xbstream -c ${INFO_FILE} ${IST_FILE} | socat -u stdio TCP:ccc.ccc.169.121:4444; RC=( ${PIPESTATUS[@]} ) (20161114 09:38:31.432) |
Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: WSREP_SST: [INFO] NOTE: donor-IST took 0 seconds (20161114 09:38:31.436) |
Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:31 140066819139328 [Note] WSREP: 0.0 (bwgalera-node01.xxx.com): State transfer to 2.0 (recovery.xxx.com) complete. |
Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:31 140066819139328 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 13870683) |
Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: WSREP_SST: [INFO] Total time on donor: 0 seconds (20161114 09:38:31.437) |
Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:31 140066819139328 [Note] WSREP: Member 0.0 (bwgalera-node01.xxx.com) synced with group. |
Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:31 140066819139328 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 13870683) |
Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:31 140072357923584 [Note] WSREP: Synchronized with group, ready for connections |
Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:31 140072357923584 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. |
Nov 14 09:38:31 bwgalera-node01 mysqld[45233]: WSREP_SST: [INFO] Cleaning up temporary directories (20161114 09:38:31.439) |
Nov 14 09:38:32 bwgalera-node01 mysqld[45233]: 2016-11-14 9:38:32 140066827532032 [Note] WSREP: (15890d99, 'tcp://0.0.0.0:4567') turning message relay requesting off |
Right now (after 1 hour of IST) this is situation on joiner
SHOW GLOBAL STATUS LIKE 'wsrep_%'; |
+------------------------------+--------------------------------------------------------------------------------+
|
| Variable_name | Value |
|
+------------------------------+--------------------------------------------------------------------------------+
|
| wsrep_apply_oooe | 0.973601 | |
| wsrep_apply_oool | 0.000003 | |
| wsrep_apply_window | 3.953276 | |
| wsrep_causal_reads | 0 | |
| wsrep_cert_deps_distance | 0.000000 | |
| wsrep_cert_index_size | 0 | |
| wsrep_cert_interval | 0.000000 | |
| wsrep_cluster_conf_id | 78 | |
| wsrep_cluster_size | 4 | |
| wsrep_cluster_state_uuid | 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268 |
|
| wsrep_cluster_status | Primary |
|
| wsrep_commit_oooe | 0.000000 | |
| wsrep_commit_oool | 0.000002 | |
| wsrep_commit_window | 2.928580 | |
| wsrep_connected | ON |
|
| wsrep_desync_count | 0 | |
| wsrep_evs_delayed | |
|
| wsrep_evs_evict_list | |
|
| wsrep_evs_repl_latency | 0/0/0/0/0 | |
| wsrep_evs_state | OPERATIONAL |
|
| wsrep_flow_control_paused | 0.000000 | |
| wsrep_flow_control_paused_ns | 0 | |
| wsrep_flow_control_recv | 0 | |
| wsrep_flow_control_sent | 0 | |
| wsrep_gcomm_uuid | 197dad25-aa4e-11e6-a2c5-ab393356446b |
|
| wsrep_incoming_addresses | aa.aaa.249.61:3306,bb.bbb.15.127:3306,ccc.ccc.169.121:3306,ddd.ddd.156.26:3306 | |
| wsrep_last_committed | 13787724 | |
| wsrep_local_bf_aborts | 0 | |
| wsrep_local_cached_downto | 18446744073709551615 | |
| wsrep_local_cert_failures | 0 | |
| wsrep_local_commits | 0 | |
| wsrep_local_index | 2 | |
| wsrep_local_recv_queue | 20343 | |
| wsrep_local_recv_queue_avg | 10170.500049 | |
| wsrep_local_recv_queue_max | 20343 | |
| wsrep_local_recv_queue_min | 0 | |
| wsrep_local_replays | 0 | |
| wsrep_local_send_queue | 0 | |
| wsrep_local_send_queue_avg | 0.333333 | |
| wsrep_local_send_queue_max | 2 | |
| wsrep_local_send_queue_min | 0 | |
| wsrep_local_state | 1 | |
| wsrep_local_state_comment | Joining: receiving State Transfer |
|
| wsrep_local_state_uuid | 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268 |
|
| wsrep_protocol_version | 7 | |
| wsrep_provider_name | Galera |
|
| wsrep_provider_vendor | Codership Oy <info@codership.com> | |
| wsrep_provider_version | 25.3.18(r3632) | |
| wsrep_ready | OFF |
|
| wsrep_received | 1 | |
| wsrep_received_bytes | 396 | |
| wsrep_repl_data_bytes | 0 | |
| wsrep_repl_keys | 0 | |
| wsrep_repl_keys_bytes | 0 | |
| wsrep_repl_other_bytes | 0 | |
| wsrep_replicated | 0 | |
| wsrep_replicated_bytes | 0 | |
| wsrep_thread_count | 5 | |
+------------------------------+--------------------------------------------------------------------------------+
|
And this is "master" node providing IST
SHOW GLOBAL STATUS LIKE 'wsrep_%'; |
+------------------------------+------------------------------------------------------------+
|
| Variable_name | Value |
|
+------------------------------+------------------------------------------------------------+
|
| wsrep_apply_oooe | 0.022080 | |
| wsrep_apply_oool | 0.000002 | |
| wsrep_apply_window | 1.024199 | |
| wsrep_causal_reads | 0 | |
| wsrep_cert_deps_distance | 71.001237 | |
| wsrep_cert_index_size | 29 | |
| wsrep_cert_interval | 0.050889 | |
| wsrep_cluster_conf_id | 79 | |
| wsrep_cluster_size | 3 | |
| wsrep_cluster_state_uuid | 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268 |
|
| wsrep_cluster_status | Primary |
|
| wsrep_commit_oooe | 0.000000 | |
| wsrep_commit_oool | 0.000001 | |
| wsrep_commit_window | 1.002219 | |
| wsrep_connected | ON |
|
| wsrep_desync_count | 0 | |
| wsrep_evs_delayed | |
|
| wsrep_evs_evict_list | |
|
| wsrep_evs_repl_latency | 0.000336649/0.000641091/0.000783475/8.88954e-05/44 | |
| wsrep_evs_state | OPERATIONAL |
|
| wsrep_flow_control_paused | 0.000007 | |
| wsrep_flow_control_paused_ns | 4177144290 | |
| wsrep_flow_control_recv | 338 | |
| wsrep_flow_control_sent | 1 | |
| wsrep_gcomm_uuid | 15890d99-a506-11e6-9af8-7b2457495062 |
|
| wsrep_incoming_addresses | aa.aaa.249.61:3306,bb.bbb.15.127:3306,ddd.ddd.156.26:3306 | |
| wsrep_last_committed | 13892160 | |
| wsrep_local_bf_aborts | 25 | |
| wsrep_local_cached_downto | 5984759 | |
| wsrep_local_cert_failures | 1 | |
| wsrep_local_commits | 7556981 | |
| wsrep_local_index | 0 | |
| wsrep_local_recv_queue | 0 | |
| wsrep_local_recv_queue_avg | 0.359858 | |
| wsrep_local_recv_queue_max | 456 | |
| wsrep_local_recv_queue_min | 0 | |
| wsrep_local_replays | 0 | |
| wsrep_local_send_queue | 0 | |
| wsrep_local_send_queue_avg | 0.004681 | |
| wsrep_local_send_queue_max | 28 | |
| wsrep_local_send_queue_min | 0 | |
| wsrep_local_state | 4 | |
| wsrep_local_state_comment | Synced |
|
| wsrep_local_state_uuid | 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268 |
|
| wsrep_protocol_version | 7 | |
| wsrep_provider_name | Galera |
|
| wsrep_provider_vendor | Codership Oy <info@codership.com> | |
| wsrep_provider_version | 25.3.18(r3632) | |
| wsrep_ready | ON |
|
| wsrep_received | 514789 | |
| wsrep_received_bytes | 186876645 | |
| wsrep_repl_data_bytes | 2829795019 | |
| wsrep_repl_keys | 37148075 | |
| wsrep_repl_keys_bytes | 471116441 | |
| wsrep_repl_other_bytes | 0 | |
| wsrep_replicated | 7556985 | |
| wsrep_replicated_bytes | 3784558500 | |
| wsrep_thread_count | 5 | |
+------------------------------+------------------------------------------------------------+
|
EDIT:
Finally on 10:45 (about an hour and 15 minutes) IST was completed on backup node
Nov 14 10:45:15 recovery mysqld[7160]: 2016-11-14 10:45:15 139640504965888 [Note] WSREP: IST received: 9a7742ee-a1b7-11e6-98c0-d2f1fa8da268:13870669 |
Nov 14 10:45:15 recovery mysqld[7160]: 2016-11-14 10:45:15 139634956826368 [Note] WSREP: 2.0 (recovery.xxx.com): State transfer from 0.0 (bwgalera-node01.xxx.com) complete. |
Nov 14 10:45:15 recovery mysqld[7160]: 2016-11-14 10:45:15 139634956826368 [Note] WSREP: Shifting JOINER -> JOINED (TO: 13891875) |
Nov 14 10:45:50 recovery mysqld[7160]: 2016-11-14 10:45:50 139634956826368 [Note] WSREP: Member 2.0 (recovery.xxx.com) synced with group. |
Nov 14 10:45:50 recovery mysqld[7160]: 2016-11-14 10:45:50 139634956826368 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 13892087) |
Nov 14 10:45:50 recovery mysqld[7160]: 2016-11-14 10:45:50 139640504965888 [Note] WSREP: Synchronized with group, ready for connections |
Nov 14 10:45:54 recovery systemd[1]: Stopping MariaDB database server... |
Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139640486193920 [Note] /usr/sbin/mysqld: Normal shutdown |
Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139640486193920 [Note] WSREP: Stop replication |
Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139640486193920 [Note] WSREP: Closing send monitor... |
Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139640486193920 [Note] WSREP: Closed send monitor. |
Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139640486193920 [Note] WSREP: gcomm: terminating thread |
Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139640486193920 [Note] WSREP: gcomm: joining thread |
Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139640486193920 [Note] WSREP: gcomm: closing backend |
Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139640486193920 [Note] WSREP: view(view_id(NON_PRIM,15890d99,78) memb { |
Nov 14 10:45:54 recovery mysqld[7160]: 197dad25,0 |
Nov 14 10:45:54 recovery mysqld[7160]: } joined { |
Nov 14 10:45:54 recovery mysqld[7160]: } left { |
Nov 14 10:45:54 recovery mysqld[7160]: } partitioned { |
Nov 14 10:45:54 recovery mysqld[7160]: 15890d99,0 |
Nov 14 10:45:54 recovery mysqld[7160]: 17f1f617,0 |
Nov 14 10:45:54 recovery mysqld[7160]: 5b4b3258,0 |
Nov 14 10:45:54 recovery mysqld[7160]: }) |
Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139640486193920 [Note] WSREP: view((empty)) |
Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139640486193920 [Note] WSREP: gcomm: closed |
Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139634956826368 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1 |
Nov 14 10:45:54 recovery mysqld[7160]: 5b4b3258,0 |
Nov 14 10:45:54 recovery mysqld[7160]: }) |
Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139640486193920 [Note] WSREP: view((empty)) |
Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139640486193920 [Note] WSREP: gcomm: closed |
Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139634956826368 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1 |
Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139634956826368 [Note] WSREP: Flow-control interval: [16, 16] |
Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139634956826368 [Note] WSREP: Received NON-PRIMARY. |
Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139634956826368 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 13892122) |
Nov 14 10:45:54 recovery mysqld[7160]: 2016-11-14 10:45:54 139634956826368 [Note] WSREP: Received self-leave message. |
So, it would take me about 45 minutes to complete SST, and 1 hour and 15 for IST. Is it a normal behaviour or I am missing something?