[MDEV-9423] cannot add new node to the cluser: Binlog file '/var/log/mysql/mariadb-bin.000001' not found in binlog index, needed for recovery. Aborting. - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 10.1.10
Fix Version/s: 10.1.15
Component/s: Galera
Labels:
None
Environment:
Linux fm-first 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt20-1+deb8u1 (2015-12-14) x86_64 GNU/Linux

Sprint:
10.1.15

Description

Looks like the master hasn't been shutdown properly and now the new node that tries to connect to this cluster tries to run a recovery , but the binlog file doesn't exist.

starting the daemon with

mysqld --wsrep_cluster_address=......

2016-01-15  9:33:23 140341928318912 [Note] mysqld (mysqld 10.1.10-MariaDB-1~jessie-log) starting as process 1 ...

2016-01-15  9:33:23 140341928318912 [Note] WSREP: Read nil XID from storage engines, skipping position init

2016-01-15  9:33:23 140341928318912 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'

2016-01-15  9:33:23 140341928318912 [Note] WSREP: wsrep_load(): Galera 3.9(rXXXX) by Codership Oy <info@codership.com> loaded successfully.

2016-01-15  9:33:23 140341928318912 [Note] WSREP: CRC-32C: using hardware acceleration.

2016-01-15  9:33:23 140341928318912 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1

2016-01-15  9:33:23 140341928318912 [Note] WSREP: Passing config to GCS: base_host = 10.0.0.8; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 3G; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.npvo = false; pc.recovery = 1;

2016-01-15  9:33:23 140338580059904 [Note] WSREP: Service thread queue flushed.

2016-01-15  9:33:23 140341928318912 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1

2016-01-15  9:33:23 140341928318912 [Note] WSREP: wsrep_sst_grab()

2016-01-15  9:33:23 140341928318912 [Note] WSREP: Start replication

2016-01-15  9:33:23 140341928318912 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1

2016-01-15  9:33:23 140341928318912 [Note] WSREP: protonet asio version 0

2016-01-15  9:33:23 140341928318912 [Note] WSREP: Using CRC-32C for message checksums.

2016-01-15  9:33:23 140341928318912 [Note] WSREP: backend: asio

2016-01-15  9:33:23 140341928318912 [Warning] WSREP: access file(gvwstate.dat) failed(No such file or directory)

2016-01-15  9:33:23 140341928318912 [Note] WSREP: restore pc from disk failed

2016-01-15  9:33:23 140341928318912 [Note] WSREP: GMCast version 0

2016-01-15  9:33:23 140341928318912 [Warning] WSREP: Failed to resolve tcp://compose_mariadb_1:4567

2016-01-15  9:33:23 140341928318912 [Note] WSREP: (05b59999, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567

2016-01-15  9:33:23 140341928318912 [Note] WSREP: (05b59999, 'tcp://0.0.0.0:4567') multicast: , ttl: 1

2016-01-15  9:33:23 140341928318912 [Note] WSREP: EVS version 0

2016-01-15  9:33:23 140341928318912 [Note] WSREP: gcomm: connecting to group 'my_wsrep_cluster', peer 'compose_garbd_1:4567,compose_mariadb_1:4567,second_mariadb_1:4567'

2016-01-15  9:33:23 140341928318912 [Note] WSREP: (05b59999, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:

2016-01-15  9:33:24 140341928318912 [Note] WSREP: declaring 1afe1669 at tcp://10.0.0.9:4567 stable

2016-01-15  9:33:24 140341928318912 [Note] WSREP: declaring 41c70c8c at tcp://10.0.0.12:4567 stable

2016-01-15  9:33:24 140341928318912 [Note] WSREP: Node 1afe1669 state prim

2016-01-15  9:33:25 140341928318912 [Note] WSREP: view(view_id(PRIM,05b59999,96) memb {

	05b59999,0

	1afe1669,0

	41c70c8c,0

} joined {

} left {

} partitioned {

})

2016-01-15  9:33:25 140341928318912 [Note] WSREP: save pc into disk

2016-01-15  9:33:25 140341928318912 [Note] WSREP: gcomm: connected

2016-01-15  9:33:25 140341928318912 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636

2016-01-15  9:33:25 140341928318912 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)

2016-01-15  9:33:25 140341928318912 [Note] WSREP: Opened channel 'my_wsrep_cluster'

2016-01-15  9:33:25 140341928318912 [Note] WSREP: Waiting for SST to complete.

2016-01-15  9:33:25 140338517763840 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 3

2016-01-15  9:33:25 140338517763840 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 069bc67c-bb6b-11e5-a9b3-1ad7790ea186

2016-01-15  9:33:25 140338517763840 [Note] WSREP: STATE EXCHANGE: sent state msg: 069bc67c-bb6b-11e5-a9b3-1ad7790ea186

2016-01-15  9:33:25 140338517763840 [Note] WSREP: STATE EXCHANGE: got state msg: 069bc67c-bb6b-11e5-a9b3-1ad7790ea186 from 0 ()

2016-01-15  9:33:25 140338517763840 [Note] WSREP: STATE EXCHANGE: got state msg: 069bc67c-bb6b-11e5-a9b3-1ad7790ea186 from 1 (garb)

2016-01-15  9:33:25 140338517763840 [Note] WSREP: STATE EXCHANGE: got state msg: 069bc67c-bb6b-11e5-a9b3-1ad7790ea186 from 2 ()

2016-01-15  9:33:25 140338517763840 [Note] WSREP: Quorum results:

	version    = 3,

	component  = PRIMARY,

	conf_id    = 93,

	members    = 2/3 (joined/total),

	act_id     = 205387,

	last_appl. = -1,

	protocols  = 0/7/3 (gcs/repl/appl),

	group UUID = 0600869f-af4f-11e5-8c89-a35fa6cdd2ef

2016-01-15  9:33:25 140338517763840 [Note] WSREP: Flow-control interval: [28, 28]

2016-01-15  9:33:25 140338517763840 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 205387)

2016-01-15  9:33:25 140341928011520 [Note] WSREP: State transfer required:

	Group state: 0600869f-af4f-11e5-8c89-a35fa6cdd2ef:205387

	Local state: 00000000-0000-0000-0000-000000000000:-1

2016-01-15  9:33:25 140341928011520 [Note] WSREP: New cluster view: global state: 0600869f-af4f-11e5-8c89-a35fa6cdd2ef:205387, view# 94: Primary, number of nodes: 3, my index: 0, protocol version 3

2016-01-15  9:33:25 140341928011520 [Warning] WSREP: Gap in state sequence. Need state transfer.

2016-01-15  9:33:25 140338488407808 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.0.0.8' --datadir '/var/lib/mysql/'   --parent '1' --binlog '/var/log/mysql/mariadb-bin' '

WSREP_SST: [INFO] Logging all stderr of SST/Innobackupex to syslog (20160115 09:33:25.515)

2016-01-15  9:33:25 140341928011520 [Note] WSREP: Prepared SST request: xtrabackup-v2|10.0.0.8:4444/xtrabackup_sst//1

2016-01-15  9:33:25 140341928011520 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

2016-01-15  9:33:25 140341928011520 [Note] WSREP: REPL Protocols: 7 (3, 2)

2016-01-15  9:33:25 140338580059904 [Note] WSREP: Service thread queue flushed.

2016-01-15  9:33:25 140341928011520 [Note] WSREP: Assign initial position for certification: 205387, protocol version: 3

2016-01-15  9:33:25 140338580059904 [Note] WSREP: Service thread queue flushed.

2016-01-15  9:33:25 140341928011520 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (0600869f-af4f-11e5-8c89-a35fa6cdd2ef): 1 (Operation not permitted)

	 at galera/src/replicator_str.cpp:prepare_for_IST():456. IST will be unavailable.

2016-01-15  9:33:25 140338517763840 [Note] WSREP: Member 0.0 () requested state transfer from '*any*'. Selected 2.0 ()(SYNCED) as donor.

2016-01-15  9:33:25 140338517763840 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 205387)

2016-01-15  9:33:25 140341928011520 [Note] WSREP: Requesting state transfer: success, donor: 2

2016-01-15  9:33:27 140338526156544 [Note] WSREP: (05b59999, 'tcp://0.0.0.0:4567') turning message relay requesting off

2016-01-15 10:25:15 140338517763840 [Note] WSREP: 2.0 (): State transfer to 0.0 () complete.

2016-01-15 10:25:15 140338517763840 [Note] WSREP: Member 2.0 () synced with group.

2016-01-15 10:25:20 140341928318912 [Note] WSREP: SST complete, seqno: 205388

2016-01-15 10:25:21 140341928318912 [Note] InnoDB: Using mutexes to ref count buffer pool pages

2016-01-15 10:25:21 140341928318912 [Note] InnoDB: The InnoDB memory heap is disabled

2016-01-15 10:25:21 140341928318912 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins

2016-01-15 10:25:21 140341928318912 [Note] InnoDB: Memory barrier is not used

2016-01-15 10:25:21 140341928318912 [Note] InnoDB: Compressed tables use zlib 1.2.8

2016-01-15 10:25:21 140341928318912 [Note] InnoDB: Using Linux native AIO

2016-01-15 10:25:21 140341928318912 [Note] InnoDB: Using SSE crc32 instructions

2016-01-15 10:25:21 140341928318912 [Note] InnoDB: Initializing buffer pool, size = 9.8G

2016-01-15 10:25:21 140341928318912 [Note] InnoDB: Completed initialization of buffer pool

2016-01-15 10:25:22 140341928318912 [Note] InnoDB: Highest supported file format is Barracuda.

2016-01-15 10:25:22 140341928318912 [Note] InnoDB: 128 rollback segment(s) are active.

2016-01-15 10:25:22 140341928318912 [Note] InnoDB: Waiting for purge to start

2016-01-15 10:25:22 140341928318912 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.26-76.0 started; log sequence number 79527883504

2016-01-15 10:25:22 140318926145280 [Note] InnoDB: Dumping buffer pool(s) not yet started

2016-01-15 10:25:22 140341928318912 [Note] Plugin 'FEEDBACK' is disabled.

2016-01-15 10:25:22 140341928318912 [Note] Recovering after a crash using /var/log/mysql/mariadb-bin

2016-01-15 10:25:22 140341928318912 [ERROR] Binlog file '/var/log/mysql/mariadb-bin.000001' not found in binlog index, needed for recovery. Aborting.

2016-01-15 10:25:22 140341928318912 [ERROR] Crash recovery failed. Either correct the problem (if it's, for example, out of memory error) and restart, or delete (or rename) binary log and start mysqld with --tc-heuristic-recover={commit|rollback}

2016-01-15 10:25:22 140341928318912 [ERROR] Can't init tc log

2016-01-15 10:25:22 140341928318912 [ERROR] Aborting

Attachments

Issue Links

duplicates

MDEV-10188 MariaDB server can't join Galera cluster after restart

Closed

links to

Commit

Activity

People

Assignee:: Nirbhay Choubey (Inactive)

Reporter:: Krasimir

Votes:: 4 Vote for this issue

Watchers:: 13 Start watching this issue

Dates

Created:: 2016-01-15 13:32

Updated:: 2016-08-21 20:20

Resolved:: 2016-06-29 21:06

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server