[MDEV-4163] Galera: WSREP fails to guess address to accept state on a localized OS Created: 2013-02-11 Updated: 2013-09-30 Resolved: 2013-09-30 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | None |
| Affects Version/s: | 5.5.28a-galera |
| Fix Version/s: | 5.5.33a-galera |
| Type: | Bug | Priority: | Major |
| Reporter: | Michée Lengronne | Assignee: | Seppo Jaakola |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | galera | ||
| Environment: |
Debian Squeeze on KVM. |
||
| Attachments: |
|
| Description |
|
Using the two bash scripts I made, the first for installing the service on each server, the second for connecting the cluster on the second server, I have the following error on the second server: Feb 11 10:41:27 mariadb2 mysqld: 130211 10:41:27 [Note] /usr/sbin/mysqld: Normal shutdown Feb 11 10:41:34 mariadb2 mysqld: 130211 10:41:34 [Warning] WSREP: Failed to guess base node address. Set it explicitly via wsrep_node_address. Feb 11 10:41:34 mariadb2 mysqld: 130211 10:41:34 [Warning] WSREP: Guessing address for incoming client connections failed. Try setting wsrep_node_incoming_address explicitly. Feb 11 10:41:34 mariadb2 mysqld: 130211 10:41:34 [Note] WSREP: Found saved state: 64dc5b89-742e-11e2-0800-9315c248faac:160 Feb 11 10:41:34 mariadb2 mysqld: 130211 10:41:34 [Note] WSREP: Reusing existing '/var/lib/mysql//galera.cache'. Feb 11 10:41:34 mariadb2 mysqld: 130211 10:41:34 [Note] WSREP: Passing config to GCS: base_port = 4567; cert.log_conflicts = no; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = NO; replicator.causal_read_timeout = PT30S; replicator.commit_order = 3 Feb 11 10:41:35 mariadb2 mysqld: 130211 10:41:35 [Note] WSREP: Assign initial position for certification: 160, protocol version: -1 Feb 11 10:41:35 mariadb2 mysqld: 130211 10:41:35 [Note] WSREP: wsrep_sst_grab() Feb 11 10:41:35 mariadb2 mysqld: 130211 10:41:35 [Note] WSREP: Start replication Feb 11 10:41:35 mariadb2 mysqld: 130211 10:41:35 [Note] WSREP: Setting initial position to 64dc5b89-742e-11e2-0800-9315c248faac:160 Feb 11 10:41:35 mariadb2 mysqld: 130211 10:41:35 [Note] WSREP: protonet asio version 0 Feb 11 10:41:35 mariadb2 mysqld: 130211 10:41:35 [Note] WSREP: backend: asio Feb 11 10:41:35 mariadb2 mysqld: 130211 10:41:35 [Note] WSREP: GMCast version 0 Feb 11 10:41:35 mariadb2 mysqld: 130211 10:41:35 [Note] WSREP: (3972dc6f-742f-11e2-0800-02e5048bee79, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567 Feb 11 10:41:35 mariadb2 mysqld: 130211 10:41:35 [Note] WSREP: (3972dc6f-742f-11e2-0800-02e5048bee79, 'tcp://0.0.0.0:4567') multicast: , ttl: 1 Feb 11 10:41:35 mariadb2 mysqld: 130211 10:41:35 [Note] WSREP: EVS version 0 Feb 11 10:41:35 mariadb2 mysqld: 130211 10:41:35 [Note] WSREP: PC version 0 Feb 11 10:41:35 mariadb2 mysqld: 130211 10:41:35 [Note] WSREP: gcomm: connecting to group 'my_wsrep_cluster', peer '192.168.122.137:' Feb 11 10:41:35 mariadb2 mysqld: 130211 10:41:35 [Note] WSREP: declaring c71af152-742e-11e2-0800-b2d6fce1fcbe stable Feb 11 10:41:35 mariadb2 mysqld: 130211 10:41:35 [Note] WSREP: view(view_id(PRIM,3972dc6f-742f-11e2-0800-02e5048bee79,2) memb { Feb 11 10:41:35 mariadb2 mysqld: #0113972dc6f-742f-11e2-0800-02e5048bee79, Feb 11 10:41:35 mariadb2 mysqld: #011c71af152-742e-11e2-0800-b2d6fce1fcbe, Feb 11 10:41:35 mariadb2 mysqld: } joined { Feb 11 10:41:35 mariadb2 mysqld: } left { Feb 11 10:41:35 mariadb2 mysqld: } partitioned { Feb 11 10:41:35 mariadb2 mysqld: }) Feb 11 10:41:36 mariadb2 mysqld: 130211 10:41:36 [Note] WSREP: gcomm: connected Feb 11 10:41:36 mariadb2 mysqld: 130211 10:41:36 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636 Feb 11 10:41:36 mariadb2 mysqld: 130211 10:41:36 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0) Feb 11 10:41:36 mariadb2 mysqld: 130211 10:41:36 [Note] WSREP: Opened channel 'my_wsrep_cluster' Feb 11 10:41:36 mariadb2 mysqld: 130211 10:41:36 [Note] WSREP: Waiting for SST to complete. Feb 11 10:41:36 mariadb2 mysqld: 130211 10:41:36 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 2 Feb 11 10:41:36 mariadb2 mysqld: 130211 10:41:36 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 3a0d6651-742f-11e2-0800-786cdd42b5c4 Feb 11 10:41:36 mariadb2 mysqld: 130211 10:41:36 [Note] WSREP: STATE EXCHANGE: sent state msg: 3a0d6651-742f-11e2-0800-786cdd42b5c4 Feb 11 10:41:36 mariadb2 mysqld: 130211 10:41:36 [Note] WSREP: STATE EXCHANGE: got state msg: 3a0d6651-742f-11e2-0800-786cdd42b5c4 from 1 (mariadb1) Feb 11 10:41:36 mariadb2 mysqld: 130211 10:41:36 [Note] WSREP: STATE EXCHANGE: got state msg: 3a0d6651-742f-11e2-0800-786cdd42b5c4 from 0 (mariadb2) Feb 11 10:41:36 mariadb2 mysqld: 130211 10:41:36 [Note] WSREP: Quorum results: Feb 11 10:41:36 mariadb2 mysqld: #011version = 2, Feb 11 10:41:36 mariadb2 mysqld: #011component = PRIMARY, Feb 11 10:41:36 mariadb2 mysqld: #011conf_id = 1, Feb 11 10:41:36 mariadb2 mysqld: #011members = 1/2 (joined/total), Feb 11 10:41:36 mariadb2 mysqld: #011act_id = 160, Feb 11 10:41:36 mariadb2 mysqld: #011last_appl. = -1, Feb 11 10:41:36 mariadb2 mysqld: #011protocols = 0/4/2 (gcs/repl/appl), Feb 11 10:41:36 mariadb2 mysqld: #011group UUID = c71c6b4f-742e-11e2-0800-c1770497f8e0 Feb 11 10:41:36 mariadb2 mysqld: 130211 10:41:36 [Note] WSREP: Flow-control interval: [23, 23] Feb 11 10:41:36 mariadb2 mysqld: 130211 10:41:36 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 160) Feb 11 10:41:36 mariadb2 mysqld: 130211 10:41:36 [Note] WSREP: State transfer required: Feb 11 10:41:36 mariadb2 mysqld: #011Group state: c71c6b4f-742e-11e2-0800-c1770497f8e0:160 Feb 11 10:41:36 mariadb2 mysqld: #011Local state: 64dc5b89-742e-11e2-0800-9315c248faac:160 Feb 11 10:41:36 mariadb2 mysqld: 130211 10:41:36 [Note] WSREP: New cluster view: global state: c71c6b4f-742e-11e2-0800-c1770497f8e0:160, view# 2: Primary, number of nodes: 2, my index: 0, protocol version 2 Feb 11 10:41:36 mariadb2 mysqld: 130211 10:41:36 [Warning] WSREP: Gap in state sequence. Need state transfer. Feb 11 10:41:38 mariadb2 mysqld: 130211 10:41:38 [ERROR] WSREP: Failed to read output of: '/sbin/ifconfig | grep -E '^[[:space:]]+inet addr:' | grep -m1 -v 'inet addr:127' | sed 's/:/ /' | awk '{ print $3 } '' joined { Feb 11 10:41:41 mariadb2 mysqld: }left { Feb 11 10:41:41 mariadb2 mysqld: }partitioned { Feb 11 10:41:41 mariadb2 mysqld: #011c71af152-742e-11e2-0800-b2d6fce1fcbe, Feb 11 10:41:41 mariadb2 mysqld: }) |
| Comments |
| Comment by Elena Stepanova [ 2013-02-11 ] |
|
The obvious suspect is this: Did you try to run the command manually on your machine and see if it works? |
| Comment by Michée Lengronne [ 2013-02-11 ] |
|
I tried. It doesn't give anything: root@mariadb2:~# /sbin/ifconfig | grep -E '^[[:space:]]+inet addr:' | grep -m1 -v 'inet addr:127' | sed 's/:/ /' | awk ' { print $3 }'root@mariadb2:~# /sbin/ifconfig | grep -E '^[[:space:]]+inet addr:' | grep -m1 -v 'inet addr:127' | sed 's/:/ /' | awk '{ print $3 } ' ' |
| Comment by Elena Stepanova [ 2013-02-11 ] |
|
Yes, I figured that.. I mean, did you try to find out why? Like, run /sbin/ifconfig and see if it returns anything apart from the loopback, and if it does (which I doubt), then gradually reduce the command above to find out where it fails? But most likely your network configuration does not have (or does not show, on whatever reason) your network interface and, just as it says, WSREP cannot "guess address to accept state transfer at". If you are sure that your network is functioning nevertheless, and you can connect to your other node, you might want to try what WSREP suggests: "wsrep_sst_receive_address must be set manually" |
| Comment by Michée Lengronne [ 2013-02-11 ] |
|
when I run /sbin/ifconfig on both servers: server1: eth0 Link encap:Ethernet HWaddr 52:54:00:83:4c:f2 lo Link encap:Boucle locale server2: eth0 Link encap:Ethernet HWaddr 52:54:00:56:92:49 lo Link encap:Boucle locale and I can ping from one server to the other. |
| Comment by Michée Lengronne [ 2013-02-11 ] |
|
/sbin/ifconfig | grep -E '^[[:space:]]+inet addr:' doesn't give anything. /sbin/ifconfig | sed 's/:/ /' gives the same than /sbin/ifconfig and /sbin/ifconfig | grep -m1 -v 'inet addr:127' gives: So, apparently the first grep gives a problem. |
| Comment by Elena Stepanova [ 2013-02-11 ] |
|
Okay, the problem with grep is a localization issue, I will pass it over to Codership to see what they can do about it. For the problem with "Unknown Command", i think it will be better to create another report and provide, again, the error log etc. |
| Comment by Elena Stepanova [ 2013-02-11 ] |
|
Hi Seppo, See comments above: a localized version of Debian prints network configuration with 'adr', while WSREP relies on 'addr' in its grep command. |
| Comment by Michée Lengronne [ 2013-02-11 ] |
|
What can I modify for making it work ? Do I have to wait for the next release ? |
| Comment by Elena Stepanova [ 2013-02-11 ] |
|
I think your best bet is to provide the address manually via wsrep_sst_receive_address, as WSREP suggests, but for that we should find out why it didn't work for you. So, please open a bug report for not working wsrep_sst_receive_address and we'll try to investigate what's going on. |
| Comment by Michée Lengronne [ 2013-02-11 ] |
|
It's already done )) |
| Comment by Seppo Jaakola [ 2013-09-30 ] |
|
The address guessing has been re-factored to use 'ip addr'. Here is the code snippet as it currently stands: #if (TARGET_OS_LINUX == 1) /.*//'"; #elif defined(_sun_) const char cmd[] = "/sbin/ifconfig -a | " "/usr/gnu/bin/grep -m1 -1 -E 'net[0-9]:' | tail -n 1 | awk '{ print $2 } '"; '"; |
| Comment by Seppo Jaakola [ 2013-09-30 ] |
|
Closing the issue, did not verify with latest MGC build, but we have no IP addres guessing related reports with MySQL version since the code re-factoring. |