[MXS-2158] Node rejoin fails, if the node was never a slave (but was master before going down) - Jira

XML

Word

Printable

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 2.3.0
Fix Version/s: 2.3.2
Component/s: mariadbmon
Labels:
None

Sprint:
MXS-SPRINT-70

Description

(1) Start a Fresh new cluster Server 1 = master, Server2,3,4= Slave
(2) Bring Master Down (without having done any transactions)
(3) Server 2 gets promoted to Master
(4) Perform couple of transactions
(5) Bring up Server 1

Server 1 is not joined into the cluster as Slave with following error Message in the log:

2018-11-12 08:48:42   notice : Server changed state: server1[127.0.0.1:33061]: server_up. [Down] -> [Running]

2018-11-12 08:48:42   warning: Automatic rejoin was not attempted on server 'server1' even though it is a valid candidate. Will keep retrying with this message suppressed for all servers. Errors:

Server 'server1' could not be queried.

MaxCtrl shows this

 maxctrl list servers                                           Mon Nov 12 08:53:40 2018

    Server	Address       Port	Connections     State               GTID

    server1     127.0.0.1     33061     0               Running

    server2     127.0.0.1     33062     1               Master, Running     0-2-4

    server3     127.0.0.1     33063     1               Slave, Running      0-2-4

    server4     127.0.0.1     33064     1               Slave, Running      0-2-4

Monitor configuration is as following

[TheMonitor]

type=monitor

module=mariadbmon

servers=server1,server2,server3,server4

user=maxuser

password=maxpwd

auto_failover=true

auto_rejoin=true

This is the server setting

MariaDB [test]> SHOW VARIABLES LIKE "rp%sync%";

+---------------------------------------+--------------+

| Variable_name                         | Value        |

+---------------------------------------+--------------+

| rpl_semi_sync_master_enabled          | OFF          |

| rpl_semi_sync_master_timeout          | 10000        |

| rpl_semi_sync_master_trace_level      | 32           |

| rpl_semi_sync_master_wait_no_slave    | ON           |

| rpl_semi_sync_master_wait_point       | AFTER_COMMIT |

| rpl_semi_sync_slave_delay_master      | OFF          |

| rpl_semi_sync_slave_enabled           | OFF          |

| rpl_semi_sync_slave_kill_conn_timeout | 5            |

| rpl_semi_sync_slave_trace_level       | 32           |

+---------------------------------------+--------------+

9 rows in set (0.001 sec)

Couple of issues here
(1) The error message is not descriptive enough to explain which query did the server trying to rejoin failed to respond
(2) The server1 should have been allowed to rejoin

Attachments

Activity

People

Assignee:: Esa Korhonen

Reporter:: Dipti Joshi (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 2018-11-12 09:01

Updated:: 2018-11-20 09:28

Resolved:: 2018-11-19 08:28

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.