[MDEV-5703] [PATCH] Slave disconnects and fails to reconnect on Error_code: 1159 Created: 2014-02-19 Updated: 2014-03-06 Resolved: 2014-03-06 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | None |
| Affects Version/s: | 5.5.35 |
| Fix Version/s: | 5.5.37, 10.0.9 |
| Type: | Bug | Priority: | Major |
| Reporter: | Tomas Matejicek | Assignee: | Kristian Nielsen |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | replication | ||
| Environment: |
Linux (slackware) |
||
| Description |
|
While replicating, slave server randomly prints this error and disconnects from master: [ERROR] Slave I/O: The slave I/O thread stops because a fatal error is encountered when it try to get the value of SERVER_ID variable from master. Error: , Error_code: 1159 Where error code 1159 is in fact ER_NET_READ_INTERRUPTED: Got timeout reading communication packets Executing STOP SLAVE; START SLAVE; on the slave server resumes the replication without any problem. The slave server should reconnect automatically though, which doesn't happen. I believe the issue is in mariadb-sources/sql/slave.cc There is a function called is_network_error(), which checks if the given error is network related. It's missing a check for ER_NET_READ_INTERRUPTED. Patch is very trivial:
Then mariadb will know that it was network related error and will try to reconnect automatically. |
| Comments |
| Comment by Elena Stepanova [ 2014-02-20 ] |
|
Hi Kristian, Could you please take a look at the suggested patch to see if it's valid (and maybe push it if it is)? |
| Comment by Kristian Nielsen [ 2014-03-04 ] |
|
Pushed to 10.0-base (will be later merged to 10.0) |
| Comment by Kristian Nielsen [ 2014-03-04 ] |
|
And btw, thanks a lot for the report and patch, Tomas Matejicek! |
| Comment by Tomas Matejicek [ 2014-03-04 ] |
|
You are welcome. May I ask you why the fix is not added to MariaDB Tomas M On Tue, Mar 4, 2014 at 2:46 PM, Kristian Nielsen (JIRA) |
| Comment by Kristian Nielsen [ 2014-03-04 ] |
|
> May I ask you why the fix is not added to MariaDB 5.5.* like 5.5.36 or so? No particular reason. I've now pushed to 5.5 as well.
|
| Comment by Laurynas Biveinis [ 2014-03-05 ] |
|
This is also https://bugs.launchpad.net/percona-server/+bug/1268729 aka http://bugs.mysql.com/bug.php?id=71374. There is also a related bug https://bugs.launchpad.net/percona-server/+bug/1268735 aka http://bugs.mysql.com/bug.php?id=71375. |
| Comment by Ives Stoddard [ 2014-03-05 ] |
|
will this patch also make its way into the 10.0.9 release? i was about to start with 10.0.8, for the new multi-source replication until 10.0.10 GA is available. |
| Comment by Sergei Golubchik [ 2014-03-05 ] |
|
most probably — yes, I've just merged it into 10.0. |