Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-5703

[PATCH] Slave disconnects and fails to reconnect on Error_code: 1159

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 5.5.35
    • 5.5.37, 10.0.9
    • None
    • Linux (slackware)

    Description

      While replicating, slave server randomly prints this error and disconnects from master:

      [ERROR] Slave I/O: The slave I/O thread stops because a fatal error is encountered when it try to get the value of SERVER_ID variable from master. Error: , Error_code: 1159
      [Note] Slave I/O thread exiting, read up to log 'mysql-bin.xxxxxx', position xxxxxx

      Where error code 1159 is in fact ER_NET_READ_INTERRUPTED: Got timeout reading communication packets

      Executing STOP SLAVE; START SLAVE; on the slave server resumes the replication without any problem. The slave server should reconnect automatically though, which doesn't happen.

      I believe the issue is in mariadb-sources/sql/slave.cc

      There is a function called is_network_error(), which checks if the given error is network related. It's missing a check for ER_NET_READ_INTERRUPTED. Patch is very trivial:

      --- sql/slave.cc<----->2013-07-17 09:51:31.000000000 -0500
      +++ sql/slave.cc<-->2014-02-19 02:06:55.591593796 -0600
      @@ -1215,6 +1215,7 @@ bool is_network_error(uint errorno)
             errorno == ER_CON_COUNT_ERROR ||
             errorno == ER_CONNECTION_KILLED ||
             errorno == ER_NEW_ABORTING_CONNECTION ||
      +      errorno == ER_NET_READ_INTERRUPTED ||
             errorno == ER_SERVER_SHUTDOWN)
           return TRUE;

      Then mariadb will know that it was network related error and will try to reconnect automatically.

      Attachments

        Activity

          Hi Kristian,

          Could you please take a look at the suggested patch to see if it's valid (and maybe push it if it is)?

          elenst Elena Stepanova added a comment - Hi Kristian, Could you please take a look at the suggested patch to see if it's valid (and maybe push it if it is)?

          Pushed to 10.0-base (will be later merged to 10.0)

          knielsen Kristian Nielsen added a comment - Pushed to 10.0-base (will be later merged to 10.0)

          And btw, thanks a lot for the report and patch, Tomas Matejicek!

          knielsen Kristian Nielsen added a comment - And btw, thanks a lot for the report and patch, Tomas Matejicek!

          You are welcome. May I ask you why the fix is not added to MariaDB
          5.5.* like 5.5.36 or so?
          Thank you

          Tomas M

          On Tue, Mar 4, 2014 at 2:46 PM, Kristian Nielsen (JIRA)

          TomasM Tomas Matejicek added a comment - You are welcome. May I ask you why the fix is not added to MariaDB 5.5.* like 5.5.36 or so? Thank you Tomas M On Tue, Mar 4, 2014 at 2:46 PM, Kristian Nielsen (JIRA)

          > May I ask you why the fix is not added to MariaDB 5.5.* like 5.5.36 or so?

          No particular reason. I've now pushed to 5.5 as well.

          • Kristian.
          knielsen Kristian Nielsen added a comment - > May I ask you why the fix is not added to MariaDB 5.5.* like 5.5.36 or so? No particular reason. I've now pushed to 5.5 as well. Kristian.
          laurynas Laurynas Biveinis added a comment - This is also https://bugs.launchpad.net/percona-server/+bug/1268729 aka http://bugs.mysql.com/bug.php?id=71374 . There is also a related bug https://bugs.launchpad.net/percona-server/+bug/1268735 aka http://bugs.mysql.com/bug.php?id=71375 .
          neurogenesis Ives Stoddard added a comment -

          will this patch also make its way into the 10.0.9 release? i was about to start with 10.0.8, for the new multi-source replication until 10.0.10 GA is available.

          neurogenesis Ives Stoddard added a comment - will this patch also make its way into the 10.0.9 release? i was about to start with 10.0.8, for the new multi-source replication until 10.0.10 GA is available.

          most probably — yes, I've just merged it into 10.0.

          serg Sergei Golubchik added a comment - most probably — yes, I've just merged it into 10.0.

          People

            knielsen Kristian Nielsen
            TomasM Tomas Matejicek
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.