Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-253 Multi-source replication
  3. MDEV-3793

Multi-source: Semisync replication is not fully supported for multiple masters and can cause replication failure and relay log corruption

    XMLWordPrintable

    Details

    • Type: Technical task
    • Status: Closed (View Workflow)
    • Priority: Minor
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: 10.3.23
    • Component/s: Replication
    • Labels:
      None

      Description

      Semisync replication doesn't properly distinguish multiple master connections, which causes different problems. For example, if one master has the semisync plugin, and another one doesn't, trying to enable semisync on slave makes replication from both masters abort. The actual errors vary. With the test case below, most often I'm getting

      On the connection with the master which does not have the semisync plugin:

      Last_IO_Errno   1593
      Last_IO_Error   Fatal error: Failed to run 'after_read_event' hook

      On the connection with the master which has the semisync plugin:
      either

      Last_SQL_Errno  1594
      Last_SQL_Error  Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.

      or

      Last_IO_Errno   1595
      Last_IO_Error   Relay log write failure: could not queue event from master

      If we decide to support it, the corresponding variables Rpl_semi_sync_slave_status and rpl_semi_sync_slave_enabled should probably be made session-aware.

      The test case is draft, it should not be added to the suite as is. It contains sleeps that are unreliable and slow, when the problem is fixed, they should be replaced with proper waits and syncs.

      Please make sure you have at least revno 3438 from 10.0-base, since the test uses reset_master_slave.inc include file which was added there.

      If you haven't got the error on the first attempt, give it another try, sometimes it gets lucky and passes, apparently there is some kind of a race condition.

      Test case:
      cat semisync.test

      # TODO: when the problem is fixed,
      # instead of the sleeps below there should be proper
      # waits for slaves to start, and also synchronization
      # with each master. For now, it will just make the test
      # hang for long time, so I won't put it here.
      # Also, an log error suppression will need to be added.
       
       
      --connect (master1,127.0.0.1,root,,,$SERVER_MYPORT_1)
      install soname 'semisync_master.so';
       
      --connect (slave,127.0.0.1,root,,,$SERVER_MYPORT_3)
       
      install soname 'semisync_slave.so';
      set global rpl_semi_sync_slave_enabled = 1;
       
      --replace_result $SERVER_MYPORT_1 MYPORT_1
      eval change master 'master1' to
      master_port=$SERVER_MYPORT_1,
      master_host='127.0.0.1',
      master_user='root';
       
      start slave 'master1';
      --sleep 2
       
      --replace_result $SERVER_MYPORT_2 MYPORT_2
      eval change master 'master2' to
      master_port=$SERVER_MYPORT_2,
      master_host='127.0.0.1',
      master_user='root';
       
      start slave 'master2';
      --sleep 2
       
      stop all slaves;
      --sleep 2
      start all slaves;
      --sleep 3
      --replace_result $SERVER_MYPORT_1 MYPORT_1 $SERVER_MYPORT_2 MYPORT_2
      query_vertical show all slaves status;
       
      # Cleanup
       
      --source reset_master_slave.inc
      uninstall plugin rpl_semi_sync_slave;
      --disconnect slave
       
      --connection master1
      --source reset_master_slave.inc
      uninstall plugin rpl_semi_sync_master;
      --disconnect master1
       

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              monty Michael Widenius
              Reporter:
              elenst Elena Stepanova
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.