Details

    • Task
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.0.0
    • None

    Description

      This task is about implementing a way for a MariaDB replication slave to
      replicate from more than one master simultaneously.

      Each master is handled by a specific replicator instance in the slave
      server. Each replicator instance consists of separate I/O thread, SQL thread,
      and associated state and configuration. In efffect, several replication slaves
      are running at the same time, each replicating from a separate master, but all
      replicating into a common data store (typically, but not necessarily to
      separate databases/tables).

      A replicator instance is identified with a user-chosen name used in
      replication SQL statements such as CHANGE MASTER TO ...
      This name is also included in file names to distinguish the files that keep
      the replication state (relay logs, master.info, relay-log.info). This way,
      each separate instance can be configured separately, but otherwise the same
      way as existing single-source replication.

      In order to remain backwards-compatible with existing third-party scripts
      etc., the replicator instance name is made optional in all existing
      replication statements. If it is omitted, the name "default" is used, and for
      this particular name, master.info and the other files retain their old names
      to allow seamless upgrades of slaves.

      In this worklog, there is no extra conflict resolution proposed. The effect of
      updates from one master on replication from another master will be the same as
      the effect of direct user SQL queries on the slave server, ie. it is the
      responsibility of the user/DBA to ensure no conflicts occur. If a conflict
      causes replication of an event to fail (with duplicate key violation eg.), the
      corresponding slave SQL thread will stop, requiring manual intervention to fix
      and start.

      An easy and typical way to avoid conflicts would be to eg. to use separate
      databases for each master->slave replication channel. RBR idempotent slave
      application can also be used to help resolve conflicts, for example.

      See also MySQL WL#1697: http://forge.mysql.com/worklog/task.php?id=1697

      High-Level Specification

      Here is a preliminary list of things that need to be changed/extended to
      handle multi-source:

      File names
      ----------

      These files used to store replication state must be extended to include the
      replicator instance name. For "default", as a special case, the old name
      should be used for backwards compatibility:

      HOST-relay-bin.XXXXXX
      HOST-relay-bin.index
      relay-log.info
      master.info

      SQL statements
      --------------

      These statements need to be extended to take an optional replicator instance
      name. If omitted, "default" is used:

      CHANGE MASTER TO
      LOAD DATA FROM MASTER
      LOAD TABLE xxx FROM MASTER
      MASTER_POS_WAIT(file, position, timeout)
      RESET SLAVE
      SHOW SLAVE STATUS
      START SLAVE
      START SLAVE UNTIL ...
      STOP SLAVE

      System variables
      ----------------

      These system variables need to handle multiple replicator instances. Some can
      remain global, and apply to all instances. Some will need to be per-instance,
      and can probably use structured system variables, same was as for multiple key
      caches:

      SQL_SLAVE_SKIP_COUNTER
      log_slave_updates
      master_connect_retry
      master_info_file
      master_retry_count
      max_relay_log_size
      read_only
      relay_log
      relay_log_index
      relay_log_info_file
      relay_log_purge
      relay_log_space_limit
      replicate_do_db
      replicate_do_table
      replicate_ignore_db
      replicate_ignore_table
      replicate_rewrite_db
      replicate_same_server_id
      replicate_wild_do_table
      replicate_wild_ignore_table
      report_host
      report_port
      skip_slave_start
      slave_compressed_protocol
      slave_load_tmpdir
      slave_net_timeout
      slave_skip_errors

      The deprecated options master_host, master_port, master_user, master_password,
      and master_ssl_* should not be supported for multi source replication.

      Low-Level Design

      A main part of this worklog will be to modify the server code so that all
      slave objects and data can have multiple instances, not use global variables,
      etc. And so that all user-visible interfaces (SQL statements, system
      variables, status variables, files) are extended to support multiple
      replicator instances.

      Another major part is to carefully develop the feature and not the least test
      it for full backwards compatibility in the case where only a single, default
      master is used. If this is not done correctly, a lot of users will get
      problems when slaves are upgraded and their monitoring scripts, management
      scripts, or replication state breaks.

      Attachments

        1.
        Multi-source: sql_slave_skip_counter doesn't work for a connection with a non-default name Technical task Closed Michael Widenius
        2.
        Multi-source: Behavior of "RESET SLAVE 'name'" is inconsistent with the normal RESET SLAVE Technical task Closed Michael Widenius
        3.
        Multi-source: sync_with_master doesn't accept a space after the comma Technical task Closed Michael Widenius
        4.
        Multi-source: [patch] Get rid of sleep in multi_source.simple test Technical task Closed Michael Widenius
        5.
        Multi-source: Non-descriptive error message in the error log on creating a duplicate replication configuration Technical task Closed Michael Widenius
        6.
        Multi-source: Slave allows multiple masters to have the same server id (might be bad for complicated replication setups) Technical task Closed Michael Widenius
        7.
        Multi-source: SHOW RELAYLOG EVENTS doesn't work for named connections Technical task Closed Michael Widenius
        8.
        Multi-source: More status variables need to be either local for a replication connection, or truly global across all slaves Technical task Closed Michael Widenius
        9.
        Multi-source: "Freeing overrun buffer" error and valgrind warnings on change master with a named connection Technical task Closed Michael Widenius
        10.
        Multi-source: [tests] Basic tests to review and add to the multi_source suite Technical task Closed Michael Widenius
        11.
        Multi-source: RESET SLAVE 'name' ALL creates master.info.index file instead of updating multi-master.info Technical task Closed Michael Widenius
        12.
        Multi-source: Memory loss warnings on an attempt to create a duplicate connection with a different name Technical task Closed Michael Widenius
        13.
        Multi-source: Semisync replication is not fully supported for multiple masters and can cause replication failure and relay log corruption Technical task Closed Michael Widenius

        Activity

          ratzpo Rasmus Johansson (Inactive) created issue -
          ratzpo Rasmus Johansson (Inactive) made changes -
          Field Original Value New Value
          Description This task is about implementing a way for a MariaDB replication slave to
          replicate from more than one master simultaneously.

          Each master is handled by a specific replicator instance in the slave
          server. Each replicator instance consists of separate I/O thread, SQL thread,
          and associated state and configuration. In efffect, several replication slaves
          are running at the same time, each replicating from a separate master, but all
          replicating into a common data store (typically, but not necessarily to
          separate databases/tables).

          A replicator instance is identified with a user-chosen name used in
          replication SQL statements such as CHANGE MASTER TO ...
          This name is also included in file names to distinguish the files that keep
          the replication state (relay logs, master.info, relay-log.info). This way,
          each separate instance can be configured separately, but otherwise the same
          way as existing single-source replication.

          In order to remain backwards-compatible with existing third-party scripts
          etc., the replicator instance name is made optional in all existing
          replication statements. If it is omitted, the name "default" is used, and for
          this particular name, master.info and the other files retain their old names
          to allow seamless upgrades of slaves.

          In this worklog, there is no extra conflict resolution proposed. The effect of
          updates from one master on replication from another master will be the same as
          the effect of direct user SQL queries on the slave server, ie. it is the
          responsibility of the user/DBA to ensure no conflicts occur. If a conflict
          causes replication of an event to fail (with duplicate key violation eg.), the
          corresponding slave SQL thread will stop, requiring manual intervention to fix
          and start.

          An easy and typical way to avoid conflicts would be to eg. to use separate
          databases for each master->slave replication channel. RBR idempotent slave
          application can also be used to help resolve conflicts, for example.

          See also MySQL WL#1697: http://forge.mysql.com/worklog/task.php?id=1697
          This task is about implementing a way for a MariaDB replication slave to
          replicate from more than one master simultaneously.

          Each master is handled by a specific replicator instance in the slave
          server. Each replicator instance consists of separate I/O thread, SQL thread,
          and associated state and configuration. In efffect, several replication slaves
          are running at the same time, each replicating from a separate master, but all
          replicating into a common data store (typically, but not necessarily to
          separate databases/tables).

          A replicator instance is identified with a user-chosen name used in
          replication SQL statements such as CHANGE MASTER TO ...
          This name is also included in file names to distinguish the files that keep
          the replication state (relay logs, master.info, relay-log.info). This way,
          each separate instance can be configured separately, but otherwise the same
          way as existing single-source replication.

          In order to remain backwards-compatible with existing third-party scripts
          etc., the replicator instance name is made optional in all existing
          replication statements. If it is omitted, the name "default" is used, and for
          this particular name, master.info and the other files retain their old names
          to allow seamless upgrades of slaves.

          In this worklog, there is no extra conflict resolution proposed. The effect of
          updates from one master on replication from another master will be the same as
          the effect of direct user SQL queries on the slave server, ie. it is the
          responsibility of the user/DBA to ensure no conflicts occur. If a conflict
          causes replication of an event to fail (with duplicate key violation eg.), the
          corresponding slave SQL thread will stop, requiring manual intervention to fix
          and start.

          An easy and typical way to avoid conflicts would be to eg. to use separate
          databases for each master->slave replication channel. RBR idempotent slave
          application can also be used to help resolve conflicts, for example.

          See also MySQL WL#1697: http://forge.mysql.com/worklog/task.php?id=1697

          h1. High-Level Specification
          Here is a preliminary list of things that need to be changed/extended to
          handle multi-source:


          File names
          ----------

          These files used to store replication state must be extended to include the
          replicator instance name. For "default", as a special case, the old name
          should be used for backwards compatibility:

              HOST-relay-bin.XXXXXX
              HOST-relay-bin.index
              relay-log.info
              master.info

          SQL statements
          --------------

          These statements need to be extended to take an optional replicator instance
          name. If omitted, "default" is used:

              CHANGE MASTER TO
              LOAD DATA FROM MASTER
              LOAD TABLE xxx FROM MASTER
              MASTER_POS_WAIT(file, position, timeout)
              RESET SLAVE
              SHOW SLAVE STATUS
              START SLAVE
              START SLAVE UNTIL ...
              STOP SLAVE

          System variables
          ----------------

          These system variables need to handle multiple replicator instances. Some can
          remain global, and apply to all instances. Some will need to be per-instance,
          and can probably use structured system variables, same was as for multiple key
          caches:

              SQL_SLAVE_SKIP_COUNTER
              log_slave_updates
              master_connect_retry
              master_info_file
              master_retry_count
              max_relay_log_size
              read_only
              relay_log
              relay_log_index
              relay_log_info_file
              relay_log_purge
              relay_log_space_limit
              replicate_do_db
              replicate_do_table
              replicate_ignore_db
              replicate_ignore_table
              replicate_rewrite_db
              replicate_same_server_id
              replicate_wild_do_table
              replicate_wild_ignore_table
              report_host
              report_port
              skip_slave_start
              slave_compressed_protocol
              slave_load_tmpdir
              slave_net_timeout
              slave_skip_errors

          The deprecated options master_host, master_port, master_user, master_password,
          and master_ssl_* should not be supported for multi source replication.
          ratzpo Rasmus Johansson (Inactive) made changes -
          Description This task is about implementing a way for a MariaDB replication slave to
          replicate from more than one master simultaneously.

          Each master is handled by a specific replicator instance in the slave
          server. Each replicator instance consists of separate I/O thread, SQL thread,
          and associated state and configuration. In efffect, several replication slaves
          are running at the same time, each replicating from a separate master, but all
          replicating into a common data store (typically, but not necessarily to
          separate databases/tables).

          A replicator instance is identified with a user-chosen name used in
          replication SQL statements such as CHANGE MASTER TO ...
          This name is also included in file names to distinguish the files that keep
          the replication state (relay logs, master.info, relay-log.info). This way,
          each separate instance can be configured separately, but otherwise the same
          way as existing single-source replication.

          In order to remain backwards-compatible with existing third-party scripts
          etc., the replicator instance name is made optional in all existing
          replication statements. If it is omitted, the name "default" is used, and for
          this particular name, master.info and the other files retain their old names
          to allow seamless upgrades of slaves.

          In this worklog, there is no extra conflict resolution proposed. The effect of
          updates from one master on replication from another master will be the same as
          the effect of direct user SQL queries on the slave server, ie. it is the
          responsibility of the user/DBA to ensure no conflicts occur. If a conflict
          causes replication of an event to fail (with duplicate key violation eg.), the
          corresponding slave SQL thread will stop, requiring manual intervention to fix
          and start.

          An easy and typical way to avoid conflicts would be to eg. to use separate
          databases for each master->slave replication channel. RBR idempotent slave
          application can also be used to help resolve conflicts, for example.

          See also MySQL WL#1697: http://forge.mysql.com/worklog/task.php?id=1697

          h1. High-Level Specification
          Here is a preliminary list of things that need to be changed/extended to
          handle multi-source:


          File names
          ----------

          These files used to store replication state must be extended to include the
          replicator instance name. For "default", as a special case, the old name
          should be used for backwards compatibility:

              HOST-relay-bin.XXXXXX
              HOST-relay-bin.index
              relay-log.info
              master.info

          SQL statements
          --------------

          These statements need to be extended to take an optional replicator instance
          name. If omitted, "default" is used:

              CHANGE MASTER TO
              LOAD DATA FROM MASTER
              LOAD TABLE xxx FROM MASTER
              MASTER_POS_WAIT(file, position, timeout)
              RESET SLAVE
              SHOW SLAVE STATUS
              START SLAVE
              START SLAVE UNTIL ...
              STOP SLAVE

          System variables
          ----------------

          These system variables need to handle multiple replicator instances. Some can
          remain global, and apply to all instances. Some will need to be per-instance,
          and can probably use structured system variables, same was as for multiple key
          caches:

              SQL_SLAVE_SKIP_COUNTER
              log_slave_updates
              master_connect_retry
              master_info_file
              master_retry_count
              max_relay_log_size
              read_only
              relay_log
              relay_log_index
              relay_log_info_file
              relay_log_purge
              relay_log_space_limit
              replicate_do_db
              replicate_do_table
              replicate_ignore_db
              replicate_ignore_table
              replicate_rewrite_db
              replicate_same_server_id
              replicate_wild_do_table
              replicate_wild_ignore_table
              report_host
              report_port
              skip_slave_start
              slave_compressed_protocol
              slave_load_tmpdir
              slave_net_timeout
              slave_skip_errors

          The deprecated options master_host, master_port, master_user, master_password,
          and master_ssl_* should not be supported for multi source replication.
          This task is about implementing a way for a MariaDB replication slave to
          replicate from more than one master simultaneously.

          Each master is handled by a specific replicator instance in the slave
          server. Each replicator instance consists of separate I/O thread, SQL thread,
          and associated state and configuration. In efffect, several replication slaves
          are running at the same time, each replicating from a separate master, but all
          replicating into a common data store (typically, but not necessarily to
          separate databases/tables).

          A replicator instance is identified with a user-chosen name used in
          replication SQL statements such as CHANGE MASTER TO ...
          This name is also included in file names to distinguish the files that keep
          the replication state (relay logs, master.info, relay-log.info). This way,
          each separate instance can be configured separately, but otherwise the same
          way as existing single-source replication.

          In order to remain backwards-compatible with existing third-party scripts
          etc., the replicator instance name is made optional in all existing
          replication statements. If it is omitted, the name "default" is used, and for
          this particular name, master.info and the other files retain their old names
          to allow seamless upgrades of slaves.

          In this worklog, there is no extra conflict resolution proposed. The effect of
          updates from one master on replication from another master will be the same as
          the effect of direct user SQL queries on the slave server, ie. it is the
          responsibility of the user/DBA to ensure no conflicts occur. If a conflict
          causes replication of an event to fail (with duplicate key violation eg.), the
          corresponding slave SQL thread will stop, requiring manual intervention to fix
          and start.

          An easy and typical way to avoid conflicts would be to eg. to use separate
          databases for each master->slave replication channel. RBR idempotent slave
          application can also be used to help resolve conflicts, for example.

          See also MySQL WL#1697: http://forge.mysql.com/worklog/task.php?id=1697

          h4. High-Level Specification
          Here is a preliminary list of things that need to be changed/extended to
          handle multi-source:


          File names
          ----------

          These files used to store replication state must be extended to include the
          replicator instance name. For "default", as a special case, the old name
          should be used for backwards compatibility:

              HOST-relay-bin.XXXXXX
              HOST-relay-bin.index
              relay-log.info
              master.info

          SQL statements
          --------------

          These statements need to be extended to take an optional replicator instance
          name. If omitted, "default" is used:

              CHANGE MASTER TO
              LOAD DATA FROM MASTER
              LOAD TABLE xxx FROM MASTER
              MASTER_POS_WAIT(file, position, timeout)
              RESET SLAVE
              SHOW SLAVE STATUS
              START SLAVE
              START SLAVE UNTIL ...
              STOP SLAVE

          System variables
          ----------------

          These system variables need to handle multiple replicator instances. Some can
          remain global, and apply to all instances. Some will need to be per-instance,
          and can probably use structured system variables, same was as for multiple key
          caches:

              SQL_SLAVE_SKIP_COUNTER
              log_slave_updates
              master_connect_retry
              master_info_file
              master_retry_count
              max_relay_log_size
              read_only
              relay_log
              relay_log_index
              relay_log_info_file
              relay_log_purge
              relay_log_space_limit
              replicate_do_db
              replicate_do_table
              replicate_ignore_db
              replicate_ignore_table
              replicate_rewrite_db
              replicate_same_server_id
              replicate_wild_do_table
              replicate_wild_ignore_table
              report_host
              report_port
              skip_slave_start
              slave_compressed_protocol
              slave_load_tmpdir
              slave_net_timeout
              slave_skip_errors

          The deprecated options master_host, master_port, master_user, master_password,
          and master_ssl_* should not be supported for multi source replication.

          h4. Low-Level Design
          A main part of this worklog will be to modify the server code so that all
          slave objects and data can have multiple instances, not use global variables,
          etc. And so that all user-visible interfaces (SQL statements, system
          variables, status variables, files) are extended to support multiple
          replicator instances.

          Another major part is to carefully develop the feature and not the least test
          it for full backwards compatibility in the case where only a single, default
          master is used. If this is not done correctly, a _lot_ of users will get
          problems when slaves are upgraded and their monitoring scripts, management
          scripts, or replication state breaks.
          ratzpo Rasmus Johansson (Inactive) made changes -
          Assignee Michael Widenius [ monty ]
          colin Colin Charles made changes -
          ratzpo Rasmus Johansson (Inactive) made changes -
          Labels pf1

          Trying bzr+ssh://bazaar.launchpad.net/~maria-captains/maria/10.0-mdev253 revno 3436.

          Here is the simplest test case (more of a template for future test cases). It contains cnf and test files now, we'll record a result file later when we're satisfied with the way it works.

          1. cat t/multisource1.cnf
            !include include/default_mysqld.cnf
            !include include/default_client.cnf

          [mysqld.1]
          server-id=1
          log-bin=master-bin

          [mysqld.2]
          server-id=2
          log-bin=master-bin

          [mysqld.3]
          server-id=3

          [ENV]
          SERVER_MYPORT_1= @mysqld.1.port
          SERVER_MYSOCK_1= @mysqld.1.socket
          SERVER_MYPORT_2= @mysqld.2.port
          SERVER_MYSOCK_2= @mysqld.2.socket
          SERVER_MYPORT_3= @mysqld.3.port
          SERVER_MYSOCK_3= @mysqld.3.socket

          1. cat t/multisource1.test
            --connect (slave,127.0.0.1,root,,,$SERVER_MYPORT_3)

          eval change master 'slave1' to master_port=$SERVER_MYPORT_1, master_host='127.0.0.1', master_user='root';
          eval change master 'slave2' to master_port=$SERVER_MYPORT_2, master_host='127.0.0.1', master_user='root';
          start slave 'slave1';
          start slave 'slave2';

          query_vertical show full slave status;

          stop slave 'slave1';

          query_vertical show slave 'slave1' status;

          elenst Elena Stepanova added a comment - Trying bzr+ssh://bazaar.launchpad.net/~maria-captains/maria/10.0-mdev253 revno 3436. Here is the simplest test case (more of a template for future test cases). It contains cnf and test files now, we'll record a result file later when we're satisfied with the way it works. cat t/multisource1.cnf !include include/default_mysqld.cnf !include include/default_client.cnf [mysqld.1] server-id=1 log-bin=master-bin [mysqld.2] server-id=2 log-bin=master-bin [mysqld.3] server-id=3 [ENV] SERVER_MYPORT_1= @mysqld.1.port SERVER_MYSOCK_1= @mysqld.1.socket SERVER_MYPORT_2= @mysqld.2.port SERVER_MYSOCK_2= @mysqld.2.socket SERVER_MYPORT_3= @mysqld.3.port SERVER_MYSOCK_3= @mysqld.3.socket cat t/multisource1.test --connect (slave,127.0.0.1,root,,,$SERVER_MYPORT_3) eval change master 'slave1' to master_port=$SERVER_MYPORT_1, master_host='127.0.0.1', master_user='root'; eval change master 'slave2' to master_port=$SERVER_MYPORT_2, master_host='127.0.0.1', master_user='root'; start slave 'slave1'; start slave 'slave2'; query_vertical show full slave status; stop slave 'slave1'; query_vertical show slave 'slave1' status;

          I observe some problems with how server executes this test case.

          1. stop slave 'slave1'; command throws a warning:

          Warnings:
          Note 1255 Slave already has been stopped

          However, both before and after the command slave status shows that the slave is running.

          Slave_IO_Running Yes
          Slave_SQL_Running Yes

          2. Slave produces memory-related errors:

          Error: Safemalloc overrun buffer mysys/safemalloc.c:303, mysys/safemalloc.c:325, ??:0, ??:0, sql/mysqld.cc:1758, sql/mysqld.cc:5022, sql/main.cc:26, ??:0
          Allocated at sql/rpl_mi.cc:48, sql/sql_parse.cc:2361, sql/sql_parse.cc:5813, sql/sql_parse.cc:1071, sql/sql_parse.cc:808, sql/sql_connect.cc:1253, sql/sql_connect.cc:1169, perfschema/pfs.cc:1017
          Error: Safemalloc overrun buffer mysys/safemalloc.c:303, mysys/safemalloc.c:325, ??:0, ??:0, sql/mysqld.cc:1758, sql/mysqld.cc:5022, sql/main.cc:26, ??:0
          Allocated at sql/rpl_mi.cc:48, sql/sql_parse.cc:2361, sql/sql_parse.cc:5813, sql/sql_parse.cc:1071, sql/sql_parse.cc:808, sql/sql_connect.cc:1253, sql/sql_connect.cc:1169, perfschema/pfs.cc:1017
          Warning: 14 bytes lost, allocated at sql/rpl_mi.cc:48, sql/sql_parse.cc:2361, sql/sql_parse.cc:5813, sql/sql_parse.cc:1071, sql/sql_parse.cc:808, sql/sql_connect.cc:1253, sql/sql_connect.cc:1169, perfschema/pfs.cc:1017
          Warning: 14 bytes lost, allocated at sql/rpl_mi.cc:48, sql/sql_parse.cc:2361, sql/sql_parse.cc:5813, sql/sql_parse.cc:1071, sql/sql_parse.cc:808, sql/sql_connect.cc:1253, sql/sql_connect.cc:1169, perfschema/pfs.cc:1017
          Memory lost: 28 bytes in 2 chunks

          3. In show full slave status, it looks like connection_name and slave_IO_state are switched:

          Connection_name Checking master version
          Slave_IO_State slave1

          elenst Elena Stepanova added a comment - I observe some problems with how server executes this test case. 1. stop slave 'slave1'; command throws a warning: Warnings: Note 1255 Slave already has been stopped However, both before and after the command slave status shows that the slave is running. Slave_IO_Running Yes Slave_SQL_Running Yes 2. Slave produces memory-related errors: Error: Safemalloc overrun buffer mysys/safemalloc.c:303, mysys/safemalloc.c:325, ??:0, ??:0, sql/mysqld.cc:1758, sql/mysqld.cc:5022, sql/main.cc:26, ??:0 Allocated at sql/rpl_mi.cc:48, sql/sql_parse.cc:2361, sql/sql_parse.cc:5813, sql/sql_parse.cc:1071, sql/sql_parse.cc:808, sql/sql_connect.cc:1253, sql/sql_connect.cc:1169, perfschema/pfs.cc:1017 Error: Safemalloc overrun buffer mysys/safemalloc.c:303, mysys/safemalloc.c:325, ??:0, ??:0, sql/mysqld.cc:1758, sql/mysqld.cc:5022, sql/main.cc:26, ??:0 Allocated at sql/rpl_mi.cc:48, sql/sql_parse.cc:2361, sql/sql_parse.cc:5813, sql/sql_parse.cc:1071, sql/sql_parse.cc:808, sql/sql_connect.cc:1253, sql/sql_connect.cc:1169, perfschema/pfs.cc:1017 Warning: 14 bytes lost, allocated at sql/rpl_mi.cc:48, sql/sql_parse.cc:2361, sql/sql_parse.cc:5813, sql/sql_parse.cc:1071, sql/sql_parse.cc:808, sql/sql_connect.cc:1253, sql/sql_connect.cc:1169, perfschema/pfs.cc:1017 Warning: 14 bytes lost, allocated at sql/rpl_mi.cc:48, sql/sql_parse.cc:2361, sql/sql_parse.cc:5813, sql/sql_parse.cc:1071, sql/sql_parse.cc:808, sql/sql_connect.cc:1253, sql/sql_connect.cc:1169, perfschema/pfs.cc:1017 Memory lost: 28 bytes in 2 chunks 3. In show full slave status, it looks like connection_name and slave_IO_state are switched: Connection_name Checking master version Slave_IO_State slave1
          elenst Elena Stepanova made changes -

          A couple times got the assertion failure:

          mysqld: /home/elenst/10.0-mdev253/mysys/mf_iocache.c:1287: _my_b_seq_read: Assertion `pos_in_file == info->end_of_file' failed.
          120925 5:11:52 [ERROR] mysqld got signal 6 ;

          sporadic, no test case yet, will try to create it shortly.

          elenst Elena Stepanova added a comment - A couple times got the assertion failure: mysqld: /home/elenst/10.0-mdev253/mysys/mf_iocache.c:1287: _my_b_seq_read: Assertion `pos_in_file == info->end_of_file' failed. 120925 5:11:52 [ERROR] mysqld got signal 6 ; sporadic, no test case yet, will try to create it shortly.
          elenst Elena Stepanova added a comment - - edited

          Getting

          Error: Freeing overrun buffer mysys/safemalloc.c:179, mysys/my_malloc.c:116, sql/rpl_mi.cc:76, sql/rpl_mi.cc:84, sql/rpl_mi.cc:596, mysys/hash.c:605, sql/rpl_mi.cc:984, sql/sql_reload.cc:338

          while executing muti_source.simple
          revno 3436
          revision-id: monty@askmonty.org-20120925162756-ad39vfvitte0fulf
          date: 2012-09-25 19:27:56 +0300

          Server built as
          cmake . -DCMAKE_BUILD_TYPE=Debug && make
          on ubuntu 11.10 oneiric, x86_64

          Error log excerpt:

          120925 20:57:00 [Note] Master 'slave1': Slave I/O thread: connected to master 'root@127.0.0
          .1:16000',replication started in log 'FIRST' at position 4
          120925 20:57:00 [Note] Master 'slave2': Slave SQL thread initialized, starting replication
          in log 'FIRST' at position 0, relay log './mysqld-relay-bin-slave2.000001' position: 4
          120925 20:57:00 [Note] Master 'slave2': Slave I/O thread: connected to master 'root@127.0.0.1:16001',replication started in log 'FIRST' at position 4
          120925 20:57:05 [Note] Master 'slave1': Error reading relay log event: slave SQL thread was killed
          120925 20:57:05 [ERROR] Master 'slave1': Error reading packet from server: Lost connection to MySQL server during query ( server_errno=2013)
          120925 20:57:05 [Note] Master 'slave1': Slave I/O thread killed while reading event
          120925 20:57:05 [Note] Master 'slave1': Slave I/O thread exiting, read up to log 'master-bin.000001', position 286
          120925 20:57:05 [Note] Deleted Master_info file '/home/elenst/10.0-mdev253/mysql-test/var/mysqld.3/data/master.info.slave1'.
          120925 20:57:05 [Note] Deleted Master_info file '/home/elenst/10.0-mdev253/mysql-test/var/mysqld.3/data/relay-log.info.slave1'.
          Error: Freeing overrun buffer mysys/safemalloc.c:179, mysys/my_malloc.c:116, sql/rpl_mi.cc:76, sql/rpl_mi.cc:84, sql/rpl_mi.cc:596, mysys/hash.c:605, sql/rpl_mi.cc:984, sql/sql_reload.cc:338
          Allocated at sql/rpl_mi.cc:48, sql/sql_parse.cc:2361, sql/sql_parse.cc:5813, sql/sql_parse.cc:1071, sql/sql_parse.cc:808, sql/sql_connect.cc:1253, sql/sql_connect.cc:1169, perfschema/pfs.cc:1017
          120925 20:57:05 [Note] Master 'slave2': Error reading relay log event: slave SQL thread was killed

          elenst Elena Stepanova added a comment - - edited Getting Error: Freeing overrun buffer mysys/safemalloc.c:179, mysys/my_malloc.c:116, sql/rpl_mi.cc:76, sql/rpl_mi.cc:84, sql/rpl_mi.cc:596, mysys/hash.c:605, sql/rpl_mi.cc:984, sql/sql_reload.cc:338 while executing muti_source.simple revno 3436 revision-id: monty@askmonty.org-20120925162756-ad39vfvitte0fulf date: 2012-09-25 19:27:56 +0300 Server built as cmake . -DCMAKE_BUILD_TYPE=Debug && make on ubuntu 11.10 oneiric, x86_64 Error log excerpt: 120925 20:57:00 [Note] Master 'slave1': Slave I/O thread: connected to master 'root@127.0.0 .1:16000',replication started in log 'FIRST' at position 4 120925 20:57:00 [Note] Master 'slave2': Slave SQL thread initialized, starting replication in log 'FIRST' at position 0, relay log './mysqld-relay-bin-slave2.000001' position: 4 120925 20:57:00 [Note] Master 'slave2': Slave I/O thread: connected to master 'root@127.0.0.1:16001',replication started in log 'FIRST' at position 4 120925 20:57:05 [Note] Master 'slave1': Error reading relay log event: slave SQL thread was killed 120925 20:57:05 [ERROR] Master 'slave1': Error reading packet from server: Lost connection to MySQL server during query ( server_errno=2013) 120925 20:57:05 [Note] Master 'slave1': Slave I/O thread killed while reading event 120925 20:57:05 [Note] Master 'slave1': Slave I/O thread exiting, read up to log 'master-bin.000001', position 286 120925 20:57:05 [Note] Deleted Master_info file '/home/elenst/10.0-mdev253/mysql-test/var/mysqld.3/data/master.info.slave1'. 120925 20:57:05 [Note] Deleted Master_info file '/home/elenst/10.0-mdev253/mysql-test/var/mysqld.3/data/relay-log.info.slave1'. Error: Freeing overrun buffer mysys/safemalloc.c:179, mysys/my_malloc.c:116, sql/rpl_mi.cc:76, sql/rpl_mi.cc:84, sql/rpl_mi.cc:596, mysys/hash.c:605, sql/rpl_mi.cc:984, sql/sql_reload.cc:338 Allocated at sql/rpl_mi.cc:48, sql/sql_parse.cc:2361, sql/sql_parse.cc:5813, sql/sql_parse.cc:1071, sql/sql_parse.cc:808, sql/sql_connect.cc:1253, sql/sql_connect.cc:1169, perfschema/pfs.cc:1017 120925 20:57:05 [Note] Master 'slave2': Error reading relay log event: slave SQL thread was killed
          elenst Elena Stepanova added a comment - - edited

          Slave_IO_state and Connection_name still seem to be switched in SHOW FULL SLAVE STATUS:

          show full slave status;
          Connection_name Waiting for master to send event
          Slave_IO_State master2
          Master_Host 127.0.0.1

          revno 3436
          revision-id: monty@askmonty.org-20120925162756-ad39vfvitte0fulf
          date: 2012-09-25 19:27:56 +0300

          elenst Elena Stepanova added a comment - - edited Slave_IO_state and Connection_name still seem to be switched in SHOW FULL SLAVE STATUS: show full slave status; Connection_name Waiting for master to send event Slave_IO_State master2 Master_Host 127.0.0.1 revno 3436 revision-id: monty@askmonty.org-20120925162756-ad39vfvitte0fulf date: 2012-09-25 19:27:56 +0300
          elenst Elena Stepanova made changes -
          Comment [ Sporadically getting the following mismatch on multi_source.simple, same revision as above:

          @@ -49,6 +49,8 @@
           Replicate_Ignore_Server_Ids
           Master_Server_Id 1
           reset slave 'slave1';
          +Warnings:
          +Warning 1612 Being purged log ./mysqld-relay-bin-slave1.000002 was not found
           show full slave status;

          not yet sure what it means ]
          elenst Elena Stepanova added a comment - - edited

          --sync_with_master 0, 'master1' doesn't work
          (space AFTER the comma)

          Upd: moved the description of the problem into MDEV-549

          elenst Elena Stepanova added a comment - - edited --sync_with_master 0, 'master1' doesn't work (space AFTER the comma) Upd: moved the description of the problem into MDEV-549
          elenst Elena Stepanova added a comment - - edited

          "RESET SLAVE 'masterX'" totally removes the masterX configuration.

          Update: moved the description of the problem into subtask MDEV-548

          elenst Elena Stepanova added a comment - - edited "RESET SLAVE 'masterX'" totally removes the masterX configuration. Update: moved the description of the problem into subtask MDEV-548

          Created a subtask https://mariadb.atlassian.net/browse/MDEV-547 for a problem with sql_slave_skip_counter (with a test file and result file).

          Hopefully subtasks will be easier to track, a list of comments becomes messy.

          elenst Elena Stepanova added a comment - Created a subtask https://mariadb.atlassian.net/browse/MDEV-547 for a problem with sql_slave_skip_counter (with a test file and result file). Hopefully subtasks will be easier to track, a list of comments becomes messy.

          Have not been able to repeat the problem with memory allocation until very late last night. Will fix tomorrow.
          The problem with sql_slave_skip_counter was that this variable is not yet multi-source aware.

          Have fixed the following issues:

          • "Slave_IO_state and Connection_name still seem to be switched in SHOW FULL SLAVE STATUS"
          • --sync_with_master 0, 'master1' doesn't work
          • Made sql_slave_skip_counter

          It's intentionally that RESET SLAVE removes the master configuration (this comes from the original patch).
          The reason this method was used, is probably that there was no other logical way to remove a connection.
          This does however create the problem of how to be able to remove the relay logs for a named connection.
          The suggestions I have regarding this are:

          • Let RESET SLAVE remove relay logs and the connection but FLUSH SLAVE would only remove relay logs.
          • Add a DROP SLAVE 'connection_name' command.
          monty Michael Widenius added a comment - Have not been able to repeat the problem with memory allocation until very late last night. Will fix tomorrow. The problem with sql_slave_skip_counter was that this variable is not yet multi-source aware. Have fixed the following issues: "Slave_IO_state and Connection_name still seem to be switched in SHOW FULL SLAVE STATUS" --sync_with_master 0, 'master1' doesn't work Made sql_slave_skip_counter It's intentionally that RESET SLAVE removes the master configuration (this comes from the original patch). The reason this method was used, is probably that there was no other logical way to remove a connection. This does however create the problem of how to be able to remove the relay logs for a named connection. The suggestions I have regarding this are: Let RESET SLAVE remove relay logs and the connection but FLUSH SLAVE would only remove relay logs. Add a DROP SLAVE 'connection_name' command.

          Cannot we still use "RESET SLAVE 'name' ALL" to remove everything and "RESET SLAVE 'name'" to only remove the logs and position? It would be consistent with what we have for single-source: now 'RESET SLAVE ALL' removes the entire configuration, while 'RESET SLAVE' only removes logs and the position.

          For the memory problem, I'll try to create a shorter test case, to reduce the amount of noise in the trace. The problem itself is reproducible on my machines and on perro, both with the plain cmake build and with BUILD/compile-pentium-debug-max. So if it doesn't happen on your boxes, maybe you could try there. I can set up the environment so you would only have to run the test.

          elenst Elena Stepanova added a comment - Cannot we still use "RESET SLAVE 'name' ALL" to remove everything and "RESET SLAVE 'name'" to only remove the logs and position? It would be consistent with what we have for single-source: now 'RESET SLAVE ALL' removes the entire configuration, while 'RESET SLAVE' only removes logs and the position. For the memory problem, I'll try to create a shorter test case, to reduce the amount of noise in the trace. The problem itself is reproducible on my machines and on perro, both with the plain cmake build and with BUILD/compile-pentium-debug-max. So if it doesn't happen on your boxes, maybe you could try there. I can set up the environment so you would only have to run the test.

          With the new usage of max_relay_log_size we are getting extra warnings in the standard test suite (as below). Should we add the warnings to the result files, or are you planning to change the algorithm somehow?

          CURRENT_TEST: rpl.rpl_deadlock_innodb
          — mysql-test/suite/rpl/r/rpl_deadlock_innodb.result 2012-09-28 03:23:07.482695000 +0400
          +++ mysql-test/suite/rpl/r/rpl_deadlock_innodb.reject 2012-10-01 22:52:05.107989763 +0400
          @@ -76,6 +76,8 @@

              • Test lock wait timeout and purged relay logs ***
                SET @my_max_relay_log_size= @@global.max_relay_log_size;
                SET global max_relay_log_size=0;
                +Warnings:
                +Warning 1292 Truncated incorrect max_relay_log_size value: '0'
                include/stop_slave.inc
                DELETE FROM t2;
                CHANGE MASTER TO MASTER_LOG_POS=<master_pos_begin>;

          mysqltest: Result length mismatch

          elenst Elena Stepanova added a comment - With the new usage of max_relay_log_size we are getting extra warnings in the standard test suite (as below). Should we add the warnings to the result files, or are you planning to change the algorithm somehow? CURRENT_TEST: rpl.rpl_deadlock_innodb — mysql-test/suite/rpl/r/rpl_deadlock_innodb.result 2012-09-28 03:23:07.482695000 +0400 +++ mysql-test/suite/rpl/r/rpl_deadlock_innodb.reject 2012-10-01 22:52:05.107989763 +0400 @@ -76,6 +76,8 @@ Test lock wait timeout and purged relay logs *** SET @my_max_relay_log_size= @@global.max_relay_log_size; SET global max_relay_log_size=0; +Warnings: +Warning 1292 Truncated incorrect max_relay_log_size value: '0' include/stop_slave.inc DELETE FROM t2; CHANGE MASTER TO MASTER_LOG_POS=<master_pos_begin>; mysqltest: Result length mismatch

          The warning for max_relay_log_size is ok in my opinion. Now it works like max_binlog_size and other variables.

          monty Michael Widenius added a comment - The warning for max_relay_log_size is ok in my opinion. Now it works like max_binlog_size and other variables.

          Pushed into 10.0-base

          monty Michael Widenius added a comment - Pushed into 10.0-base
          monty Michael Widenius made changes -
          Fix Version/s 10.0.0 [ 10000 ]
          Resolution Fixed [ 1 ]
          Status Open [ 1 ] Closed [ 6 ]
          serg Sergei Golubchik made changes -
          Workflow defaullt [ 11652 ] MariaDB v2 [ 43984 ]
          ratzpo Rasmus Johansson (Inactive) made changes -
          Workflow MariaDB v2 [ 43984 ] MariaDB v3 [ 63230 ]
          elenst Elena Stepanova made changes -
          elenst Elena Stepanova made changes -
          serg Sergei Golubchik made changes -
          Workflow MariaDB v3 [ 63230 ] MariaDB v4 [ 131924 ]

          People

            monty Michael Widenius
            ratzpo Rasmus Johansson (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.