[MDEV-253] Multi-source replication - Jira

Details

Type: Task
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Fix Version/s: 10.0.0
Component/s: None
Labels:
- pf1

Description

This task is about implementing a way for a MariaDB replication slave to
replicate from more than one master simultaneously.

Each master is handled by a specific replicator instance in the slave
server. Each replicator instance consists of separate I/O thread, SQL thread,
and associated state and configuration. In efffect, several replication slaves
are running at the same time, each replicating from a separate master, but all
replicating into a common data store (typically, but not necessarily to
separate databases/tables).

A replicator instance is identified with a user-chosen name used in
replication SQL statements such as CHANGE MASTER TO ...
This name is also included in file names to distinguish the files that keep
the replication state (relay logs, master.info, relay-log.info). This way,
each separate instance can be configured separately, but otherwise the same
way as existing single-source replication.

In order to remain backwards-compatible with existing third-party scripts
etc., the replicator instance name is made optional in all existing
replication statements. If it is omitted, the name "default" is used, and for
this particular name, master.info and the other files retain their old names
to allow seamless upgrades of slaves.

In this worklog, there is no extra conflict resolution proposed. The effect of
updates from one master on replication from another master will be the same as
the effect of direct user SQL queries on the slave server, ie. it is the
responsibility of the user/DBA to ensure no conflicts occur. If a conflict
causes replication of an event to fail (with duplicate key violation eg.), the
corresponding slave SQL thread will stop, requiring manual intervention to fix
and start.

An easy and typical way to avoid conflicts would be to eg. to use separate
databases for each master->slave replication channel. RBR idempotent slave
application can also be used to help resolve conflicts, for example.

See also MySQL WL#1697: http://forge.mysql.com/worklog/task.php?id=1697

High-Level Specification

Here is a preliminary list of things that need to be changed/extended to
handle multi-source:

File names
----------

These files used to store replication state must be extended to include the
replicator instance name. For "default", as a special case, the old name
should be used for backwards compatibility:

HOST-relay-bin.XXXXXX
HOST-relay-bin.index
relay-log.info
master.info

SQL statements
--------------

These statements need to be extended to take an optional replicator instance
name. If omitted, "default" is used:

CHANGE MASTER TO
LOAD DATA FROM MASTER
LOAD TABLE xxx FROM MASTER
MASTER_POS_WAIT(file, position, timeout)
RESET SLAVE
SHOW SLAVE STATUS
START SLAVE
START SLAVE UNTIL ...
STOP SLAVE

System variables
----------------

These system variables need to handle multiple replicator instances. Some can
remain global, and apply to all instances. Some will need to be per-instance,
and can probably use structured system variables, same was as for multiple key
caches:

SQL_SLAVE_SKIP_COUNTER
log_slave_updates
master_connect_retry
master_info_file
master_retry_count
max_relay_log_size
read_only
relay_log
relay_log_index
relay_log_info_file
relay_log_purge
relay_log_space_limit
replicate_do_db
replicate_do_table
replicate_ignore_db
replicate_ignore_table
replicate_rewrite_db
replicate_same_server_id
replicate_wild_do_table
replicate_wild_ignore_table
report_host
report_port
skip_slave_start
slave_compressed_protocol
slave_load_tmpdir
slave_net_timeout
slave_skip_errors

The deprecated options master_host, master_port, master_user, master_password,
and master_ssl_* should not be supported for multi source replication.

Low-Level Design

A main part of this worklog will be to modify the server code so that all
slave objects and data can have multiple instances, not use global variables,
etc. And so that all user-visible interfaces (SQL statements, system
variables, status variables, files) are extended to support multiple
replicator instances.

Another major part is to carefully develop the feature and not the least test
it for full backwards compatibility in the case where only a single, default
master is used. If this is not done correctly, a lot of users will get
problems when slaves are upgraded and their monitoring scripts, management
scripts, or replication state breaks.

Attachments

Sub-Tasks

1.	Multi-source: sql_slave_skip_counter doesn't work for a connection with a non-default name	Closed	Michael Widenius
2.	Multi-source: Behavior of "RESET SLAVE 'name'" is inconsistent with the normal RESET SLAVE	Closed	Michael Widenius
3.	Multi-source: sync_with_master doesn't accept a space after the comma	Closed	Michael Widenius
4.	Multi-source: [patch] Get rid of sleep in multi_source.simple test	Closed	Michael Widenius
5.	Multi-source: Non-descriptive error message in the error log on creating a duplicate replication configuration	Closed	Michael Widenius
6.	Multi-source: Slave allows multiple masters to have the same server id (might be bad for complicated replication setups)	Closed	Michael Widenius
7.	Multi-source: SHOW RELAYLOG EVENTS doesn't work for named connections	Closed	Michael Widenius
8.	Multi-source: More status variables need to be either local for a replication connection, or truly global across all slaves	Closed	Michael Widenius
9.	Multi-source: "Freeing overrun buffer" error and valgrind warnings on change master with a named connection	Closed	Michael Widenius
10.	Multi-source: [tests] Basic tests to review and add to the multi_source suite	Closed	Michael Widenius
11.	Multi-source: RESET SLAVE 'name' ALL creates master.info.index file instead of updating multi-master.info	Closed	Michael Widenius
12.	Multi-source: Memory loss warnings on an attempt to create a duplicate connection with a different name	Closed	Michael Widenius
13.	Multi-source: Semisync replication is not fully supported for multiple masters and can cause replication failure and relay log corruption	Closed	Michael Widenius

Activity

Ascending order - Click to sort in descending order

Rasmus Johansson (Inactive) created issue - 2012-05-04 11:32

Rasmus Johansson (Inactive) made changes - 2012-05-04 11:33

Field	Original Value	New Value
Description	This task is about implementing a way for a MariaDB replication slave to replicate from more than one master simultaneously. Each master is handled by a specific replicator instance in the slave server. Each replicator instance consists of separate I/O thread, SQL thread, and associated state and configuration. In efffect, several replication slaves are running at the same time, each replicating from a separate master, but all replicating into a common data store (typically, but not necessarily to separate databases/tables). A replicator instance is identified with a user-chosen name used in replication SQL statements such as CHANGE MASTER TO ... This name is also included in file names to distinguish the files that keep the replication state (relay logs, master.info, relay-log.info). This way, each separate instance can be configured separately, but otherwise the same way as existing single-source replication. In order to remain backwards-compatible with existing third-party scripts etc., the replicator instance name is made optional in all existing replication statements. If it is omitted, the name "default" is used, and for this particular name, master.info and the other files retain their old names to allow seamless upgrades of slaves. In this worklog, there is no extra conflict resolution proposed. The effect of updates from one master on replication from another master will be the same as the effect of direct user SQL queries on the slave server, ie. it is the responsibility of the user/DBA to ensure no conflicts occur. If a conflict causes replication of an event to fail (with duplicate key violation eg.), the corresponding slave SQL thread will stop, requiring manual intervention to fix and start. An easy and typical way to avoid conflicts would be to eg. to use separate databases for each master->slave replication channel. RBR idempotent slave application can also be used to help resolve conflicts, for example. See also MySQL WL#1697: http://forge.mysql.com/worklog/task.php?id=1697	This task is about implementing a way for a MariaDB replication slave to replicate from more than one master simultaneously. Each master is handled by a specific replicator instance in the slave server. Each replicator instance consists of separate I/O thread, SQL thread, and associated state and configuration. In efffect, several replication slaves are running at the same time, each replicating from a separate master, but all replicating into a common data store (typically, but not necessarily to separate databases/tables). A replicator instance is identified with a user-chosen name used in replication SQL statements such as CHANGE MASTER TO ... This name is also included in file names to distinguish the files that keep the replication state (relay logs, master.info, relay-log.info). This way, each separate instance can be configured separately, but otherwise the same way as existing single-source replication. In order to remain backwards-compatible with existing third-party scripts etc., the replicator instance name is made optional in all existing replication statements. If it is omitted, the name "default" is used, and for this particular name, master.info and the other files retain their old names to allow seamless upgrades of slaves. In this worklog, there is no extra conflict resolution proposed. The effect of updates from one master on replication from another master will be the same as the effect of direct user SQL queries on the slave server, ie. it is the responsibility of the user/DBA to ensure no conflicts occur. If a conflict causes replication of an event to fail (with duplicate key violation eg.), the corresponding slave SQL thread will stop, requiring manual intervention to fix and start. An easy and typical way to avoid conflicts would be to eg. to use separate databases for each master->slave replication channel. RBR idempotent slave application can also be used to help resolve conflicts, for example. See also MySQL WL#1697: http://forge.mysql.com/worklog/task.php?id=1697 h1. High-Level Specification Here is a preliminary list of things that need to be changed/extended to handle multi-source: File names ---------- These files used to store replication state must be extended to include the replicator instance name. For "default", as a special case, the old name should be used for backwards compatibility: HOST-relay-bin.XXXXXX HOST-relay-bin.index relay-log.info master.info SQL statements -------------- These statements need to be extended to take an optional replicator instance name. If omitted, "default" is used: CHANGE MASTER TO LOAD DATA FROM MASTER LOAD TABLE xxx FROM MASTER MASTER_POS_WAIT(file, position, timeout) RESET SLAVE SHOW SLAVE STATUS START SLAVE START SLAVE UNTIL ... STOP SLAVE System variables ---------------- These system variables need to handle multiple replicator instances. Some can remain global, and apply to all instances. Some will need to be per-instance, and can probably use structured system variables, same was as for multiple key caches: SQL_SLAVE_SKIP_COUNTER log_slave_updates master_connect_retry master_info_file master_retry_count max_relay_log_size read_only relay_log relay_log_index relay_log_info_file relay_log_purge relay_log_space_limit replicate_do_db replicate_do_table replicate_ignore_db replicate_ignore_table replicate_rewrite_db replicate_same_server_id replicate_wild_do_table replicate_wild_ignore_table report_host report_port skip_slave_start slave_compressed_protocol slave_load_tmpdir slave_net_timeout slave_skip_errors The deprecated options master_host, master_port, master_user, master_password, and master_ssl_* should not be supported for multi source replication.

Rasmus Johansson (Inactive) made changes - 2012-05-04 11:34

Description

This task is about implementing a way for a MariaDB replication slave to
replicate from more than one master simultaneously.

Each master is handled by a specific replicator instance in the slave
server. Each replicator instance consists of separate I/O thread, SQL thread,
and associated state and configuration. In efffect, several replication slaves
are running at the same time, each replicating from a separate master, but all
replicating into a common data store (typically, but not necessarily to
separate databases/tables).

A replicator instance is identified with a user-chosen name used in
replication SQL statements such as CHANGE MASTER TO ...
This name is also included in file names to distinguish the files that keep
the replication state (relay logs, master.info, relay-log.info). This way,
each separate instance can be configured separately, but otherwise the same
way as existing single-source replication.

In order to remain backwards-compatible with existing third-party scripts
etc., the replicator instance name is made optional in all existing
replication statements. If it is omitted, the name "default" is used, and for
this particular name, master.info and the other files retain their old names
to allow seamless upgrades of slaves.

In this worklog, there is no extra conflict resolution proposed. The effect of
updates from one master on replication from another master will be the same as
the effect of direct user SQL queries on the slave server, ie. it is the
responsibility of the user/DBA to ensure no conflicts occur. If a conflict
causes replication of an event to fail (with duplicate key violation eg.), the
corresponding slave SQL thread will stop, requiring manual intervention to fix
and start.

An easy and typical way to avoid conflicts would be to eg. to use separate
databases for each master->slave replication channel. RBR idempotent slave
application can also be used to help resolve conflicts, for example.

See also MySQL WL#1697: http://forge.mysql.com/worklog/task.php?id=1697

h1. High-Level Specification
Here is a preliminary list of things that need to be changed/extended to
handle multi-source:

File names
----------

These files used to store replication state must be extended to include the
replicator instance name. For "default", as a special case, the old name
should be used for backwards compatibility:

    HOST-relay-bin.XXXXXX
    HOST-relay-bin.index
    relay-log.info
    master.info

SQL statements
--------------

These statements need to be extended to take an optional replicator instance
name. If omitted, "default" is used:

    CHANGE MASTER TO
    LOAD DATA FROM MASTER
    LOAD TABLE xxx FROM MASTER
    MASTER_POS_WAIT(file, position, timeout)
    RESET SLAVE
    SHOW SLAVE STATUS
    START SLAVE
    START SLAVE UNTIL ...
    STOP SLAVE

System variables
----------------

These system variables need to handle multiple replicator instances. Some can
remain global, and apply to all instances. Some will need to be per-instance,
and can probably use structured system variables, same was as for multiple key
caches:

    SQL_SLAVE_SKIP_COUNTER
    log_slave_updates
    master_connect_retry
    master_info_file
    master_retry_count
    max_relay_log_size
    read_only
    relay_log
    relay_log_index
    relay_log_info_file
    relay_log_purge
    relay_log_space_limit
    replicate_do_db
    replicate_do_table
    replicate_ignore_db
    replicate_ignore_table
    replicate_rewrite_db
    replicate_same_server_id
    replicate_wild_do_table
    replicate_wild_ignore_table
    report_host
    report_port
    skip_slave_start
    slave_compressed_protocol
    slave_load_tmpdir
    slave_net_timeout
    slave_skip_errors

The deprecated options master_host, master_port, master_user, master_password,
and master_ssl_* should not be supported for multi source replication.

This task is about implementing a way for a MariaDB replication slave to
replicate from more than one master simultaneously.

Each master is handled by a specific replicator instance in the slave
server. Each replicator instance consists of separate I/O thread, SQL thread,
and associated state and configuration. In efffect, several replication slaves
are running at the same time, each replicating from a separate master, but all
replicating into a common data store (typically, but not necessarily to
separate databases/tables).

A replicator instance is identified with a user-chosen name used in
replication SQL statements such as CHANGE MASTER TO ...
This name is also included in file names to distinguish the files that keep
the replication state (relay logs, master.info, relay-log.info). This way,
each separate instance can be configured separately, but otherwise the same
way as existing single-source replication.

In order to remain backwards-compatible with existing third-party scripts
etc., the replicator instance name is made optional in all existing
replication statements. If it is omitted, the name "default" is used, and for
this particular name, master.info and the other files retain their old names
to allow seamless upgrades of slaves.

In this worklog, there is no extra conflict resolution proposed. The effect of
updates from one master on replication from another master will be the same as
the effect of direct user SQL queries on the slave server, ie. it is the
responsibility of the user/DBA to ensure no conflicts occur. If a conflict
causes replication of an event to fail (with duplicate key violation eg.), the
corresponding slave SQL thread will stop, requiring manual intervention to fix
and start.

An easy and typical way to avoid conflicts would be to eg. to use separate
databases for each master->slave replication channel. RBR idempotent slave
application can also be used to help resolve conflicts, for example.

See also MySQL WL#1697: http://forge.mysql.com/worklog/task.php?id=1697

h4. High-Level Specification
Here is a preliminary list of things that need to be changed/extended to
handle multi-source:

File names
----------

These files used to store replication state must be extended to include the
replicator instance name. For "default", as a special case, the old name
should be used for backwards compatibility:

    HOST-relay-bin.XXXXXX
    HOST-relay-bin.index
    relay-log.info
    master.info

SQL statements
--------------

These statements need to be extended to take an optional replicator instance
name. If omitted, "default" is used:

    CHANGE MASTER TO
    LOAD DATA FROM MASTER
    LOAD TABLE xxx FROM MASTER
    MASTER_POS_WAIT(file, position, timeout)
    RESET SLAVE
    SHOW SLAVE STATUS
    START SLAVE
    START SLAVE UNTIL ...
    STOP SLAVE

System variables
----------------

These system variables need to handle multiple replicator instances. Some can
remain global, and apply to all instances. Some will need to be per-instance,
and can probably use structured system variables, same was as for multiple key
caches:

    SQL_SLAVE_SKIP_COUNTER
    log_slave_updates
    master_connect_retry
    master_info_file
    master_retry_count
    max_relay_log_size
    read_only
    relay_log
    relay_log_index
    relay_log_info_file
    relay_log_purge
    relay_log_space_limit
    replicate_do_db
    replicate_do_table
    replicate_ignore_db
    replicate_ignore_table
    replicate_rewrite_db
    replicate_same_server_id
    replicate_wild_do_table
    replicate_wild_ignore_table
    report_host
    report_port
    skip_slave_start
    slave_compressed_protocol
    slave_load_tmpdir
    slave_net_timeout
    slave_skip_errors

The deprecated options master_host, master_port, master_user, master_password,
and master_ssl_* should not be supported for multi source replication.

h4. Low-Level Design
A main part of this worklog will be to modify the server code so that all
slave objects and data can have multiple instances, not use global variables,
etc. And so that all user-visible interfaces (SQL statements, system
variables, status variables, files) are extended to support multiple
replicator instances.

Another major part is to carefully develop the feature and not the least test
it for full backwards compatibility in the case where only a single, default
master is used. If this is not done correctly, a _lot_ of users will get
problems when slaves are upgraded and their monitoring scripts, management
scripts, or replication state breaks.

Rasmus Johansson (Inactive) made changes - 2012-05-04 11:35

Assignee

Michael Widenius [ monty ]

Colin Charles made changes - 2012-05-16 09:42

Link

This issue blocks TODO-160 [ TODO-160 ]

Rasmus Johansson (Inactive) made changes - 2012-07-17 22:46

Labels

pf1

Elena Stepanova added a comment - 2012-09-25 03:40

Trying bzr+ssh://bazaar.launchpad.net/~maria-captains/maria/10.0-mdev253 revno 3436.

Here is the simplest test case (more of a template for future test cases). It contains cnf and test files now, we'll record a result file later when we're satisfied with the way it works.

cat t/multisource1.cnf
!include include/default_mysqld.cnf
!include include/default_client.cnf

[mysqld.1]
server-id=1
log-bin=master-bin

[mysqld.2]
server-id=2
log-bin=master-bin

[mysqld.3]
server-id=3

[ENV]
SERVER_MYPORT_1= @mysqld.1.port
SERVER_MYSOCK_1= @mysqld.1.socket
SERVER_MYPORT_2= @mysqld.2.port
SERVER_MYSOCK_2= @mysqld.2.socket
SERVER_MYPORT_3= @mysqld.3.port
SERVER_MYSOCK_3= @mysqld.3.socket

cat t/multisource1.test
--connect (slave,127.0.0.1,root,,,$SERVER_MYPORT_3)

eval change master 'slave1' to master_port=$SERVER_MYPORT_1, master_host='127.0.0.1', master_user='root';
eval change master 'slave2' to master_port=$SERVER_MYPORT_2, master_host='127.0.0.1', master_user='root';
start slave 'slave1';
start slave 'slave2';

query_vertical show full slave status;

stop slave 'slave1';

query_vertical show slave 'slave1' status;

Elena Stepanova added a comment - 2012-09-25 03:40 Trying bzr+ssh://bazaar.launchpad.net/~maria-captains/maria/10.0-mdev253 revno 3436. Here is the simplest test case (more of a template for future test cases). It contains cnf and test files now, we'll record a result file later when we're satisfied with the way it works. cat t/multisource1.cnf !include include/default_mysqld.cnf !include include/default_client.cnf [mysqld.1] server-id=1 log-bin=master-bin [mysqld.2] server-id=2 log-bin=master-bin [mysqld.3] server-id=3 [ENV] SERVER_MYPORT_1= @mysqld.1.port SERVER_MYSOCK_1= @mysqld.1.socket SERVER_MYPORT_2= @mysqld.2.port SERVER_MYSOCK_2= @mysqld.2.socket SERVER_MYPORT_3= @mysqld.3.port SERVER_MYSOCK_3= @mysqld.3.socket cat t/multisource1.test --connect (slave,127.0.0.1,root,,,$SERVER_MYPORT_3) eval change master 'slave1' to master_port=$SERVER_MYPORT_1, master_host='127.0.0.1', master_user='root'; eval change master 'slave2' to master_port=$SERVER_MYPORT_2, master_host='127.0.0.1', master_user='root'; start slave 'slave1'; start slave 'slave2'; query_vertical show full slave status; stop slave 'slave1'; query_vertical show slave 'slave1' status;

Elena Stepanova added a comment - 2012-09-25 03:45

I observe some problems with how server executes this test case.

1. stop slave 'slave1'; command throws a warning:

Warnings:
Note 1255 Slave already has been stopped

However, both before and after the command slave status shows that the slave is running.

Slave_IO_Running Yes
Slave_SQL_Running Yes

2. Slave produces memory-related errors:

Error: Safemalloc overrun buffer mysys/safemalloc.c:303, mysys/safemalloc.c:325, ??:0, ??:0, sql/mysqld.cc:1758, sql/mysqld.cc:5022, sql/main.cc:26, ??:0
Allocated at sql/rpl_mi.cc:48, sql/sql_parse.cc:2361, sql/sql_parse.cc:5813, sql/sql_parse.cc:1071, sql/sql_parse.cc:808, sql/sql_connect.cc:1253, sql/sql_connect.cc:1169, perfschema/pfs.cc:1017
Error: Safemalloc overrun buffer mysys/safemalloc.c:303, mysys/safemalloc.c:325, ??:0, ??:0, sql/mysqld.cc:1758, sql/mysqld.cc:5022, sql/main.cc:26, ??:0
Allocated at sql/rpl_mi.cc:48, sql/sql_parse.cc:2361, sql/sql_parse.cc:5813, sql/sql_parse.cc:1071, sql/sql_parse.cc:808, sql/sql_connect.cc:1253, sql/sql_connect.cc:1169, perfschema/pfs.cc:1017
Warning: 14 bytes lost, allocated at sql/rpl_mi.cc:48, sql/sql_parse.cc:2361, sql/sql_parse.cc:5813, sql/sql_parse.cc:1071, sql/sql_parse.cc:808, sql/sql_connect.cc:1253, sql/sql_connect.cc:1169, perfschema/pfs.cc:1017
Warning: 14 bytes lost, allocated at sql/rpl_mi.cc:48, sql/sql_parse.cc:2361, sql/sql_parse.cc:5813, sql/sql_parse.cc:1071, sql/sql_parse.cc:808, sql/sql_connect.cc:1253, sql/sql_connect.cc:1169, perfschema/pfs.cc:1017
Memory lost: 28 bytes in 2 chunks

3. In show full slave status, it looks like connection_name and slave_IO_state are switched:

Connection_name Checking master version
Slave_IO_State slave1

Elena Stepanova added a comment - 2012-09-25 03:45 I observe some problems with how server executes this test case. 1. stop slave 'slave1'; command throws a warning: Warnings: Note 1255 Slave already has been stopped However, both before and after the command slave status shows that the slave is running. Slave_IO_Running Yes Slave_SQL_Running Yes 2. Slave produces memory-related errors: Error: Safemalloc overrun buffer mysys/safemalloc.c:303, mysys/safemalloc.c:325, ??:0, ??:0, sql/mysqld.cc:1758, sql/mysqld.cc:5022, sql/main.cc:26, ??:0 Allocated at sql/rpl_mi.cc:48, sql/sql_parse.cc:2361, sql/sql_parse.cc:5813, sql/sql_parse.cc:1071, sql/sql_parse.cc:808, sql/sql_connect.cc:1253, sql/sql_connect.cc:1169, perfschema/pfs.cc:1017 Error: Safemalloc overrun buffer mysys/safemalloc.c:303, mysys/safemalloc.c:325, ??:0, ??:0, sql/mysqld.cc:1758, sql/mysqld.cc:5022, sql/main.cc:26, ??:0 Allocated at sql/rpl_mi.cc:48, sql/sql_parse.cc:2361, sql/sql_parse.cc:5813, sql/sql_parse.cc:1071, sql/sql_parse.cc:808, sql/sql_connect.cc:1253, sql/sql_connect.cc:1169, perfschema/pfs.cc:1017 Warning: 14 bytes lost, allocated at sql/rpl_mi.cc:48, sql/sql_parse.cc:2361, sql/sql_parse.cc:5813, sql/sql_parse.cc:1071, sql/sql_parse.cc:808, sql/sql_connect.cc:1253, sql/sql_connect.cc:1169, perfschema/pfs.cc:1017 Warning: 14 bytes lost, allocated at sql/rpl_mi.cc:48, sql/sql_parse.cc:2361, sql/sql_parse.cc:5813, sql/sql_parse.cc:1071, sql/sql_parse.cc:808, sql/sql_connect.cc:1253, sql/sql_connect.cc:1169, perfschema/pfs.cc:1017 Memory lost: 28 bytes in 2 chunks 3. In show full slave status, it looks like connection_name and slave_IO_state are switched: Connection_name Checking master version Slave_IO_State slave1

Elena Stepanova made changes - 2012-09-25 03:49

Link

This issue relates to TODO-295 [ TODO-295 ]

Elena Stepanova added a comment - 2012-09-25 04:20

A couple times got the assertion failure:

mysqld: /home/elenst/10.0-mdev253/mysys/mf_iocache.c:1287: _my_b_seq_read: Assertion `pos_in_file == info->end_of_file' failed.
120925 5:11:52 [ERROR] mysqld got signal 6 ;

sporadic, no test case yet, will try to create it shortly.

Elena Stepanova added a comment - 2012-09-25 04:20 A couple times got the assertion failure: mysqld: /home/elenst/10.0-mdev253/mysys/mf_iocache.c:1287: _my_b_seq_read: Assertion `pos_in_file == info->end_of_file' failed. 120925 5:11:52 [ERROR] mysqld got signal 6 ; sporadic, no test case yet, will try to create it shortly.

Elena Stepanova added a comment - 2012-09-25 20:05 - edited

Getting

Error: Freeing overrun buffer mysys/safemalloc.c:179, mysys/my_malloc.c:116, sql/rpl_mi.cc:76, sql/rpl_mi.cc:84, sql/rpl_mi.cc:596, mysys/hash.c:605, sql/rpl_mi.cc:984, sql/sql_reload.cc:338

while executing muti_source.simple
revno 3436
revision-id: monty@askmonty.org-20120925162756-ad39vfvitte0fulf
date: 2012-09-25 19:27:56 +0300

Server built as
cmake . -DCMAKE_BUILD_TYPE=Debug && make
on ubuntu 11.10 oneiric, x86_64

Error log excerpt:

120925 20:57:00 [Note] Master 'slave1': Slave I/O thread: connected to master 'root@127.0.0
.1:16000',replication started in log 'FIRST' at position 4
120925 20:57:00 [Note] Master 'slave2': Slave SQL thread initialized, starting replication
in log 'FIRST' at position 0, relay log './mysqld-relay-bin-slave2.000001' position: 4
120925 20:57:00 [Note] Master 'slave2': Slave I/O thread: connected to master 'root@127.0.0.1:16001',replication started in log 'FIRST' at position 4
120925 20:57:05 [Note] Master 'slave1': Error reading relay log event: slave SQL thread was killed
120925 20:57:05 [ERROR] Master 'slave1': Error reading packet from server: Lost connection to MySQL server during query ( server_errno=2013)
120925 20:57:05 [Note] Master 'slave1': Slave I/O thread killed while reading event
120925 20:57:05 [Note] Master 'slave1': Slave I/O thread exiting, read up to log 'master-bin.000001', position 286
120925 20:57:05 [Note] Deleted Master_info file '/home/elenst/10.0-mdev253/mysql-test/var/mysqld.3/data/master.info.slave1'.
120925 20:57:05 [Note] Deleted Master_info file '/home/elenst/10.0-mdev253/mysql-test/var/mysqld.3/data/relay-log.info.slave1'.
Error: Freeing overrun buffer mysys/safemalloc.c:179, mysys/my_malloc.c:116, sql/rpl_mi.cc:76, sql/rpl_mi.cc:84, sql/rpl_mi.cc:596, mysys/hash.c:605, sql/rpl_mi.cc:984, sql/sql_reload.cc:338
Allocated at sql/rpl_mi.cc:48, sql/sql_parse.cc:2361, sql/sql_parse.cc:5813, sql/sql_parse.cc:1071, sql/sql_parse.cc:808, sql/sql_connect.cc:1253, sql/sql_connect.cc:1169, perfschema/pfs.cc:1017
120925 20:57:05 [Note] Master 'slave2': Error reading relay log event: slave SQL thread was killed

Elena Stepanova added a comment - 2012-09-25 20:05 - edited Getting Error: Freeing overrun buffer mysys/safemalloc.c:179, mysys/my_malloc.c:116, sql/rpl_mi.cc:76, sql/rpl_mi.cc:84, sql/rpl_mi.cc:596, mysys/hash.c:605, sql/rpl_mi.cc:984, sql/sql_reload.cc:338 while executing muti_source.simple revno 3436 revision-id: monty@askmonty.org-20120925162756-ad39vfvitte0fulf date: 2012-09-25 19:27:56 +0300 Server built as cmake . -DCMAKE_BUILD_TYPE=Debug && make on ubuntu 11.10 oneiric, x86_64 Error log excerpt: 120925 20:57:00 [Note] Master 'slave1': Slave I/O thread: connected to master 'root@127.0.0 .1:16000',replication started in log 'FIRST' at position 4 120925 20:57:00 [Note] Master 'slave2': Slave SQL thread initialized, starting replication in log 'FIRST' at position 0, relay log './mysqld-relay-bin-slave2.000001' position: 4 120925 20:57:00 [Note] Master 'slave2': Slave I/O thread: connected to master 'root@127.0.0.1:16001',replication started in log 'FIRST' at position 4 120925 20:57:05 [Note] Master 'slave1': Error reading relay log event: slave SQL thread was killed 120925 20:57:05 [ERROR] Master 'slave1': Error reading packet from server: Lost connection to MySQL server during query ( server_errno=2013) 120925 20:57:05 [Note] Master 'slave1': Slave I/O thread killed while reading event 120925 20:57:05 [Note] Master 'slave1': Slave I/O thread exiting, read up to log 'master-bin.000001', position 286 120925 20:57:05 [Note] Deleted Master_info file '/home/elenst/10.0-mdev253/mysql-test/var/mysqld.3/data/master.info.slave1'. 120925 20:57:05 [Note] Deleted Master_info file '/home/elenst/10.0-mdev253/mysql-test/var/mysqld.3/data/relay-log.info.slave1'. Error: Freeing overrun buffer mysys/safemalloc.c:179, mysys/my_malloc.c:116, sql/rpl_mi.cc:76, sql/rpl_mi.cc:84, sql/rpl_mi.cc:596, mysys/hash.c:605, sql/rpl_mi.cc:984, sql/sql_reload.cc:338 Allocated at sql/rpl_mi.cc:48, sql/sql_parse.cc:2361, sql/sql_parse.cc:5813, sql/sql_parse.cc:1071, sql/sql_parse.cc:808, sql/sql_connect.cc:1253, sql/sql_connect.cc:1169, perfschema/pfs.cc:1017 120925 20:57:05 [Note] Master 'slave2': Error reading relay log event: slave SQL thread was killed

Elena Stepanova added a comment - 2012-09-25 20:18 - edited

Slave_IO_state and Connection_name still seem to be switched in SHOW FULL SLAVE STATUS:

show full slave status;
Connection_name Waiting for master to send event
Slave_IO_State master2
Master_Host 127.0.0.1

revno 3436
revision-id: monty@askmonty.org-20120925162756-ad39vfvitte0fulf
date: 2012-09-25 19:27:56 +0300

Elena Stepanova added a comment - 2012-09-25 20:18 - edited Slave_IO_state and Connection_name still seem to be switched in SHOW FULL SLAVE STATUS: show full slave status; Connection_name Waiting for master to send event Slave_IO_State master2 Master_Host 127.0.0.1 revno 3436 revision-id: monty@askmonty.org-20120925162756-ad39vfvitte0fulf date: 2012-09-25 19:27:56 +0300

Elena Stepanova made changes - 2012-09-25 21:23

Comment

[ Sporadically getting the following mismatch on multi_source.simple, same revision as above:

@@ -49,6 +49,8 @@
Replicate_Ignore_Server_Ids
Master_Server_Id 1
reset slave 'slave1';
+Warnings:
+Warning 1612 Being purged log ./mysqld-relay-bin-slave1.000002 was not found
show full slave status;

not yet sure what it means ]

Elena Stepanova added a comment - 2012-09-25 22:22 - edited

--sync_with_master 0, 'master1' doesn't work
(space AFTER the comma)

Upd: moved the description of the problem into ~~MDEV-549~~

Elena Stepanova added a comment - 2012-09-25 22:22 - edited --sync_with_master 0, 'master1' doesn't work (space AFTER the comma) Upd: moved the description of the problem into MDEV-549

Elena Stepanova added a comment - 2012-09-25 22:53 - edited

"RESET SLAVE 'masterX'" totally removes the masterX configuration.

Update: moved the description of the problem into subtask ~~MDEV-548~~

Elena Stepanova added a comment - 2012-09-25 22:53 - edited "RESET SLAVE 'masterX'" totally removes the masterX configuration. Update: moved the description of the problem into subtask MDEV-548

Elena Stepanova added a comment - 2012-09-26 03:47

Created a subtask https://mariadb.atlassian.net/browse/MDEV-547 for a problem with sql_slave_skip_counter (with a test file and result file).

Hopefully subtasks will be easier to track, a list of comments becomes messy.

Elena Stepanova added a comment - 2012-09-26 03:47 Created a subtask https://mariadb.atlassian.net/browse/MDEV-547 for a problem with sql_slave_skip_counter (with a test file and result file). Hopefully subtasks will be easier to track, a list of comments becomes messy.

Michael Widenius added a comment - 2012-09-27 04:50

Have not been able to repeat the problem with memory allocation until very late last night. Will fix tomorrow.
The problem with sql_slave_skip_counter was that this variable is not yet multi-source aware.

Have fixed the following issues:

"Slave_IO_state and Connection_name still seem to be switched in SHOW FULL SLAVE STATUS"
--sync_with_master 0, 'master1' doesn't work
Made sql_slave_skip_counter

It's intentionally that RESET SLAVE removes the master configuration (this comes from the original patch).
The reason this method was used, is probably that there was no other logical way to remove a connection.
This does however create the problem of how to be able to remove the relay logs for a named connection.
The suggestions I have regarding this are:

Let RESET SLAVE remove relay logs and the connection but FLUSH SLAVE would only remove relay logs.
Add a DROP SLAVE 'connection_name' command.

Michael Widenius added a comment - 2012-09-27 04:50 Have not been able to repeat the problem with memory allocation until very late last night. Will fix tomorrow. The problem with sql_slave_skip_counter was that this variable is not yet multi-source aware. Have fixed the following issues: "Slave_IO_state and Connection_name still seem to be switched in SHOW FULL SLAVE STATUS" --sync_with_master 0, 'master1' doesn't work Made sql_slave_skip_counter It's intentionally that RESET SLAVE removes the master configuration (this comes from the original patch). The reason this method was used, is probably that there was no other logical way to remove a connection. This does however create the problem of how to be able to remove the relay logs for a named connection. The suggestions I have regarding this are: Let RESET SLAVE remove relay logs and the connection but FLUSH SLAVE would only remove relay logs. Add a DROP SLAVE 'connection_name' command.

Elena Stepanova added a comment - 2012-09-27 05:11

Cannot we still use "RESET SLAVE 'name' ALL" to remove everything and "RESET SLAVE 'name'" to only remove the logs and position? It would be consistent with what we have for single-source: now 'RESET SLAVE ALL' removes the entire configuration, while 'RESET SLAVE' only removes logs and the position.

For the memory problem, I'll try to create a shorter test case, to reduce the amount of noise in the trace. The problem itself is reproducible on my machines and on perro, both with the plain cmake build and with BUILD/compile-pentium-debug-max. So if it doesn't happen on your boxes, maybe you could try there. I can set up the environment so you would only have to run the test.

Elena Stepanova added a comment - 2012-09-27 05:11 Cannot we still use "RESET SLAVE 'name' ALL" to remove everything and "RESET SLAVE 'name'" to only remove the logs and position? It would be consistent with what we have for single-source: now 'RESET SLAVE ALL' removes the entire configuration, while 'RESET SLAVE' only removes logs and the position. For the memory problem, I'll try to create a shorter test case, to reduce the amount of noise in the trace. The problem itself is reproducible on my machines and on perro, both with the plain cmake build and with BUILD/compile-pentium-debug-max. So if it doesn't happen on your boxes, maybe you could try there. I can set up the environment so you would only have to run the test.

Elena Stepanova added a comment - 2012-10-01 21:57

With the new usage of max_relay_log_size we are getting extra warnings in the standard test suite (as below). Should we add the warnings to the result files, or are you planning to change the algorithm somehow?

CURRENT_TEST: rpl.rpl_deadlock_innodb
— mysql-test/suite/rpl/r/rpl_deadlock_innodb.result 2012-09-28 03:23:07.482695000 +0400
+++ mysql-test/suite/rpl/r/rpl_deadlock_innodb.reject 2012-10-01 22:52:05.107989763 +0400
@@ -76,6 +76,8 @@

- - Test lock wait timeout and purged relay logs ***
    SET @my_max_relay_log_size= @@global.max_relay_log_size;
    SET global max_relay_log_size=0;
    +Warnings:
    +Warning 1292 Truncated incorrect max_relay_log_size value: '0'
    include/stop_slave.inc
    DELETE FROM t2;
    CHANGE MASTER TO MASTER_LOG_POS=<master_pos_begin>;

mysqltest: Result length mismatch

Elena Stepanova added a comment - 2012-10-01 21:57 With the new usage of max_relay_log_size we are getting extra warnings in the standard test suite (as below). Should we add the warnings to the result files, or are you planning to change the algorithm somehow? CURRENT_TEST: rpl.rpl_deadlock_innodb — mysql-test/suite/rpl/r/rpl_deadlock_innodb.result 2012-09-28 03:23:07.482695000 +0400 +++ mysql-test/suite/rpl/r/rpl_deadlock_innodb.reject 2012-10-01 22:52:05.107989763 +0400 @@ -76,6 +76,8 @@ Test lock wait timeout and purged relay logs *** SET @my_max_relay_log_size= @@global.max_relay_log_size; SET global max_relay_log_size=0; +Warnings: +Warning 1292 Truncated incorrect max_relay_log_size value: '0' include/stop_slave.inc DELETE FROM t2; CHANGE MASTER TO MASTER_LOG_POS=<master_pos_begin>; mysqltest: Result length mismatch

Michael Widenius added a comment - 2012-10-03 22:08

The warning for max_relay_log_size is ok in my opinion. Now it works like max_binlog_size and other variables.

Michael Widenius added a comment - 2012-10-03 22:08 The warning for max_relay_log_size is ok in my opinion. Now it works like max_binlog_size and other variables.

Michael Widenius added a comment - 2012-11-18 19:32

Pushed into 10.0-base

Michael Widenius added a comment - 2012-11-18 19:32 Pushed into 10.0-base

Michael Widenius made changes - 2012-11-18 19:32

Fix Version/s		10.0.0 [ 10000 ]
Resolution		Fixed [ 1 ]
Status	Open [ 1 ]	Closed [ 6 ]

Sergei Golubchik made changes - 2014-06-13 15:06

Workflow

defaullt [ 11652 ]

MariaDB v2 [ 43984 ]

Rasmus Johansson (Inactive) made changes - 2015-05-18 17:51

Workflow

MariaDB v2 [ 43984 ]

MariaDB v3 [ 63230 ]

Elena Stepanova made changes - 2019-08-22 12:00

Link

This issue causes MENT-349 [ MENT-349 ]

Elena Stepanova made changes - 2019-08-22 12:05

Link

This issue causes MENT-349 [ MENT-349 ]

Sergei Golubchik made changes - 2021-12-06 21:22

Workflow

MariaDB v3 [ 63230 ]

MariaDB v4 [ 131924 ]

People

Assignee:: Michael Widenius

Reporter:: Rasmus Johansson (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 2012-05-04 11:32

Updated:: 2019-08-22 12:05

Resolved:: 2012-11-18 19:32

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

High-Level Specification

Low-Level Design

Attachments

Sub-Tasks

Activity

People

Dates

Git Integration