[MDEV-8134] The relay-log is not flushed after the slave-relay-log.999999 showed - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 10.0.14, 10.0(EOL), 10.1(EOL)
Fix Version/s: 10.2.37, 10.3.28, 10.4.18, 10.5.9
Component/s: Replication
Labels:
- contribution
- slave-relay-log
Environment:
Centos 6.5 64bit

Description

I am using MariaDB 10.0.14 at CentOS 6.5.
And, as I mentioned at the title of this question, the relay-log is not
flushed after the slave-relay-log.999999 showed when using
"Salve_parallel_threads:10" setting. like showed blow.

binlog_format: ROW
Slave_parallel_threads:10

Everything are working fine except the "slave-relay-log.******" files
continue to exist at the disk which will finally cause the disk full.
If I change the value of Slave_parallel_threads setting from 10 to 0,
the log will be flushed. Howevery "PK duplicate warning" error logs come
next.

Is there any setting should be used with Slave_parallel_threads setting?
Any help will be great.

Best regareds,

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

999999.diff
3 kB
2019-04-25 08:50

Issue Links

mentioned in: Page Failed to load; Page Failed to load; Page Failed to load; Page Failed to load; Page Failed to load

Activity

Ascending order - Click to sort in descending order

Elena Stepanova added a comment - 2015-06-05 19:00

Hi,

Could you please clarify: by "not flushed" do you mean that the logs are not closed/re-opened? Or that they do not get rotated? Or that they do not get purged?

So far, given that the eventual problem is "disk full", I assume the latter, that old relay logs do not get purged automatically. However, I cannot reproduce it, seems to be working fine for me:

# Relay log files, after restart

slave-relay-bin.999998

slave-relay-bin.999999

slave-relay-bin.index

# Contents of relay-log.info:

./slave-relay-bin.999999

master-bin.000001

# Contents of relay index file:

./slave-relay-bin.999998

./slave-relay-bin.999999

#-----------------------------------------

flush relay logs;

#-----------------------------------------

# Relay log files

slave-relay-bin.1000000

slave-relay-bin.999999

slave-relay-bin.index

# Contents of relay-log.info:

./slave-relay-bin.1000000

master-bin.000001

# Contents of relay index file:

./slave-relay-bin.999999

./slave-relay-bin.1000000

#-----------------------------------------

flush relay logs;

#-----------------------------------------

# Relay log files

slave-relay-bin.1000000

slave-relay-bin.1000001

slave-relay-bin.index

# Contents of relay-log.info:

./slave-relay-bin.1000001

master-bin.000001

# Contents of relay index file:

./slave-relay-bin.1000000

./slave-relay-bin.1000001

#-----------------------------------------

flush relay logs;

#-----------------------------------------

# Relay log files

slave-relay-bin.1000001

slave-relay-bin.1000002

slave-relay-bin.index

# Contents of relay-log.info:

./slave-relay-bin.1000002

master-bin.000001

# Contents of relay index file:

./slave-relay-bin.1000001

./slave-relay-bin.1000002

select @@slave_parallel_threads, @@version;

@@slave_parallel_threads	@@version

10	10.0.14-MariaDB-log

Files won't get purged if replication is stopped or aborted, but since you are saying that everything else works fine, I assume it is not the case.

Can you somehow demonstrate the problem? E.g. list the contents of the datadir, then run flush logs on the slave or or on the master, wait a bit, list the contents of the datadir again, the contents of relay-log.info file, and show slave status output?

Also, if you are using multi-source replication or some other non-standard replication setup, please do say so. And please attach your cnf file(s).

Thanks.

Elena Stepanova added a comment - 2015-06-05 19:00 Hi, Could you please clarify: by "not flushed" do you mean that the logs are not closed/re-opened? Or that they do not get rotated? Or that they do not get purged? So far, given that the eventual problem is "disk full", I assume the latter, that old relay logs do not get purged automatically. However, I cannot reproduce it, seems to be working fine for me: # Relay log files, after restart slave-relay-bin.999998 slave-relay-bin.999999 slave-relay-bin.index # # Contents of relay-log.info: ./slave-relay-bin.999999 536 master-bin.000001 313 # # Contents of relay index file: ./slave-relay-bin.999998 ./slave-relay-bin.999999 #----------------------------------------- flush relay logs; #----------------------------------------- # Relay log files slave-relay-bin.1000000 slave-relay-bin.999999 slave-relay-bin.index # # Contents of relay-log.info: ./slave-relay-bin.1000000 492 master-bin.000001 313 # # Contents of relay index file: ./slave-relay-bin.999999 ./slave-relay-bin.1000000 #----------------------------------------- flush relay logs; #----------------------------------------- # Relay log files slave-relay-bin.1000000 slave-relay-bin.1000001 slave-relay-bin.index # # Contents of relay-log.info: ./slave-relay-bin.1000001 492 master-bin.000001 313 # # Contents of relay index file: ./slave-relay-bin.1000000 ./slave-relay-bin.1000001 #----------------------------------------- flush relay logs; #----------------------------------------- # Relay log files slave-relay-bin.1000001 slave-relay-bin.1000002 slave-relay-bin.index # # Contents of relay-log.info: ./slave-relay-bin.1000002 492 master-bin.000001 313 # # Contents of relay index file: ./slave-relay-bin.1000001 ./slave-relay-bin.1000002 select @@slave_parallel_threads, @@version; @@slave_parallel_threads @@version 10 10.0.14-MariaDB-log Files won't get purged if replication is stopped or aborted, but since you are saying that everything else works fine, I assume it is not the case. Can you somehow demonstrate the problem? E.g. list the contents of the datadir, then run flush logs on the slave or or on the master, wait a bit, list the contents of the datadir again, the contents of relay-log.info file, and show slave status output? Also, if you are using multi-source replication or some other non-standard replication setup, please do say so. And please attach your cnf file(s). Thanks.

ilhwan Kim added a comment - 2015-06-08 08:45 - edited

Hi,

Our Replication configuration : master 1node , slave 2node ( slave2 - master - slave1 )

1. slave1 server status

# Slave_parallel_threads = 10

# show slave status\G

*************************** 1. row ***************************

               Slave_IO_State: Waiting for master to send event

                  Master_Host: 10.0.2.120

                  Master_User: repluser

                  Master_Port: 3306

                Connect_Retry: 10

              Master_Log_File: server-bin.000003

          Read_Master_Log_Pos: 327

               Relay_Log_File: slave1-relay-log.999999

                Relay_Log_Pos: 7979

        Relay_Master_Log_File: server-bin.000003

             Slave_IO_Running: Yes

            Slave_SQL_Running: Yes

              Replicate_Do_DB:

          Replicate_Ignore_DB:

           Replicate_Do_Table:

       Replicate_Ignore_Table:

      Replicate_Wild_Do_Table:

  Replicate_Wild_Ignore_Table:

                   Last_Errno: 0

                   Last_Error:

                 Skip_Counter: 0

          Exec_Master_Log_Pos: 327

              Relay_Log_Space: 8161616

              Until_Condition: None

               Until_Log_File:

                Until_Log_Pos: 0

           Master_SSL_Allowed: No

           Master_SSL_CA_File:

           Master_SSL_CA_Path:

              Master_SSL_Cert:

            Master_SSL_Cipher:

               Master_SSL_Key:

        Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

                Last_IO_Errno: 0

                Last_IO_Error:

               Last_SQL_Errno: 0

               Last_SQL_Error:

  Replicate_Ignore_Server_Ids:

             Master_Server_Id: 1

               Master_SSL_Crl:

           Master_SSL_Crlpath:

                   Using_Gtid: No

                  Gtid_IO_Pos:

1 row in set (0.00 sec)

# Contents of slave1-relay-log.index

/Maria_DBLog/slave1-relay-log.999999

/Maria_DBLog/slave1-relay-log.1000000

/Maria_DBLog/slave1-relay-log.1000001

....

/Maria_DBLog/slave1-relay-log.1000937

/Maria_DBLog/slave1-relay-log.1000938

/Maria_DBLog/slave1-relay-log.1000939

# -----------------------------------------------------------

# Contents of slave1-relay-log.info

/Maria_DBLog/slave1-relay-log.999999

server-bin.000002

Maria_DBLog/slave1-relay-log.999999

server-bin.000001

# -----------------------------------------------------------

2. slave2 server status

# Slave_parallel_threads = 10 -> Same as slave1

# Stop slave; Slave_paralle_threads = 0 ; start slave
-> change to slave2-relay-log.1000936 after purge slave2-relay-log.999999

# show slave status\G

*************************** 1. row ***************************

               Slave_IO_State: Waiting for master to send event

                  Master_Host: 10.0.2.120

                  Master_User: repluser

                  Master_Port: 3306

                Connect_Retry: 10

              Master_Log_File: server-bin.000003

          Read_Master_Log_Pos: 327

               Relay_Log_File: slave2-relay-log.1000936

                Relay_Log_Pos: 536

        Relay_Master_Log_File: server-bin.000003

             Slave_IO_Running: Yes

            Slave_SQL_Running: Yes

              Replicate_Do_DB:

          Replicate_Ignore_DB:

           Replicate_Do_Table:

       Replicate_Ignore_Table:

      Replicate_Wild_Do_Table:

  Replicate_Wild_Ignore_Table:

                   Last_Errno: 0

                   Last_Error:

                 Skip_Counter: 0

          Exec_Master_Log_Pos: 327

              Relay_Log_Space: 1202

              Until_Condition: None

               Until_Log_File:

                Until_Log_Pos: 0

           Master_SSL_Allowed: No

           Master_SSL_CA_File:

           Master_SSL_CA_Path:

              Master_SSL_Cert:

            Master_SSL_Cipher:

               Master_SSL_Key:

        Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

                Last_IO_Errno: 0

                Last_IO_Error:

               Last_SQL_Errno: 0

               Last_SQL_Error:

  Replicate_Ignore_Server_Ids:

             Master_Server_Id: 1

               Master_SSL_Crl:

           Master_SSL_Crlpath:

                   Using_Gtid: No

                  Gtid_IO_Pos:

1 row in set (0.00 sec)

# Contents of slave2-relay-log.index

/Maria_DBLog/slave2-relay-log.1000935

/Maria_DBLog/slave2-relay-log.1000936

# -----------------------------------------------------------

# Contents of slave2-relay-log.info

/Maria_DBLog/slave2-relay-log.1000936

server-bin.000003

Maria_DBLog/slave2-relay-log.999999

server-bin.000001

/Maria_DBLog/slave2-relay-log.999999

server-bin.000001

# -----------------------------------------------------------

master , slave1, slave2 -> data is same

# -----------------------------------------------------------

slave1 my.cnf
# -----------------------------------------------------------------------------
server-id = 2
report-host = dev-dbslave1

max-relay-log-size = 10K
master-info-file = /Maria_DBLog/master.info
relay-log = /Maria_DBLog/slave1-relay-log.bin
relay-log-info-file = /Maria_DBLog/slave1-relay-log.info
relay-log-index = /Maria_DBLog/slave1-relay-log.index
slave-skip-errors = all
low-priority-updates

slave_parallel_threads = 10
slave_parallel_max_queued = 524288
# -----------------------------------------------------------------------------

Thanks.

ilhwan Kim added a comment - 2015-06-08 08:45 - edited Hi, Our Replication configuration : master 1node , slave 2node ( slave2 - master - slave1 ) 1. slave1 server status # Slave_parallel_threads = 10 # show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.0.2.120 Master_User: repluser Master_Port: 3306 Connect_Retry: 10 Master_Log_File: server-bin.000003 Read_Master_Log_Pos: 327 Relay_Log_File: slave1-relay-log.999999 Relay_Log_Pos: 7979 Relay_Master_Log_File: server-bin.000003 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 327 Relay_Log_Space: 8161616 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 1 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: No Gtid_IO_Pos: 1 row in set (0.00 sec) # Contents of slave1-relay-log.index /Maria_DBLog/slave1-relay-log.999999 /Maria_DBLog/slave1-relay-log.1000000 /Maria_DBLog/slave1-relay-log.1000001 .... /Maria_DBLog/slave1-relay-log.1000937 /Maria_DBLog/slave1-relay-log.1000938 /Maria_DBLog/slave1-relay-log.1000939 # ----------------------------------------------------------- # Contents of slave1-relay-log.info /Maria_DBLog/slave1-relay-log.999999 7979 server-bin.000002 327 686 Maria_DBLog/slave1-relay-log.999999 7979 server-bin.000001 499387 # ----------------------------------------------------------- 2. slave2 server status # Slave_parallel_threads = 10 -> Same as slave1 # Stop slave; Slave_paralle_threads = 0 ; start slave -> change to slave2-relay-log.1000936 after purge slave2-relay-log.999999 # show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.0.2.120 Master_User: repluser Master_Port: 3306 Connect_Retry: 10 Master_Log_File: server-bin.000003 Read_Master_Log_Pos: 327 Relay_Log_File: slave2-relay-log.1000936 Relay_Log_Pos: 536 Relay_Master_Log_File: server-bin.000003 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 327 Relay_Log_Space: 1202 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 1 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: No Gtid_IO_Pos: 1 row in set (0.00 sec) # Contents of slave2-relay-log.index /Maria_DBLog/slave2-relay-log.1000935 /Maria_DBLog/slave2-relay-log.1000936 # ----------------------------------------------------------- # Contents of slave2-relay-log.info /Maria_DBLog/slave2-relay-log.1000936 536 server-bin.000003 327 686 Maria_DBLog/slave2-relay-log.999999 7979 server-bin.000001 7691686 /Maria_DBLog/slave2-relay-log.999999 7979 server-bin.000001 7691686 # ----------------------------------------------------------- master , slave1, slave2 -> data is same # ----------------------------------------------------------- slave1 my.cnf # ----------------------------------------------------------------------------- server-id = 2 report-host = dev-dbslave1 max-relay-log-size = 10K master-info-file = /Maria_DBLog/master.info relay-log = /Maria_DBLog/slave1-relay-log.bin relay-log-info-file = /Maria_DBLog/slave1-relay-log.info relay-log-index = /Maria_DBLog/slave1-relay-log.index slave-skip-errors = all low-priority-updates slave_parallel_threads = 10 slave_parallel_max_queued = 524288 # ----------------------------------------------------------------------------- Thanks.

Elena Stepanova added a comment - 2015-06-08 17:44

Thanks.
Apparently, my previous test with flush relay logs was irrelevant.
Here is another one (it's a crude one just to demonstrate the problem, not to be included into the regression suite):

Test case
# IMPORTANT: Run with --mysqld=--slave-parallel-threads=1 --mysqld=--max-relay-log-size=100K

--source include/master-slave.inc
--source include/have_binlog_format_row.inc

--enable_connect_log

--connection master
create table t1 (i int, c varchar(1024));
flush binary logs;
flush binary logs;

--sync_slave_with_master

--echo #
--echo # Stop slave server
--echo #

--let $datadir = `select @@datadir`

--exec echo "wait" > $MYSQLTEST_VARDIR/tmp/mysqld.2.expect
--shutdown_server 10
--source include/wait_until_disconnected.inc

--echo #
--echo # Cheat: get close to 999999
--echo # by renaming relay logs and modifying index/info files
--echo #

--exec sed -i "s/slave-relay-bin.00000/slave-relay-bin.99999/g" $datadir/relay-log.info
--exec sed -i "s/slave-relay-bin.00000/slave-relay-bin.99999/g" $datadir/slave-relay-bin.index
--disable_abort_on_error
--move_file $datadir/slave-relay-bin.000001 $datadir/slave-relay-bin.999991
--move_file $datadir/slave-relay-bin.000002 $datadir/slave-relay-bin.999992
--move_file $datadir/slave-relay-bin.000003 $datadir/slave-relay-bin.999993
--move_file $datadir/slave-relay-bin.000004 $datadir/slave-relay-bin.999994
--move_file $datadir/slave-relay-bin.000005 $datadir/slave-relay-bin.999995
--move_file $datadir/slave-relay-bin.000006 $datadir/slave-relay-bin.999996
--enable_abort_on_error

--echo #
--echo # Restart slave server
--echo #

--exec echo "restart" > $MYSQLTEST_VARDIR/tmp/mysqld.2.expect
--enable_reconnect
--source include/wait_until_connected_again.inc
--source include/start_slave.inc

--echo #-----------------------------------------
--echo # Relay log files after restart:
--list_files $datadir slave-relay-bin.*
--echo #
--echo # Contents of relay-log.info:
--cat_file $datadir/relay-log.info
--echo #
--echo # Contents of relay index file:
--cat_file $datadir/slave-relay-bin.index
--echo #-----------------------------------------

--connection master
--echo #
--echo # Insert some data to generate enough amount of binary logs
--echo #
--let $count = 1000
--disable_query_log
while ($count)
{
eval insert into t1 values (1001 - $count, repeat('a',1000));
dec $count;
}
--enable_query_log

--sync_slave_with_master

--echo #-----------------------------------------
--echo # Relay log files:
--list_files $datadir slave-relay-bin.*
--echo #
--echo # Contents of relay-log.info:
--cat_file $datadir/relay-log.info
--echo #
--echo # Contents of relay index file:
--cat_file $datadir/slave-relay-bin.index
--echo #-----------------------------------------

--let $status_items=Master_Log_File,Relay_Log_File,Relay_Log_Pos,Relay_Master_Log_File,Exec_Master_Log_Pos,Seconds_Behind_Master,Slave_IO_Running,Slave_SQL_Running
--source include/show_slave_status.inc

Actual result
...
# Insert some data to generate enough amount of binary logs
#
connection slave;
#-----------------------------------------
# Relay log files:
slave-relay-bin.1000000
slave-relay-bin.1000001
slave-relay-bin.1000002
slave-relay-bin.1000003
slave-relay-bin.1000004
slave-relay-bin.1000005
slave-relay-bin.1000006
slave-relay-bin.1000007
slave-relay-bin.1000008
slave-relay-bin.1000009
slave-relay-bin.999999
slave-relay-bin.index
#
# Contents of relay-log.info:
./slave-relay-bin.999999
101456
master-bin.000003
1187367
#
# Contents of relay index file:
./slave-relay-bin.999999
./slave-relay-bin.1000000
./slave-relay-bin.1000001
./slave-relay-bin.1000002
./slave-relay-bin.1000003
./slave-relay-bin.1000004
./slave-relay-bin.1000005
./slave-relay-bin.1000006
./slave-relay-bin.1000007
./slave-relay-bin.1000008
./slave-relay-bin.1000009
#-----------------------------------------
Master_Log_File = 'master-bin.000003'
Relay_Log_File = 'slave-relay-bin.999999'
Relay_Log_Pos = '101456'
Relay_Master_Log_File = 'master-bin.000003'
Exec_Master_Log_Pos = '1187367'
Seconds_Behind_Master = '0'
Slave_IO_Running = 'Yes'
Slave_SQL_Running = 'Yes'

Elena Stepanova added a comment - 2015-06-08 17:44 Thanks. Apparently, my previous test with flush relay logs was irrelevant. Here is another one (it's a crude one just to demonstrate the problem, not to be included into the regression suite): Test case # IMPORTANT: Run with --mysqld=--slave-parallel-threads=1 --mysqld=--max-relay-log-size=100K --source include/master-slave.inc --source include/have_binlog_format_row.inc --enable_connect_log --connection master create table t1 (i int, c varchar(1024)); flush binary logs; flush binary logs; --sync_slave_with_master --echo # --echo # Stop slave server --echo # --let $datadir = `select @@datadir` --exec echo "wait" > $MYSQLTEST_VARDIR/tmp/mysqld.2.expect --shutdown_server 10 --source include/wait_until_disconnected.inc --echo # --echo # Cheat: get close to 999999 --echo # by renaming relay logs and modifying index/info files --echo # --exec sed -i "s/slave-relay-bin.00000/slave-relay-bin.99999/g" $datadir/relay-log.info --exec sed -i "s/slave-relay-bin.00000/slave-relay-bin.99999/g" $datadir/slave-relay-bin.index --disable_abort_on_error --move_file $datadir/slave-relay-bin.000001 $datadir/slave-relay-bin.999991 --move_file $datadir/slave-relay-bin.000002 $datadir/slave-relay-bin.999992 --move_file $datadir/slave-relay-bin.000003 $datadir/slave-relay-bin.999993 --move_file $datadir/slave-relay-bin.000004 $datadir/slave-relay-bin.999994 --move_file $datadir/slave-relay-bin.000005 $datadir/slave-relay-bin.999995 --move_file $datadir/slave-relay-bin.000006 $datadir/slave-relay-bin.999996 --enable_abort_on_error --echo # --echo # Restart slave server --echo # --exec echo "restart" > $MYSQLTEST_VARDIR/tmp/mysqld.2.expect --enable_reconnect --source include/wait_until_connected_again.inc --source include/start_slave.inc --echo #----------------------------------------- --echo # Relay log files after restart: --list_files $datadir slave-relay-bin.* --echo # --echo # Contents of relay-log.info: --cat_file $datadir/relay-log.info --echo # --echo # Contents of relay index file: --cat_file $datadir/slave-relay-bin.index --echo #----------------------------------------- --connection master --echo # --echo # Insert some data to generate enough amount of binary logs --echo # --let $count = 1000 --disable_query_log while ($count) { eval insert into t1 values (1001 - $count, repeat('a',1000)); dec $count; } --enable_query_log --sync_slave_with_master --echo #----------------------------------------- --echo # Relay log files: --list_files $datadir slave-relay-bin.* --echo # --echo # Contents of relay-log.info: --cat_file $datadir/relay-log.info --echo # --echo # Contents of relay index file: --cat_file $datadir/slave-relay-bin.index --echo #----------------------------------------- --let $status_items=Master_Log_File,Relay_Log_File,Relay_Log_Pos,Relay_Master_Log_File,Exec_Master_Log_Pos,Seconds_Behind_Master,Slave_IO_Running,Slave_SQL_Running --source include/show_slave_status.inc Actual result ... # Insert some data to generate enough amount of binary logs # connection slave; #----------------------------------------- # Relay log files: slave-relay-bin.1000000 slave-relay-bin.1000001 slave-relay-bin.1000002 slave-relay-bin.1000003 slave-relay-bin.1000004 slave-relay-bin.1000005 slave-relay-bin.1000006 slave-relay-bin.1000007 slave-relay-bin.1000008 slave-relay-bin.1000009 slave-relay-bin.999999 slave-relay-bin.index # # Contents of relay-log.info: ./slave-relay-bin.999999 101456 master-bin.000003 1187367 # # Contents of relay index file: ./slave-relay-bin.999999 ./slave-relay-bin.1000000 ./slave-relay-bin.1000001 ./slave-relay-bin.1000002 ./slave-relay-bin.1000003 ./slave-relay-bin.1000004 ./slave-relay-bin.1000005 ./slave-relay-bin.1000006 ./slave-relay-bin.1000007 ./slave-relay-bin.1000008 ./slave-relay-bin.1000009 #----------------------------------------- Master_Log_File = 'master-bin.000003' Relay_Log_File = 'slave-relay-bin.999999' Relay_Log_Pos = '101456' Relay_Master_Log_File = 'master-bin.000003' Exec_Master_Log_Pos = '1187367' Seconds_Behind_Master = '0' Slave_IO_Running = 'Yes' Slave_SQL_Running = 'Yes'

jasung gu added a comment - 2015-06-09 11:42

I have same problem.

How I can solve this problem?

jasung gu added a comment - 2015-06-09 11:42 I have same problem. How I can solve this problem?

ilhwan Kim added a comment - 2016-06-29 06:26

mariadb 10.0.26 version same result .. ㅜㅜ

ilhwan Kim added a comment - 2016-06-29 06:26 mariadb 10.0.26 version same result .. ㅜㅜ

David Zhao added a comment - 2019-04-25 08:51 - edited

I also came into this issue a few days ago, and I have located this bug and fixed it, I've attached my patch here.

The problem is that in Relay_log_info::inc_group_relay_log_pos() function, they way you compare two log name is via strcmp() function, this is fine when log name sequence number are both of the same digits( 6 digits). but when the number goes to 7 digits, a 99999 compares greater than 1000000, which is wrong, hence the bug.

Besides this bug, I also located another bug caused by the same mistake in handle_queued_pos_update(), which could cause parallel replication issues when log name sequence number goes to 7 digits.

999999.diff

David Zhao added a comment - 2019-04-25 08:51 - edited I also came into this issue a few days ago, and I have located this bug and fixed it, I've attached my patch here. The problem is that in Relay_log_info::inc_group_relay_log_pos() function, they way you compare two log name is via strcmp() function, this is fine when log name sequence number are both of the same digits( 6 digits). but when the number goes to 7 digits, a 99999 compares greater than 1000000, which is wrong, hence the bug. Besides this bug, I also located another bug caused by the same mistake in handle_queued_pos_update(), which could cause parallel replication issues when log name sequence number goes to 7 digits. 999999.diff

Maria M Pflaum (Inactive) added a comment - 2021-01-06 23:09 - edited

julien.fritsch It turns out I don't need a custom build for the customer. Sorry for the confusion. I just need to let them know when it's available in the next 10.2.x CS release.

Maria M Pflaum (Inactive) added a comment - 2021-01-06 23:09 - edited julien.fritsch It turns out I don't need a custom build for the customer. Sorry for the confusion. I just need to let them know when it's available in the next 10.2.x CS release.

Sujatha Sivakumar (Inactive) added a comment - 2021-01-12 10:37

Hello Andrei,

Can you please review fix for ~~MDEV-8134~~.

Patch: https://github.com/MariaDB/server/commit/ffbb7348f67fdb78bb6c19cbf3d8d890e9ea29d7

BuildBot: bb-10.2-sujatha

Regarding usage of 'binlog_id' for file name comparison, this variable is dedicated for binary logs. Its usage is tightly coupled with binlog checkpoint. This variable monotonically increases during server runtime, starting from 1. If binary logs are cleared by using RESET MASTER the binlog_id is not reset. Only upon server restart it is reset to 1. Hence we cannot use this variable for file_name comparison.

Sujatha Sivakumar (Inactive) added a comment - 2021-01-12 10:37 Hello Andrei, Can you please review fix for MDEV-8134 . Patch: https://github.com/MariaDB/server/commit/ffbb7348f67fdb78bb6c19cbf3d8d890e9ea29d7 BuildBot: bb-10.2-sujatha Regarding usage of 'binlog_id' for file name comparison, this variable is dedicated for binary logs. Its usage is tightly coupled with binlog checkpoint. This variable monotonically increases during server runtime, starting from 1. If binary logs are cleared by using RESET MASTER the binlog_id is not reset. Only upon server restart it is reset to 1. Hence we cannot use this variable for file_name comparison.

Sachin Setiya (Inactive) added a comment - 2021-01-20 10:47

Okay to push

Sachin Setiya (Inactive) added a comment - 2021-01-20 10:47 Okay to push

Sujatha Sivakumar (Inactive) added a comment - 2021-01-21 07:38

Fix is implemented in 10.2.37.

Patch has been cherry-picked to higher versions and tested.
No major merge conflicts were observed.

10.3: https://github.com/MariaDB/server/commit/811dac176c10973d90f110480d2d353c452d78b9
10.4: https://github.com/MariaDB/server/commit/0f2e60f07763de230baa057818c161ce3d59b994
10.5: https://github.com/MariaDB/server/commit/3aeb78968f662629abd467773f7924fd66772022
10.6: https://github.com/MariaDB/server/commit/822e224170f99478b09ddfe03d316fee8b36a3ec

Sujatha Sivakumar (Inactive) added a comment - 2021-01-21 07:38 Fix is implemented in 10.2.37. Patch has been cherry-picked to higher versions and tested. No major merge conflicts were observed. 10.3: https://github.com/MariaDB/server/commit/811dac176c10973d90f110480d2d353c452d78b9 10.4: https://github.com/MariaDB/server/commit/0f2e60f07763de230baa057818c161ce3d59b994 10.5: https://github.com/MariaDB/server/commit/3aeb78968f662629abd467773f7924fd66772022 10.6: https://github.com/MariaDB/server/commit/822e224170f99478b09ddfe03d316fee8b36a3ec

People

Assignee:: Sujatha Sivakumar (Inactive)

Reporter:: ilhwan Kim

Votes:: 5 Vote for this issue

Watchers:: 12 Start watching this issue

Dates

Created:: 2015-05-11 06:47

Updated:: 2024-09-06 10:01

Resolved:: 2021-01-21 07:38

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration