|
From which version did you upgrade to 10.2? Could it have been 10.1 or earlier?
To get more information, run SHOW SLAVE STATUS when you see it happen, several times with some time interval. The first thing to pay attention to is whether the slave position is changing at all. If it does not, then the slave indeed hangs on this particular event, or at least executes it for a very long time. If the position does change, only very slowly, that's a different matter.
Then, search the binary log file for the position that SHOW SLAVE STATUS shows (be it always the same position or a slowly changed one), and see what kind of event is being executed. If you can, attach it to the issue along with your cnf files and the output of SHOW SLAVE STATUS that you've collected; or, if the event contains confidential information, you can upload it to ftp.askmonty.org/private.
Additionally, if the slave indeed hangs and doesn't change the position at all, you could try to get the stack trace from the running server.
|
|
I'm upgrading as new releases are available. From 10.2.6 I have used all. It was 10.2.12.
I'll reply tomorrow (since the offending replication event occurs at night).
|
|
SHOW SLAVE STATUS
| Slave_IO_State |
Master_Host |
Master_User |
Master_Port |
Connect_Retry |
Master_Log_File |
Read_Master_Log_Pos |
Relay_Log_File |
Relay_Log_Pos |
Relay_Master_Log_File |
Slave_IO_Running |
Slave_SQL_Running |
Replicate_Do_DB |
Replicate_Ignore_DB |
Replicate_Do_Table |
Replicate_Ignore_Table |
Replicate_Wild_Do_Table |
Replicate_Wild_Ignore_Table |
Last_Errno |
Last_Error |
Skip_Counter |
Exec_Master_Log_Pos |
Relay_Log_Space |
Until_Condition |
Until_Log_File |
Until_Log_Pos |
Master_SSL_Allowed |
Master_SSL_CA_File |
Master_SSL_CA_Path |
Master_SSL_Cert |
Master_SSL_Cipher |
Master_SSL_Key |
Seconds_Behind_Master |
Master_SSL_Verify_Server_Cert |
Last_IO_Errno |
Last_IO_Error |
Last_SQL_Errno |
Last_SQL_Error |
Replicate_Ignore_Server_Ids |
Master_Server_Id |
Master_SSL_Crl |
Master_SSL_Crlpath |
Using_Gtid |
Gtid_IO_Pos |
Replicate_Do_Domain_Ids |
Replicate_Ignore_Domain_Ids |
Parallel_Mode |
SQL_Delay |
SQL_Remaining_Delay |
Slave_SQL_Running_State |
| Waiting for master to send event |
192.168.30.46 |
repllpvs2 |
3306 |
60 |
lpvs1-mysql-bin.000023 |
108508087 |
lpvs2-relay-bin.000078 |
505807544 |
lpvs1-mysql-bin.000022 |
Yes |
Yes |
|
|
|
|
|
|
0 |
|
0 |
641922625 |
1104002383 |
None |
|
0 |
No |
|
|
|
|
|
23624 |
No |
0 |
|
0 |
|
|
1 |
|
|
No |
|
|
|
none |
0 |
|
Delete_rows_log_event::find_row(-1) |
|
|
Another run
| Slave_IO_State |
Master_Host |
Master_User |
Master_Port |
Connect_Retry |
Master_Log_File |
Read_Master_Log_Pos |
Relay_Log_File |
Relay_Log_Pos |
Relay_Master_Log_File |
Slave_IO_Running |
Slave_SQL_Running |
Replicate_Do_DB |
Replicate_Ignore_DB |
Replicate_Do_Table |
Replicate_Ignore_Table |
Replicate_Wild_Do_Table |
Replicate_Wild_Ignore_Table |
Last_Errno |
Last_Error |
Skip_Counter |
Exec_Master_Log_Pos |
Relay_Log_Space |
Until_Condition |
Until_Log_File |
Until_Log_Pos |
Master_SSL_Allowed |
Master_SSL_CA_File |
Master_SSL_CA_Path |
Master_SSL_Cert |
Master_SSL_Cipher |
Master_SSL_Key |
Seconds_Behind_Master |
Master_SSL_Verify_Server_Cert |
Last_IO_Errno |
Last_IO_Error |
Last_SQL_Errno |
Last_SQL_Error |
Replicate_Ignore_Server_Ids |
Master_Server_Id |
Master_SSL_Crl |
Master_SSL_Crlpath |
Using_Gtid |
Gtid_IO_Pos |
Replicate_Do_Domain_Ids |
Replicate_Ignore_Domain_Ids |
Parallel_Mode |
SQL_Delay |
SQL_Remaining_Delay |
Slave_SQL_Running_State |
| Waiting for master to send event |
192.168.30.46 |
repllpvs2 |
3306 |
60 |
lpvs1-mysql-bin.000023 |
120126020 |
lpvs2-relay-bin.000078 |
505807544 |
lpvs1-mysql-bin.000022 |
Yes |
Yes |
|
|
|
|
|
|
0 |
|
0 |
641922625 |
1114819549 |
None |
|
0 |
No |
|
|
|
|
|
26940 |
No |
0 |
|
0 |
|
|
1 |
|
|
No |
|
|
|
none |
0 |
|
Delete_rows_log_event::find_row(-1) |
|
|
So, the event in question is 505807544 in lpvs1-mysql-bin.000022. Can you find out what it is?
|
|
Alternatively, you can upload the whole binlog to ftp.askmonty.org/private.
|
|
Hi Elena,
I've, since, upgraded the server to have more RAM and change the config accordingly. Today, I'll be re-syncing the slave. Let's see how it goes.
Best regards,
António
|
|
It keeps happening 
Schema:
CREATE TABLE `tbl_facturas_detalhe` (
`Armazem` varchar(3) DEFAULT '71',
`Urgente` varchar(1) DEFAULT NULL,
`Guia` varchar(10) DEFAULT NULL,
`Factura` varchar(15) DEFAULT NULL,
`CodCliente` varchar(7) DEFAULT NULL,
`Data` date DEFAULT NULL,
`Tipo` varchar(5) DEFAULT NULL,
`Atendedor` varchar(50) DEFAULT NULL,
`Vendedor` varchar(6) DEFAULT NULL,
`NumeroPedidoCliente` varchar(15) DEFAULT NULL,
`NumeroQuiterPedidoOrigem` varchar(20) DEFAULT NULL,
`Marca` varchar(3) DEFAULT NULL,
`Referencia` varchar(50) DEFAULT NULL,
`Designacao` varchar(50) DEFAULT NULL,
`CodigoDesconto` varchar(10) DEFAULT NULL,
`TabelaDescontoVenda` varchar(5) DEFAULT NULL,
`DescontoStock` decimal(6,2) DEFAULT NULL,
`DescontoUrgente` decimal(6,2) DEFAULT NULL,
`Familia` varchar(10) DEFAULT NULL,
`FamiliaMCoutinho` varchar(10) DEFAULT NULL,
`Qtd` int(11) DEFAULT NULL,
`PVenda` decimal(10,2) DEFAULT NULL,
`Desconto` decimal(10,2) DEFAULT NULL,
`VVenda` decimal(10,2) DEFAULT NULL,
`VPCMP` decimal(10,2) DEFAULT NULL,
`Margem` decimal(10,2) DEFAULT NULL,
`Percentagem` decimal(6,2) DEFAULT NULL,
KEY `Referencia` (`Referencia`),
KEY `Factura` (`Factura`),
KEY `Guia` (`Guia`),
KEY `CodCliente` (`CodCliente`),
KEY `Data` (`Data`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
PARTITION BY RANGE COLUMNS(`Data`)
(PARTITION `p01` VALUES LESS THAN ('2006-01-01') ENGINE = MyISAM,
PARTITION `p02` VALUES LESS THAN ('2007-01-01') ENGINE = MyISAM,
PARTITION `p03` VALUES LESS THAN ('2008-01-01') ENGINE = MyISAM,
PARTITION `p04` VALUES LESS THAN ('2009-01-01') ENGINE = MyISAM,
PARTITION `p05` VALUES LESS THAN ('2010-01-01') ENGINE = MyISAM,
PARTITION `p06` VALUES LESS THAN ('2011-01-01') ENGINE = MyISAM,
PARTITION `p07` VALUES LESS THAN ('2012-01-01') ENGINE = MyISAM,
PARTITION `p08` VALUES LESS THAN ('2013-01-01') ENGINE = MyISAM,
PARTITION `p09` VALUES LESS THAN ('2014-01-01') ENGINE = MyISAM,
PARTITION `p10` VALUES LESS THAN ('2015-01-01') ENGINE = MyISAM,
PARTITION `p11` VALUES LESS THAN ('2016-01-01') ENGINE = MyISAM,
PARTITION `p12` VALUES LESS THAN ('2017-01-01') ENGINE = MyISAM,
PARTITION `p13` VALUES LESS THAN ('2018-01-01') ENGINE = MyISAM,
PARTITION `p14` VALUES LESS THAN ('2019-01-01') ENGINE = MyISAM,
PARTITION `p15` VALUES LESS THAN ('2020-01-01') ENGINE = MyISAM,
PARTITION `p99` VALUES LESS THAN (MAXVALUE) ENGINE = MyISAM)
|
| Slave_IO_State |
Waiting for master to send event |
| Master_Host |
192.168.30.221 |
| Master_User |
repllpvs2 |
| Master_Port |
3306 |
| Connect_Retry |
60 |
| Master_Log_File |
lpvs1-mysql-bin.000064 |
| Read_Master_Log_Pos |
65557889 |
| Relay_Log_File |
lpvs2-relay-bin.000140 |
| Relay_Log_Pos |
405820055 |
| Relay_Master_Log_File |
lpvs1-mysql-bin.000063 |
| Slave_IO_Running |
Yes |
| Slave_SQL_Running |
Yes |
| Replicate_Do_DB |
| Replicate_Ignore_DB |
| Replicate_Do_Table |
| Replicate_Ignore_Table |
| Replicate_Wild_Do_Table |
| Replicate_Wild_Ignore_Table |
| Last_Errno |
0 |
| Last_Error |
| Skip_Counter |
0 |
| Exec_Master_Log_Pos |
484742991 |
| Relay_Log_Space |
1059071280 |
| Until_Condition |
None |
| Until_Log_File |
| Until_Log_Pos |
0 |
| Master_SSL_Allowed |
No |
| Master_SSL_CA_File |
| Master_SSL_CA_Path |
| Master_SSL_Cert |
| Master_SSL_Cipher |
| Master_SSL_Key |
| Seconds_Behind_Master |
29261 |
| Master_SSL_Verify_Server_Cert |
No |
| Last_IO_Errno |
0 |
| Last_IO_Error |
| Last_SQL_Errno |
0 |
| Last_SQL_Error |
| Replicate_Ignore_Server_Ids |
| Master_Server_Id |
1 |
| Master_SSL_Crl |
| Master_SSL_Crlpath |
| Using_Gtid |
No |
| Gtid_IO_Pos |
| Replicate_Do_Domain_Ids |
| Replicate_Ignore_Domain_Ids |
| Parallel_Mode |
none |
| SQL_Delay |
0 |
| SQL_Remaining_Delay |
| Slave_SQL_Running_State |
Delete_rows_log_event::find_row(-1) |
|
|
Uploaded MDEV-15350.tgz
|
|
Upgraded the slave to 10.2.14 and the problem persists.
|
|
Problem does not exist for InnoDB. Closing.
|
|
Thanks for letting us know.
So, it was still happening for you with MyISAM on 10.2.14?
|
|
Yes, it was. I believe it's related to the fact that the table does not have a primary key.
With InnoDB, at least, it has (not the schema but within InnoDB).
But that's just my suspicion.
Thanks
|
|
ampf, "Delete_rows_log_event::find_row(-1)" status of a slave applier thread can be explained by lack
of unique key on the table. I am closing the issue, since you suspected the same and as I gather basing on
your table properties.
|
|
Rows log event applying on unique-key-less tables may be slow.
|