|
Please paste or attach your cnf file(s) and the error log covering at least the last session before upgrade and the first session after upgrade.
Thanks.
|
|
The master server config file (the slave config file is nearly identical). The server IP address and database names have been changed.
[mysqld]
bind-address=1.2.3.4
port=3306
#socket=/tmp/mysql.sock
#key_buffer_size=16M
max_allowed_packet=1G
server-id=1
log-error=/var/db/mysql/error.log
log-bin=mysql-bin
expire_logs_days=14
relay-log=relay-bin
character-set-server=utf8
collation-server=utf8_unicode_ci
innodb_log_file_size=64M
innodb_flush_log_at_trx_commit=1
sync_binlog=1
auto_increment_increment=2
auto_increment_offset=1
log-slave-updates
#skip-slave-start
replicate-do-db=db1
replicate-do-db=db2
ssl-ca=/usr/local/etc/mysql/mysql.ca.crt.pem
ssl-cert=/usr/local/etc/mysql/server.crt.pem
ssl-key=/usr/local/etc/mysql/server.key.pem
[mysql]
default-character-set=utf8
[mysqldump]
events
ignore-table=mysql.events
[client]
default-character-set=utf8
ssl-ca=/usr/local/etc/mysql/mysql.ca.crt.pem
ssl-cert=/usr/local/etc/mysql/mysql.client.crt.pem
ssl-key=/usr/local/etc/mysql/mysql.client.key.pem
|
|
Here is the slave server error log--there were no events logged before this for a 12-hour period. The master and slave servers in this bug report are actually in a master-master configuration. When I say master or slave in this bug report I am referring to servers 1 and 2 respectively. This is why you see the "Start binlog_dump to slave_server(1)" message in the slave server error log below.
2017-12-12 0:00:05 34770706432 [Note] Start binlog_dump to slave_server(1), pos(mysql-bin.000476, 4)
2017-12-12 0:00:07 34770312448 [ERROR] Error reading packet from server: could not find next log; the first event 'mysql-bin.000470' at 69513478, the last event read from 'mysql-bin.000471' at 58629163, the last byte read from 'mysql-bin.000471' at 58629210. (server_errno=1236)
2017-12-12 0:00:07 34770312448 [ERROR] Slave I/O: Got fatal error 1236 from master when reading data from binary log: 'could not find next log; the first event 'mysql-bin.000470' at 69513478, the last event read from 'mysql-bin.000471' at 58629163, the last byte read from 'mysql-bin.000471' at 58629210.', Internal MariaDB error code: 1236
2017-12-12 0:00:07 34770312448 [Note] Slave I/O thread exiting, read up to log 'mysql-bin.000472', position 4
|
|
Additional information:
This bug occurred on two different clusters after performing the same upgrade. I have since fixed one of the clusters by dumping the data, reinstalling MariaDB, re-importing everything and re-configuring replication. No changes were made to the configuration files.
mysqldump -u root -p -A > all.sql
sudo service mysql-server stop
sudo pkg delete mariadb102-client mariadb102-server
sudo rm -r /var/db/mysql
sudo pkg install mariadb102-client mariadb102-server
sudo service mysql-server start
/usr/local/bin/mysql_secure_installation
#Answer 'y' to each y/n prompt
mysql -u root -p < all.sql
sudo shutdown -r now
|
|
Thanks for the report, Jason!
In case the last pre-upgrade mysql-bin.000610 is still available, could you please check out (mysqlbinlog -v) its tail to see what are the concluding events. I suspect there's no Rotate-log-event to 611th file which would be the 1st one after the upgrade.
Cheers,
Andrei
|
|
Andrei,
I'm afraid that file is no longer available. I have more information about this issue though. Replication for the master-master cluster that I rebuilt yesterday afternoon failed this morning at 0000. Due to the same issue occurring from a fresh install, I updated the ticket title as I think we can safely rule out the upgrade from 10.2.10 to 10.2.11 as the problem. The problem seems to be 10.2.11 itself.
Show slave status output:
server1> show slave status\G
|
*************************** 1. row ***************************
|
Slave_IO_State:
|
Master_Host: 1.1.1.2
|
Master_User: rep_user
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mysql-bin.000006
|
Read_Master_Log_Pos: 4
|
Relay_Log_File: relay-bin.000006
|
Relay_Log_Pos: 303
|
Relay_Master_Log_File: mysql-bin.000006
|
Slave_IO_Running: No
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB: db1,db2
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 4
|
Relay_Log_Space: 1818
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: Yes
|
Master_SSL_CA_File: /usr/local/etc/mysql/mysql.ca.crt.pem
|
Master_SSL_CA_Path: /usr/local/etc/mysql
|
Master_SSL_Cert: /usr/local/etc/mysql/mysql.client.crt.pem
|
Master_SSL_Cipher:
|
Master_SSL_Key: /usr/local/etc/mysql/mysql.client.key.pem
|
Seconds_Behind_Master: NULL
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 1236
|
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'could not find next log; the first event 'mysql-bin.000004' at 338186520, the last event read from 'mysql-bin.000005' at 13143162, the last byte read from 'mysql-bin.000005' at 13143209.'
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 2
|
Master_SSL_Crl: /usr/local/etc/mysql/mysql.ca.crt.pem
|
Master_SSL_Crlpath: /usr/local/etc/mysql
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
SQL_Delay: 0
|
SQL_Remaining_Delay: NULL
|
Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
server2> show slave status\G
|
*************************** 1. row ***************************
|
Slave_IO_State:
|
Master_Host: 1.1.1.1
|
Master_User: rep_user
|
Master_Port: 3306
|
Connect_Retry: 60
|
Master_Log_File: mysql-bin.000006
|
Read_Master_Log_Pos: 4
|
Relay_Log_File: relay-bin.000007
|
Relay_Log_Pos: 303
|
Relay_Master_Log_File: mysql-bin.000006
|
Slave_IO_Running: No
|
Slave_SQL_Running: Yes
|
Replicate_Do_DB: db1,db2
|
Replicate_Ignore_DB:
|
Replicate_Do_Table:
|
Replicate_Ignore_Table:
|
Replicate_Wild_Do_Table:
|
Replicate_Wild_Ignore_Table:
|
Last_Errno: 0
|
Last_Error:
|
Skip_Counter: 0
|
Exec_Master_Log_Pos: 4
|
Relay_Log_Space: 35531
|
Until_Condition: None
|
Until_Log_File:
|
Until_Log_Pos: 0
|
Master_SSL_Allowed: Yes
|
Master_SSL_CA_File: /usr/local/etc/mysql/mysql.ca.crt.pem
|
Master_SSL_CA_Path: /usr/local/etc/mysql
|
Master_SSL_Cert: /usr/local/etc/mysql/mysql.client.crt.pem
|
Master_SSL_Cipher:
|
Master_SSL_Key: /usr/local/etc/mysql/mysql.client.key.pem
|
Seconds_Behind_Master: NULL
|
Master_SSL_Verify_Server_Cert: No
|
Last_IO_Errno: 1236
|
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'could not find next log; the first event 'mysql-bin.000004' at 556969, the last event read from 'mysql-bin.000005' at 35577620, the last byte read from 'mysql-bin.000005' at 35577667.'
|
Last_SQL_Errno: 0
|
Last_SQL_Error:
|
Replicate_Ignore_Server_Ids:
|
Master_Server_Id: 1
|
Master_SSL_Crl: /usr/local/etc/mysql/mysql.ca.crt.pem
|
Master_SSL_Crlpath: /usr/local/etc/mysql
|
Using_Gtid: No
|
Gtid_IO_Pos:
|
Replicate_Do_Domain_Ids:
|
Replicate_Ignore_Domain_Ids:
|
Parallel_Mode: conservative
|
SQL_Delay: 0
|
SQL_Remaining_Delay: NULL
|
Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
|
1 row in set (0.00 sec)
|
Show binary logs output:
server1> show binary logs;
|
+------------------+-----------+
|
| Log_name | File_size |
|
+------------------+-----------+
|
| mysql-bin.000001 | 29297 |
|
| mysql-bin.000002 | 684676057 |
|
| mysql-bin.000003 | 7029858 |
|
| mysql-bin.000004 | 556992 |
|
| mysql-bin.000005 | 35577667 |
|
+------------------+-----------+
|
5 rows in set (0.03 sec)
|
server2> show binary logs;
|
+------------------+-----------+
|
| Log_name | File_size |
|
+------------------+-----------+
|
| mysql-bin.000001 | 29297 |
|
| mysql-bin.000002 | 340293575 |
|
| mysql-bin.000003 | 365 |
|
| mysql-bin.000004 | 338186543 |
|
| mysql-bin.000005 | 13143209 |
|
+------------------+-----------+
|
5 rows in set (0.01 sec)
|
Filesystem binary logs:
[user@server1 ~]$ ls -la /var/db/mysql/mysql-*
|
-rw-rw---- 1 mysql mysql 29297 Dec 13 16:29 /var/db/mysql/mysql-bin.000001
|
-rw-rw---- 1 mysql mysql 684676057 Dec 13 16:39 /var/db/mysql/mysql-bin.000002
|
-rw-rw---- 1 mysql mysql 7029858 Dec 13 17:32 /var/db/mysql/mysql-bin.000003
|
-rw-rw---- 1 mysql mysql 556992 Dec 13 18:08 /var/db/mysql/mysql-bin.000004
|
-rw-rw---- 1 mysql mysql 35577667 Dec 14 00:00 /var/db/mysql/mysql-bin.000005
|
-rw-rw---- 1 mysql mysql 139311216 Dec 15 00:00 /var/db/mysql/mysql-bin.000006
|
-rw-rw---- 1 mysql mysql 139352826 Dec 16 00:00 /var/db/mysql/mysql-bin.000007
|
-rw-rw---- 1 mysql mysql 139308220 Dec 17 00:00 /var/db/mysql/mysql-bin.000008
|
-rw-rw---- 1 mysql mysql 139332003 Dec 18 00:00 /var/db/mysql/mysql-bin.000009
|
-rw-rw---- 1 mysql mysql 75873886 Dec 18 12:36 /var/db/mysql/mysql-bin.000010
|
-rw-rw---- 1 mysql mysql 6512 Dec 18 00:00 /var/db/mysql/mysql-bin.index
|
[user@server2 ~]$ ls -la /var/db/mysql/mysql-*
|
-rw-rw---- 1 mysql mysql 29297 Dec 13 17:13 /var/db/mysql/mysql-bin.000001
|
-rw-rw---- 1 mysql mysql 340293575 Dec 13 17:21 /var/db/mysql/mysql-bin.000002
|
-rw-rw---- 1 mysql mysql 365 Dec 13 17:32 /var/db/mysql/mysql-bin.000003
|
-rw-rw---- 1 mysql mysql 338186543 Dec 13 18:08 /var/db/mysql/mysql-bin.000004
|
-rw-rw---- 1 mysql mysql 13143209 Dec 14 00:00 /var/db/mysql/mysql-bin.000005
|
-rw-rw---- 1 mysql mysql 16508 Dec 15 00:00 /var/db/mysql/mysql-bin.000006
|
-rw-rw---- 1 mysql mysql 448 Dec 16 00:00 /var/db/mysql/mysql-bin.000007
|
-rw-rw---- 1 mysql mysql 448 Dec 17 00:00 /var/db/mysql/mysql-bin.000008
|
-rw-rw---- 1 mysql mysql 448 Dec 18 00:00 /var/db/mysql/mysql-bin.000009
|
-rw-rw---- 1 mysql mysql 401 Dec 18 00:00 /var/db/mysql/mysql-bin.000010
|
-rw-rw---- 1 mysql mysql 6505 Dec 18 00:00 /var/db/mysql/mysql-bin.index
|
mysql-bin.index file tail output:
[user@server1 ~]$ sudo tail /var/db/mysql/mysql-bin.index
|
Event queue status:
|
Element count : 0
|
Data locked : NO
|
Attempting lock : NO
|
LLA : init_queue:140
|
LUA : init_queue:150
|
WOC : NO
|
Next activation : never
|
./mysql-bin.000010
|
[user@server2 ~]$ sudo tail /var/db/mysql/mysql-bin.index
|
Event queue status:
|
Element count : 0
|
Data locked : NO
|
Attempting lock : NO
|
LLA : init_queue:140
|
LUA : init_queue:150
|
WOC : NO
|
Next activation : never
|
./mysql-bin.000010
|
Mysqlbinlog output:
[user@server1 /var/db/mysql]$ sudo mysqlbinlog --no-defaults --start-datetime='2017-12-14 00:00:00' -v /var/db/mysql/mysql-bin.000005 | tail
|
# at 35577589
|
#171214 0:00:08 server id 1 end_log_pos 35577620 CRC32 0x36b33079 Xid = 116378
|
COMMIT/*!*/;
|
# at 35577620
|
#171214 0:00:08 server id 1 end_log_pos 35577667 CRC32 0xdbfe3c5b Rotate to mysql-bin.000006 pos: 4
|
DELIMITER ;
|
# End of log file
|
ROLLBACK /* added by mysqlbinlog */;
|
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
|
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
|
[user@server2 /var/db/mysql]$ sudo mysqlbinlog --no-defaults --start-datetime='2017-12-14 00:00:00' -v /var/db/mysql/mysql-bin.000005 | tail
|
# at 13143131
|
#171214 0:00:04 server id 1 end_log_pos 13143162 CRC32 0xa5937b47 Xid = 160346
|
COMMIT/*!*/;
|
# at 13143162
|
#171214 0:00:05 server id 2 end_log_pos 13143209 CRC32 0xea11e8dd Rotate to mysql-bin.000006 pos: 4
|
DELIMITER ;
|
# End of log file
|
ROLLBACK /* added by mysqlbinlog */;
|
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
|
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
|
Configuration files:
[user@server1 ~]$ more /usr/local/etc/mysql/my.cnf
|
[mysqld]
|
bind-address=1.1.1.1
|
port=3306
|
#socket=/tmp/mysql.sock
|
#key_buffer_size=16M
|
max_allowed_packet=1G
|
server-id=1
|
log-error=/var/db/mysql/error.log
|
log-bin=mysql-bin
|
expire_logs_days=14
|
relay-log=relay-bin
|
slow_query_log=1
|
slow_query_log_file=slow.log
|
character-set-server=utf8
|
collation-server=utf8_unicode_ci
|
innodb_log_file_size=64M
|
innodb_flush_log_at_trx_commit=1
|
sync_binlog=1
|
auto_increment_increment=2
|
auto_increment_offset=1
|
log-slave-updates
|
#skip-slave-start
|
replicate-do-db=db1
|
replicate-do-db=db2
|
ssl-ca=/usr/local/etc/mysql/mysql.ca.crt.pem
|
ssl-cert=/usr/local/etc/mysql/mysql.server.crt.pem
|
ssl-key=/usr/local/etc/mysql/mysql.server.key.pem
|
|
[mysql]
|
default-character-set=utf8
|
|
[mysqldump]
|
events
|
ignore-table=mysql.events
|
|
[client]
|
default-character-set=utf8
|
ssl-ca=/usr/local/etc/mysql/mysql.ca.crt.pem
|
ssl-cert=/usr/local/etc/mysql/mysql.client.crt.pem
|
ssl-key=/usr/local/etc/mysql/mysql.client.key.pem
|
[user@server2 ~]$ more /usr/local/etc/mysql/my.cnf
|
[mysqld]
|
bind-address=1.1.1.2
|
port=3306
|
#socket=/tmp/mysql.sock
|
#key_buffer_size=16M
|
max_allowed_packet=1G
|
server-id=2
|
log-error=/var/db/mysql/error.log
|
log-bin=mysql-bin
|
expire_logs_days=14
|
relay-log=relay-bin
|
slow_query_log=1
|
slow_query_log_file=slow.log
|
character-set-server=utf8
|
collation-server=utf8_unicode_ci
|
innodb_log_file_size=64M
|
innodb_flush_log_at_trx_commit=1
|
sync_binlog=1
|
auto_increment_increment=2
|
auto_increment_offset=2
|
log-slave-updates
|
#skip-slave-start
|
replicate-do-db=db1
|
replicate-do-db=db2
|
ssl-ca=/usr/local/etc/mysql/mysql.ca.crt.pem
|
ssl-cert=/usr/local/etc/mysql/mysql.server.crt.pem
|
ssl-key=/usr/local/etc/mysql/mysql.server.key.pem
|
|
[mysql]
|
default-character-set=utf8
|
|
[mysqldump]
|
events
|
ignore-table=mysql.events
|
|
[client]
|
default-character-set=utf8
|
ssl-ca=/usr/local/etc/mysql/mysql.ca.crt.pem
|
ssl-cert=/usr/local/etc/mysql/mysql.client.crt.pem
|
ssl-key=/usr/local/etc/mysql/mysql.client.key.pem
|
Error log output:
Server1
|
2017-12-14 0:00:05 34932894976 [ERROR] Error reading packet from server: could not find next log; the first event 'mysql-bin.000004' at 338186520, the last event read from 'mysql-bin.000005' at 13143162, the last byte read from 'mysql-bin.000005' at 13143209. (server_errno=1236)
|
2017-12-14 0:00:05 34932894976 [ERROR] Slave I/O: Got fatal error 1236 from master when reading data from binary log: 'could not find next log; the first event 'mysql-bin.000004' at 338186520, the last event read from 'mysql-bin.000005' at 13143162, the last byte read from 'mysql-bin.000005' at 13143209.', Internal MariaDB error code: 1236
|
2017-12-14 0:00:05 34932894976 [Note] Slave I/O thread exiting, read up to log 'mysql-bin.000006', position 4
|
Server2
|
2017-12-14 0:00:08 34932903168 [ERROR] Error reading packet from server: could not find next log; the first event 'mysql-bin.000004' at 556969, the last event read from 'mysql-bin.000005' at 35577620, the last byte read from 'mysql-bin.000005' at 35577667. (server_errno=1236)
|
2017-12-14 0:00:08 34932903168 [ERROR] Slave I/O: Got fatal error 1236 from master when reading data from binary log: 'could not find next log; the first event 'mysql-bin.000004' at 556969, the last event read from 'mysql-bin.000005' at 35577620, the last byte read from 'mysql-bin.000005' at 35577667.', Internal MariaDB error code: 1236
|
2017-12-14 0:00:08 34932903168 [Note] Slave I/O thread exiting, read up to log 'mysql-bin.000006', position 4
|
System log output. This just shows me upgrading, then uninstalling and reinstalling MariaDB.
[user@server1 /var/log]$ sudo more messages | grep maria
|
Dec 11 12:08:49 server1 pkg: mariadb102-client upgraded: 10.2.10_1 -> 10.2.11
|
Dec 11 12:09:06 server1 pkg: mariadb102-server upgraded: 10.2.10 -> 10.2.11
|
Dec 13 16:21:57 server1 pkg: mariadb102-server-10.2.11 deinstalled
|
Dec 13 16:22:19 server1 pkg: mariadb102-client-10.2.11 deinstalled
|
Dec 13 16:26:10 server1 pkg: mariadb102-client-10.2.11 installed
|
Dec 13 16:26:15 server1 pkg: mariadb102-server-10.2.11 installed
|
[user@server2 /var/log]$ sudo more messages | grep maria
|
Dec 11 12:05:23 server2 pkg: mariadb102-client upgraded: 10.2.10_1 -> 10.2.11
|
Dec 11 12:05:28 server2 pkg: mariadb102-server upgraded: 10.2.10 -> 10.2.11
|
Dec 13 17:12:48 server2 pkg: mariadb102-server-10.2.11 deinstalled
|
Dec 13 17:12:49 server2 pkg: mariadb102-client-10.2.11 deinstalled
|
Dec 13 17:13:28 server2 pkg: mariadb102-client-10.2.11 installed
|
Dec 13 17:13:32 server2 pkg: mariadb102-server-10.2.11 installed
|
Ideas?
|
|
Hi, Jason.
It's very good of you that you managed to refine the problem description!
If I understood it correctly your replication setup is actually circular between two servers.
I conclude that seeing 'Slave I/O' errors in either server error logs.
Presence of 'replicate-do-db' complicates the matter additionally.
If I am right about the circular you can help us to help you in providing either server's:
my.cnf, show-slave-status, full error log and the query when available; also
the server messages to the syslog.
As the bad things strike always at midnight I would think of the server event system that may trigger something. Could you please clear this out as well.
Thanks and good luck!
Andrei
|
|
Andrei,
I have added additional information to my previous post. Again, these exact configurations have worked without issue through quite a few 10.2.x upgrades in the past. Only the 10.2.11 upgrade has had any issue.
|
|
A new version of the mariadb102-server package, mariadb102-server-10.2.11_1 was released on FreeBSD and includes a fix for MDEV-14337.
Change log notes can be found here: https://svnweb.freebsd.org/ports/head/databases/mariadb102-server/?view=log
I set up the development environment for another round of testing and so far, everything looks good. Manually running newsyslog, which sends a SIGHUP message to the mariadb process, does not break replication. A 'SHOW BINARY LOGS' command includes all the binary logs that I can see on the file system and at the bottom of the binary log index file. I'm going to let everything run over the weekend for good measure but the MDEV-14337 fix seems to have done the trick.
Note to self: Link related bugs before closing.
|
|
Thanks! 10.2.12 is built and ready (but not reached all mirrors yet) and it includes the fix for MDEV-14337.
If you confirm that the fix is working, we can close this bug as fixed in 10.2.12
|
|
Issue resolution confirmed. Please close this bug.
|