[MDEV-11121] Master - Slave Replication - Duplicate key error Created: 2016-10-24  Updated: 2017-01-23  Resolved: 2017-01-23

Status: Closed
Project: MariaDB Server
Component/s: Replication
Affects Version/s: 10.0.20
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: jisun bae Assignee: Unassigned
Resolution: Incomplete Votes: 0
Labels: need_feedback
Environment:

CentOS release 6.7 / MariaDB 10.0.20


Attachments: Text File error query.TXT     File hostname1.cnf     File hostname2.cnf     Text File table dictionary.TXT    

 Description   

Our architecture is MariaDB Replication(Master - Slave) and Slave DB is read only mode.

2016 10/20 , We got a error. The error message is below.

161020 16:18:05 [ERROR] Slave SQL: Could not execute Write_rows_v1 event on table dbname.system_sd_sync; Duplicate entry '200001986744' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mariadb-bin.000179, end_log_pos 318838554, Gtid 1-1-110570931, Internal MariaDB error code: 1062
161020 16:18:05 [Warning] Slave: Duplicate entry '200001986744' for key 'PRIMARY' Error_code: 1062
161020 16:18:05 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mariadb-bin.000179' position 318837719

We have not had super privileges except root and replication user.

The primary key of table(system_sd_sync) is a sequence using a funtion & innodb engine table.
Two tables and one function create script, Master-Slave my.cnf files, error query file attach.

Master's data is valid but Slave's data is not valid.
And there is no logging in master's binary log about slave's data.

I don't know what is mean slave's data.



 Comments   
Comment by Elena Stepanova [ 2016-10-24 ]

1) The error simply says that there is a record with cmd_id='200001986744' on the slave already. Please check whether it's so. If it is, please search all master's binary logs for this ID to find where the 2nd occurrence came from.
2) If you haven't found any, it's possible that you've encountered a bug which was fixed in one of recent 10.0 versions (10.0.20 is somewhat old); e.g. MDEV-8302
3) Alternatively, make sure that the slave was not reconfigured on the way, e.g. stopped and then told to start from a previous position which had already been executed, and check the slave's error log for previous errors or warnings.

Comment by jisun bae [ 2016-10-26 ]

Our system is operating after June.
1) There was no master's binary log using next sequence number before error time. And the sequence table is only used user function.
3) If Slave DB had had some transaction (not Master DB transaction), Slave DB have written in slave's binary log.
But slave's binary log was empty file.
I deleted cmd_id='200001986744' record in Slave DB and "stop slave, start slave" commands implemented.
But I received same error message.
##################################################################
161020 23:16:01 [ERROR] Slave SQL: Error 'Duplicate entry '200001986745' for key 'PRIMARY'' on query. Default database: 'dbname'. Query: 'INSERT INTO SYSTEM_SD_SYNC^M
(^M
CMD_ID, CMD, SYNC_PROCESS, SYNC_STATUS, SYNC_DATE,^M
SCHEDULE_MADE, SCHEDULE_ID, START_DATE, END_DATE, WHOLEDAY,^M
`REPEAT`, SCHEDULE_PUBLIC, TITLE, PLACE, CONTENTS,^M
REGISTER_ID, PARTICIPANT, OWA_ID, SYNC_REGISTER, SYNC_REGIST_DATE,^M
SYNC_UPDATER, SYNC_UPDATE_DATE, SCHEDULE_GROUP_ID, SYNC_ID, ICAL_UID,^M
ORIGINAL_START^M
)^M
SELECT FN_NEXT_VAL('SEQ_SYSTEM_OUTLOOK_CMD_ID') , 'D', NAME_CONST('p_SYNC_PROCESS',0), NAME_CONST('p_SYNC_STATUS',0), NAME_CONST('p_SYNC_DATE',NULL),^M
NAME_CONST('p_SCHEDULE_MADE',2), SCHEDULE_ID, START_DATE, END_DATE, WHOLEDAY,^M
`REPEAT`, SCHEDULE_PUBLIC, TITLE, PLACE,
161020 23:16:01 [Warning] Slave: Duplicate entry '200001986745' for key 'PRIMARY' Error_code: 1062
161020 23:16:01 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mariadb-bin.000179' position 318837719
###################################################################

So, I restored backup file on Slave DB.

I want to know the cause.

Comment by Elena Stepanova [ 2016-12-24 ]

Sorry, I don't understand what you mean by "There was no master's binary log using next sequence number before error time. And the sequence table is only used user function.".

For the possible causes, it's impossible to say without analyzing all the binary logs, and as also mentioned before, you might have run onto a bug fixed in a later version (MDEV-8302, fixed in 10.0.21). Have it ever happened again after you restored the slave?

If you keep getting the same problem, it definitely makes sense to upgrade to a recent version to pick up all the fixes that have been made over the last 1.5 years. If you have never had the problem again after restoring from the backup, the more likely reason is an operational error of some sort.

Comment by Sergei Golubchik [ 2017-01-23 ]

No feedback for a month, closing.
Feel free to reopen if you have more info to add.

Generated at Thu Feb 08 07:47:28 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.