[MDEV-20892] AUTO_INCREMENT is set lower than the max value of the primary_key Created: 2019-10-24  Updated: 2024-01-19  Resolved: 2020-06-22

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.1.38, 10.2.27, 10.2.28, 10.3.20
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Brian Sok Assignee: Marko Mäkelä
Resolution: Incomplete Votes: 2
Labels: auto_increment, innodb, mariadb, need_feedback
Environment:

It's running in a 2 nodes galera cluster on mariadb 10.2.27 and debian 9. Both nodes have 16 cores (32 threads) and 196GB ram with 4TB of disk space. One node is the primary node that handles all the connections and the other is the hot standby.

UPDATE: This problem is also happening on 10.3.20 and various server setups


Attachments: File my.cnf    
Issue Links:
Relates
relates to MDEV-12123 Page contains nonzero PAGE_MAX_TRX_ID Closed
relates to MDEV-13094 SHOW CREATE TABLE can report non-pers... Confirmed
relates to MDEV-6076 Persistent AUTO_INCREMENT for InnoDB Closed
relates to MDEV-15400 Upgrading to 10.2 breaks auto increment Closed
relates to MDEV-33277 In-place migration from MySQL 5.7 cau... Stalled

 Description   

In our situation with galera we have a problem that on random moments the `AUTO_INCREMENT` gets set to a lower value than the maximum value of the primary_key of that table. We have a total of ~700 databases with a total size of ~180GB and in about 24 hours of handling requests 20 tables change to having this problem.

For example:
Database testdb1 has two module tables for car parts: module_cars and module_car_parts. The table module_cars has 100 records and has a AUTO_INCREMENT of 102 (+2 cause of the offset in a node 2 cluster). When I check this table 24 hours later the table got changed to still having 100 records but having a AUTO_INCREMENT of 89. This value can vary from 90% to 10% lower than the max value of the primary_key.

I can't find why this is happening. I found https://jira.mariadb.org/browse/MDEV-6076 where there was a problem with handling AUTO_INCREMENT's but that should've be fixed a long time ago.

See attachment for the `my.cnf`

UPDATE: I have upgraded mariadb to 10.3.20 and removed galera from our setup but without any change. Even our other servers without replication and with master-slave replication keep getting this issue where the AUTO_INCREMENT gets set lower then the max value of the primary_key with AUTO_INCREMENT. This causes major data corruption if this is not contained. Galera seems not to be the only issue here.



 Comments   
Comment by Felix Meier [ 2019-11-12 ]

It seems to be the same bug that affects us and Plesk as well

https://support.plesk.com/hc/en-us/articles/115003292185-An-operation-fails-in-Plesk-SQLSTATE-23000-Integrity-constraint-violation-1062-Duplicate-entry-X-for-key-PRIMARY-

Version: 10.2.27-1.el7.centos

It looks like it occurs mainly on well loaded database servers. If we test an unloaded server, AUTO_INCREMENT is OK after a restart. If a server is heavily used, it usually goes wrong.

Currently it looks like, that the value is also sometimes broken without a restart of the database server.

Comment by Brian Sok [ 2019-11-27 ]

Update: I have removed galera from our setup and i also see this problem happening on our other servers with master-slave replication and even without replication. I hope that someone has the time to look at this soon since this causes major data corruption.

Comment by Andrei Elkin [ 2019-11-28 ]

scriptkidi To, this problem happening on our other servers with master-slave replication and even without replication, we need some more detailed description of how to reproduce.

You must be aware of the auto-inc counter can be lowered through "offending" records deletion finalized with ALTER installing a smaller value.

Comment by Brian Sok [ 2019-11-28 ]

@Andrei Elkin I have not succeeded yet in creating a scenario where this can be manually reproduced. This problem where the AUTO_INCREMENT gets set lower then the max PRIMARY_KEY value is happening on random moments and is not specificly occuring on replication servers. So it seems to be a problem in mariadb/innodb itself. For further information: I was able to capture one of those moments where the AUTO_INCREMENT value was set lower. In this case there were 7 INSERT ON DUPLICATE KEY UPDATE queries executed on a table with 20 rows. These 7 queries were executed on the row with primary key 10 with the max value of the primary_key being 20. After these 7 queries the AUTO_INCREMENT was set to 10. But on our servers we execute around 50000 of these queries in a day but the AUTO_INCREMENT is not set lower in every case.

You must be aware of the auto-inc counter can be lowered through "offending" records deletion finalized with ALTER installing a smaller value.

Can you explain this further? When does this "offending" happen and in what scenario can the AUTO_INCREMENT be altered to a lower value?

Comment by Andrei Elkin [ 2019-11-30 ]

scriptkidi I meant a possibility to gain the auto-inc column a lesser value. Something like
DELETE from t; ALTER table t SET auto_increment= 1.

To the other details (thanks for providing!), it feels concurrent IODKU are necessary to reproduce. Before we can attempt catching that but please tell us more to _ I was able to capture one of those moments where the AUTO_INCREMENT value was set lower. In this case there were 7 INSERT ON DUPLICATE KEY UPDATE queries executed on a table with 20 rows. These 7 queries were executed on the row with primary key 10 with the max value of the primary_key being 20_:
1. The exact table def
2. Exact queries
3. auto-inc offset and increment

Comment by Andrei Elkin [ 2019-11-30 ]

alice I've set you to own the ticket while it is not about replication as the user makes it clear.
However you can try simulating its "7-threaded" client once he provides more details I've just requested. I looked around auto-inc bugs to see there are few asserts on {{next_key >= ...}, specifically in the partition engine. So I suggest we just wait for 1-3, construct and run a quick test trying our luck. With no luck feel free to pass the ticket to the server engineering or QA.

Comment by Leon Koens [ 2019-12-27 ]

I ran into this issue as well last week. We are running mariadb 10.4.11 in a Galera set-up with 3 nodes. In stead of lots of databases we run 1 database of about 350 GB with 600 tables. The table in question was a small table which doesn't get changed a lot. Its auto_increment was set to 28 while its maximum id was 36.

As far as I can see in the codebase the only way changes are made to the data in the table are via REPLACE INTO queries, but those hardly ever happen. When I added a 'new' record with the REPLACE INTO query after I noticed the issue (NULL for the id) , the auto_increment on the table increased, but was still under the maximum id. Setting the auto_increment with an ALTER TABLE table AUTO_INCREMENT= fixed the auto_increment value.

I did notice there was a migration the previous day that dropped 5 columns from the table. I haven't been able to reproduce this issue and I haven't seen it happen since.

The mysqldump I did when the auto_increment was wrong also showed it wrong in the produced create table statement:

CREATE TABLE `contentProductAccessoryTree` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(255) NOT NULL,
  `shortName` varchar(255) NOT NULL,
  `image` varchar(255) DEFAULT NULL,
  `isVisible` tinyint(1) NOT NULL DEFAULT 1,
  `hardwareSubType` int(255) DEFAULT NULL,
  `parent` int(11) DEFAULT NULL,
  `lft` int(11) NOT NULL,
  `rgt` int(11) NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `shortName` (`shortName`),
  KEY `fk_parent` (`parent`)
) ENGINE=InnoDB AUTO_INCREMENT=28 DEFAULT CHARSET=latin1;

Comment by Jon Capcik [ 2020-01-30 ]

I'm not sure if this helps in any way, but I just experienced an issue like this after upgrading from MariaDB 10.0 to 10.3. Due to a poor decision by me in the past, we "reserved" a set of IDs below 10,000 in 4 different tables to allow for "standardizing" that data across hundreds of databases, but allowing the creation of new records above 10,000 that would be treated as unique to that database.

Because of AUTO_INCREMENT not being persistent in 10.0, we "standardized" a record by inserting it with an ID of 9999 so that upon database restarts, it wouldn't reset AUTO_INCREMENT back down to the highest value, so 1 through 9998 was available for us to manage without the application creating records with those IDs. This actually worked just fine for us.

I noticed this week however that quite a few of our databases have an AUTO_INCREMENT value below the max(id). Although there were certainly some oddities mainly on one of our database servers, the consistent theme was that the AUTO_INCREMENT value was set below the max(id) whenever the first record inserted into that table post-upgrade specified an ID below 10,000. So for example we inserted a "standardized" record with ID 800, then next time a record was inserted without specifying an ID it was set to 801. If we inserted a record utilizing the AUTO_INCREMENT, the ID was above the highest value and the AI was correct.

There were a lot of databases on one of our database servers were no records had been created since the upgrade, but the IDs were still below the max(id), yet the AUTO_INCREMENT value did not make sense. An example was an AI value of 9333, yet the highest value below 9999 was in the 8000's, and it had IDs above 10,000.

In the end I just ran alter table [table] auto_increment=1 on all databases to have it recalculate the correct AI value, and I hope I can just chalk this up to an upgrade issue that won't happen in the future. Next step is to get away from "reserving" IDs so that this can't bite us again.

Comment by Brian Sok [ 2020-03-03 ]

Is there any update on this?

Comment by Andrei Elkin [ 2020-03-20 ]

sanja: I had to bounce it to as it's clear auto-increment behaviour in the server. To speed up grasping the problem consider to read through https://jira.mariadb.org/browse/MDEV-20892?focusedCommentId=143383&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-143383.

Comment by Marko Mäkelä [ 2020-04-23 ]

We have had similar reports in the past, for example, MDEV-15400, but sadly without enough details to reproduce the problem.

scriptkidi, jcapcik can you please provide more information:

  1. Does CHECK TABLE report any errors before or after the upgrade?
  2. Can you repeat this problem when the table is created in 10.2 in the first place? MDEV-6076 repurposed a data field that I thought should always have been written as 0.
  3. Show od -Ax -t x1 -N 8 -j 0xc038 tablename.ibd both before and after the upgrade, if possible. That should be the persistent value of the AUTO_INCREMENT field.

If you have some nonzero garbage (unfortunately, old versions of InnoDB wrote uninitialized data to unused fields), it would be interpreted as the AUTO_INCREMENT value.

We have some claims that the AUTO_INCREMENT would be slightly off, but no test case to back those claims. In MDEV-20357 the user seems to have caused corruption by questionable actions ("renamed innodb_logfiles"). In MDEV-6076, the PAGE_MAX_AUTO_INC value in the clustered index root page is being updated atomically in the same mini-transaction that will update the leaf page. It should only be able to get out of sync if data files are being copied in an unsafe way while the server is running, or if there is a bug in Mariabackup, or proper log-based recovery is being prevented.

Comment by Marko Mäkelä [ 2020-04-23 ]

One more idea: Could everything be working as designed, but you are confused due to MDEV-13094?

Comment by Marko Mäkelä [ 2024-01-19 ]

A possible explanation of this bug could be that the adjustment MDEV-12123 for MDEV-6076 was not present when the data files were originally written. The commit message mentions ALTER TABLE…IMPORT TABLESPACE, but similar issues would occur when upgrading.

I think that a possible work-around could be to execute

ALTER TABLE t AUTO_INCREMENT=1;

on every InnoDB table that contains AUTO_INCREMENT columns.

Generated at Thu Feb 08 09:02:59 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.