[MDEV-30046] wrong row targeted with "insert ... on duplicate" and "replace", leading to data corruption - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 10.5.15, 10.4(EOL), 10.5, 10.6, 10.7(EOL), 10.8(EOL), 10.9(EOL), 10.10(EOL)
Fix Version/s: 10.5.25, 10.11.8, 10.6.18, 11.0.6, 11.1.5, 11.2.4, 11.4.2
Component/s: Data Manipulation - Insert
Labels:
None
Environment:
Linux 5.10.0-14-amd64, Debian 11.5

Description

(Also reported here: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1015293)

Using the MySQL interface, these statements:

DROP TABLE IF EXISTS t;

CREATE TABLE t (s BLOB, n INT, UNIQUE (s));

INSERT INTO t VALUES ('Hrecvx_0004ln-00',1), ('Hrecvx_0004mm-00',1);

INSERT INTO t VALUES ('Hrecvx_0004mm-00',2) ON DUPLICATE KEY UPDATE n = VALUES (n);

SELECT * FROM t;

produce this output:

s n
Hrecvx_0004ln-00 2
Hrecvx_0004mm-00 1

So the latter "INSERT" updates the wrong row.

This happens whether the first column is "BLOB" or "TEXT", but only
with specific values. (In my actual use case with ~1 million rows,
it happened a few dozen times, which might be consistent e.g. with
collisions of a 32 bit hash or so.)

Likewise, these statements:

DROP TABLE IF EXISTS t;

CREATE TABLE t (s BLOB, n INT, UNIQUE (s));

INSERT INTO t VALUES ('Hrecvx_0004ln-00',1), ('Hrecvx_0004mm-00',1);

REPLACE INTO t VALUES ('Hrecvx_0004mm-00',2);

SELECT * FROM t;

give the error:

ERROR 1062 (23000) at line 4: Duplicate entry 'Hrecvx_0004mm-00' for key 's'

In my understanding, this error should actually be impossible with
"REPLACE INTO".

It might be the same issue, i.e. it tries to delete the wrong row
before inserting the new one, so it's still duplicate.

Attachments

Issue Links

is duplicated by

MDEV-30588 Failed to update duplicate data when running insert... on duplicate key update.

Closed

relates to

MDEV-371 Unique indexes for blobs

Closed

MDEV-17395 REPLACE/INSERT ODKU: support WITHOUT OVERLAPS

Stalled

MDEV-18748 REPLACE doesn't work with unique blobs on MyISAM table

Closed

MDEV-31093 "ON DUPLICATE KEY UPDATE" saves wrong data to the database

Open

split to

MDEV-34091 Refactor write_record and generalize the code with replication

Stalled

(1 split to)

Activity

Ascending order - Click to sort in descending order

Frank Heckenbach created issue - 2022-11-21 05:19

Alice Sherepa made changes - 2022-11-21 11:02

Field	Original Value	New Value
Link		This issue relates to ~~MDEV-371~~ [ ~~MDEV-371~~ ]

Alice Sherepa made changes - 2022-11-21 11:07

Link

This issue relates to ~~MDEV-23264~~ [ ~~MDEV-23264~~ ]

Alice Sherepa made changes - 2022-11-21 11:12

Link

This issue relates to ~~MDEV-23264~~ [ ~~MDEV-23264~~ ]

Alice Sherepa made changes - 2022-11-21 11:12

Affects Version/s		10.4 [ 22408 ]
Affects Version/s		10.5 [ 23123 ]
Affects Version/s		10.6 [ 24028 ]
Affects Version/s		10.7 [ 24805 ]
Affects Version/s		10.8 [ 26121 ]
Affects Version/s		10.9 [ 26905 ]
Affects Version/s		10.10 [ 27530 ]

Alice Sherepa made changes - 2022-11-21 11:33

Fix Version/s		10.4 [ 22408 ]
Fix Version/s		10.5 [ 23123 ]
Fix Version/s		10.6 [ 24028 ]
Fix Version/s		10.7 [ 24805 ]
Fix Version/s		10.8 [ 26121 ]
Fix Version/s		10.9 [ 26905 ]

Alice Sherepa added a comment - 2022-11-21 11:43 - edited

Thanks for the report! I repeated as described on 10.4-10.11, InnoDB. Probably caused by ~~MDEV-371~~

--source include/have_innodb.inc

CREATE TABLE t (s BLOB, n INT, UNIQUE (s)) engine=innodb;

INSERT INTO t VALUES ('Hrecvx_0004ln-00',1), ('Hrecvx_0004mm-00',1);

REPLACE INTO t VALUES ('Hrecvx_0004mm-00',2);

10.4 0d586d62e5326053383
CURRENT_TEST: main.1_my
mysqltest: At line 9: query 'REPLACE INTO t VALUES ('Hrecvx_0004mm-00',2)' failed: 1062: Duplicate entry 'Hrecvx_0004mm-00' for key 's'

CREATE TABLE t (s BLOB, n INT, UNIQUE (s)) engine=innodb;

INSERT INTO t VALUES ('Hrecvx_0004ln-00',1), ('Hrecvx_0004mm-00',1);

INSERT INTO t VALUES ('Hrecvx_0004mm-00',2) ON DUPLICATE KEY UPDATE n = VALUES(n) ;

SELECT * FROM t;

SELECT * FROM t;

s	n

Hrecvx_0004ln-00	2

Hrecvx_0004mm-00	1

Alice Sherepa added a comment - 2022-11-21 11:43 - edited Thanks for the report! I repeated as described on 10.4-10.11, InnoDB. Probably caused by MDEV-371 --source include/have_innodb.inc CREATE TABLE t (s BLOB, n INT , UNIQUE (s)) engine=innodb; INSERT INTO t VALUES ( 'Hrecvx_0004ln-00' ,1), ( 'Hrecvx_0004mm-00' ,1); REPLACE INTO t VALUES ( 'Hrecvx_0004mm-00' ,2); 10.4 0d586d62e5326053383 CURRENT_TEST: main.1_my mysqltest: At line 9: query 'REPLACE INTO t VALUES ('Hrecvx_0004mm-00',2)' failed: 1062: Duplicate entry 'Hrecvx_0004mm-00' for key 's' CREATE TABLE t (s BLOB, n INT , UNIQUE (s)) engine=innodb; INSERT INTO t VALUES ( 'Hrecvx_0004ln-00' ,1), ( 'Hrecvx_0004mm-00' ,1); INSERT INTO t VALUES ( 'Hrecvx_0004mm-00' ,2) ON DUPLICATE KEY UPDATE n = VALUES (n) ; SELECT * FROM t; SELECT * FROM t; s n Hrecvx_0004ln-00 2 Hrecvx_0004mm-00 1

Alice Sherepa made changes - 2022-11-21 11:43

Assignee

Oleksandr Byelkin [ sanja ]

Alice Sherepa made changes - 2022-11-21 11:43

Status

Open [ 1 ]

Confirmed [ 10101 ]

Alice Sherepa made changes - 2022-11-21 11:47

Priority

Major [ 3 ]

Critical [ 2 ]

Sergei Golubchik made changes - 2022-11-21 11:49

Description

(Also reported here: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1015293)

Using the MySQL interface, these statements:

DROP TABLE IF EXISTS t;
CREATE TABLE t (s BLOB, n INT, UNIQUE (s));
INSERT INTO t VALUES ('Hrecvx_0004ln-00',1), ('Hrecvx_0004mm-00',1);
INSERT INTO t VALUES ('Hrecvx_0004mm-00',2) ON DUPLICATE KEY UPDATE n = VALUES (n);
SELECT * FROM t;

produce this output:

s n
Hrecvx_0004ln-00 2
Hrecvx_0004mm-00 1

So the latter "INSERT" updates the wrong row.

This happens whether the first column is "BLOB" or "TEXT", but only
with specific values. (In my actual use case with ~1 million rows,
it happened a few dozen times, which might be consistent e.g. with
collisions of a 32 bit hash or so.)

Likewise, these statements:

DROP TABLE IF EXISTS t;
CREATE TABLE t (s BLOB, n INT, UNIQUE (s));
INSERT INTO t VALUES ('Hrecvx_0004ln-00',1), ('Hrecvx_0004mm-00',1);
REPLACE INTO t VALUES ('Hrecvx_0004mm-00',2);
SELECT * FROM t;

give the error:

ERROR 1062 (23000) at line 4: Duplicate entry 'Hrecvx_0004mm-00' for key 's'

In my understanding, this error should actually be impossible with
"REPLACE INTO".

It might be the same issue, i.e. it tries to delete the wrong row
before inserting the new one, so it's still duplicate.

(Also reported here: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1015293)

Using the MySQL interface, these statements:
{code:sql}
DROP TABLE IF EXISTS t;
CREATE TABLE t (s BLOB, n INT, UNIQUE (s));
INSERT INTO t VALUES ('Hrecvx_0004ln-00',1), ('Hrecvx_0004mm-00',1);
INSERT INTO t VALUES ('Hrecvx_0004mm-00',2) ON DUPLICATE KEY UPDATE n = VALUES (n);
SELECT * FROM t;
{code}
produce this output:

s n
Hrecvx_0004ln-00 2
Hrecvx_0004mm-00 1

So the latter "INSERT" updates the wrong row.

This happens whether the first column is "BLOB" or "TEXT", but only
with specific values. (In my actual use case with ~1 million rows,
it happened a few dozen times, which might be consistent e.g. with
collisions of a 32 bit hash or so.)

Likewise, these statements:

DROP TABLE IF EXISTS t;
CREATE TABLE t (s BLOB, n INT, UNIQUE (s));
INSERT INTO t VALUES ('Hrecvx_0004ln-00',1), ('Hrecvx_0004mm-00',1);
REPLACE INTO t VALUES ('Hrecvx_0004mm-00',2);
SELECT * FROM t;

give the error:

ERROR 1062 (23000) at line 4: Duplicate entry 'Hrecvx_0004mm-00' for key 's'

In my understanding, this error should actually be impossible with
"REPLACE INTO".

It might be the same issue, i.e. it tries to delete the wrong row
before inserting the new one, so it's still duplicate.

Sergei Golubchik made changes - 2022-11-21 11:50

Description

(Also reported here: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1015293)

Using the MySQL interface, these statements:
{code:sql}
DROP TABLE IF EXISTS t;
CREATE TABLE t (s BLOB, n INT, UNIQUE (s));
INSERT INTO t VALUES ('Hrecvx_0004ln-00',1), ('Hrecvx_0004mm-00',1);
INSERT INTO t VALUES ('Hrecvx_0004mm-00',2) ON DUPLICATE KEY UPDATE n = VALUES (n);
SELECT * FROM t;
{code}
produce this output:

s n
Hrecvx_0004ln-00 2
Hrecvx_0004mm-00 1

So the latter "INSERT" updates the wrong row.

This happens whether the first column is "BLOB" or "TEXT", but only
with specific values. (In my actual use case with ~1 million rows,
it happened a few dozen times, which might be consistent e.g. with
collisions of a 32 bit hash or so.)

Likewise, these statements:
{code:sql}
DROP TABLE IF EXISTS t;
CREATE TABLE t (s BLOB, n INT, UNIQUE (s));
INSERT INTO t VALUES ('Hrecvx_0004ln-00',1), ('Hrecvx_0004mm-00',1);
REPLACE INTO t VALUES ('Hrecvx_0004mm-00',2);
SELECT * FROM t;
{code}
give the error:

ERROR 1062 (23000) at line 4: Duplicate entry 'Hrecvx_0004mm-00' for key 's'

In my understanding, this error should actually be impossible with
"REPLACE INTO".

It might be the same issue, i.e. it tries to delete the wrong row
before inserting the new one, so it's still duplicate.

Sergei Golubchik made changes - 2022-11-21 11:56

Assignee

Oleksandr Byelkin [ sanja ]

Alexander Barkov [ bar ]

Alexander Barkov added a comment - 2022-12-16 07:17

Works fine with MyISAM:

CREATE OR REPLACE TABLE t1 (s BLOB, n INT, UNIQUE (s)) engine=MyISAM;

INSERT INTO t1 VALUES  ('Hrecvx_0004ln-00',1),

                       ('Hrecvx_0004mm-00',1);

REPLACE INTO t1 VALUES ('Hrecvx_0004mm-00',2);

SELECT * FROM t1;

+------------------+------+

| s                | n    |

+------------------+------+

| Hrecvx_0004ln-00 |    1 |

| Hrecvx_0004mm-00 |    2 |

+------------------+------+

CREATE OR REPLACE TABLE t1 (s BLOB, n INT, UNIQUE (s)) engine=MyISAM;

INSERT INTO t1 VALUES ('Hrecvx_0004ln-00',1), ('Hrecvx_0004mm-00',1);

INSERT INTO t1 VALUES ('Hrecvx_0004mm-00',2) ON DUPLICATE KEY UPDATE n = VALUE(n) ;

SELECT * FROM t1;

+------------------+------+

| s                | n    |

+------------------+------+

| Hrecvx_0004ln-00 |    1 |

| Hrecvx_0004mm-00 |    2 |

+------------------+------+

Alexander Barkov added a comment - 2022-12-16 07:17 Works fine with MyISAM: CREATE OR REPLACE TABLE t1 (s BLOB, n INT , UNIQUE (s)) engine=MyISAM; INSERT INTO t1 VALUES ( 'Hrecvx_0004ln-00' ,1), ( 'Hrecvx_0004mm-00' ,1); REPLACE INTO t1 VALUES ( 'Hrecvx_0004mm-00' ,2); SELECT * FROM t1; +------------------+------+ | s | n | +------------------+------+ | Hrecvx_0004ln-00 | 1 | | Hrecvx_0004mm-00 | 2 | +------------------+------+ CREATE OR REPLACE TABLE t1 (s BLOB, n INT , UNIQUE (s)) engine=MyISAM; INSERT INTO t1 VALUES ( 'Hrecvx_0004ln-00' ,1), ( 'Hrecvx_0004mm-00' ,1); INSERT INTO t1 VALUES ( 'Hrecvx_0004mm-00' ,2) ON DUPLICATE KEY UPDATE n = VALUE(n) ; SELECT * FROM t1; +------------------+------+ | s | n | +------------------+------+ | Hrecvx_0004ln-00 | 1 | | Hrecvx_0004mm-00 | 2 | +------------------+------+

Alexander Barkov made changes - 2022-12-16 08:08

Link

This issue relates to ~~MDEV-18748~~ [ ~~MDEV-18748~~ ]

Alexander Barkov made changes - 2022-12-20 11:25

Assignee

Alexander Barkov [ bar ]

Nikita Malyavin [ nikitamalyavin ]

Nikita Malyavin made changes - 2022-12-26 15:00

Status

Confirmed [ 10101 ]

In Progress [ 3 ]

Nikita Malyavin made changes - 2022-12-26 21:27

Link

This issue relates to MDEV-17395 [ MDEV-17395 ]

Nikita Malyavin added a comment - 2023-01-09 18:29

Thanks bar for the pointers, the bug with IDEMPOTENT replication was found with long uniques, which uses the same logic as REPLACE, and even contains the copy-paste from there. I guess it was made so to allocate and reuse the key buffer memory on the stack, saving from extra malloc. I think our priority now is to minimize stack usage vs extra mallocs, so it's not the point anymore.

I made some refactoring and extracted common code. Also moved handler's RND search initialization to Write_rows_log_event::do_before_row_operations. This fixed many long unique bugs in relplication and optimized the use a little bit.

https://github.com/MariaDB/server/commit/703e73e221a42638f2f05379124b35c57482da93

One more refactoring should be done to generalize handler initialization (and de-initialization) across REPLACE, LOAD DATA, IDEMPOTENT replication, and improve memory usage.

Nikita Malyavin added a comment - 2023-01-09 18:29 Thanks bar for the pointers, the bug with IDEMPOTENT replication was found with long uniques, which uses the same logic as REPLACE, and even contains the copy-paste from there. I guess it was made so to allocate and reuse the key buffer memory on the stack, saving from extra malloc. I think our priority now is to minimize stack usage vs extra mallocs, so it's not the point anymore. I made some refactoring and extracted common code. Also moved handler's RND search initialization to Write_rows_log_event::do_before_row_operations . This fixed many long unique bugs in relplication and optimized the use a little bit. https://github.com/MariaDB/server/commit/703e73e221a42638f2f05379124b35c57482da93 One more refactoring should be done to generalize handler initialization (and de-initialization) across REPLACE, LOAD DATA, IDEMPOTENT replication, and improve memory usage.

Nikita Malyavin made changes - 2023-01-09 18:29

Status

In Progress [ 3 ]

Stalled [ 10000 ]

Nikita Malyavin made changes - 2023-01-12 22:18

Comment

[ Thanks [~bar] for the pointers, the bug with IDEMPOTENT replication was found with long uniques, which uses the same logic as REPLACE, and even contains the copy-paste from there. I guess it was made so to allocate and reuse the key buffer memory on the stack, saving from extra malloc. I think our priority now is to minimize stack usage vs extra mallocs, so it's not the point anymore.

I made some refactoring and extracted common code. Also moved handler's RND search initialization to {{Write_rows_log_event::do_before_row_operations}}. This fixed many long unique bugs in relplication and optimized the use a little bit.

https://github.com/MariaDB/server/commit/703e73e221a42638f2f05379124b35c57482da93

One more refactoring should be done to generalize handler initialization (and de-initialization) across REPLACE, LOAD DATA, IDEMPOTENT replication, and improve memory usage. ]

Alice Sherepa made changes - 2023-02-07 14:44

Link

This issue is duplicated by ~~MDEV-30588~~ [ ~~MDEV-30588~~ ]

Julien Fritsch made changes - 2023-03-03 17:35

Fix Version/s

10.7 [ 24805 ]

Alice Sherepa made changes - 2023-04-20 09:27

Link

This issue relates to MDEV-31093 [ MDEV-31093 ]

Julien Fritsch made changes - 2023-04-27 14:46

Fix Version/s

10.8 [ 26121 ]

Julien Fritsch made changes - 2023-11-28 10:39

Fix Version/s

10.9 [ 26905 ]

Otto Kekäläinen added a comment - 2023-12-01 04:58

The commit https://github.com/MariaDB/server/commit/703e73e221a42638f2f05379124b35c57482da93 was never in a pull request nor included on any branch, and `git log -S ~~MDEV-30046~~` does not yield any results from 10.5 branch, so I assume this issue is still open (and thus will also continue to keep https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1015293 open).

Otto Kekäläinen added a comment - 2023-12-01 04:58 The commit https://github.com/MariaDB/server/commit/703e73e221a42638f2f05379124b35c57482da93 was never in a pull request nor included on any branch, and `git log -S MDEV-30046 ` does not yield any results from 10.5 branch, so I assume this issue is still open (and thus will also continue to keep https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1015293 open).

Nikita Malyavin made changes - 2024-01-02 23:11

Status

Stalled [ 10000 ]

In Progress [ 3 ]

Nikita Malyavin made changes - 2024-01-03 12:27

Status

In Progress [ 3 ]

Stalled [ 10000 ]

Nikita Malyavin made changes - 2024-03-20 18:08

Status

Stalled [ 10000 ]

In Progress [ 3 ]

Nikita Malyavin made changes - 2024-04-26 16:27

Status

In Progress [ 3 ]

In Testing [ 10301 ]

Nikita Malyavin made changes - 2024-04-26 16:27

Status

In Testing [ 10301 ]

Stalled [ 10000 ]

Nikita Malyavin added a comment - 2024-04-26 18:05

The refactoring took a significant amount of time, but I think it will be very beneficial.

We had two implementations of the replace behavior, one is usual, and another one is an idempotent insert. The latter was many fixes behind the primary one.

First, the implementations should have been joined. I didn't stop only on that, though.
I also reworked the implementation to a better readability, as well as, as I believe, maintainability.
Also I have optimized it in several places.

The commit message will bring the better detail.

serg, please review commits the following commits

013fbe19 MDEV-30046 Refactor write_record and fix idempotent replication

aae554b1 MDEV-30046 wrong row targeted with "insert ... on duplicate" and "replace"

Or take a look at the same-titled commits at branch bb-10.5-nikita-MDEV-30046.

They are best to view with commit

9f2ef5d4 MDEV-15990 REPLACE on a precise-versioned table returns ER_DUP_ENTRY

as it further simplifies the REPLACE implementation, however it goes for review to midenok. Anyway, comments are welcome.

Nikita Malyavin added a comment - 2024-04-26 18:05 The refactoring took a significant amount of time, but I think it will be very beneficial. We had two implementations of the replace behavior, one is usual, and another one is an idempotent insert. The latter was many fixes behind the primary one. First, the implementations should have been joined. I didn't stop only on that, though. I also reworked the implementation to a better readability, as well as, as I believe, maintainability. Also I have optimized it in several places. The commit message will bring the better detail. serg , please review commits the following commits 013fbe19 MDEV-30046 Refactor write_record and fix idempotent replication aae554b1 MDEV-30046 wrong row targeted with "insert ... on duplicate" and "replace" Or take a look at the same-titled commits at branch bb-10.5-nikita- MDEV-30046 . They are best to view with commit 9f2ef5d4 MDEV-15990 REPLACE on a precise-versioned table returns ER_DUP_ENTRY as it further simplifies the REPLACE implementation, however it goes for review to midenok . Anyway, comments are welcome.

Nikita Malyavin made changes - 2024-04-26 18:05

Assignee	Nikita Malyavin [ nikitamalyavin ]	Sergei Golubchik [ serg ]
Status	Stalled [ 10000 ]	In Review [ 10002 ]

Sergei Golubchik made changes - 2024-05-01 20:30

Assignee	Sergei Golubchik [ serg ]	Nikita Malyavin [ nikitamalyavin ]
Status	In Review [ 10002 ]	Stalled [ 10000 ]

Sergei Golubchik added a comment - 2024-05-04 09:43

https://github.com/MariaDB/server/commit/972f6941df8b09b523f7f14d2465cee39a98fecb is ok to push, preferably with a clearer comment in sql_insert.cc.

The refactoring commit is generally moving in the right direction, but it won't be ready before the release, so I don't think the actual bug fix should wait for it.

Sergei Golubchik added a comment - 2024-05-04 09:43 https://github.com/MariaDB/server/commit/972f6941df8b09b523f7f14d2465cee39a98fecb is ok to push, preferably with a clearer comment in sql_insert.cc . The refactoring commit is generally moving in the right direction, but it won't be ready before the release, so I don't think the actual bug fix should wait for it.

Nikita Malyavin made changes - 2024-05-05 17:12

Link

This issue split to MDEV-34091 [ MDEV-34091 ]

Nikita Malyavin made changes - 2024-05-05 17:18

Fix Version/s		10.5.25 [ 29626 ]
Fix Version/s	10.4 [ 22408 ]
Fix Version/s	10.5 [ 23123 ]
Fix Version/s	10.6 [ 24028 ]
Resolution		Fixed [ 1 ]
Status	Stalled [ 10000 ]	Closed [ 6 ]

JiraAutomate made changes - 2024-05-05 17:18

Fix Version/s		10.6.18 [ 29627 ]
Fix Version/s		10.11.8 [ 29630 ]
Fix Version/s		11.0.6 [ 29628 ]
Fix Version/s		11.1.5 [ 29629 ]
Fix Version/s		11.2.4 [ 29631 ]
Fix Version/s		11.4.2 [ 29633 ]

Otto Kekäläinen added a comment - 2025-02-18 06:22

For the record, this was merged in as https://github.com/MariaDB/server/commit/72429cad7f5448ad680d8df7b55aa055462d5964 (not visible in this Jira)

Otto Kekäläinen added a comment - 2025-02-18 06:22 For the record, this was merged in as https://github.com/MariaDB/server/commit/72429cad7f5448ad680d8df7b55aa055462d5964 (not visible in this Jira)

People

Assignee:: Nikita Malyavin

Reporter:: Frank Heckenbach

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 2022-11-21 05:19

Updated:: 2025-02-18 06:22

Resolved:: 2024-05-05 17:18

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Attachments

Issue Links

Activity

People

Dates

Git Integration