Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-30046

wrong row targeted with "insert ... on duplicate" and "replace", leading to data corruption

Details

    Description

      (Also reported here: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1015293)

      Using the MySQL interface, these statements:

      DROP TABLE IF EXISTS t;
      CREATE TABLE t (s BLOB, n INT, UNIQUE (s));
      INSERT INTO t VALUES ('Hrecvx_0004ln-00',1), ('Hrecvx_0004mm-00',1);
      INSERT INTO t VALUES ('Hrecvx_0004mm-00',2) ON DUPLICATE KEY UPDATE n = VALUES (n);
      SELECT * FROM t;
      

      produce this output:

      s n
      Hrecvx_0004ln-00 2
      Hrecvx_0004mm-00 1

      So the latter "INSERT" updates the wrong row.

      This happens whether the first column is "BLOB" or "TEXT", but only
      with specific values. (In my actual use case with ~1 million rows,
      it happened a few dozen times, which might be consistent e.g. with
      collisions of a 32 bit hash or so.)

      Likewise, these statements:

      DROP TABLE IF EXISTS t;
      CREATE TABLE t (s BLOB, n INT, UNIQUE (s));
      INSERT INTO t VALUES ('Hrecvx_0004ln-00',1), ('Hrecvx_0004mm-00',1);
      REPLACE INTO t VALUES ('Hrecvx_0004mm-00',2);
      SELECT * FROM t;
      

      give the error:

      ERROR 1062 (23000) at line 4: Duplicate entry 'Hrecvx_0004mm-00' for key 's'

      In my understanding, this error should actually be impossible with
      "REPLACE INTO".

      It might be the same issue, i.e. it tries to delete the wrong row
      before inserting the new one, so it's still duplicate.

      Attachments

        Issue Links

          Activity

            fh Frank Heckenbach created issue -
            alice Alice Sherepa made changes -
            Field Original Value New Value
            alice Alice Sherepa made changes -
            alice Alice Sherepa made changes -
            alice Alice Sherepa made changes -
            Affects Version/s 10.4 [ 22408 ]
            Affects Version/s 10.5 [ 23123 ]
            Affects Version/s 10.6 [ 24028 ]
            Affects Version/s 10.7 [ 24805 ]
            Affects Version/s 10.8 [ 26121 ]
            Affects Version/s 10.9 [ 26905 ]
            Affects Version/s 10.10 [ 27530 ]
            alice Alice Sherepa made changes -
            Fix Version/s 10.4 [ 22408 ]
            Fix Version/s 10.5 [ 23123 ]
            Fix Version/s 10.6 [ 24028 ]
            Fix Version/s 10.7 [ 24805 ]
            Fix Version/s 10.8 [ 26121 ]
            Fix Version/s 10.9 [ 26905 ]
            alice Alice Sherepa added a comment - - edited

            Thanks for the report! I repeated as described on 10.4-10.11, InnoDB. Probably caused by MDEV-371

            --source include/have_innodb.inc 
            CREATE TABLE t (s BLOB, n INT, UNIQUE (s)) engine=innodb;
            INSERT INTO t VALUES ('Hrecvx_0004ln-00',1), ('Hrecvx_0004mm-00',1);
            REPLACE INTO t VALUES ('Hrecvx_0004mm-00',2);
            

            10.4 0d586d62e5326053383

            CURRENT_TEST: main.1_my
            mysqltest: At line 9: query 'REPLACE INTO t VALUES ('Hrecvx_0004mm-00',2)' failed: 1062: Duplicate entry 'Hrecvx_0004mm-00' for key 's'
            

            CREATE TABLE t (s BLOB, n INT, UNIQUE (s)) engine=innodb;
            INSERT INTO t VALUES ('Hrecvx_0004ln-00',1), ('Hrecvx_0004mm-00',1);
            INSERT INTO t VALUES ('Hrecvx_0004mm-00',2) ON DUPLICATE KEY UPDATE n = VALUES(n) ;
            SELECT * FROM t;
            

            SELECT * FROM t;
            s	n
            Hrecvx_0004ln-00	2
            Hrecvx_0004mm-00	1
            

            alice Alice Sherepa added a comment - - edited Thanks for the report! I repeated as described on 10.4-10.11, InnoDB. Probably caused by MDEV-371 --source include/have_innodb.inc CREATE TABLE t (s BLOB, n INT , UNIQUE (s)) engine=innodb; INSERT INTO t VALUES ( 'Hrecvx_0004ln-00' ,1), ( 'Hrecvx_0004mm-00' ,1); REPLACE INTO t VALUES ( 'Hrecvx_0004mm-00' ,2); 10.4 0d586d62e5326053383 CURRENT_TEST: main.1_my mysqltest: At line 9: query 'REPLACE INTO t VALUES ('Hrecvx_0004mm-00',2)' failed: 1062: Duplicate entry 'Hrecvx_0004mm-00' for key 's' CREATE TABLE t (s BLOB, n INT , UNIQUE (s)) engine=innodb; INSERT INTO t VALUES ( 'Hrecvx_0004ln-00' ,1), ( 'Hrecvx_0004mm-00' ,1); INSERT INTO t VALUES ( 'Hrecvx_0004mm-00' ,2) ON DUPLICATE KEY UPDATE n = VALUES (n) ; SELECT * FROM t; SELECT * FROM t; s n Hrecvx_0004ln-00 2 Hrecvx_0004mm-00 1
            alice Alice Sherepa made changes -
            Assignee Oleksandr Byelkin [ sanja ]
            alice Alice Sherepa made changes -
            Status Open [ 1 ] Confirmed [ 10101 ]
            alice Alice Sherepa made changes -
            Priority Major [ 3 ] Critical [ 2 ]
            serg Sergei Golubchik made changes -
            Description (Also reported here: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1015293)

            Using the MySQL interface, these statements:

            DROP TABLE IF EXISTS t;
            CREATE TABLE t (s BLOB, n INT, UNIQUE (s));
            INSERT INTO t VALUES ('Hrecvx_0004ln-00',1), ('Hrecvx_0004mm-00',1);
            INSERT INTO t VALUES ('Hrecvx_0004mm-00',2) ON DUPLICATE KEY UPDATE n = VALUES (n);
            SELECT * FROM t;

            produce this output:

            s n
            Hrecvx_0004ln-00 2
            Hrecvx_0004mm-00 1

            So the latter "INSERT" updates the wrong row.

            This happens whether the first column is "BLOB" or "TEXT", but only
            with specific values. (In my actual use case with ~1 million rows,
            it happened a few dozen times, which might be consistent e.g. with
            collisions of a 32 bit hash or so.)

            Likewise, these statements:

            DROP TABLE IF EXISTS t;
            CREATE TABLE t (s BLOB, n INT, UNIQUE (s));
            INSERT INTO t VALUES ('Hrecvx_0004ln-00',1), ('Hrecvx_0004mm-00',1);
            REPLACE INTO t VALUES ('Hrecvx_0004mm-00',2);
            SELECT * FROM t;

            give the error:

            ERROR 1062 (23000) at line 4: Duplicate entry 'Hrecvx_0004mm-00' for key 's'

            In my understanding, this error should actually be impossible with
            "REPLACE INTO".

            It might be the same issue, i.e. it tries to delete the wrong row
            before inserting the new one, so it's still duplicate.
            (Also reported here: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1015293)

            Using the MySQL interface, these statements:
            {code:sql}
            DROP TABLE IF EXISTS t;
            CREATE TABLE t (s BLOB, n INT, UNIQUE (s));
            INSERT INTO t VALUES ('Hrecvx_0004ln-00',1), ('Hrecvx_0004mm-00',1);
            INSERT INTO t VALUES ('Hrecvx_0004mm-00',2) ON DUPLICATE KEY UPDATE n = VALUES (n);
            SELECT * FROM t;
            {code}
            produce this output:

            s n
            Hrecvx_0004ln-00 2
            Hrecvx_0004mm-00 1

            So the latter "INSERT" updates the wrong row.

            This happens whether the first column is "BLOB" or "TEXT", but only
            with specific values. (In my actual use case with ~1 million rows,
            it happened a few dozen times, which might be consistent e.g. with
            collisions of a 32 bit hash or so.)

            Likewise, these statements:

            DROP TABLE IF EXISTS t;
            CREATE TABLE t (s BLOB, n INT, UNIQUE (s));
            INSERT INTO t VALUES ('Hrecvx_0004ln-00',1), ('Hrecvx_0004mm-00',1);
            REPLACE INTO t VALUES ('Hrecvx_0004mm-00',2);
            SELECT * FROM t;

            give the error:

            ERROR 1062 (23000) at line 4: Duplicate entry 'Hrecvx_0004mm-00' for key 's'

            In my understanding, this error should actually be impossible with
            "REPLACE INTO".

            It might be the same issue, i.e. it tries to delete the wrong row
            before inserting the new one, so it's still duplicate.
            serg Sergei Golubchik made changes -
            Description (Also reported here: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1015293)

            Using the MySQL interface, these statements:
            {code:sql}
            DROP TABLE IF EXISTS t;
            CREATE TABLE t (s BLOB, n INT, UNIQUE (s));
            INSERT INTO t VALUES ('Hrecvx_0004ln-00',1), ('Hrecvx_0004mm-00',1);
            INSERT INTO t VALUES ('Hrecvx_0004mm-00',2) ON DUPLICATE KEY UPDATE n = VALUES (n);
            SELECT * FROM t;
            {code}
            produce this output:

            s n
            Hrecvx_0004ln-00 2
            Hrecvx_0004mm-00 1

            So the latter "INSERT" updates the wrong row.

            This happens whether the first column is "BLOB" or "TEXT", but only
            with specific values. (In my actual use case with ~1 million rows,
            it happened a few dozen times, which might be consistent e.g. with
            collisions of a 32 bit hash or so.)

            Likewise, these statements:

            DROP TABLE IF EXISTS t;
            CREATE TABLE t (s BLOB, n INT, UNIQUE (s));
            INSERT INTO t VALUES ('Hrecvx_0004ln-00',1), ('Hrecvx_0004mm-00',1);
            REPLACE INTO t VALUES ('Hrecvx_0004mm-00',2);
            SELECT * FROM t;

            give the error:

            ERROR 1062 (23000) at line 4: Duplicate entry 'Hrecvx_0004mm-00' for key 's'

            In my understanding, this error should actually be impossible with
            "REPLACE INTO".

            It might be the same issue, i.e. it tries to delete the wrong row
            before inserting the new one, so it's still duplicate.
            (Also reported here: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1015293)

            Using the MySQL interface, these statements:
            {code:sql}
            DROP TABLE IF EXISTS t;
            CREATE TABLE t (s BLOB, n INT, UNIQUE (s));
            INSERT INTO t VALUES ('Hrecvx_0004ln-00',1), ('Hrecvx_0004mm-00',1);
            INSERT INTO t VALUES ('Hrecvx_0004mm-00',2) ON DUPLICATE KEY UPDATE n = VALUES (n);
            SELECT * FROM t;
            {code}
            produce this output:

            s n
            Hrecvx_0004ln-00 2
            Hrecvx_0004mm-00 1

            So the latter "INSERT" updates the wrong row.

            This happens whether the first column is "BLOB" or "TEXT", but only
            with specific values. (In my actual use case with ~1 million rows,
            it happened a few dozen times, which might be consistent e.g. with
            collisions of a 32 bit hash or so.)

            Likewise, these statements:
            {code:sql}
            DROP TABLE IF EXISTS t;
            CREATE TABLE t (s BLOB, n INT, UNIQUE (s));
            INSERT INTO t VALUES ('Hrecvx_0004ln-00',1), ('Hrecvx_0004mm-00',1);
            REPLACE INTO t VALUES ('Hrecvx_0004mm-00',2);
            SELECT * FROM t;
            {code}
            give the error:

            ERROR 1062 (23000) at line 4: Duplicate entry 'Hrecvx_0004mm-00' for key 's'

            In my understanding, this error should actually be impossible with
            "REPLACE INTO".

            It might be the same issue, i.e. it tries to delete the wrong row
            before inserting the new one, so it's still duplicate.
            serg Sergei Golubchik made changes -
            Assignee Oleksandr Byelkin [ sanja ] Alexander Barkov [ bar ]

            Works fine with MyISAM:

            CREATE OR REPLACE TABLE t1 (s BLOB, n INT, UNIQUE (s)) engine=MyISAM;
            INSERT INTO t1 VALUES  ('Hrecvx_0004ln-00',1),
                                   ('Hrecvx_0004mm-00',1);
            REPLACE INTO t1 VALUES ('Hrecvx_0004mm-00',2);
            SELECT * FROM t1;
            

            +------------------+------+
            | s                | n    |
            +------------------+------+
            | Hrecvx_0004ln-00 |    1 |
            | Hrecvx_0004mm-00 |    2 |
            +------------------+------+
            

            CREATE OR REPLACE TABLE t1 (s BLOB, n INT, UNIQUE (s)) engine=MyISAM;
            INSERT INTO t1 VALUES ('Hrecvx_0004ln-00',1), ('Hrecvx_0004mm-00',1);
            INSERT INTO t1 VALUES ('Hrecvx_0004mm-00',2) ON DUPLICATE KEY UPDATE n = VALUE(n) ;
            SELECT * FROM t1;
            

            +------------------+------+
            | s                | n    |
            +------------------+------+
            | Hrecvx_0004ln-00 |    1 |
            | Hrecvx_0004mm-00 |    2 |
            +------------------+------+
            

            bar Alexander Barkov added a comment - Works fine with MyISAM: CREATE OR REPLACE TABLE t1 (s BLOB, n INT , UNIQUE (s)) engine=MyISAM; INSERT INTO t1 VALUES ( 'Hrecvx_0004ln-00' ,1), ( 'Hrecvx_0004mm-00' ,1); REPLACE INTO t1 VALUES ( 'Hrecvx_0004mm-00' ,2); SELECT * FROM t1; +------------------+------+ | s | n | +------------------+------+ | Hrecvx_0004ln-00 | 1 | | Hrecvx_0004mm-00 | 2 | +------------------+------+ CREATE OR REPLACE TABLE t1 (s BLOB, n INT , UNIQUE (s)) engine=MyISAM; INSERT INTO t1 VALUES ( 'Hrecvx_0004ln-00' ,1), ( 'Hrecvx_0004mm-00' ,1); INSERT INTO t1 VALUES ( 'Hrecvx_0004mm-00' ,2) ON DUPLICATE KEY UPDATE n = VALUE(n) ; SELECT * FROM t1; +------------------+------+ | s | n | +------------------+------+ | Hrecvx_0004ln-00 | 1 | | Hrecvx_0004mm-00 | 2 | +------------------+------+
            bar Alexander Barkov made changes -
            bar Alexander Barkov made changes -
            Assignee Alexander Barkov [ bar ] Nikita Malyavin [ nikitamalyavin ]
            nikitamalyavin Nikita Malyavin made changes -
            Status Confirmed [ 10101 ] In Progress [ 3 ]
            nikitamalyavin Nikita Malyavin made changes -

            Thanks bar for the pointers, the bug with IDEMPOTENT replication was found with long uniques, which uses the same logic as REPLACE, and even contains the copy-paste from there. I guess it was made so to allocate and reuse the key buffer memory on the stack, saving from extra malloc. I think our priority now is to minimize stack usage vs extra mallocs, so it's not the point anymore.

            I made some refactoring and extracted common code. Also moved handler's RND search initialization to Write_rows_log_event::do_before_row_operations. This fixed many long unique bugs in relplication and optimized the use a little bit.

            https://github.com/MariaDB/server/commit/703e73e221a42638f2f05379124b35c57482da93

            One more refactoring should be done to generalize handler initialization (and de-initialization) across REPLACE, LOAD DATA, IDEMPOTENT replication, and improve memory usage.

            nikitamalyavin Nikita Malyavin added a comment - Thanks bar for the pointers, the bug with IDEMPOTENT replication was found with long uniques, which uses the same logic as REPLACE, and even contains the copy-paste from there. I guess it was made so to allocate and reuse the key buffer memory on the stack, saving from extra malloc. I think our priority now is to minimize stack usage vs extra mallocs, so it's not the point anymore. I made some refactoring and extracted common code. Also moved handler's RND search initialization to Write_rows_log_event::do_before_row_operations . This fixed many long unique bugs in relplication and optimized the use a little bit. https://github.com/MariaDB/server/commit/703e73e221a42638f2f05379124b35c57482da93 One more refactoring should be done to generalize handler initialization (and de-initialization) across REPLACE, LOAD DATA, IDEMPOTENT replication, and improve memory usage.
            nikitamalyavin Nikita Malyavin made changes -
            Status In Progress [ 3 ] Stalled [ 10000 ]
            nikitamalyavin Nikita Malyavin made changes -
            Comment [ Thanks [~bar] for the pointers, the bug with IDEMPOTENT replication was found with long uniques, which uses the same logic as REPLACE, and even contains the copy-paste from there. I guess it was made so to allocate and reuse the key buffer memory on the stack, saving from extra malloc. I think our priority now is to minimize stack usage vs extra mallocs, so it's not the point anymore.

            I made some refactoring and extracted common code. Also moved handler's RND search initialization to {{Write_rows_log_event::do_before_row_operations}}. This fixed many long unique bugs in relplication and optimized the use a little bit.

            https://github.com/MariaDB/server/commit/703e73e221a42638f2f05379124b35c57482da93

            One more refactoring should be done to generalize handler initialization (and de-initialization) across REPLACE, LOAD DATA, IDEMPOTENT replication, and improve memory usage. ]
            alice Alice Sherepa made changes -
            julien.fritsch Julien Fritsch made changes -
            Fix Version/s 10.7 [ 24805 ]
            alice Alice Sherepa made changes -
            julien.fritsch Julien Fritsch made changes -
            Fix Version/s 10.8 [ 26121 ]
            julien.fritsch Julien Fritsch made changes -
            Fix Version/s 10.9 [ 26905 ]

            The commit https://github.com/MariaDB/server/commit/703e73e221a42638f2f05379124b35c57482da93 was never in a pull request nor included on any branch, and `git log -S MDEV-30046` does not yield any results from 10.5 branch, so I assume this issue is still open (and thus will also continue to keep https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1015293 open).

            otto Otto Kekäläinen added a comment - The commit https://github.com/MariaDB/server/commit/703e73e221a42638f2f05379124b35c57482da93 was never in a pull request nor included on any branch, and `git log -S MDEV-30046 ` does not yield any results from 10.5 branch, so I assume this issue is still open (and thus will also continue to keep https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1015293 open).
            nikitamalyavin Nikita Malyavin made changes -
            Status Stalled [ 10000 ] In Progress [ 3 ]
            nikitamalyavin Nikita Malyavin made changes -
            Status In Progress [ 3 ] Stalled [ 10000 ]
            nikitamalyavin Nikita Malyavin made changes -
            Status Stalled [ 10000 ] In Progress [ 3 ]
            nikitamalyavin Nikita Malyavin made changes -
            Status In Progress [ 3 ] In Testing [ 10301 ]
            nikitamalyavin Nikita Malyavin made changes -
            Status In Testing [ 10301 ] Stalled [ 10000 ]

            The refactoring took a significant amount of time, but I think it will be very beneficial.

            We had two implementations of the replace behavior, one is usual, and another one is an idempotent insert. The latter was many fixes behind the primary one.

            First, the implementations should have been joined. I didn't stop only on that, though.
            I also reworked the implementation to a better readability, as well as, as I believe, maintainability.
            Also I have optimized it in several places.

            The commit message will bring the better detail.

            serg, please review commits the following commits

            013fbe19 MDEV-30046 Refactor write_record and fix idempotent replication
            aae554b1 MDEV-30046 wrong row targeted with "insert ... on duplicate" and "replace"
            

            Or take a look at the same-titled commits at branch bb-10.5-nikita-MDEV-30046.

            They are best to view with commit

            9f2ef5d4 MDEV-15990 REPLACE on a precise-versioned table returns ER_DUP_ENTRY
            

            as it further simplifies the REPLACE implementation, however it goes for review to midenok. Anyway, comments are welcome.

            nikitamalyavin Nikita Malyavin added a comment - The refactoring took a significant amount of time, but I think it will be very beneficial. We had two implementations of the replace behavior, one is usual, and another one is an idempotent insert. The latter was many fixes behind the primary one. First, the implementations should have been joined. I didn't stop only on that, though. I also reworked the implementation to a better readability, as well as, as I believe, maintainability. Also I have optimized it in several places. The commit message will bring the better detail. serg , please review commits the following commits 013fbe19 MDEV-30046 Refactor write_record and fix idempotent replication aae554b1 MDEV-30046 wrong row targeted with "insert ... on duplicate" and "replace" Or take a look at the same-titled commits at branch bb-10.5-nikita- MDEV-30046 . They are best to view with commit 9f2ef5d4 MDEV-15990 REPLACE on a precise-versioned table returns ER_DUP_ENTRY as it further simplifies the REPLACE implementation, however it goes for review to midenok . Anyway, comments are welcome.
            nikitamalyavin Nikita Malyavin made changes -
            Assignee Nikita Malyavin [ nikitamalyavin ] Sergei Golubchik [ serg ]
            Status Stalled [ 10000 ] In Review [ 10002 ]
            serg Sergei Golubchik made changes -
            Assignee Sergei Golubchik [ serg ] Nikita Malyavin [ nikitamalyavin ]
            Status In Review [ 10002 ] Stalled [ 10000 ]

            https://github.com/MariaDB/server/commit/972f6941df8b09b523f7f14d2465cee39a98fecb is ok to push, preferably with a clearer comment in sql_insert.cc.

            The refactoring commit is generally moving in the right direction, but it won't be ready before the release, so I don't think the actual bug fix should wait for it.

            serg Sergei Golubchik added a comment - https://github.com/MariaDB/server/commit/972f6941df8b09b523f7f14d2465cee39a98fecb is ok to push, preferably with a clearer comment in sql_insert.cc . The refactoring commit is generally moving in the right direction, but it won't be ready before the release, so I don't think the actual bug fix should wait for it.
            nikitamalyavin Nikita Malyavin made changes -
            nikitamalyavin Nikita Malyavin made changes -
            Fix Version/s 10.5.25 [ 29626 ]
            Fix Version/s 10.4 [ 22408 ]
            Fix Version/s 10.5 [ 23123 ]
            Fix Version/s 10.6 [ 24028 ]
            Resolution Fixed [ 1 ]
            Status Stalled [ 10000 ] Closed [ 6 ]
            JIraAutomate JiraAutomate made changes -
            Fix Version/s 10.6.18 [ 29627 ]
            Fix Version/s 10.11.8 [ 29630 ]
            Fix Version/s 11.0.6 [ 29628 ]
            Fix Version/s 11.1.5 [ 29629 ]
            Fix Version/s 11.2.4 [ 29631 ]
            Fix Version/s 11.4.2 [ 29633 ]

            For the record, this was merged in as https://github.com/MariaDB/server/commit/72429cad7f5448ad680d8df7b55aa055462d5964 (not visible in this Jira)

            otto Otto Kekäläinen added a comment - For the record, this was merged in as https://github.com/MariaDB/server/commit/72429cad7f5448ad680d8df7b55aa055462d5964 (not visible in this Jira)

            People

              nikitamalyavin Nikita Malyavin
              fh Frank Heckenbach
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.