Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-23446

UPDATE does not insert history row if the row is not changed

Details

    Description

      UPDATE queries that do not change any values in the rows they match on versioned tables do not insert historical rows.

      Example:

      CREATE TABLE test (a INT) WITH SYSTEM VERSIONING;
      INSERT INTO test (a) VALUES (1);
      UPDATE test SET a = 1 WHERE a = 1;
      SELECT *, row_start, row_end FROM test FOR SYSTEM_TIME ALL;
      

      Disclaimer: I don't know whether this is a bug or was purposefully designed like this. In the latter case, I want to make a case to reconsider it:

      • This behavior contrasts with triggers: UPDATE queries that do not change any values in the rows they match still carry out registered triggers for each row.
      • Now some UPDATEs cannot be logged historically. Versioned table fail to capture that they happened. Yet logging such queries and the timestamps at which they were issued can still be valuable information.

      Attachments

        Issue Links

          Activity

            Looks like a bug to me

            serg Sergei Golubchik added a comment - Looks like a bug to me

            We had a similar bug in Clustrix caused by a similar optimization, though it manifest as a MVCC problem rather than a system versioning problem. (Bug 31880 for those with access to the Clustrix bug database.) If it helps, here was the scenario from that bug:

            (All transactions use repeatable-read isolation.) First, create a table with some data, and open a transaction:

            > CREATE TABLE t (a INT, b INT);
            Query OK, 0 rows affected (0.02 sec)
             
            > INSERT INTO t VALUES (1, 10), (3, 30);
            Query OK, 2 rows affected (0.03 sec)
             
            > BEGIN;
            Query OK, 0 rows affected (0.00 sec)
             
            > SELECT * FROM t;
            +------+------+
            | a    | b    |
            +------+------+
            |    3 |   30 |
            |    1 |   10 |
            +------+------+
            2 rows in set (0.00 sec)
            

            Then, while keeping that transaction open, in another session update the table:

            > BEGIN;
            Query OK, 0 rows affected (0.00 sec)
             
            > INSERT INTO t VALUES (5, 50);
            Query OK, 1 row affected (0.01 sec)
             
            > UPDATE t SET b = 100 WHERE a = 1;
            Query OK, 1 row affected (0.01 sec)
             
            > SELECT * FROM t;
            +------+------+
            | a    | b    |
            +------+------+
            |    3 |   30 |
            |    5 |   50 |
            |    1 |  100 |
            +------+------+
            3 rows in set (0.00 sec)
             
            > COMMIT;
            Query OK, 0 rows affected (0.00 sec)
            

            (The insert is just to prove we're not crazy.) Back in the first transaction, we can't see those modifications:

            > SELECT * FROM t;
            +------+------+
            | a    | b    |
            +------+------+
            |    3 |   30 |
            |    1 |   10 |
            +------+------+
            2 rows in set (0.00 sec)
            

            But we should be able to see the results of our own modifications. However, if we update to the future value of the row, we cannot see it!

            > UPDATE t SET b = 100 WHERE a = 1;
            Query OK, 1 row affected (7.09 sec)
             
            > SELECT * FROM t;
            +------+------+
            | a    | b    |
            +------+------+
            |    3 |   30 |
            |    1 |   10 |
            +------+------+
            2 rows in set (0.00 sec)
            

            This seems to only happen if we use the future value of the row. We can see it if we update to a different value:

            > UPDATE t SET b = 1000 WHERE a = 1;
            Query OK, 1 row affected (0.00 sec)
             
            > SELECT * FROM t;
            +------+------+
            |  a   | b    |
            +------+------+
            |    3 |   30 |
            |    1 | 1000 |
            +------+------+
            2 rows in set (0.00 sec)
            

            Perhaps this can serve as inspiration for additional testcases.

            michae2 Michael Erickson (Inactive) added a comment - We had a similar bug in Clustrix caused by a similar optimization, though it manifest as a MVCC problem rather than a system versioning problem. (Bug 31880 for those with access to the Clustrix bug database.) If it helps, here was the scenario from that bug: (All transactions use repeatable-read isolation.) First, create a table with some data, and open a transaction: > CREATE TABLE t (a INT, b INT); Query OK, 0 rows affected (0.02 sec)   > INSERT INTO t VALUES (1, 10), (3, 30); Query OK, 2 rows affected (0.03 sec)   > BEGIN; Query OK, 0 rows affected (0.00 sec)   > SELECT * FROM t; +------+------+ | a | b | +------+------+ | 3 | 30 | | 1 | 10 | +------+------+ 2 rows in set (0.00 sec) Then, while keeping that transaction open, in another session update the table: > BEGIN; Query OK, 0 rows affected (0.00 sec)   > INSERT INTO t VALUES (5, 50); Query OK, 1 row affected (0.01 sec)   > UPDATE t SET b = 100 WHERE a = 1; Query OK, 1 row affected (0.01 sec)   > SELECT * FROM t; +------+------+ | a | b | +------+------+ | 3 | 30 | | 5 | 50 | | 1 | 100 | +------+------+ 3 rows in set (0.00 sec)   > COMMIT; Query OK, 0 rows affected (0.00 sec) (The insert is just to prove we're not crazy.) Back in the first transaction, we can't see those modifications: > SELECT * FROM t; +------+------+ | a | b | +------+------+ | 3 | 30 | | 1 | 10 | +------+------+ 2 rows in set (0.00 sec) But we should be able to see the results of our own modifications. However, if we update to the future value of the row, we cannot see it! > UPDATE t SET b = 100 WHERE a = 1; Query OK, 1 row affected (7.09 sec)   > SELECT * FROM t; +------+------+ | a | b | +------+------+ | 3 | 30 | | 1 | 10 | +------+------+ 2 rows in set (0.00 sec) This seems to only happen if we use the future value of the row. We can see it if we update to a different value: > UPDATE t SET b = 1000 WHERE a = 1; Query OK, 1 row affected (0.00 sec)   > SELECT * FROM t; +------+------+ | a | b | +------+------+ | 3 | 30 | | 1 | 1000 | +------+------+ 2 rows in set (0.00 sec) Perhaps this can serve as inspiration for additional testcases.

            Changes are required: IGNORE behavior should be handled for `vers_insert_history_row`; gotos to the label in the middle of the function body should be removed

            nikitamalyavin Nikita Malyavin added a comment - Changes are required: IGNORE behavior should be handled for `vers_insert_history_row`; gotos to the label in the middle of the function body should be removed

            After the discussion we decided not to argue too much on IGNORE behavior, and leave it out of this ticket's scope. So this will be left as is

            nikitamalyavin Nikita Malyavin added a comment - After the discussion we decided not to argue too much on IGNORE behavior, and leave it out of this ticket's scope. So this will be left as is
            midenok Aleksey Midenkov added a comment - - edited

            versioning.foreign fails still. Might be different bug of execution foreign ON constraint. Fixed by MDEV-23644

            midenok Aleksey Midenkov added a comment - - edited versioning.foreign fails still. Might be different bug of execution foreign ON constraint. Fixed by MDEV-23644
            sanja Oleksandr Byelkin added a comment - - edited

            Common people, first goto backward, then using "error" variable from other scope by place of goto.

            sanja Oleksandr Byelkin added a comment - - edited Common people, first goto backward, then using "error" variable from other scope by place of goto.

            I will make urgent fix with the variable, plase fix it properly.

            sanja Oleksandr Byelkin added a comment - I will make urgent fix with the variable, plase fix it properly.

            midenok, I think that we must absolutely cover this code path in multi_update::send_data() with a test case:

                  if (has_vers_fields && table->versioned(VERS_TIMESTAMP))
                  {
                    store_record(table, record[2]);
                    if (vers_insert_history_row(table))
                    {
                      restore_record(table, record[2]);
                      goto error;
                    }
            

            This is causing an uninitialized variable to be used in another scope:

                  if (!can_compare_record || compare_record(table))
                  {
            	int error;
            …
                      {
            error:
                        /*
                          If (ignore && error == is ignorable) we don't have to
                          do anything; otherwise...
                        */
                        myf flags= 0;
             
                        if (table->file->is_fatal_error(error, HA_CHECK_ALL))
                          flags|= ME_FATAL; /* Other handler errors are fatal */
             
                        prepare_record_for_error_message(error, table);
                        table->file->print_error(error,MYF(flags));
                        DBUG_RETURN(1);
            

            If I understood it correctly, the fix that sanja is considering would only silence the GCC -Wmaybe-uninitialized in optimized builds, but we would use error=0 (no error) in that error handling code path, which feels wrong to me. Hence, we should make sure that this code path is covered by a test, and an appropriate error will be reported.

            marko Marko Mäkelä added a comment - midenok , I think that we must absolutely cover this code path in multi_update::send_data() with a test case: if (has_vers_fields && table->versioned(VERS_TIMESTAMP)) { store_record(table, record[2]); if (vers_insert_history_row(table)) { restore_record(table, record[2]); goto error; } This is causing an uninitialized variable to be used in another scope: if (!can_compare_record || compare_record(table)) { int error; … { error: /* If (ignore && error == is ignorable) we don't have to do anything; otherwise... */ myf flags= 0;   if (table->file->is_fatal_error(error, HA_CHECK_ALL)) flags|= ME_FATAL; /* Other handler errors are fatal */   prepare_record_for_error_message(error, table); table->file->print_error(error,MYF(flags)); DBUG_RETURN(1); If I understood it correctly, the fix that sanja is considering would only silence the GCC -Wmaybe-uninitialized in optimized builds, but we would use error=0 (no error) in that error handling code path, which feels wrong to me. Hence, we should make sure that this code path is covered by a test, and an appropriate error will be reported.

            188b328335d5c2a61d21f528fad19a685f9511ef

            sanja Oleksandr Byelkin added a comment - 188b328335d5c2a61d21f528fad19a685f9511ef

            marko Agree and already started that in MDEV-24451. Some errors are hard to trigger without DBUG_EXECUTE_IF() (like failure of insert history row). Do you have an idea how to fail row insert in release build?

            midenok Aleksey Midenkov added a comment - marko Agree and already started that in MDEV-24451 . Some errors are hard to trigger without DBUG_EXECUTE_IF() (like failure of insert history row). Do you have an idea how to fail row insert in release build?

            midenok, an insert should be able to fail on lock wait timeout. Maybe you could issue a locking read (SELECT…LOCK IN SHARE MODE) from another connection?

            marko Marko Mäkelä added a comment - midenok , an insert should be able to fail on lock wait timeout. Maybe you could issue a locking read ( SELECT…LOCK IN SHARE MODE ) from another connection?

            People

              midenok Aleksey Midenkov
              user2180613 Remy Fox
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.