[MDEV-26994] Transaction-precise tables ignore most recent row versions when queried with "FOR SYSTEM_TIME FROM ts TO ts" Created: 2021-10-27  Updated: 2023-12-15

Status: Stalled
Project: MariaDB Server
Component/s: Versioned Tables
Affects Version/s: None
Fix Version/s: 10.4

Type: Bug Priority: Major
Reporter: Geoff Montee (Inactive) Assignee: Aleksey Midenkov
Resolution: Unresolved Votes: 0
Labels: trx-versioning

Attachments: PNG File Screenshot from 2021-10-29 15-59-42.png    
Issue Links:
Problem/Incident
is caused by MDEV-12894 System-versioned tables Closed

 Description   

Transaction-precise tables seem to ignore the most recent row versions when queried with FOR SYSTEM_TIME FROM ts TO ts.

Let's say that we start with the following DDL and DML:

CREATE TABLE accounts (
    id SERIAL PRIMARY KEY,
    name VARCHAR(255),
    amount INT,
    start_trxid BIGINT UNSIGNED GENERATED ALWAYS AS ROW START INVISIBLE,
    end_trxid BIGINT UNSIGNED GENERATED ALWAYS AS ROW END INVISIBLE,
    PERIOD FOR SYSTEM_TIME(start_trxid, end_trxid)
) WITH SYSTEM VERSIONING;
 
INSERT INTO accounts (name, amount)
VALUES ("Smith", 400),
       ("Orson", 300),
       ("Serio", 500),
       ("Wallace", 200),
       ("March", 600),
       ("Douglas", 100);
 
UPDATE accounts
SET amount = 1000
WHERE id = 1
   OR id = 3;

This results in the following current and historical row versions:

MariaDB [test]> SELECT *, start_trxid, end_trxid
    -> FROM accounts
    -> FOR SYSTEM_TIME ALL;
+----+---------+--------+-------------+----------------------+
| id | name    | amount | start_trxid | end_trxid            |
+----+---------+--------+-------------+----------------------+
|  1 | Smith   |    400 |         102 |                  107 |
|  1 | Smith   |   1000 |         107 | 18446744073709551615 |
|  2 | Orson   |    300 |         102 | 18446744073709551615 |
|  3 | Serio   |    500 |         102 |                  107 |
|  3 | Serio   |   1000 |         107 | 18446744073709551615 |
|  4 | Wallace |    200 |         102 | 18446744073709551615 |
|  5 | March   |    600 |         102 | 18446744073709551615 |
|  6 | Douglas |    100 |         102 | 18446744073709551615 |
+----+---------+--------+-------------+----------------------+

And the following transaction metadata:

MariaDB [test]> SELECT * FROM mysql.transaction_registry;
+----------------+-----------+----------------------------+----------------------------+-----------------+
| transaction_id | commit_id | begin_timestamp            | commit_timestamp           | isolation_level |
+----------------+-----------+----------------------------+----------------------------+-----------------+
|            102 |       103 | 2021-10-27 20:12:56.367871 | 2021-10-27 20:12:56.368546 | REPEATABLE-READ |
|            107 |       108 | 2021-10-27 20:13:02.423844 | 2021-10-27 20:13:02.431790 | REPEATABLE-READ |
+----------------+-----------+----------------------------+----------------------------+-----------------+
2 rows in set (0.000 sec)

Given the timestamps of these transactions, I would expect the following queries to return all current and historical row versions:

SELECT *
FROM accounts
FOR SYSTEM_TIME
   BETWEEN '2021-10-27 20:10'
   AND '2038-01-19 03:14:07.999999';
 
SELECT *
FROM accounts
FOR SYSTEM_TIME
   FROM '2021-10-27 20:10'
   TO '2038-01-19 03:14:07.999999';

The query that uses the BETWEEN .. AND .. clause seems to work properly:

MariaDB [test]> SELECT *
    -> FROM accounts
    -> FOR SYSTEM_TIME
    ->    BETWEEN '2021-10-27 20:10'
    ->    AND '2038-01-19 03:14:07.999999';
+----+---------+--------+
| id | name    | amount |
+----+---------+--------+
|  1 | Smith   |    400 |
|  1 | Smith   |   1000 |
|  2 | Orson   |    300 |
|  3 | Serio   |    500 |
|  3 | Serio   |   1000 |
|  4 | Wallace |    200 |
|  5 | March   |    600 |
|  6 | Douglas |    100 |
+----+---------+--------+
8 rows in set (0.001 sec)

But the query that uses the FROM .. TO .. clause seems to ignore the most recent row versions if there are historical row versions:

MariaDB [test]> SELECT *
    -> FROM accounts
    -> FOR SYSTEM_TIME
    ->    FROM '2021-10-27 20:10'
    ->    TO '2038-01-19 03:14:07.999999';
+----+---------+--------+
| id | name    | amount |
+----+---------+--------+
|  1 | Smith   |    400 |
|  2 | Orson   |    300 |
|  3 | Serio   |    500 |
|  4 | Wallace |    200 |
|  5 | March   |    600 |
|  6 | Douglas |    100 |
+----+---------+--------+
6 rows in set (0.001 sec)



 Comments   
Comment by Geoff Montee (Inactive) [ 2021-11-01 ]

I believe this issue can be simplified to the following test case:

CREATE TABLE accounts_svt (
       id SERIAL PRIMARY KEY,
       name VARCHAR(255),
       amount INT
) WITH SYSTEM VERSIONING;
 
CREATE TABLE accounts_tp (
    id SERIAL PRIMARY KEY,
    name VARCHAR(255),
    amount INT,
    start_trxid BIGINT UNSIGNED GENERATED ALWAYS AS ROW START INVISIBLE,
    end_trxid BIGINT UNSIGNED GENERATED ALWAYS AS ROW END INVISIBLE,
    PERIOD FOR SYSTEM_TIME(start_trxid, end_trxid)
) WITH SYSTEM VERSIONING;
 
BEGIN; 
INSERT INTO accounts_svt (name, amount)
VALUES ("Smith", 400),
       ("Orson", 300),
       ("Serio", 500),
       ("Wallace", 200),
       ("March", 600),
       ("Douglas", 100);
INSERT INTO accounts_tp (name, amount)
VALUES ("Smith", 400),
       ("Orson", 300),
       ("Serio", 500),
       ("Wallace", 200),
       ("March", 600),
       ("Douglas", 100);
COMMIT;
 
BEGIN;
UPDATE accounts_svt
SET amount = 1000
WHERE id = 1
   OR id = 3;
UPDATE accounts_tp
SET amount = 1000
WHERE id = 1
   OR id = 3;
COMMIT;
 
SELECT * FROM mysql.transaction_registry;
 
SELECT *
FROM accounts_svt
FOR SYSTEM_TIME
   FROM '2021-10-27 20:10'
   TO '2038-01-19 03:14:07.999999';
   
SELECT *
FROM accounts_tp
FOR SYSTEM_TIME
   FROM '2021-10-27 20:10'
   TO '2038-01-19 03:14:07.999999';

The final queries using FROM .. TO .. return different results for system-versioned tables and transaction-precise tables:

MariaDB [test]> SELECT *
    -> FROM accounts_svt
    -> FOR SYSTEM_TIME
    ->    FROM '2021-10-27 20:10'
    ->    TO '2038-01-19 03:14:07.999999';
+----+---------+--------+
| id | name    | amount |
+----+---------+--------+
|  1 | Smith   |    400 |
|  1 | Smith   |   1000 |
|  2 | Orson   |    300 |
|  3 | Serio   |    500 |
|  3 | Serio   |   1000 |
|  4 | Wallace |    200 |
|  5 | March   |    600 |
|  6 | Douglas |    100 |
+----+---------+--------+
8 rows in set (0.000 sec)
 
MariaDB [test]> SELECT *
    -> FROM accounts_tp
    -> FOR SYSTEM_TIME
    ->    FROM '2021-10-27 20:10'
    ->    TO '2038-01-19 03:14:07.999999';
+----+---------+--------+
| id | name    | amount |
+----+---------+--------+
|  1 | Smith   |    400 |
|  2 | Orson   |    300 |
|  3 | Serio   |    500 |
|  4 | Wallace |    200 |
|  5 | March   |    600 |
|  6 | Douglas |    100 |
+----+---------+--------+
6 rows in set (0.001 sec)

Comment by Nikita Malyavin [ 2021-11-01 ]

Yes, it seems that transactional and timestamp-based versioning produce different results.

I have created the test to show the result differences

source suite/versioning/engines.inc;
source suite/versioning/common.inc;
 
SELECT TIMESTAMP '9999-12-31 23:59:59';
CREATE TABLE t(
  x INT,
  start_timestamp TIMESTAMP(6) NOT NULL,
  end_timestamp TIMESTAMP(6) NOT NULL DEFAULT now()
);
insert into t values (1, TIMESTAMP '2021-10-27 20:10', TIMESTAMP '2038-01-19 03:14:07.999999');
select * from t;
select * from t WHERE start_timestamp >= TIMESTAMP '0001-01-01 00:00:00' and end_timestamp < TIMESTAMP '9999-12-31 23:59:59';
drop table t;
 
replace_result $sys_datatype_expl SYS_DATATYPE;
eval create or replace table t(
  id SERIAL PRIMARY KEY,
  x INT,
  start_timestamp $sys_datatype_expl AS ROW START,
  end_timestamp $sys_datatype_expl AS ROW END,
  PERIOD FOR SYSTEM_TIME(start_timestamp, end_timestamp)
) WITH SYSTEM VERSIONING;
INSERT INTO t(x)
  VALUES (400),
         (300),
         (500),
         (200),
         (600),
         (100);
 
UPDATE t
SET x = 999
  WHERE id = 1
     OR id = 3;
 
select id,x from t;
 
--sorted_result
SELECT id, x
  FROM t
    FOR SYSTEM_TIME
  FROM '2021-10-27 20:10'
    TO '2038-01-19 03:14:07.999999';
 
--sorted_result
SELECT id,x
  FROM t
    FOR SYSTEM_TIME
    BETWEEN '2021-10-27 20:10'
    AND '2038-01-19 03:14:07.999999';
#select * from mysql.transaction_registry;
 
drop table t;
 
--source suite/versioning/common_finish.inc

produces:

versioning.bug 'innodb,timestamp'        [ pass ]     16
versioning.bug 'innodb,trx_id'           [ fail ]
        Test ended at 2021-11-02 00:16:31
 
CURRENT_TEST: versioning.bug
--- /home/nik/mariadb/mysql-test/suite/versioning/r/bug.result	2021-11-02 00:16:24.799553838 +0300
+++ /home/nik/mariadb/mysql-test/suite/versioning/r/bug.reject	2021-11-02 00:16:31.282928508 +0300
@@ -47,10 +47,8 @@
     TO '2038-01-19 03:14:07.999999';
 id	x
 1	400
-1	999
 2	300
 3	500
-3	999
 4	200
 5	600
 6	100
 
mysqltest: Result length mismatch

Comment by Nikita Malyavin [ 2021-11-02 ]

GeoffMontee I have just got familiar with MDEV-16226. If transaction-based versioning will be redesigned like explained there, the problem should go

Generated at Thu Feb 08 09:49:31 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.