Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-16226

TRX_ID-based System Versioning refactoring

Details

    Description

      As was discussed in https://mariadb.slack.com/archives/CHTLSLQEP/p1575543442067300 (attached here) transaction_registry table has many drawbacks: performance, hard to backup, not node-portable. The feature of transaction-precise versioning as described in https://mariadb.com/kb/en/library/temporal-data-tables/ actually does not require transaction_registry translation: we store timestamps as usual, but update row_end of changed rows at commit time.

      Clustered index records contain DB_TRX_ID, DB_ROLL_PTR fields with non-null values in case they was changed by some open transaction. Clustered index is ordered by PK, explicit or implicit. In case there is no explicit PK in table, first non-null UK is then used or if no such UK exists then auto-generated DB_ROW_ID is added to clustered index. So there are 3 variants of ordering: explicit PK, non-null UK, DB_ROW_ID. We cannot search quickly by DB_TRX_ID.

      Every open transaction has 2 undo logs attached per each changed table. Undo log cannot be attached to more than 1 transaction. One of undo logs is used for non-temporary tables, another one is for temporary ones. Non-temporary undo log must be scanned and corresponding clustered index rows selected by PK (one of 3 variants above).

      Redo log must updated as well, otherwise crash recovery wil end up with wrong values of row_end.

      Upgrade process converts TRX_ID-based tables to timestamp-based. As such kind of tables cannot be backed up, transaction_registry table is safe to drop as long as all TRX_ID-based tables converted.

      Attachments

        Issue Links

          Activity

            midenok Aleksey Midenkov created issue -
            midenok Aleksey Midenkov made changes -
            Field Original Value New Value
            Description Current performance of {{SELECT}} by timestamp from TRX_ID-based tables is very low due to linear scan of TRANSACTION_REGISTRY table when doing timestamp -> TRX_ID translation. Utilize join optimizer for querying TRANSACTION_REGISTRY:

            1. For each timestamped {{Vers_history_point}} selector forge {{TRANSACTION_REGISTRY}} subquery and add it to {{SELECT}}.

            h4. Example
            Query
            {code:sql}
            select *, row_start, row_end from i1 for system_time as of timestamp @t1;
            {code}
            is transformed into
            {code:sql}
            select i1.x as x,
                   i1.row_start as row_start,
                   i1.row_end as row_end
            from i1
            for SYSTEM_TIME all
            join (
               select transaction_id
               from mysql.transaction_registry
               where commit_timestamp <= @t1
               order by commit_timestamp desc
               limit 1
            ) __trt_0
            where trt_trx_sees(i1.row_end, __trt_0.transaction_id)
              and trt_trx_sees_eq(__trt_0.transaction_id, i1.row_start)
            {code}

            2. subquery must be excluded from wildcard expansion, i.e.
            {code:sql}
            select transaction_id from i1 for system_time as of timestamp @t1;
            {code}
            must be resolved to {{i1.transaction_id}}.
            Current performance of {{SELECT}} by timestamp from TRX_ID-based tables is very low due to linear scan of TRANSACTION_REGISTRY table when doing timestamp -> TRX_ID translation. Utilize join optimizer for querying TRANSACTION_REGISTRY:

            h3. 1. For each timestamped {{Vers_history_point}} selector forge {{TRANSACTION_REGISTRY}} subquery and add it to {{SELECT}}.

            h4. Example
            Query
            {code:sql}
            select *, row_start, row_end from i1 for system_time as of timestamp @t1;
            {code}
            is transformed into
            {code:sql}
            select i1.x as x,
                   i1.row_start as row_start,
                   i1.row_end as row_end
            from i1
            for SYSTEM_TIME all
            join (
               select transaction_id
               from mysql.transaction_registry
               where commit_timestamp <= @t1
               order by commit_timestamp desc
               limit 1
            ) __trt_0
            where trt_trx_sees(i1.row_end, __trt_0.transaction_id)
              and trt_trx_sees_eq(__trt_0.transaction_id, i1.row_start)
            {code}

            h3. 2. subquery must be excluded from wildcard expansion, i.e.
            {code:sql}
            select transaction_id from i1 for system_time as of timestamp @t1;
            {code}
            must be resolved to {{i1.transaction_id}}.

            [Detailed analysis and work progress|https://github.com/tempesta-tech/mariadb/issues/314]
            midenok Aleksey Midenkov made changes -
            Description Current performance of {{SELECT}} by timestamp from TRX_ID-based tables is very low due to linear scan of TRANSACTION_REGISTRY table when doing timestamp -> TRX_ID translation. Utilize join optimizer for querying TRANSACTION_REGISTRY:

            h3. 1. For each timestamped {{Vers_history_point}} selector forge {{TRANSACTION_REGISTRY}} subquery and add it to {{SELECT}}.

            h4. Example
            Query
            {code:sql}
            select *, row_start, row_end from i1 for system_time as of timestamp @t1;
            {code}
            is transformed into
            {code:sql}
            select i1.x as x,
                   i1.row_start as row_start,
                   i1.row_end as row_end
            from i1
            for SYSTEM_TIME all
            join (
               select transaction_id
               from mysql.transaction_registry
               where commit_timestamp <= @t1
               order by commit_timestamp desc
               limit 1
            ) __trt_0
            where trt_trx_sees(i1.row_end, __trt_0.transaction_id)
              and trt_trx_sees_eq(__trt_0.transaction_id, i1.row_start)
            {code}

            h3. 2. subquery must be excluded from wildcard expansion, i.e.
            {code:sql}
            select transaction_id from i1 for system_time as of timestamp @t1;
            {code}
            must be resolved to {{i1.transaction_id}}.

            [Detailed analysis and work progress|https://github.com/tempesta-tech/mariadb/issues/314]
            Current performance of {{SELECT}} by timestamp from TRX_ID-based tables is very low due to linear scan of TRANSACTION_REGISTRY table when doing timestamp -> TRX_ID translation. Utilize join optimizer for querying TRANSACTION_REGISTRY:

            h3. 1. For each timestamped {{Vers_history_point}} selector forge {{TRANSACTION_REGISTRY}} subquery and add it to {{SELECT}}.

            h4. Example
            Query
            {code:sql}
            select *, row_start, row_end from i1 for system_time as of timestamp @t1;
            {code}
            is transformed into
            {code:sql}
            select i1.x as x,
                   i1.row_start as row_start,
                   i1.row_end as row_end
            from i1
            for SYSTEM_TIME all
            join (
               select transaction_id
               from mysql.transaction_registry
               where commit_timestamp <= @t1
               order by commit_timestamp desc
               limit 1
            ) __trt_0
            where trt_trx_sees(i1.row_end, __trt_0.transaction_id)
              and trt_trx_sees_eq(__trt_0.transaction_id, i1.row_start)
            {code}

            h3. 2. subquery must be excluded from wildcard expansion, i.e.
            {code:sql}
            select transaction_id from i1 for system_time as of timestamp @t1;
            {code}
            must be resolved to {{i1.transaction_id}}.

            h2. [Detailed analysis and work progress|https://github.com/tempesta-tech/mariadb/issues/314]
            midenok Aleksey Midenkov made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            midenok Aleksey Midenkov made changes -
            Status In Progress [ 3 ] Stalled [ 10000 ]
            midenok Aleksey Midenkov made changes -
            Status Stalled [ 10000 ] In Progress [ 3 ]
            midenok Aleksey Midenkov made changes -
            midenok Aleksey Midenkov made changes -
            Comment [ 4h ]
            midenok Aleksey Midenkov made changes -
            midenok Aleksey Midenkov made changes -
            Status In Progress [ 3 ] Stalled [ 10000 ]
            serg Sergei Golubchik made changes -
            Fix Version/s 10.4 [ 22408 ]
            serg Sergei Golubchik made changes -
            NRE Projects RM_105_CANDIDATE
            midenok Aleksey Midenkov made changes -
            Priority Major [ 3 ] Minor [ 4 ]
            midenok Aleksey Midenkov made changes -
            Labels trx-versioning
            ralf.gebhardt Ralf Gebhardt made changes -
            Fix Version/s 10.3 [ 22126 ]
            Fix Version/s 10.4 [ 22408 ]
            ralf.gebhardt Ralf Gebhardt made changes -
            Fix Version/s 10.6 [ 24028 ]
            serg Sergei Golubchik made changes -
            Rank Ranked lower
            serg Sergei Golubchik made changes -
            Priority Minor [ 4 ] Major [ 3 ]
            midenok Aleksey Midenkov made changes -
            Summary TRX_ID-based versioned tables performance improvement TRX_ID-based System Versioning refactoring
            midenok Aleksey Midenkov made changes -
            Description Current performance of {{SELECT}} by timestamp from TRX_ID-based tables is very low due to linear scan of TRANSACTION_REGISTRY table when doing timestamp -> TRX_ID translation. Utilize join optimizer for querying TRANSACTION_REGISTRY:

            h3. 1. For each timestamped {{Vers_history_point}} selector forge {{TRANSACTION_REGISTRY}} subquery and add it to {{SELECT}}.

            h4. Example
            Query
            {code:sql}
            select *, row_start, row_end from i1 for system_time as of timestamp @t1;
            {code}
            is transformed into
            {code:sql}
            select i1.x as x,
                   i1.row_start as row_start,
                   i1.row_end as row_end
            from i1
            for SYSTEM_TIME all
            join (
               select transaction_id
               from mysql.transaction_registry
               where commit_timestamp <= @t1
               order by commit_timestamp desc
               limit 1
            ) __trt_0
            where trt_trx_sees(i1.row_end, __trt_0.transaction_id)
              and trt_trx_sees_eq(__trt_0.transaction_id, i1.row_start)
            {code}

            h3. 2. subquery must be excluded from wildcard expansion, i.e.
            {code:sql}
            select transaction_id from i1 for system_time as of timestamp @t1;
            {code}
            must be resolved to {{i1.transaction_id}}.

            h2. [Detailed analysis and work progress|https://github.com/tempesta-tech/mariadb/issues/314]
            As was discussed in https://mariadb.slack.com/archives/CHTLSLQEP/p1575543442067300 (attached here) transaction_registry table has many drawbacks: performance, hard to backup, not node-portable. The feature of transaction-precise versioning as described in https://mariadb.com/kb/en/library/temporal-data-tables/ actually does not require transaction_registry translation: we store timestamps as usual, but update row_end of changed rows at commit time.
            midenok Aleksey Midenkov made changes -
            Attachment trx_id_versioning_talk.txt [ 55155 ]
            midenok Aleksey Midenkov made changes -
            Fix Version/s 10.7 [ 24805 ]
            Fix Version/s 10.6 [ 24028 ]
            midenok Aleksey Midenkov made changes -
            Comment [ Transaction registry subquery is added at early stage right after query tables are opened. Whether the subquery is added is considered from the following conditions:

            1. query table is TRX_ID-based;
            2. {{SYSTEM_TIME}} specifier is TRX_ID.

            The condition 2. puts the requirement of explicit TRX_ID resolution, because after the subquery is added it is impossible to resolute the expression to timestamp.
            ]
            midenok Aleksey Midenkov made changes -
            midenok Aleksey Midenkov made changes -
            midenok Aleksey Midenkov made changes -
            midenok Aleksey Midenkov made changes -
            midenok Aleksey Midenkov made changes -
            midenok Aleksey Midenkov made changes -
            midenok Aleksey Midenkov made changes -
            midenok Aleksey Midenkov made changes -
            Status Stalled [ 10000 ] In Progress [ 3 ]
            midenok Aleksey Midenkov made changes -
            Description As was discussed in https://mariadb.slack.com/archives/CHTLSLQEP/p1575543442067300 (attached here) transaction_registry table has many drawbacks: performance, hard to backup, not node-portable. The feature of transaction-precise versioning as described in https://mariadb.com/kb/en/library/temporal-data-tables/ actually does not require transaction_registry translation: we store timestamps as usual, but update row_end of changed rows at commit time.
            As was discussed in https://mariadb.slack.com/archives/CHTLSLQEP/p1575543442067300 (attached here) transaction_registry table has many drawbacks: performance, hard to backup, not node-portable. The feature of transaction-precise versioning as described in https://mariadb.com/kb/en/library/temporal-data-tables/ actually does not require transaction_registry translation: we store timestamps as usual, but update row_end of changed rows at commit time.

            Clustered index recoreds contain DB_TRX_ID, DB_ROLL_PTR fields with non-null values in case they was changed by some open transaction. Clustered index is ordered by PK, explicit or implicit. In case there is no explicit PK in table, first non-null UK is then used or if no such UK exists then auto-generated DB_ROW_ID is added to clustered index. So there are 3 variants of ordering: PK, non-null UK, DB_ROW_ID. We cannot search quickly by DB_TRX_ID.
            midenok Aleksey Midenkov made changes -
            Description As was discussed in https://mariadb.slack.com/archives/CHTLSLQEP/p1575543442067300 (attached here) transaction_registry table has many drawbacks: performance, hard to backup, not node-portable. The feature of transaction-precise versioning as described in https://mariadb.com/kb/en/library/temporal-data-tables/ actually does not require transaction_registry translation: we store timestamps as usual, but update row_end of changed rows at commit time.

            Clustered index recoreds contain DB_TRX_ID, DB_ROLL_PTR fields with non-null values in case they was changed by some open transaction. Clustered index is ordered by PK, explicit or implicit. In case there is no explicit PK in table, first non-null UK is then used or if no such UK exists then auto-generated DB_ROW_ID is added to clustered index. So there are 3 variants of ordering: PK, non-null UK, DB_ROW_ID. We cannot search quickly by DB_TRX_ID.
            As was discussed in https://mariadb.slack.com/archives/CHTLSLQEP/p1575543442067300 (attached here) transaction_registry table has many drawbacks: performance, hard to backup, not node-portable. The feature of transaction-precise versioning as described in https://mariadb.com/kb/en/library/temporal-data-tables/ actually does not require transaction_registry translation: we store timestamps as usual, but update row_end of changed rows at commit time.

            Clustered index recoreds contain DB_TRX_ID, DB_ROLL_PTR fields with non-null values in case they was changed by some open transaction. Clustered index is ordered by PK, explicit or implicit. In case there is no explicit PK in table, first non-null UK is then used or if no such UK exists then auto-generated DB_ROW_ID is added to clustered index. So there are 3 variants of ordering: explicit PK, non-null UK, DB_ROW_ID. We cannot search quickly by DB_TRX_ID.

            Every open transaction has 2 undo logs attached per each changed table. Undo log cannot be attached to more than 1 transaction. One of undo logs is used for non-temporary tables, another one is for temporary ones. Non-temporary undo log must be scanned and corresponding clustered index rows selected by PK (one of 3 variants above).
            midenok Aleksey Midenkov made changes -
            Description As was discussed in https://mariadb.slack.com/archives/CHTLSLQEP/p1575543442067300 (attached here) transaction_registry table has many drawbacks: performance, hard to backup, not node-portable. The feature of transaction-precise versioning as described in https://mariadb.com/kb/en/library/temporal-data-tables/ actually does not require transaction_registry translation: we store timestamps as usual, but update row_end of changed rows at commit time.

            Clustered index recoreds contain DB_TRX_ID, DB_ROLL_PTR fields with non-null values in case they was changed by some open transaction. Clustered index is ordered by PK, explicit or implicit. In case there is no explicit PK in table, first non-null UK is then used or if no such UK exists then auto-generated DB_ROW_ID is added to clustered index. So there are 3 variants of ordering: explicit PK, non-null UK, DB_ROW_ID. We cannot search quickly by DB_TRX_ID.

            Every open transaction has 2 undo logs attached per each changed table. Undo log cannot be attached to more than 1 transaction. One of undo logs is used for non-temporary tables, another one is for temporary ones. Non-temporary undo log must be scanned and corresponding clustered index rows selected by PK (one of 3 variants above).
            As was discussed in https://mariadb.slack.com/archives/CHTLSLQEP/p1575543442067300 (attached here) transaction_registry table has many drawbacks: performance, hard to backup, not node-portable. The feature of transaction-precise versioning as described in https://mariadb.com/kb/en/library/temporal-data-tables/ actually does not require transaction_registry translation: we store timestamps as usual, but update row_end of changed rows at commit time.

            Clustered index records contain DB_TRX_ID, DB_ROLL_PTR fields with non-null values in case they was changed by some open transaction. Clustered index is ordered by PK, explicit or implicit. In case there is no explicit PK in table, first non-null UK is then used or if no such UK exists then auto-generated DB_ROW_ID is added to clustered index. So there are 3 variants of ordering: explicit PK, non-null UK, DB_ROW_ID. We cannot search quickly by DB_TRX_ID.

            Every open transaction has 2 undo logs attached per each changed table. Undo log cannot be attached to more than 1 transaction. One of undo logs is used for non-temporary tables, another one is for temporary ones. Non-temporary undo log must be scanned and corresponding clustered index rows selected by PK (one of 3 variants above).
            midenok Aleksey Midenkov made changes -
            Status In Progress [ 3 ] Stalled [ 10000 ]
            ralf.gebhardt Ralf Gebhardt made changes -
            Fix Version/s 10.8 [ 26121 ]
            Fix Version/s 10.7 [ 24805 ]
            midenok Aleksey Midenkov made changes -
            Description As was discussed in https://mariadb.slack.com/archives/CHTLSLQEP/p1575543442067300 (attached here) transaction_registry table has many drawbacks: performance, hard to backup, not node-portable. The feature of transaction-precise versioning as described in https://mariadb.com/kb/en/library/temporal-data-tables/ actually does not require transaction_registry translation: we store timestamps as usual, but update row_end of changed rows at commit time.

            Clustered index records contain DB_TRX_ID, DB_ROLL_PTR fields with non-null values in case they was changed by some open transaction. Clustered index is ordered by PK, explicit or implicit. In case there is no explicit PK in table, first non-null UK is then used or if no such UK exists then auto-generated DB_ROW_ID is added to clustered index. So there are 3 variants of ordering: explicit PK, non-null UK, DB_ROW_ID. We cannot search quickly by DB_TRX_ID.

            Every open transaction has 2 undo logs attached per each changed table. Undo log cannot be attached to more than 1 transaction. One of undo logs is used for non-temporary tables, another one is for temporary ones. Non-temporary undo log must be scanned and corresponding clustered index rows selected by PK (one of 3 variants above).
            As was discussed in https://mariadb.slack.com/archives/CHTLSLQEP/p1575543442067300 (attached here) transaction_registry table has many drawbacks: performance, hard to backup, not node-portable. The feature of transaction-precise versioning as described in https://mariadb.com/kb/en/library/temporal-data-tables/ actually does not require transaction_registry translation: *we store timestamps as usual*, but update row_end of changed rows at commit time.

            Clustered index records contain DB_TRX_ID, DB_ROLL_PTR fields with non-null values in case they was changed by some open transaction. Clustered index is ordered by PK, explicit or implicit. In case there is no explicit PK in table, first non-null UK is then used or if no such UK exists then auto-generated DB_ROW_ID is added to clustered index. So there are 3 variants of ordering: explicit PK, non-null UK, DB_ROW_ID. We cannot search quickly by DB_TRX_ID.

            Every open transaction has 2 undo logs attached per each changed table. Undo log cannot be attached to more than 1 transaction. One of undo logs is used for non-temporary tables, another one is for temporary ones. Non-temporary undo log must be scanned and corresponding clustered index rows selected by PK (one of 3 variants above).
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 87354 ] MariaDB v4 [ 131688 ]
            serg Sergei Golubchik made changes -
            Fix Version/s 10.9 [ 26905 ]
            Fix Version/s 10.8 [ 26121 ]
            serg Sergei Golubchik made changes -
            Fix Version/s 10.10 [ 27530 ]
            Fix Version/s 10.9 [ 26905 ]
            ralf.gebhardt Ralf Gebhardt made changes -
            Fix Version/s 10.10 [ 27530 ]
            ralf.gebhardt Ralf Gebhardt made changes -
            Fix Version/s 10.11 [ 27614 ]
            serg Sergei Golubchik made changes -
            Fix Version/s 10.12 [ 28320 ]
            Fix Version/s 10.11 [ 27614 ]
            ralf.gebhardt Ralf Gebhardt made changes -
            Fix Version/s 11.3 [ 28565 ]
            Fix Version/s 11.0 [ 28320 ]
            serg Sergei Golubchik made changes -
            Fix Version/s 11.4 [ 29301 ]
            Fix Version/s 11.3 [ 28565 ]
            midenok Aleksey Midenkov made changes -
            Priority Major [ 3 ] Critical [ 2 ]
            midenok Aleksey Midenkov made changes -
            midenok Aleksey Midenkov made changes -
            midenok Aleksey Midenkov made changes -
            midenok Aleksey Midenkov made changes -
            midenok Aleksey Midenkov made changes -
            midenok Aleksey Midenkov made changes -
            midenok Aleksey Midenkov made changes -
            midenok Aleksey Midenkov made changes -
            midenok Aleksey Midenkov made changes -
            midenok Aleksey Midenkov made changes -
            midenok Aleksey Midenkov made changes -
            midenok Aleksey Midenkov made changes -
            midenok Aleksey Midenkov made changes -
            midenok Aleksey Midenkov made changes -
            midenok Aleksey Midenkov made changes -
            midenok Aleksey Midenkov made changes -
            Description As was discussed in https://mariadb.slack.com/archives/CHTLSLQEP/p1575543442067300 (attached here) transaction_registry table has many drawbacks: performance, hard to backup, not node-portable. The feature of transaction-precise versioning as described in https://mariadb.com/kb/en/library/temporal-data-tables/ actually does not require transaction_registry translation: *we store timestamps as usual*, but update row_end of changed rows at commit time.

            Clustered index records contain DB_TRX_ID, DB_ROLL_PTR fields with non-null values in case they was changed by some open transaction. Clustered index is ordered by PK, explicit or implicit. In case there is no explicit PK in table, first non-null UK is then used or if no such UK exists then auto-generated DB_ROW_ID is added to clustered index. So there are 3 variants of ordering: explicit PK, non-null UK, DB_ROW_ID. We cannot search quickly by DB_TRX_ID.

            Every open transaction has 2 undo logs attached per each changed table. Undo log cannot be attached to more than 1 transaction. One of undo logs is used for non-temporary tables, another one is for temporary ones. Non-temporary undo log must be scanned and corresponding clustered index rows selected by PK (one of 3 variants above).
            As was discussed in https://mariadb.slack.com/archives/CHTLSLQEP/p1575543442067300 (attached here) transaction_registry table has many drawbacks: performance, hard to backup, not node-portable. The feature of transaction-precise versioning as described in https://mariadb.com/kb/en/library/temporal-data-tables/ actually does not require transaction_registry translation: *we store timestamps as usual*, but update row_end of changed rows at commit time.

            Clustered index records contain DB_TRX_ID, DB_ROLL_PTR fields with non-null values in case they was changed by some open transaction. Clustered index is ordered by PK, explicit or implicit. In case there is no explicit PK in table, first non-null UK is then used or if no such UK exists then auto-generated DB_ROW_ID is added to clustered index. So there are 3 variants of ordering: explicit PK, non-null UK, DB_ROW_ID. We cannot search quickly by DB_TRX_ID.

            Every open transaction has 2 undo logs attached per each changed table. Undo log cannot be attached to more than 1 transaction. One of undo logs is used for non-temporary tables, another one is for temporary ones. Non-temporary undo log must be scanned and corresponding clustered index rows selected by PK (one of 3 variants above).

            Upgrade process converts TRX_ID-based tables to timestamp-based. As such kind of tables cannot be backed up, transaction_registry table is safe to drop as long as all TRX_ID-based tables converted.
            midenok Aleksey Midenkov made changes -
            midenok Aleksey Midenkov made changes -
            midenok Aleksey Midenkov made changes -
            Description As was discussed in https://mariadb.slack.com/archives/CHTLSLQEP/p1575543442067300 (attached here) transaction_registry table has many drawbacks: performance, hard to backup, not node-portable. The feature of transaction-precise versioning as described in https://mariadb.com/kb/en/library/temporal-data-tables/ actually does not require transaction_registry translation: *we store timestamps as usual*, but update row_end of changed rows at commit time.

            Clustered index records contain DB_TRX_ID, DB_ROLL_PTR fields with non-null values in case they was changed by some open transaction. Clustered index is ordered by PK, explicit or implicit. In case there is no explicit PK in table, first non-null UK is then used or if no such UK exists then auto-generated DB_ROW_ID is added to clustered index. So there are 3 variants of ordering: explicit PK, non-null UK, DB_ROW_ID. We cannot search quickly by DB_TRX_ID.

            Every open transaction has 2 undo logs attached per each changed table. Undo log cannot be attached to more than 1 transaction. One of undo logs is used for non-temporary tables, another one is for temporary ones. Non-temporary undo log must be scanned and corresponding clustered index rows selected by PK (one of 3 variants above).

            Upgrade process converts TRX_ID-based tables to timestamp-based. As such kind of tables cannot be backed up, transaction_registry table is safe to drop as long as all TRX_ID-based tables converted.
            As was discussed in https://mariadb.slack.com/archives/CHTLSLQEP/p1575543442067300 (attached here) transaction_registry table has many drawbacks: performance, hard to backup, not node-portable. The feature of transaction-precise versioning as described in https://mariadb.com/kb/en/library/temporal-data-tables/ actually does not require transaction_registry translation: *we store timestamps as usual*, but update row_end of changed rows at commit time.

            Clustered index records contain DB_TRX_ID, DB_ROLL_PTR fields with non-null values in case they was changed by some open transaction. Clustered index is ordered by PK, explicit or implicit. In case there is no explicit PK in table, first non-null UK is then used or if no such UK exists then auto-generated DB_ROW_ID is added to clustered index. So there are 3 variants of ordering: explicit PK, non-null UK, DB_ROW_ID. We cannot search quickly by DB_TRX_ID.

            Every open transaction has 2 undo logs attached per each changed table. Undo log cannot be attached to more than 1 transaction. One of undo logs is used for non-temporary tables, another one is for temporary ones. Non-temporary undo log must be scanned and corresponding clustered index rows selected by PK (one of 3 variants above).

            Redo log must updated as well, otherwise crash recovery wil end up with wrong values or row_end.

            Upgrade process converts TRX_ID-based tables to timestamp-based. As such kind of tables cannot be backed up, transaction_registry table is safe to drop as long as all TRX_ID-based tables converted.
            midenok Aleksey Midenkov made changes -
            Description As was discussed in https://mariadb.slack.com/archives/CHTLSLQEP/p1575543442067300 (attached here) transaction_registry table has many drawbacks: performance, hard to backup, not node-portable. The feature of transaction-precise versioning as described in https://mariadb.com/kb/en/library/temporal-data-tables/ actually does not require transaction_registry translation: *we store timestamps as usual*, but update row_end of changed rows at commit time.

            Clustered index records contain DB_TRX_ID, DB_ROLL_PTR fields with non-null values in case they was changed by some open transaction. Clustered index is ordered by PK, explicit or implicit. In case there is no explicit PK in table, first non-null UK is then used or if no such UK exists then auto-generated DB_ROW_ID is added to clustered index. So there are 3 variants of ordering: explicit PK, non-null UK, DB_ROW_ID. We cannot search quickly by DB_TRX_ID.

            Every open transaction has 2 undo logs attached per each changed table. Undo log cannot be attached to more than 1 transaction. One of undo logs is used for non-temporary tables, another one is for temporary ones. Non-temporary undo log must be scanned and corresponding clustered index rows selected by PK (one of 3 variants above).

            Redo log must updated as well, otherwise crash recovery wil end up with wrong values or row_end.

            Upgrade process converts TRX_ID-based tables to timestamp-based. As such kind of tables cannot be backed up, transaction_registry table is safe to drop as long as all TRX_ID-based tables converted.
            As was discussed in https://mariadb.slack.com/archives/CHTLSLQEP/p1575543442067300 (attached here) transaction_registry table has many drawbacks: performance, hard to backup, not node-portable. The feature of transaction-precise versioning as described in https://mariadb.com/kb/en/library/temporal-data-tables/ actually does not require transaction_registry translation: *we store timestamps as usual*, but update row_end of changed rows at commit time.

            Clustered index records contain DB_TRX_ID, DB_ROLL_PTR fields with non-null values in case they was changed by some open transaction. Clustered index is ordered by PK, explicit or implicit. In case there is no explicit PK in table, first non-null UK is then used or if no such UK exists then auto-generated DB_ROW_ID is added to clustered index. So there are 3 variants of ordering: explicit PK, non-null UK, DB_ROW_ID. We cannot search quickly by DB_TRX_ID.

            Every open transaction has 2 undo logs attached per each changed table. Undo log cannot be attached to more than 1 transaction. One of undo logs is used for non-temporary tables, another one is for temporary ones. Non-temporary undo log must be scanned and corresponding clustered index rows selected by PK (one of 3 variants above).

            Redo log must updated as well, otherwise crash recovery wil end up with wrong values of row_end.

            Upgrade process converts TRX_ID-based tables to timestamp-based. As such kind of tables cannot be backed up, transaction_registry table is safe to drop as long as all TRX_ID-based tables converted.
            serg Sergei Golubchik made changes -
            Fix Version/s 11.5 [ 29506 ]
            Fix Version/s 11.4 [ 29301 ]
            serg Sergei Golubchik made changes -
            Priority Critical [ 2 ] Major [ 3 ]
            serg Sergei Golubchik made changes -
            Fix Version/s 11.6 [ 29515 ]
            Fix Version/s 11.5 [ 29506 ]
            ralf.gebhardt Ralf Gebhardt made changes -
            Fix Version/s 11.6 [ 29515 ]
            Labels trx-versioning tech_debt trx-versioning

            People

              midenok Aleksey Midenkov
              midenok Aleksey Midenkov
              Votes:
              5 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.