[MDEV-33067] SCN(Sequence Commit Number) based MVCC Created: 2023-12-19 Updated: 2024-01-08 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | None |
| Fix Version/s: | None |
| Type: | New Feature | Priority: | Major |
| Reporter: | zhangyuan | Assignee: | Marko Mäkelä |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||
| Description |
|
SCN is a sequence number which was generated while committing transaction, and it should always be increased. With SCN, the snapshot for MVCC can only use a version number to compare with SCN. origin commit from zhai weixiang: https://bugs.mysql.com/bug.php?id=109599 |
| Comments |
| Comment by Marko Mäkelä [ 2023-12-19 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
In MariaDB Server 10.3, the MVCC implementation as well as the data structures for transactions were changed in Which types of operations is this expected to improve and how? I think that currently the largest problem with MVCC are secondary indexes. That would be fixed by MDEV-17598. We already have a monotonically increasing sequence number that is stored in each data page at FIL_PAGE_LSN. I have previously encountered the SCN (System Change Number) in Oracle documentation, and I was under the impression that it corresponds to the InnoDB Log Sequence Number (LSN). | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Yin Peng [ 2023-12-20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I have created a pull request on github: https://github.com/MariaDB/server/pull/2955 The purpose of this patch is to reduce the overhead of creating a snapshot(readview) by avoiding the traversal of active transaction IDs. In our testing scenario, we observed an approximately 5% improvement in QPS for read-write workload. However, there was a decrease of approximately 3% in the write workload. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2024-01-02 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
zhangyuan, thank you. I posted some initial review comments. I would like to know more about the benchmarks that you ran. The read performance bug could be what we already know as MDEV-20630 and MDEV-21423. I have not studied the lock-free hash table code in deep enough detail to be able to do something to it. There could be some low-hanging fruit in that code, such as replacing some of the sequential consistent memory ordering with release-acquire. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Yin Peng [ 2024-01-03 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hello Marko, thanks for your reply. Our test configurations are: sysbench tables=250 mariadb server max_connections=2048 The result of oltp_read_only:
The result of oltp_read_write:
The result of oltp_write_only:
The result of oltp_update_index:
We also conducted some mixed tests using two instances of sysbench.
oltp_update_non_index(256 threads) + oltp_point_select(256 threads):
oltp_write_only(128 threads) + oltp_read_only(128 threads):
oltp_write_only(128 threads) + oltp_read_only(128 threads):
Note: All tests were conducted with the patch in MDEV-19749 applied. I will make the modifications according to your review comments and then do the performance testing again. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2024-01-04 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thank you, zhangyuan. For the oltp_read_only test, if the history list was empty at the start of the test, all DB_TRX_ID should be 0 and any transaction ID lookups should be avoided. In this type of scenario I would expect no big difference in throughput. I have filed https://github.com/akopytov/sysbench/pull/419 for enabling the use of It would be good to specify the exact commit hash when not testing a tagged release. The pull request appears to be based on 11.4 as of this commit, which does not include a merge of | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Yin Peng [ 2024-01-08 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hello Marko, I have made some modifications according to your review comments. Here are some answers to the questions mentioned in the pull request. 1 How older versions would react to undo logs that exist in this format? 2. Did all test suites pass when running ./mtr - @@ -29,7 +29,7 @@ 3. What is the test suite execution time with both values of the parameter? 4. Has this been tested with Galera or with replication? Do all nodes have to use the same value of the parameter? How does the SCN affect checks for implicit locks? 5. All builders (including ASAN, MSAN, UBSAN) need to pass all tests in both modes. ==6377==ERROR: AddressSanitizer: use-after-poison on address 0x629000034e60 at pc 0x0000016cbcff bp 0x7ffe8a43a9a0 sp 0x7ffe8a43a990 0x629000034e60 is located 11360 bytes inside of 16516-byte region [0x629000032200,0x629000036284) 6. Could we have a cmake option for enabling this feature, similar to PLUGIN_PERFSCHEMA, WITH_WSREP, or WITH_INNODB_AHI? I have rebased the code to current 11.4(commit id: 7ee16b1e2978d6b73377d9a5bf7d9fd83c841f92), but following tests are done based on commit 3fad2b115569864d8c1b7ea90ce92aa895cfef08. The test result of oltp_point_select:
The test result of oltp_read_only:
The test result of oltp_read_write:
The test result of oltp_write_only:
The test result of oltp_update_index:
oltp_update_non_index(128 threads) + oltp_point_select(128 threads):
oltp_update_non_index(256 threads) + oltp_point_select(256 threads):
oltp_write_only(128 threads) + oltp_read_only(128 threads):
oltp_write_only(128 threads) + oltp_read_only(128 threads):
According to these tests for mariadb, SCN doesn't make much sense for most sysbench scenarios. However, it does show promising improvements for the oltp_point_select workload when combined with other write operations. |