[MDEV-21923] LSN allocation is a bottleneck - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 10.11, 11.4, 11.8
Fix Version/s: 10.11.12, 11.4.6, 11.8.2
Component/s: Storage Engine - InnoDB
Labels:
- Sprint
- performance

Description

MySQL #WL10310 optimizes the redo log(unlock and write concurrently). Does MariaDB plan to optimize redo log?

Reference material - https://dev.mysql.com/blog-archive/mysql-8-0-new-lock-free-scalable-wal-design/

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

1.sysbench_rw_128_QPS.png
2025-04-10 06:15
34 kB
Debarun Banerjee
2.sysbench_rw_128_LSN.png
2025-04-10 06:17
22 kB
Debarun Banerjee
3.sysbench_rw_128_Free_Page.png
2025-04-10 06:17
21 kB
Debarun Banerjee
4.sysbench_rw_128_Purge_History_Length.png
2025-04-10 06:17
24 kB
Debarun Banerjee
mdev-21923b.PNG
2025-04-03 11:25
23 kB
Steve Shaw
mdev-21923c.PNG
2025-04-03 11:25
8 kB
Steve Shaw

Issue Links

relates to

MDEV-14425 Change the InnoDB redo log format to reduce write amplification

Closed

MDEV-14462 Confusing error message: ib_logfiles are too small for innodb_thread_concurrency=0

Closed

MDEV-27774 Reduce scalability bottlenecks in mtr_t::commit()

Closed

MDEV-33515 log_sys.lsn_lock causes excessive context switching

Closed

MDEV-36159 mariabackup failed after upgrade 10.11.10

Stalled

links to

MySQL 8.0: New Lock free, scalable WAL design

WL#10310: Redo log optimization: dedicated threads and concurrent log buffer

(2 links to)

Activity

Ascending order - Click to sort in descending order

View 27 older comments

Debarun Banerjee added a comment - 2025-04-10 07:02

The patch improves the latching for LSN update at mini-transaction commit and also better utilizes the redo buffer by allowing it to fill completely if flush is not triggered otherwise like transaction commit. The change looks good and correct to me the code simplification would also help going forward. However, the changes touch the core area of redo persistence and a possibility of impact cannot be denied . Since the design allows for write_lsn_offset to overshoot temporarily, other threads that need to read the system LSN consistently must take X redo latch e.g. commit redo flush check to see if redo is already flushed to the point.

To check possible impact/improvement on OLTP read-write performance I tested with 8G (~35 M rows) sysbench data in sdp : 76 core in 2 Numa nodes, hyper-threaded (x2).

Architecture: x86_64

CPU(s):  152

Model name:  Intel(R) Xeon(R) Platinum 8368 CPU @ 2.40GHz

Thread(s) per core:   2

Core(s) per socket:   38

Socket(s):  2

I kept the the test run cpu oriented by choosing buffer pool 16G and ran for 1,2,4,8,16,32,64,128 threads with 2 commits. Redo log size is 1G, rest of the configurations are default.

1. ~~MDEV-21923~~: LSN allocation is a bottleneck : Marked as 10_11_12-128 in the graphs.
2. Last commit in 10.11 without ~~MDEV-21923~~: changes: Marked as 10_11_12_PRE-128 in the graphs.

From 1-64 threads the QPS, TPS and other numbers look similar. With 128 threads, there is good performance improvement initially and then it drop down to the same level.

The LSN speed also shows that the initial throughput is better.

However, the free pages in both run reduces steadily and once goes down close to 1K - when the regular eviction and free page generation is forced, the performance comes down almost to the level of previous commit (PRE).

It can be seen that the excess 8G space is all filled by undo pages which steadily grows (faster in ~~MDEV-21923~~ as the throughput is more) and at this rate purge doesn't seem to be able yo keep up which undo data growing steadily on disk and purge history length increasing.

This experiment sure confirms the improvement at mtr commit concurrency. For improving the purge it needs a different focus and we could possibly aim that in a separate task if that is needed - sysbench read-write load is not necessarily reflects the real world load.

Debarun Banerjee added a comment - 2025-04-10 07:02 The patch improves the latching for LSN update at mini-transaction commit and also better utilizes the redo buffer by allowing it to fill completely if flush is not triggered otherwise like transaction commit. The change looks good and correct to me the code simplification would also help going forward. However, the changes touch the core area of redo persistence and a possibility of impact cannot be denied . Since the design allows for write_lsn_offset to overshoot temporarily, other threads that need to read the system LSN consistently must take X redo latch e.g. commit redo flush check to see if redo is already flushed to the point. To check possible impact/improvement on OLTP read-write performance I tested with 8G (~35 M rows) sysbench data in sdp : 76 core in 2 Numa nodes, hyper-threaded (x2). Architecture: x86_64 CPU(s): 152 Model name: Intel(R) Xeon(R) Platinum 8368 CPU @ 2.40GHz Thread(s) per core: 2 Core(s) per socket: 38 Socket(s): 2 I kept the the test run cpu oriented by choosing buffer pool 16G and ran for 1,2,4,8,16,32,64,128 threads with 2 commits. Redo log size is 1G, rest of the configurations are default. 1. MDEV-21923 : LSN allocation is a bottleneck : Marked as 10_11_12-128 in the graphs. 2. Last commit in 10.11 without MDEV-21923 : changes: Marked as 10_11_12_PRE-128 in the graphs. From 1-64 threads the QPS, TPS and other numbers look similar. With 128 threads, there is good performance improvement initially and then it drop down to the same level. The LSN speed also shows that the initial throughput is better. However, the free pages in both run reduces steadily and once goes down close to 1K - when the regular eviction and free page generation is forced, the performance comes down almost to the level of previous commit (PRE). It can be seen that the excess 8G space is all filled by undo pages which steadily grows (faster in MDEV-21923 as the throughput is more) and at this rate purge doesn't seem to be able yo keep up which undo data growing steadily on disk and purge history length increasing. This experiment sure confirms the improvement at mtr commit concurrency. For improving the purge it needs a different focus and we could possibly aim that in a separate task if that is needed - sysbench read-write load is not necessarily reflects the real world load.

Debarun Banerjee added a comment - 2025-04-10 07:08

Review: Concluded the review. Updated the findings and graphs.

Debarun Banerjee added a comment - 2025-04-10 07:08 Review: Concluded the review. Updated the findings and graphs.

Marko Mäkelä added a comment - 2025-04-10 07:20

debarun, thank you for the review and the performance test. Your finding on the purge lag (growing number of undo pages, or history list) had already been confirmed in MDEV-36472. We will definitely need to spend some further effort in that area.

Marko Mäkelä added a comment - 2025-04-10 07:20 debarun , thank you for the review and the performance test. Your finding on the purge lag (growing number of undo pages, or history list) had already been confirmed in MDEV-36472 . We will definitely need to spend some further effort in that area.

Marko Mäkelä added a comment - 2025-04-10 10:13

mleich, can you please give this a final round of testing? There have been some minor changes since the last time you tested this.

Marko Mäkelä added a comment - 2025-04-10 10:13 mleich , can you please give this a final round of testing? There have been some minor changes since the last time you tested this.

Matthias Leich added a comment - 2025-04-10 16:05

origin/10.11-~~MDEV-21923~~ acd071f599f416ddb4821dec485c4d912844213f 2025-04-10T13:02:17+03:00
performed well in RQG testing. No new problems.

Matthias Leich added a comment - 2025-04-10 16:05 origin/10.11- MDEV-21923 acd071f599f416ddb4821dec485c4d912844213f 2025-04-10T13:02:17+03:00 performed well in RQG testing. No new problems.

MariaDB Server

LSN allocation is a bottleneck

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration