[MDEV-21724] Optimize page_cur_insert_rec_low() redo logging Created: 2020-02-13 Updated: 2024-01-31 Resolved: 2020-02-27 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 10.5 |
| Fix Version/s: | 10.5.2 |
| Type: | Bug | Priority: | Major |
| Reporter: | Marko Mäkelä | Assignee: | Marko Mäkelä |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | performance | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
This test case, which used to fail on the 83rd round when running with the ,4k combination (because it would wrongly apply a mini-transaction whose LSN matches the FIL_PAGE_LSN on the page 0:13, or the only page of SYS_FIELDS) , shows that we are emitting suboptimal redo log records for a particular mini-transaction:
The decoded form is:
|
| Comments |
| Comment by Marko Mäkelä [ 2020-02-14 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The following test case, derived from main.sum_distinct-big, is showing a significant regression due to
Because the perf report output was ‘polluted’ by PERFORMANCE_SCHEMA functions, I decided to rebuild with cmake -DPLUGIN_PERFSCHEMA_NO. I also disabled the concurrent purge activity (
The execution times include the perf record overhead. Here is the perf report output (evaluate_join_record() and anything that consumed more time):
As expected, rec_get_offsets_func() was removed from the top, because we no longer call it when writing log for an insert. With a single-row INSERT, there is no regression:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2020-02-17 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I think that to highlight the impact of undo log and B-tree operations, it is better to use a variation of the test, so that updates of the DICT_HDR page will be avoided (they will be done even for TEMPORARY TABLE, until
Even if I disable the redo logging in btr_page_reorganize_low(), the log volume will not be reduced much. Similarly, there is very little impact of disabling logging in page_copy_rec_list_end_no_locks(), page_copy_rec_list_end(), page_copy_rec_list_start(). Once I disabled the logging for page_cur_insert_rec_low(), the redo log volume will be reduced by 72%. The rest is almost entirely attributed to trx_undo_page_set_next_prev_and_add() (writing an undo log record). | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2020-02-17 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I disabled all other redo log sources and set a breakpoint on mtr_t::finish_write(). Most inserts are logged as 47‥55 bytes, and every time the page directory is split, the length can be as much as 80 bytes. The record payload is 5+4+6+7=22 bytes. We must typically update 4+2+4=10 page header bytes when the page directory does not grow. On about 20% of the cases, we must write at least 2+2+1 more bytes to grow the page directory, but this should be doable in much less than 80 bytes. One more idea, in addition to those mentioned in the Description, is that we could avoid writing leading zero bytes for DB_TRX_ID when the page already contains zeroes in that area. That would save space when the transaction ID fits in 32 bits or less. In the old format, MLOG_COMP_REC_INSERT is 30 bytes at the start of this test case (with small page numbers and transaction identifiers). We may have to introduce a special redo log record for inserting an index page record. That should fit in less than 30 bytes for 80% of the cases (not adjusting the page directory). Currently, we seem to be writing 60 bytes in the average, which is an 100% increase over the 30 bytes. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2020-02-27 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I posted some detailed log size statistics for a simple 2-row INSERT transaction in a comment in
In our example, we are writing 29% less log for the 2,097,152-row INSERT and spending 52% less total time than before | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2024-01-31 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
For the record, in |