[MDEV-24621] In bulk insert, pre-sort and build indexes one page at a time Created: 2021-01-19 Updated: 2024-01-15 Resolved: 2021-10-26 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Fix Version/s: | 10.7.1 |
| Type: | Task | Priority: | Critical |
| Reporter: | Marko Mäkelä | Assignee: | Thirunarayanan Balathandayuthapani |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | Preview_10.7, performance | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
In
The undo logging change turned out to be rather challenging on its own, because it affects MVCC and locking as well. Preventing the useless undo logging should speed up ROLLBACK and recovery, and also the purge of history, compensating for The purpose of this task is to make INSERT into an empty table or partition even more efficient by making use of sorting. We expect data loads into huge tables with many secondary indexes to become faster. The change buffer ( After implementation comments:
Insert into an empty table without unique_checks=0,foreign_key_checks=0 will NOT be optimized. |
| Comments |
| Comment by Marko Mäkelä [ 2021-08-17 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Before MDEV-25036 has been implemented, I think that we must disable this optimization for a subsequent INSERT statements into an initially empty table. That is, for a multi-INSERT transaction into an initially empty table, we will retain the table-level undo logging as per After MDEV-25036, we could improve things in subsequent task(s) as follows:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Thirunarayanan Balathandayuthapani [ 2021-08-30 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Patch is in bb-10.7- | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-09-01 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I posted the first batch of review comments. I think that the interfaces are roughly correct. I think that we must be careful about overhead during DML operation. Error handling needs to be improved too. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2021-09-07 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Does it really have to add a new handlerton method and extend the server's commit code ? | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-09-08 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
serg, until MDEV-25036 has been implemented, we must finalize bulk insert at the end of each statement that was the first one to insert into an empty table. Even after MDEV-25036 is implemented, the finalization must take place before either handlerton::prepare_commit_versioned or handlerton::prepare are invoked. It would be cleaner to have a "pre-prepare" step. That could be handy not only for handlerton::prepare_commit_versioned calls, but also for a new one that we might need in MDEV-24608. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-09-16 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
An incomplete work-around for | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-09-17 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-09-20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I wanted to perform a small performance test, similar to what I conducted for
Without that first SET statement, we will be using normal row-logged INSERT. For me, that did consume roughly the same amount of time and produced the same amount of redo log, when comparing to the 10.7 base revision that was last merged with the branch.
I repeated the same with rr, and I determined that the DB_IO_ERROR was set here:
I did not debug this deeper. One idea would be to use normal file I/O instead of the InnoDB page I/O wrapper. We do not want O_DIRECT for these writes, since we will be writing arbitrary-sized blocks without any alignment guarantees. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-09-20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The issue was that we were incorrectly attempting to write NULL values to the BLOB file. With that fixed, I am seeing the following amount of redo log writes reported by the test. (Note that these numbers are not entirely stable.)
For the last step, the numbers should be identical, and they are close to identical. Somewhat surprising is that for the ROLLBACK and TRUNCATE we see more log written, even though those were supposed to be roughly identical as well. To remove randomness caused by the purge of transaction history, I repeated the test with innodb_force_recovery=2 and disabled the wait_all_purge.inc lines. Every second column is from that run. As we can see, the nopurge columns for TRUNCATE are virtually identical, but for ROLLBACK they are not. I think that the reason is different index tree layout. After the first INSERT of the 1M rows, I am seeing the following t.ibd file size:
thiru, I think that such a huge change of file size is something that we must investigate. With the same MERGE_THRESHOLD and other parameters that affect the B-tree page fill factor, we would want to have a similar file layout. I would expect the file size change to explain the different amount of redo log written for ROLLBACK as well. Note: Performance-wise we seem to be doing fine. Because I had some concurrent load running on the machine, I cannot quote reliable numbers. The test completed in about 31 seconds on the baseline 10.7 revision and in 25 seconds with the patch. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-09-20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I must say that it is encouraging to see such time savings for a simple benchmark that did not even include secondary indexes. With secondary indexes, I would expect the performance difference to be even bigger, between row-by-row inserts into each index, and the merge sort and bulk insert of one index at a time. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Thirunarayanan Balathandayuthapani [ 2021-10-01 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Inplace alter table code increases the file size of the table by 4MB. Bulk insert code uses the same code path as inplace alter code. This file size | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-10-08 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
thiru, I got some data from axel regarding a performance regression, but it did not have good stack traces. I reread the code changes, and I suspect that the performance regression is due to the SQL layer change that also serg asked about. Could we replace innodb_bulk_insert_write() and its caller with a small addition to ha_innobase::reset()? | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-10-15 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
This is still causing some performance regression for normal DML workload in a case when bulk loading was not used at all. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-10-22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Performance seems to be acceptable now. |