Details
-
Task
-
Status: Stalled (View Workflow)
-
Critical
-
Resolution: Unresolved
Description
Since MariaDB 10.2.2, InnoDB never holds any mutexes or RW-locks across handler API calls. (Until that version, btr_search_latch for the adaptive hash index could be held, and there was a special call handlerton::release_temporary_latches.)
During UPDATE operations, and also possibly during reads that perform range scans, it could help a lot to reuse the same InnoDB mini-transaction and to protect the current page with the page latch (buf_block_t::lock) across calls:
- Introduce row_prebuilt_t::mtr and keep it open.
- Avoid mtr_t::commit() between row reads
- Avoid storing & restoring btr_pcur_t position
- If there is any possibility of a delay (such as, waiting for a row read from another table, or waiting for client connection I/O), then btr_pcur_store_position(); mtr.commit() will have to be called before the wait and mtr.start();btr_pcur_restore_position(); after it.
This change could remove any benefit of the row_prebuilt_t::fetch_cache (after 4 consecutive row reads, it’d prefetch 8 rows). Removing this cache would greatly reduce the InnoDB memory usage for partitioned tables.
Mini-transactions for single-row UPDATE/DELETE
- Search and S-latch the PRIMARY KEY leaf page (get explicit transactional lock)
- X-latch the PRIMARY KEY leaf page, update transaction directory page (rollback segment header page), allocate&initialize first undo log page of the transaction
- Write undo log record
- Modify the PRIMARY KEY index
- (For each off-page column, use 1 mini-transaction per page written.)
- (For each secondary index, modify the index.)
- Commit the user transaction
There are 1 read-only mini-transaction and 4 read-write mini-transactions for a 1-row user transaction! (With MariaDB 10.3.5, only 3 read-write mini-transactions, because the first two writes were merged.)
We can actually use a single mini-transaction for all this. Only if there are secondary indexes or off-page columns, multiple mini-transactions will be needed:
- Search and X-latch the PRIMARY KEY leaf page, update transaction directory page, allocate&initialize first undo log page, write undo log record, modify the PRIMARY KEY index (with implicit transactional locking)
- (For each off-page column, use 1 mini-transaction per page written.)
- (For each secondary index, modify the index.)
- Commit the user transaction
If there are no off-page columns or secondary indexes, the user transaction commit can be merged to the same mini-transaction. (This is a special case for a single-row user transaction.)
The merging of the 'read' and 'write' steps under a single page lock would implement implicit locking for UPDATE and DELETE. When there is no locking conflict, this should greatly reduce the contention on lock_sys.mutex.
Using fewer mini-transactions for writes also means less communication with the redo log buffer, which should reduce contention in log_sys.mutex or whatever MDEV-14425 will be replacing it with.
Note: For any record modifications, we must always commit and restart the mini-transaction between rows, because we cannot move to another B-tree page after acquiring an undo page lock. Reads can reuse the same mini-transaction.
Attachments
Issue Links
- blocks
-
MDEV-16402 Support Index Condition Pushdown for clustered PK scans
- Confirmed
-
MDEV-21452 Use condition variables and normal mutexes instead of InnoDB os_event and mutex
- Closed
-
MDEV-30078 SQL Layer support for: Use fewer InnoDB mini-transactions
- Stalled
- relates to
-
MDEV-17603 Allow statement-based replication for REPLACE and INSERT…ON DUPLICATE KEY UPDATE
- Closed
-
MDEV-21974 InnoDB DML under backup locks make buffer pool usage grow permanently
- Open
-
MDEV-24813 Locking full table scan fails to use table-level locking
- In Review
-
MDEV-33251 Redundant check on prebuilt::n_rows_fetched overflow
- Closed
-
MDEV-34791 Redundant page lookups hurt performance
- Closed
-
MDEV-10962 Deadlock with 3 concurrent DELETEs by unique key
- Closed
-
MDEV-11215 Several locks taken to same record inside a transaction.
- Stalled
-
MDEV-14425 Change the InnoDB redo log format to reduce write amplification
- Closed
-
MDEV-16168 Performance regression on sysbench write benchmarks from 10.2 to 10.3
- Closed
-
MDEV-16675 Unnecessary explicit lock acquisition during UPDATE or DELETE
- Closed
-
MDEV-18746 Reduce the amount of mem_heap_create() or malloc()
- Open
-
MDEV-22413 Server hangs upon UPDATE/DELETE on a view reading from versioned partitioned table
- Closed
-
MDEV-24224 Gap lock on delete in 10.5 using READ COMMITTED
- Closed
-
MDEV-26779 reduce lock_sys.wait_mutex contention by using spinloop construct
- Closed
-
MDEV-30835 Inconsistent blocking of UPDATE and DELETE with the same WHERE clause
- Open