

      Since MariaDB 10.2.2, InnoDB never holds any mutexes or RW-locks across handler API calls. (Until that version, btr_search_latch for the adaptive hash index could be held, and there was a special call handlerton::release_temporary_latches.)

      During UPDATE operations, and also possibly during reads that perform range scans, it could help a lot to reuse the same InnoDB mini-transaction and to protect the current page with the page latch (buf_block_t::lock) across calls:

      1. Introduce row_prebuilt_t::mtr and keep it open.
      2. Avoid mtr_t::commit() between row reads
      3. Avoid storing & restoring btr_pcur_t position
      4. If there is any possibility of a delay (such as, waiting for a row read from another table, or waiting for client connection I/O), then btr_pcur_store_position(); mtr.commit() will have to be called before the wait and mtr.start();btr_pcur_restore_position(); after it.

      This change could remove any benefit of the row_prebuilt_t::fetch_cache (after 4 consecutive row reads, it’d prefetch 8 rows). Removing this cache would greatly reduce the InnoDB memory usage for partitioned tables.

      Mini-transactions for single-row UPDATE/DELETE

      1. Search and S-latch the PRIMARY KEY leaf page (get explicit transactional lock)
      2. X-latch the PRIMARY KEY leaf page, update transaction directory page (rollback segment header page), allocate&initialize first undo log page of the transaction
      3. Write undo log record
      4. Modify the PRIMARY KEY index
      5. (For each off-page column, use 1 mini-transaction per page written.)
      6. (For each secondary index, modify the index.)
      7. Commit the user transaction

      There are 1 read-only mini-transaction and 4 read-write mini-transactions for a 1-row user transaction! (With MariaDB 10.3.5, only 3 read-write mini-transactions, because the first two writes were merged.)

      We can actually use a single mini-transaction for all this. Only if there are secondary indexes or off-page columns, multiple mini-transactions will be needed:

      1. Search and X-latch the PRIMARY KEY leaf page, update transaction directory page, allocate&initialize first undo log page, write undo log record, modify the PRIMARY KEY index (with implicit transactional locking)
      2. (For each off-page column, use 1 mini-transaction per page written.)
      3. (For each secondary index, modify the index.)
      4. Commit the user transaction

      If there are no off-page columns or secondary indexes, the user transaction commit can be merged to the same mini-transaction. (This is a special case for a single-row user transaction.)

      The merging of the 'read' and 'write' steps under a single page lock would implement implicit locking for UPDATE and DELETE. When there is no locking conflict, this should greatly reduce the contention on lock_sys.mutex.

      Using fewer mini-transactions for writes also means less communication with the redo log buffer, which should reduce contention in log_sys.mutex or whatever MDEV-14425 will be replacing it with.

      Note: For any record modifications, we must always commit and restart the mini-transaction between rows, because we cannot move to another B-tree page after acquiring an undo page lock. Reads can reuse the same mini-transaction.


        Issue Links



              debarun Debarun Banerjee
              marko Marko Mäkelä
              5 Vote for this issue
              21 Start watching this issue



                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.