[MDEV-23484] Rollback unnecessarily acquires dict_operation_lock for every row Created: 2020-08-14  Updated: 2021-07-30  Resolved: 2021-07-29

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB, Storage Engine - XtraDB
Affects Version/s: 10.0, 10.1, 10.2, 10.3, 10.4, 10.5, 10.6
Fix Version/s: 10.6.4

Type: Bug Priority: Critical
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: performance, rr-profile-analyzed

Issue Links:
Blocks
blocks MDEV-24258 Merge dict_sys.mutex into dict_sys.latch Closed
is blocked by MDEV-515 innodb bulk insert Closed
is blocked by MDEV-23721 Assertion ‘node->table->is_temporary(... Closed
Problem/Incident
causes MDEV-23514 SEGV storage/innobase/row/row0log.cc:... Closed
Relates
relates to MDEV-21175 Remove dict_table_t::n_foreign_key_ch... Closed
relates to MDEV-21602 CREATE TABLE…PRIMARY KEY…SELECT worka... Closed
relates to MDEV-24324 Having an active XA transaction in TH... Open
relates to MDEV-25506 Atomic DDL: .frm file is removed and ... Closed
relates to MDEV-25919 InnoDB reports misleading lock wait t... Closed

 Description   

InnoDB transaction rollback includes an unnecessary work-around for a data corruption bug that was fixed by me in MySQL 5.6.12. By acquiring and releasing dict_operation_lock in shared mode, row_undo() hopes to prevent the table from being dropped while the undo log record is being rolled back. But, thanks to the fix, the rollback is guaranteed to be protected by transactional locks (table IX lock, in addition to implicit or explicit exclusive locks on the records that had been modified).

In one test case for which I analyzed rr replay, this unnecessary contention on dict_operation_lock seems to slow down the rollback of a CREATE…SELECT statement. As mentioned in MDEV-21602, the error handling of CREATE…SELECT would internally invoke DROP TABLE before rolling back the changes, and that DROP TABLE would invoke another work-around in InnoDB, the background DROP TABLE queue. Because that background operation would periodically acquire dict_operation_lock in exclusive mode, it would seriously slow down any rollback operation.

While the root cause of the problem is to be addressed in MDEV-21602, unnecessary acquisitions of the known contention point dict_operation_lock should be removed. Another reason for such acquisitions is FOREIGN KEY operations, to be addressed in MDEV-21175.



 Comments   
Comment by Matthias Leich [ 2020-08-14 ]

The RQG testing on actual 10.5+a patch for MDEV-23484 did not show unknown problems.

Comment by Marko Mäkelä [ 2020-10-20 ]

I think that we must revert this until MDEV-23721 has been re-analyzed, and until we have a solution for online ALTER TABLE:

10.5 6dc037a9d1af3ab56db7bf1dc69f6c46278a9224 with some changes

#0  0x0000560d02717fe6 in row_log_allocate (trx=<optimized out>, 
    index=index@entry=0x61700000cc20, table=<optimized out>, 
    same_pk=<optimized out>, defaults=<optimized out>, 
    col_map=<optimized out>, path=<optimized out>, old_table=<optimized out>, 
    allow_not_null=<optimized out>)
    at /home/mleich/Server/bb-10.5-MDEV-23855A/storage/innobase/row/row0log.cc:3240
#1  0x0000560d02337fff in prepare_inplace_alter_table_dict (
    ha_alter_info=ha_alter_info@entry=0x2a0b74a257b0, 
    altered_table=altered_table@entry=0x2a0b74a25d70, 
    old_table=<optimized out>, table_name=<optimized out>, 
    flags=<optimized out>, flags2=<optimized out>, 
    fts_doc_id_col=<optimized out>, add_fts_doc_id=<optimized out>, 
    add_fts_doc_id_idx=<optimized out>)
    at /home/mleich/Server/bb-10.5-MDEV-23855A/storage/innobase/handler/handler0alter.cc:6867
#2  0x0000560d02341e1d in ha_innobase::prepare_inplace_alter_table (this=
    0x61d0007a76b8, altered_table=<optimized out>, 
    ha_alter_info=0x2a0b74a257b0)
    at /home/mleich/Server/bb-10.5-MDEV-23855A/storage/innobase/handler/ha_innodb.h:706
mysqld: /home/mleich/Server/bb-10.5-MDEV-23855A/storage/innobase/row/row0uins.cc:91: dberr_t row_undo_ins_remove_clust_rec(undo_node_t*): Assertion `!node->trx->dict_operation_lock_mode' failed.

Here, we are starting an online ALTER TABLE operation, even though a rollback of a previous ALTER TABLE is in progress. This would be prevented by the dict_operation_lock that we removed.

I think that the only way to fix this is to acquire an InnoDB table S lock at the start of ALTER TABLE. Also MDEV-515 will require that. Hence, this is blocked by MDEV-515.

Comment by Marko Mäkelä [ 2021-07-27 ]

Now that the CREATE…SELECT was fixed in MDEV-25506 part 3 in MariaDB 10.6.3, it should be possible to fix this bug in 10.6.

Generated at Thu Feb 08 09:22:45 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.