[MDEV-13980] InnoDB fails to discard record lock when discarding an index page Created: 2017-10-02  Updated: 2017-12-15  Resolved: 2017-10-02

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB, Storage Engine - XtraDB
Affects Version/s: 5.5, 10.0, 10.1, 10.2, 10.3
Fix Version/s: 5.5.58, 10.0.33, 10.1.29, 10.2.10, 10.3.2

Type: Bug Priority: Major
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: corruption, lock, transactions, upstream

Issue Links:
Relates
relates to MDEV-9663 InnoDB assertion failure: *cursor->in... Closed
relates to MDEV-14643 InnoDB: Failing assertion: !cursor->... Closed

 Description   

In InnoDB, the function btr_cur_pessimistic_delete() should invoke the function lock_update_delete() before deleting the record.

InnoDB fails to do this when the entire page becomes empty and the only record in the page is deleted by btr_discard_page(). As a result of this, the transaction may keep holding explicit locks on a freed page. The scenario could involve a ROLLBACK TO SAVEPOINT of an INSERT or UPDATE.

If this freed page is soon reused by another transaction, the transaction that performed the btr_discard_page() could wrongly hold explicit locks on the records that should be owned by the other transaction, violating the Isolation property of ACID.

This bug is present in all InnoDB versions. Here is the code from MySQL 3.23.49:

	if ((page_get_n_recs(page) < 2)
	    && (dict_tree_get_page(btr_cur_get_tree(cursor))
					!= buf_frame_get_page_no(page))) {
 
		/* If there is only one record, drop the whole page in
		btr_discard_page, if this is not the root page */
	
		btr_discard_page(cursor, mtr);
 
		*err = DB_SUCCESS;
		ret = TRUE;
 
		goto return_after_reservations;	
	}
 
	rec = btr_cur_get_rec(cursor);
	
	lock_update_delete(rec);

Theoretically, this bug could explain the corruption that has been reported in MDEV-9663, especially when it occurs on a secondary index (rolling back the update of an indexed column). My comment 2017-08-24 12:46 in MDEV-9663 notes that there was corruption on unique secondary indexes. InnoDB would handle a duplicate key error by rolling back the latest row operation. It is possible that we now found an explanation to that analyzed case of MDEV-9663 corruption.



 Comments   
Comment by Marko Mäkelä [ 2017-10-03 ]

There are some calls to lock_update_delete() in btr_discard_only_page_on_level() and btr_discard_page(). It is difficult to say if calling lock_update_delete() before btr_discard_page() is actually fixing anything. But it should make the logic easier to follow.

In InnoDB B-tree indexes, record locks only exist on leaf pages, but btr_discard_only_page_on_level() and btr_discard_page() can be invoked on non-leaf pages. Therefore those calls may be unnecessary.

We might want to add assertions that whenever a page is marked free, there are no locks on it. Related to this, there is the function lock_rec_free_all_from_discard_page() that is being invoked from btr_compress(). Maybe that function should be changed to an assertion that no locks must exist. Locks are normally moved between pages during splits and merges.

Generated at Thu Feb 08 08:09:53 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.