Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
5.5(EOL), 10.0(EOL), 10.1(EOL), 10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5
Description
The InnoDB insert buffer was upgraded in MySQL 5.5 into a change buffer that also covers delete-mark and delete (purge) operations.
There is an important constraint for delete operations: a B-tree leaf page must not become empty unless the entire tree becomes empty, consisting of an empty root page. Because change buffer merges only occur on a single leaf page at a time (until MDEV-11634 might improve the situation), delete operations must not be buffered if it is possible that the last record of the page could be deleted. (In that case, we would refuse to use the change buffer, and if we really delete the last record, we would shrink the index tree.)
The function ibuf_get_volume_buffered_hash() is part of our insurance that the page would not become empty. It is supposed to map each buffered INSERT or DELETE_MARK record payload into a hash value. We will only count each such record as a distinct key if there is no hash collision. DELETE operations will always decrement the predicted number fo records in the page.
Due to a bug in the function, we would actually compute the hash value not only on the record payload, but also on some following bytes, in case the record contains NULL values. In MySQL Bug #61104, we have some examples of this dating back to 2012:
mysql-5.5.24 |
120709 22:26:32 [Note] /usr/sbin/mysqld: ready for connections.
|
Version: '5.5.24-55-log' socket: '/var/run/mysql/mysqld.sock' port: 3306 Percona Server (GPL), Release rel26.0, Revision 256
|
120709 22:27:55 InnoDB: unable to purge a record
|
InnoDB: tuple DATA TUPLE: 3 fields;
|
0: len 32; hex 3237333838313830383038303830383038303830383038303830383046344232; asc 2738818080808080808080808080F4B2;;
|
1: SQL NULL;
|
2: len 8; hex 80000000f90ea91f; asc ;;
|
Note: the SQL NULL in the output is part of the DELETE operation that was not supposed to be buffered. It proves that the secondary index in question allows NULL values, and hence it is possible that there were preceding buffered INSERT or DELETE_MARK records for the page that might have contained NULL values for some of the secondary key columns.
With a corrected and simplified hash function (also, using CRC-32C in the hash value calculation instead of the slower and worse ut_fold_binary()), the problem is no longer repeatable:
/** Determine if a change buffer record has been encountered already.
|
@param rec change buffer record in the MySQL 5.5 format
|
@param hash hash table of encountered records
|
@param size number of elements in hash
|
@retval true if a distinct record
|
@retval false if this may be duplicating an earlier record */
|
static bool ibuf_get_volume_buffered_hash(const rec_t *rec, ulint *hash, |
ulint size)
|
{
|
ut_ad(rec_get_n_fields_old(rec) > IBUF_REC_FIELD_USER);
|
const ulint start= rec_get_field_start_offs(rec, IBUF_REC_FIELD_USER); |
const ulint len= rec_get_data_size_old(rec) - start; |
const uint32_t fold= ut_crc32(rec + start, len); |
hash+= (fold / (CHAR_BIT * sizeof *hash)) % size; |
ulint bitmask= static_cast<ulint>(1) << (fold % (CHAR_BIT * sizeof(*hash))); |
|
if (*hash & bitmask) |
return false; |
|
/* We have not seen this record yet. Remember it. */ |
*hash|= bitmask;
|
return true; |
}
|
See MDEV-19344 for more details from an rr replay analysis.
Attachments
Issue Links
- relates to
-
MDEV-19344 innodb.innodb-change-buffer-recovery-innodb fails with incorrect results or crash
- Closed
- links to