[MDEV-15983] Reduce fil_system.mutex contention further Created: 2018-04-23  Updated: 2020-09-22  Resolved: 2018-04-23

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.3
Fix Version/s: 10.3.7

Type: Bug Priority: Major
Reporter: Marko Mäkelä Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: encryption, performance
Environment:

Debian GNU/Linux unstable AMD64 (gcc 8.0.1)


Issue Links:
Blocks
is blocked by MDEV-12266 Reduce the number of InnoDB tablespac... Closed
Problem/Incident
causes MDEV-16169 InnoDB: Failing assertion: !space->re... Closed
causes MDEV-23651 InnoDB: Failing assertion: !space->re... Closed
Relates
relates to MDEV-9359 encryption.create_or_replace fails sp... Closed

 Description   

The test encryption.innodb-missing-key occasionally fails due to an apparent race condition when fil_space_release() is being invoked:

==5551==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7f0a0658ca10 at pc 0x557a04108229 bp 0x7f0a0658c970 sp 0x7f0a0658c968
WRITE of size 8 at 0x7f0a0658ca10 thread T21
    #0 0x557a04108228 in latch_t::latch_t(latch_id_t) /mariadb/10.3/storage/innobase/include/sync0types.h:995
    #1 0x557a0411f328 in MutexDebug<TTASEventMutex<GenericPolicy> >::Context::Context(latch_id_t) /mariadb/10.3/storage/innobase/include/sync0policy.h:62
    #2 0x557a0411b35d in MutexDebug<TTASEventMutex<GenericPolicy> >::enter(TTASEventMutex<GenericPolicy> const*, char const*, unsigned int) /mariadb/10.3/storage/innobase/include/sync0policy.ic:66
    #3 0x557a0411800f in GenericPolicy<TTASEventMutex<GenericPolicy> >::enter(TTASEventMutex<GenericPolicy> const&, char const*, unsigned int) /mariadb/10.3/storage/innobase/include/sync0policy.h:348
    #4 0x557a0411402d in PolicyMutex<TTASEventMutex<GenericPolicy> >::enter(unsigned int, unsigned int, char const*, unsigned int) /mariadb/10.3/storage/innobase/include/ib0mutex.h:635
    #5 0x557a048879a7 in fil_space_release(fil_space_t*) /mariadb/10.3/storage/innobase/fil/fil0fil.cc:2102
    #6 0x557a047507e9 in buf_load /mariadb/10.3/storage/innobase/buf/buf0dump.cc:701
    #7 0x557a04750e4e in buf_dump_thread /mariadb/10.3/storage/innobase/buf/buf0dump.cc:829

This looks like a false alarm, because the supposedly wrong access occurs inside the memory that was allocated from the stack for the constructor call Context::Context(). The class Context derives from latch_t.

The failure looks similar to the one that was reported in MDEV-9359:

CURRENT_TEST: encryption.innodb-missing-key
mysqltest: At line 44: query 'SELECT SLEEP(5)' failed: 2013: Lost connection to MySQL server during query

In any case, there is no need to acquire fil_system.mutex for decrementing a reference count. Let us use atomic memory access for fil_space_t::n_pending_ops and fil_space_t::n_pending_ios.


Generated at Thu Feb 08 08:25:29 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.