[MDEV-19749] MDL scalability regression after backup locks - Jira

Details

Type: Bug
Status: In Testing (View Workflow)
Priority: Major
Resolution: Unresolved
Affects Version/s: 10.4(EOL), 10.5, 10.6, 10.7(EOL), 10.8(EOL), 10.9(EOL), 10.10(EOL)
Fix Version/s: 10.11
Component/s: Locking
Labels:
- performance
- regression-10.4

Description

Background

FLUSH TABLES WITH READ LOCK (FTWRL) is a predecessor of BACKUP STAGE facility (~~MDEV-5336~~). Upon FTWRL completion connections are still able to issue DQL, but DDL and DML are blocked. In other words no connection is able to modify tables and commit active transactions.

It was achieved by taking two MDL_lock-s: global read lock (GLOBAL X-lock) and global commit lock (COMMIT X-lock). Correspondingly, statements that intend to modify data have to take protection against these locks. GLOBAL S-lock and COMMIT S-lock were acquired for this purpose.

These two locks were separate entities, they didn't share data structures and locking primitives. And thus they were separate contention points.

With BACKUP STAGE introduced by 7a9dfdd, connections have to take protection against ongoing FTWRL or BACKUP STAGE. It is in many ways similar to how it used to work before with GLOBAL S-lock and COMMIT S-lock. The culprit of this tragedy is GLOBAL and COMMIT namespaces being combined into single BACKUP namespace to make code simpler. Now we have single contention point with doubled load on BACKUP lock internals. In other words system throughput is halved.

MDL_lock internals

For the purpose of protection against ongoing FTWRL or BACKUP STAGE, MDL_BACKUP_DML/MDL_BACKUP_TRANS_DML and MDL_BACKUP_COMMIT have to be acquired in BACKUP namespace. These locks are mutually compatible, multiple connections are allowed to hold them.

When there is no FTWRL or BACKUP STAGE ongoing, critical section is fairly simple, roughly speaking:

wrlock(&backup->m_rwlock);

if (!(backup->granted_bitmap & ticket->incompatible_granted_bitmap) &&

    !(backup->waiting_bitmap & ticket->incompatible_waiting_bitmap))

  backup->granted_list.add(ticket);

  backup->granted_bitmap|= ticket->type_bit;

unlock(&backup->m_rwlock);

What it does in other words is: make sure there is no ongoing or pending FTWRL or BACKUP STAGE and add current connection to lock holders.

Proposed solution

Multi-instance MDL_lock, which gives multiple contention points. Compatible locks (like MDL_BACKUP_COMMIT) will go into their specific instance,
whereas heavyweight locks (like COMMIT X-lock aka MDL_BACKUP_WAIT_COMMIT) will have to exposes themselves via all instances.

MDL_BACKUP_COMMIT example:

backup= backup_instances[connection_id % num_instances];

wrlock(&backup->m_rwlock);

if (!(backup->granted_bitmap & ticket->incompatible_granted_bitmap) &&

    !(backup->waiting_bitmap & ticket->incompatible_waiting_bitmap))

  backup->granted_list.add(ticket);

  backup->granted_bitmap|= ticket->type_bit;

unlock(&backup->m_rwlock);

MDL_BACKUP_WAIT_COMMIT example (rough example, more complex in reality):

for (i= 0; i < instances; i++)

  backup= backup_instances[i];

  wrlock(&backup->m_rwlock);

  if (!(backup->granted_bitmap & ticket->incompatible_granted_bitmap) &&

      !(backup->waiting_bitmap & ticket->incompatible_waiting_bitmap))

    backup->granted_list.add(ticket);

    backup->granted_bitmap|= ticket->type_bit;

  unlock(&backup->m_rwlock);

Alternative solutions

Can be fixed by implementing something similar to MySQL WL#7306 "Improve MDL performance and scalability by implementing lock-free lock acquisition for DML". It adds atomic variable before critical section. Basing on that atomic variable it can skip critical section if there're no concurrent heavyweight locks.
Cons:
1. overcomplicated heavyweight locks handling, they have to materialise locks that were not added to granted_list for the purpose of deadlock detection
2. although it is much faster compared to original critical section, it is still single contention point

Complications

Galera code in MDL (most probably shouldn't be there)
Replication code in MDL (most probably shouldn't be there)
MDL deadlock detector

Extra stuff that should be moved out of critical section

ticket->m_time
performance schema handling

One can argue that these consume just 1 cpu tick and are nothing compared to the rest of the critical section. However hot-path is pretty straightforward too and here is a story how villagers lost their happy holiday...

Once upon a time 101 bright Villa Ribo software developers were standing in a line for a bathroom. For brushing of course. They did spend 3 minutes on average to complete their things. Full round completes in 300 minutes, total wait time 15150 developer minutes.

At the same time 101 bright Villa Baggio software developers were also standing in a line for a bathroom. In additional to 3 minutes brushing they did 1 minute shaving. Full round completes in 400 minutes, total wait time 20200 developer minutes.

Developers are expensive and they like their pay, Villa Baggio had to pay for extra 84 hours. Happy bearded Villa Riba developers are celebrating fiesta. While Villa Baggio developers are still queueing for a bathroom, and they don't have money for fiesta anyway.

This is the cost of adding small things to critical sections: 1 extra minute in a critical section becomes 84 hours idling for the whole system.

Attachments

Issue Links

is caused by

MDEV-5336 Implement BACKUP STAGE for safe external backups

Closed

relates to

MDEV-14992 BACKUP: in-server backup

Open

Activity

People

Assignee:: Michael Widenius

Reporter:: Sergey Vojtovich

Votes:: 1 Vote for this issue

Watchers:: 10 Start watching this issue

Dates

Created:: 2019-06-13 14:30

Updated:: Yesterday 20:59

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server