[MDEV-10796] tokudb+mariadb server stalled Created: 2016-09-12  Updated: 2018-01-01  Resolved: 2018-01-01

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - TokuDB
Affects Version/s: 10.0.22
Fix Version/s: 10.1.20, 10.2.4

Type: Bug Priority: Major
Reporter: Manuel Arostegui Assignee: Lixun Peng
Resolution: Fixed Votes: 0
Labels: mariadb, tokudb, upstream
Environment:

Ubuntu, tokudb-7.5.7



 Description   

We have a server running several processes of Mariadb (10.0.22) and tables running tokudb engine (tokudb-7.5.7) replicating from a master in a normal master-slave setup.

We realised that one of the proceses was lagging a lot found a query that was stuck. The process was completely stuck (stop slave didn't even work, it was just hanging trying to kill the replication thread). The only solution was to kill -9 mysqld.

After migrating the table to InnoDB the problem looks gone.
The full bug report is here: https://bugs.launchpad.net/percona-server/+bug/1621852 but Percona also suggested to get this opened here as it might be something between MariaDB and TokuDB plugin.



 Comments   
Comment by Elena Stepanova [ 2016-09-19 ]

There isn't much information to look at at the moment. If it happens again, please

  • do what Percona requested in the upstream report;
  • since it's a TokuDB table, it makes more sense to run show engine TokuDB status than InnoDB;
  • attach your cnf file(s);
  • attach your error logs;
  • check the disk status;
  • run and paste the output of SHOW TABLE STATUS LIKE 'user_groups';
  • check if you have triggers on the table, and if there are any, paste their definitions.
Comment by Manuel Arostegui [ 2016-09-20 ]

Hello Elena,

This happened again and I have attached all the information to the percona bug report: https://bugs.launchpad.net/percona-server/+bug/1621852

Answering to some of your specific questions:

  • We do not have triggers on that table.
  • The error log shows nothing
  • The disk and RAID are both fine.

Again, once it happened, the only solution was to kill -9 the process, alter the table to InnoDB and let replication flow again.

Comment by Elena Stepanova [ 2016-09-20 ]

Thanks.
Lets wait and see what Percona guys come up with, TokuDB is their area of expertise. If they claim it's the server's fault, then it will be MariaDB's turn.

Comment by Manuel Arostegui [ 2016-09-30 ]

Hello Elena,

Can you review what Percona said? https://bugs.launchpad.net/percona-server/+bug/1621852

Comment by Manuel Arostegui [ 2016-10-14 ]

Any update on this? Did you have time to look at what Percona said?

Thanks

Comment by Elena Stepanova [ 2016-10-14 ]

plinux,
Percona suspects MariaDB parallel replication to be the reason of the problem. Could you please review their assessment?

Comment by Manuel Arostegui [ 2016-10-14 ]

Note that we do have parallel replication disabled:

MariaDB SANITARIUM localhost (none) > show global variables like 'slave_parallel_mode';
Empty set (0.00 sec)

Comment by Xie Yongmei [ 2017-01-11 ]

Hi, I am trying to explain my understanding of this issue.
This Xie Yongmei, from alibaba rds team.

The root cause of this issue might be the way to signal rangelock's waiting list.

The current solution of rangelock is:
1) each write transaction should acquire rangelock before modification in index tree (actually FT in tokudb) to prevent concurrent read/write operation on the index row.
2) read-only query should acquire rangelock at the callback of cursor get for snapshot read.

3) the process of acquire rangelock (in toku_db_get_range_lock):
I. call toku_db_start_range_lock to get rangelock (in fact, it's trylock semantics)

  • if conflict, notifies locktree to track it in pending list.

II. if grant or deadlock, toku_db_start_range_lock just returns.
III.if conflict, toku_db_get_range_lock will call toku_db_wait_range_lock: let itself be waiting on the condition variable in its own context.

4) the process of release rangelock () when transaction commit or abort:
I. release the rangelock it held
II. retry all the rangelocks waiting on the same locktree

  • signal the condition variable in the request lock context, if succ

The following scenario could happen:
t1: txn1 call toku_db_start_range_lock, but found conflict, it told locktree's pending list to track it.
t2: txn2 commit, it release the rangelock which txn1 was waiting for and retry to get rangelock for txn1 and signal txn1 to execute. because the rangelock has been tracked in locktree pending list on time t1.
t3: txn1 call toku_db_wait_range_lock to sleep on it's own condition variable. but unfortunately it miss the signal, it won't wakeup until timeout occurs.

the above example shows: there's no rangelock conflict, but transaction txn1 was waiting for a long time.

The imlementation for tokudb rangelock is rare:
It uses centralized waiting list (locktree's pending list) and centralized mutex; But, for each rangelock request, it has its own condition variable which is defined in its context and sleeps on its own condition variable.

So, the wakeup process is tricky: the transaction releasing the rangelock is responsible for acquiring rangelock for blocking transaction and signal it to execute.

The rough workaround is shown as below:
Before sleep, it should verify whether there's still rangelock conflict with m_info->mutex held.
If conflict dispears, remove it from locktree's pending list and return grant; otherwise sleep on its cv.

Comment by Manuel Arostegui [ 2017-05-26 ]

This is an update from Percona at: https://bugs.launchpad.net/bugs/1621852

So far we believe this may be related to lock tree stalls. MariaDB has
imported a patch a while ago to address this and we have been reviewing
and improving the patch for Percona Server. This work is being tracked
here https://jira.percona.com/browse/TDB-3, please login there to
Percona JIRA and 'watch' for future updates. Marking it as opinion here
as there is no other accurate matching state.

Comment by Daniel Black [ 2018-01-01 ]

TDB-3 is fixed. Merged into MariaDB as https://github.com/MariaDB/server/commit/d145d1b6

Closing might have been overly keen. Please check but I did follow the patches to the above commit.

Generated at Thu Feb 08 07:44:59 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.