Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.0.24
-
None
-
Debian Linux
Description
We have a MariaDB server that sometimes seems to get an indefinite table lock. The only way to solve it is to kill MariaDB and restart it. It "randomly happens", so I suspect it may be a race condition. The replication thread get's stuck in state "After opening tables". The lock seems to be caused by a query on an unrelated database with 3 million MyISAM tables (not receiving data via replication) in state "Opening tables". These databases are completely separated, so they shouldn't cause a lock between each other.
Processlist shows:
root@server ~ # mysql -e "show full processlist" |
+----------+----------------------+-----------------------+----------------------+---------+---------+----------------------------------+-------------------------------------------------------+----------+ |
| Id | User | Host | db | Command | Time | State | Info | Progress | |
+----------+----------------------+-----------------------+----------------------+---------+---------+----------------------------------+-------------------------------------------------------+----------+ |
| 2 | system user | | NULL | Connect | 1116538 | Waiting for master to send event | NULL | 0.000 | |
| 3 | system user | | replication_database | Connect | 264121 | After opening tables | COMMIT | 0.000 | |
| 37332419 | user1 | localhost | large_database | Killed | 264121 | Opening tables | DELETE FROM `my_domain` WHERE date<'2016-04-29' | 0.000 | |
| 48692447 | root | localhost | NULL | Query | 0 | init | show full processlist | 0.000 | |
+----------+----------------------+-----------------------+----------------------+---------+---------+----------------------------------+-------------------------------------------------------+----------+ |
root@server ~ #
|
New backtrace attached. Unfortunately mysqld started "hanging" shortly afterwards and became unresponsive to "kill -6". In the end I had to force kill it with "kill -9" which didn't produce a core dump. I did get a core dump from another slave that had the exact same issue (same position), however not running with the special binary. Let me know if that should be uploaded, hopefully the new backtrace provides sufficient information.