[MDEV-3891] Deadlock with statistics tables and ANALYZE Created: 2012-11-28  Updated: 2017-05-08  Resolved: 2014-04-08

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: None

Attachments: File t45.test     File threads45    
Issue Links:
Duplicate
is duplicated by MDEV-10804 main.stat_tables_par fails sporadical... Closed
Relates
relates to MDEV-3806 Engine independent statistics Closed

 Description   

(Initially found by Elena)

I'll attach a testcase, which eventually gets three threads to be stuck in "Waiting for table level lock" state.

The testcase must be run like this: ./mysql-test-run t/t45.test --repeat=100 (can also add --testcase-timeout=600 to ease debugging). Execution will get stuck after a minute. On my machine (pylon-fedora15), only release build gets stuck, and only when the server is started by MTR.

Being stuck looks like this:

+----+------+-----------------+------+---------+------+------------------------------+------------------------------------------------+----------+
| Id | User | Host            | db   | Command | Time | State                        | Info                                           | Progress |
+----+------+-----------------+------+---------+------+------------------------------+------------------------------------------------+----------+
| 23 | root | localhost       | test | Sleep   |   17 |                              | NULL                                           |    0.000 |
| 24 | root | localhost       | test | Query   |   17 | Waiting for table level lock | ANALYZE TABLE E                                |    0.000 |
| 25 | root | localhost       | test | Query   |   17 | Waiting for table level lock | ANALYZE TABLE A                                |    0.000 |
| 26 | root | localhost       | test | Query   |   17 | Waiting for table level lock | SELECT * FROM mysql.index_stat, test.E LIMIT 1 |    0.000 |
| 27 | root | localhost:52086 | NULL | Query   |    0 | NULL                         | show processlist                               |    0.000 |
+----+------+-----------------+------+---------+------+------------------------------+------------------------------------------------+----------+



 Comments   
Comment by Sergei Petrunia [ 2012-11-28 ]

Debugging results #1

*****************************************************************************

    • Thread 4
      *****************************************************************************
      "ANALYZE TABLE A"
  • Has a table lock on table A << How do we kow that??
    Tries to get three locks: { table_stat, column_stat, index_stat }, all for
    TL_WRITE..

    We're hung on getting the 3rd one.. it's a TL_WRITE lock on index_stat.

    #2 wait_for_lock (wait=0x7f868c038580, data=0x7f86800371a0, in_wait_list=0 '\000', lock_wait_timeout=<optimized out>) at /home/psergey/dev2/maria-5.5-mwl248/mysys/thr_lock.c:569
    (gdb) p data->lock->name
    $374 = 0x7f868c02a580 "index_stat"

    We are:
    (gdb) p thd->lock_info
    $405 = {thread = 140216257513216, thread_id = 25}

    We're waiting on: index_stat,:

    (gdb) p * data->lock->read->data->owner
    $415 = {thread = 140216257214208, thread_id = 26}

    *****************************************************************************
    ** Thread 3
    *****************************************************************************
    "SELECT * FROM mysql.index_stat, test.E LIMIT 1"

    #3 in thr_lock ()
    #4 thr_multi_lock ()
    #5 in mysql_lock_tables ()
    #6 in mysql_lock_tables ()
    #7 in lock_tables ()
    #8 in open_and_lock_tables ()
    #9 in open_and_lock_tables ()
    #10 open_system_tables_for_read ()
    #11 in read_statistics_for_tables_if_needed ()

    Here:
    lock_count=3, { table_stat, column_stat, index_stat }


    type=TL_READ on every one.

We're stuck when locking table #0, table_stat.
#2 wait_for_lock (wait=0x7f868c02d620, data=0x7f868002d510, in_wait_list=0 '\000', lock_wait_timeout=<optimized out>) at /home/psergey/dev2/maria-5.5-mwl248/mysys/thr_lock.c:569
(gdb) p data->lock->name
$376 = 0x7f868c02a510 "table_stat"

We are:
(gdb) p thd->lock_info
$421 =

{thread = 140216257214208, thread_id = 26}

We're waiting on: table_stat, READ lock.

(gdb) p * data->lock->write->data->owner
$435 =

{thread = 140216257513216, thread_id = 25}


*****************************************************************************
** Thread 5
*****************************************************************************
"ANALYZE TABLE E"

lock_count=3, all TL_WRITE, { table_stat, column_stat, index_stat },
we're locking 0th table.

(gdb) up
#2 wait_for_lock (wait=0x7f868c02d640, data=0x7f868c023ce0, in_wait_list=0 '\000', lock_wait_timeout=<optimized out>) at /home/psergey/dev2/maria-5.5-mwl248/mysys/thr_lock.c:569
(gdb) p data->lock->name
$378 = 0x7f868c02a510 "table_stat"

(gdb) p thd->lock_info
$400 = {thread = 140216257812224, thread_id = 24}

(gdb) p *data->lock->write->data->owner
$397 = {thread = 140216257513216, thread_id = 25}

Ok, we're thread 24 and we're waiting for "table_stat" because thread 25 has it.

Comment by Sergei Petrunia [ 2012-11-28 ]

A hypothesis:

Deadlock is caused by this sequence of events:

SELECT takes a read lock on

{mysql.index_stat, E}

ANALYZE TABLE A tries to take write locks on

{table_stat, column_stat, index_stat}

it gets locks on table_stat and column_stat
it doesn't get lock on index_stat, because it is locked.
It ends up waiting.

SELECT tries to load statistics data for the tables it is using.
table_stat is locked by the ANALYZE command, so it is waiting.

(The other ANALYZE command is also waiting, but it not actually pariticipating
in the deadlock. Perhaps, it is necessary so that SELECT will try to load
statistics data)

Comment by Elena Stepanova [ 2012-12-12 ]

Info requested by Igor: I observed the problem on maria-5.5-mwl248 revno 3569 (and AFAIR also double-checked later on 3570).

Comment by Elena Stepanova [ 2014-04-08 ]

The bug was fixed in maria-5.5-mwl248 by the following revision:

revno: 3577
revision-id: igor@askmonty.org-20121213071654-k1ghi7u5s10nvnno
parent: igor@askmonty.org-20121210053308-o0qpwt4upw7e9740
committer: Igor Babaev <igor@askmonty.org>
branch nick: maria-5.5-mwl248
timestamp: Wed 2012-12-12 23:16:54 -0800
message:
  Fixed bug mdev-3891.
  If a query referenced some system statistical tables, but not all of them,
  then executing an ANALYZE command simultaneously with this query could
  lead to a deadlock.
  The fix prohibited reading statistics from system statistical tables
  for such queries.
  
  Removed the function unlock_tables_n_open_system_tables_for_write()
  as not used anymore.
  Performed some minor refactoring of the code in sql_statistics.cc. 

Generated at Thu Feb 08 06:52:04 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.