Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-3891

Deadlock with statistics tables and ANALYZE

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • None
    • None
    • None
    • None

    Description

      (Initially found by Elena)

      I'll attach a testcase, which eventually gets three threads to be stuck in "Waiting for table level lock" state.

      The testcase must be run like this: ./mysql-test-run t/t45.test --repeat=100 (can also add --testcase-timeout=600 to ease debugging). Execution will get stuck after a minute. On my machine (pylon-fedora15), only release build gets stuck, and only when the server is started by MTR.

      Being stuck looks like this:

      +----+------+-----------------+------+---------+------+------------------------------+------------------------------------------------+----------+
      | Id | User | Host            | db   | Command | Time | State                        | Info                                           | Progress |
      +----+------+-----------------+------+---------+------+------------------------------+------------------------------------------------+----------+
      | 23 | root | localhost       | test | Sleep   |   17 |                              | NULL                                           |    0.000 |
      | 24 | root | localhost       | test | Query   |   17 | Waiting for table level lock | ANALYZE TABLE E                                |    0.000 |
      | 25 | root | localhost       | test | Query   |   17 | Waiting for table level lock | ANALYZE TABLE A                                |    0.000 |
      | 26 | root | localhost       | test | Query   |   17 | Waiting for table level lock | SELECT * FROM mysql.index_stat, test.E LIMIT 1 |    0.000 |
      | 27 | root | localhost:52086 | NULL | Query   |    0 | NULL                         | show processlist                               |    0.000 |
      +----+------+-----------------+------+---------+------+------------------------------+------------------------------------------------+----------+

      Attachments

        1. t45.test
          65 kB
        2. threads45
          17 kB

        Issue Links

          Activity

            Debugging results #1

            *****************************************************************************

              • Thread 4
                *****************************************************************************
                "ANALYZE TABLE A"
            • Has a table lock on table A << How do we kow that??
              Tries to get three locks: { table_stat, column_stat, index_stat }, all for
              TL_WRITE..

              We're hung on getting the 3rd one.. it's a TL_WRITE lock on index_stat.

              #2 wait_for_lock (wait=0x7f868c038580, data=0x7f86800371a0, in_wait_list=0 '\000', lock_wait_timeout=<optimized out>) at /home/psergey/dev2/maria-5.5-mwl248/mysys/thr_lock.c:569
              (gdb) p data->lock->name
              $374 = 0x7f868c02a580 "index_stat"

              We are:
              (gdb) p thd->lock_info
              $405 = {thread = 140216257513216, thread_id = 25}

              We're waiting on: index_stat,:

              (gdb) p * data->lock->read->data->owner
              $415 = {thread = 140216257214208, thread_id = 26}

              *****************************************************************************
              ** Thread 3
              *****************************************************************************
              "SELECT * FROM mysql.index_stat, test.E LIMIT 1"

              #3 in thr_lock ()
              #4 thr_multi_lock ()
              #5 in mysql_lock_tables ()
              #6 in mysql_lock_tables ()
              #7 in lock_tables ()
              #8 in open_and_lock_tables ()
              #9 in open_and_lock_tables ()
              #10 open_system_tables_for_read ()
              #11 in read_statistics_for_tables_if_needed ()

              Here:
              lock_count=3, { table_stat, column_stat, index_stat }


              type=TL_READ on every one.

            We're stuck when locking table #0, table_stat.
            #2 wait_for_lock (wait=0x7f868c02d620, data=0x7f868002d510, in_wait_list=0 '\000', lock_wait_timeout=<optimized out>) at /home/psergey/dev2/maria-5.5-mwl248/mysys/thr_lock.c:569
            (gdb) p data->lock->name
            $376 = 0x7f868c02a510 "table_stat"

            We are:
            (gdb) p thd->lock_info
            $421 =

            {thread = 140216257214208, thread_id = 26}

            We're waiting on: table_stat, READ lock.

            (gdb) p * data->lock->write->data->owner
            $435 =

            {thread = 140216257513216, thread_id = 25}


            *****************************************************************************
            ** Thread 5
            *****************************************************************************
            "ANALYZE TABLE E"

            lock_count=3, all TL_WRITE, { table_stat, column_stat, index_stat },
            we're locking 0th table.

            (gdb) up
            #2 wait_for_lock (wait=0x7f868c02d640, data=0x7f868c023ce0, in_wait_list=0 '\000', lock_wait_timeout=<optimized out>) at /home/psergey/dev2/maria-5.5-mwl248/mysys/thr_lock.c:569
            (gdb) p data->lock->name
            $378 = 0x7f868c02a510 "table_stat"

            (gdb) p thd->lock_info
            $400 = {thread = 140216257812224, thread_id = 24}

            (gdb) p *data->lock->write->data->owner
            $397 = {thread = 140216257513216, thread_id = 25}

            Ok, we're thread 24 and we're waiting for "table_stat" because thread 25 has it.

            psergei Sergei Petrunia added a comment - Debugging results #1 ***************************************************************************** Thread 4 ***************************************************************************** "ANALYZE TABLE A" Has a table lock on table A << How do we kow that?? Tries to get three locks: { table_stat, column_stat, index_stat }, all for TL_WRITE.. We're hung on getting the 3rd one.. it's a TL_WRITE lock on index_stat. #2 wait_for_lock (wait=0x7f868c038580, data=0x7f86800371a0, in_wait_list=0 '\000', lock_wait_timeout=<optimized out>) at /home/psergey/dev2/maria-5.5-mwl248/mysys/thr_lock.c:569 (gdb) p data->lock->name $374 = 0x7f868c02a580 "index_stat" We are: (gdb) p thd->lock_info $405 = {thread = 140216257513216, thread_id = 25} We're waiting on: index_stat,: (gdb) p * data->lock->read->data->owner $415 = {thread = 140216257214208, thread_id = 26} ***************************************************************************** ** Thread 3 ***************************************************************************** "SELECT * FROM mysql.index_stat, test.E LIMIT 1" #3 in thr_lock () #4 thr_multi_lock () #5 in mysql_lock_tables () #6 in mysql_lock_tables () #7 in lock_tables () #8 in open_and_lock_tables () #9 in open_and_lock_tables () #10 open_system_tables_for_read () #11 in read_statistics_for_tables_if_needed () Here: lock_count=3, { table_stat, column_stat, index_stat } type=TL_READ on every one. We're stuck when locking table #0, table_stat. #2 wait_for_lock (wait=0x7f868c02d620, data=0x7f868002d510, in_wait_list=0 '\000', lock_wait_timeout=<optimized out>) at /home/psergey/dev2/maria-5.5-mwl248/mysys/thr_lock.c:569 (gdb) p data->lock->name $376 = 0x7f868c02a510 "table_stat" We are: (gdb) p thd->lock_info $421 = {thread = 140216257214208, thread_id = 26} We're waiting on: table_stat, READ lock. (gdb) p * data->lock->write->data->owner $435 = {thread = 140216257513216, thread_id = 25} ***************************************************************************** ** Thread 5 ***************************************************************************** "ANALYZE TABLE E" lock_count=3, all TL_WRITE, { table_stat, column_stat, index_stat }, we're locking 0th table. (gdb) up #2 wait_for_lock (wait=0x7f868c02d640, data=0x7f868c023ce0, in_wait_list=0 '\000', lock_wait_timeout=<optimized out>) at /home/psergey/dev2/maria-5.5-mwl248/mysys/thr_lock.c:569 (gdb) p data->lock->name $378 = 0x7f868c02a510 "table_stat" (gdb) p thd->lock_info $400 = {thread = 140216257812224, thread_id = 24} (gdb) p *data->lock->write->data->owner $397 = {thread = 140216257513216, thread_id = 25} Ok, we're thread 24 and we're waiting for "table_stat" because thread 25 has it.

            A hypothesis:

            Deadlock is caused by this sequence of events:

            SELECT takes a read lock on

            {mysql.index_stat, E}

            ANALYZE TABLE A tries to take write locks on

            {table_stat, column_stat, index_stat}

            it gets locks on table_stat and column_stat
            it doesn't get lock on index_stat, because it is locked.
            It ends up waiting.

            SELECT tries to load statistics data for the tables it is using.
            table_stat is locked by the ANALYZE command, so it is waiting.

            (The other ANALYZE command is also waiting, but it not actually pariticipating
            in the deadlock. Perhaps, it is necessary so that SELECT will try to load
            statistics data)

            psergei Sergei Petrunia added a comment - A hypothesis: Deadlock is caused by this sequence of events: SELECT takes a read lock on {mysql.index_stat, E} ANALYZE TABLE A tries to take write locks on {table_stat, column_stat, index_stat} it gets locks on table_stat and column_stat it doesn't get lock on index_stat, because it is locked. It ends up waiting. SELECT tries to load statistics data for the tables it is using. table_stat is locked by the ANALYZE command, so it is waiting. (The other ANALYZE command is also waiting, but it not actually pariticipating in the deadlock. Perhaps, it is necessary so that SELECT will try to load statistics data)

            Info requested by Igor: I observed the problem on maria-5.5-mwl248 revno 3569 (and AFAIR also double-checked later on 3570).

            elenst Elena Stepanova added a comment - Info requested by Igor: I observed the problem on maria-5.5-mwl248 revno 3569 (and AFAIR also double-checked later on 3570).

            The bug was fixed in maria-5.5-mwl248 by the following revision:

            revno: 3577
            revision-id: igor@askmonty.org-20121213071654-k1ghi7u5s10nvnno
            parent: igor@askmonty.org-20121210053308-o0qpwt4upw7e9740
            committer: Igor Babaev <igor@askmonty.org>
            branch nick: maria-5.5-mwl248
            timestamp: Wed 2012-12-12 23:16:54 -0800
            message:
              Fixed bug mdev-3891.
              If a query referenced some system statistical tables, but not all of them,
              then executing an ANALYZE command simultaneously with this query could
              lead to a deadlock.
              The fix prohibited reading statistics from system statistical tables
              for such queries.
              
              Removed the function unlock_tables_n_open_system_tables_for_write()
              as not used anymore.
              Performed some minor refactoring of the code in sql_statistics.cc. 

            elenst Elena Stepanova added a comment - The bug was fixed in maria-5.5-mwl248 by the following revision: revno: 3577 revision-id: igor@askmonty.org-20121213071654-k1ghi7u5s10nvnno parent: igor@askmonty.org-20121210053308-o0qpwt4upw7e9740 committer: Igor Babaev <igor@askmonty.org> branch nick: maria-5.5-mwl248 timestamp: Wed 2012-12-12 23:16:54 -0800 message: Fixed bug mdev-3891. If a query referenced some system statistical tables, but not all of them, then executing an ANALYZE command simultaneously with this query could lead to a deadlock. The fix prohibited reading statistics from system statistical tables for such queries. Removed the function unlock_tables_n_open_system_tables_for_write() as not used anymore. Performed some minor refactoring of the code in sql_statistics.cc.

            People

              psergei Sergei Petrunia
              psergei Sergei Petrunia
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.