Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-4009

main.delayed sporadically fails with "query 'REPLACE DELAYED t1 VALUES (5)' failed: 1317: Query execution was interrupted"

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Fixed
    • 5.5.29
    • 5.5.29
    • None
    • None

    Description

      Here is an extract that helps to reproduce the problem quite easily:

      drop table if exists t1;
       
      --disable_query_log
      --disable_result_log
       
      --let $run=1000
       
      while ($run)
      {
        --echo # $run attempts left...
        CREATE TABLE t1 ( f1 INTEGER AUTO_INCREMENT, PRIMARY KEY (f1)) ENGINE=MyISAM;
        INSERT DELAYED t1 VALUES (4);
        --error ER_TABLE_EXISTS_ERROR
        CREATE TABLE t1 AS SELECT 1 AS f1;
       
        REPLACE DELAYED t1 VALUES (5);
        DROP TABLE t1;
        --dec $run
      }
       
      --echo # All done

      Normally it would go to the end and print "All done". With the bug, it stops in the middle with "Query execution was interrupted". Tried Linux and Windows, release and debug builds (particularly, BUILD/compile-pentium-debug-max-no-ndb).

      It started happening after MDEV-3941, with this revision:

      revno: 3593
      revision-id: monty@askmonty.org-20121216141317-77vdudlfczp9sd84
      parent: monty@askmonty.org-20121216100426-23v67jvuacc9rtk1
      committer: Michael Widenius <monty@askmonty.org>
      branch nick: maria-5.5
      timestamp: Sun 2012-12-16 16:13:17 +0200
      message:
        Implemented MDEV-3941: CREATE TABLE xxx IF NOT EXISTS should not block if table exists.
        - Added option to check_if_table_exists() to quickly check if table exists (either SHARE or .FRM)
        - Extended lock_table_names() to not wait for meta data locks if CREATE IF NOT EXISTS is used.

      Attachments

        Activity

          I don't think the problem is critical as such, but Monty asked to mark as 'Critical' those that need to be taken care of, and that's one of them – it's a recent regression in GA, and besides it breaks buildbot tests.

          elenst Elena Stepanova added a comment - I don't think the problem is critical as such, but Monty asked to mark as 'Critical' those that need to be taken care of, and that's one of them – it's a recent regression in GA, and besides it breaks buildbot tests.

          Could not reproduce on MySQL 5.5 (release or debug), or on MariaDB 5.5.28 release and on maria/5.5 revno 3592, or on maria/10.0 and 10.0-base (MDEV-3941 is not there yet).

          However, I observed a different problem on release binaries of 5.5.28, both MariaDB and MySQL. The same test sometimes gets stuck, the process list shows

          --------------------------------------------------------------------------------------------------

          Id User Host db Command Time State Info Progress

          --------------------------------------------------------------------------------------------------

          2 root localhost test Query 118 Waiting for table metadata lock DROP TABLE t1 0.000
          1584 DELAYED localhost test Killed 118 Waiting for INSERT   0.000
          1585 root localhost:52815 test Query 0 NULL show processlist 0.000

          --------------------------------------------------------------------------------------------------

          And stack trace is like

          Thread 3 (Thread 0x7fca79c57700 (LWP 19711)):
          #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216
          #1 0x00000000006572d3 in inline_mysql_cond_timedwait (src_line=<optimized out>, src_file=<optimized out>, abstime=<optimized out>, mutex=<optimized out>, that=<optimized out>) at /homeinclude/mysql/psi/mysql_thread.h:1012
          #2 MDL_wait::timed_wait (this=0x23e3800, thd=0x23e3710, abs_timeout=0x7fca79c54ab0, set_status_on_timeout=false, wait_state_name=<optimized out>) at /homesql/mdl.cc:1191
          #3 0x0000000000659682 in MDL_context::acquire_lock (this=0x23e3800, mdl_request=0x247e240, lock_wait_timeout=140507761248984) at /homesql/mdl.cc:2103
          #4 0x0000000000659ce4 in MDL_context::acquire_locks (this=0x23e3800, mdl_requests=<optimized out>, lock_wait_timeout=31536000) at /homesql/mdl.cc:2214
          #5 0x0000000000549c83 in lock_table_names (thd=0x23e3710, tables_start=<optimized out>, tables_end=0x247de30, lock_wait_timeout=31536000, flags=1) at /homesql/sql_base.cc:4739
          #6 0x0000000000609bc9 in mysql_rm_table (thd=0x23e3710, tables=0x247de30, if_exists=0 '\000', drop_temporary=0 '\000') at /homesql/sql_table.cc:1895
          #7 0x0000000000590d53 in mysql_execute_command (thd=0x23e3710) at /homesql/sql_parse.cc:3116
          #8 0x0000000000595400 in mysql_parse (thd=0x23e3710, rawbuf=0x247dd68 "DROP TABLE t1", length=<optimized out>, parser_state=0x7fca79c56620) at /homesql/sql_parse.cc:5730
          #9 0x00000000005969ba in dispatch_command (command=<optimized out>, thd=0x23e3710, packet=0x24759f1 "DROP TABLE t1", packet_length=13) at /homesql/sql_parse.cc:1055
          #10 0x0000000000596d5b in do_command (thd=0x23e3710) at /homesql/sql_parse.cc:794
          #11 0x000000000064e274 in do_handle_one_connection (thd_arg=0x23e3710) at /homesql/sql_connect.cc:1253
          #12 0x000000000064e3ac in handle_one_connection (arg=<optimized out>) at /homesql/sql_connect.cc:1168
          #13 0x0000000000a6be88 in pfs_spawn_thread (arg=<optimized out>) at /homestorage/perfschema/pfs.cc:1015
          #14 0x00007fca854beefc in start_thread (arg=0x7fca79c57700) at pthread_create.c:304
          #15 0x00007fca84628f4d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
          #16 0x0000000000000000 in ?? ()

          But I'm not sure it's related to the failure described in this report.

          elenst Elena Stepanova added a comment - Could not reproduce on MySQL 5.5 (release or debug), or on MariaDB 5.5.28 release and on maria/5.5 revno 3592, or on maria/10.0 and 10.0-base ( MDEV-3941 is not there yet). However, I observed a different problem on release binaries of 5.5.28, both MariaDB and MySQL. The same test sometimes gets stuck, the process list shows ----- ------- --------------- ---- ------- ---- ------------------------------- ---------------- --------- Id User Host db Command Time State Info Progress ----- ------- --------------- ---- ------- ---- ------------------------------- ---------------- --------- 2 root localhost test Query 118 Waiting for table metadata lock DROP TABLE t1 0.000 1584 DELAYED localhost test Killed 118 Waiting for INSERT   0.000 1585 root localhost:52815 test Query 0 NULL show processlist 0.000 ----- ------- --------------- ---- ------- ---- ------------------------------- ---------------- --------- And stack trace is like Thread 3 (Thread 0x7fca79c57700 (LWP 19711)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216 #1 0x00000000006572d3 in inline_mysql_cond_timedwait (src_line=<optimized out>, src_file=<optimized out>, abstime=<optimized out>, mutex=<optimized out>, that=<optimized out>) at /homeinclude/mysql/psi/mysql_thread.h:1012 #2 MDL_wait::timed_wait (this=0x23e3800, thd=0x23e3710, abs_timeout=0x7fca79c54ab0, set_status_on_timeout=false, wait_state_name=<optimized out>) at /homesql/mdl.cc:1191 #3 0x0000000000659682 in MDL_context::acquire_lock (this=0x23e3800, mdl_request=0x247e240, lock_wait_timeout=140507761248984) at /homesql/mdl.cc:2103 #4 0x0000000000659ce4 in MDL_context::acquire_locks (this=0x23e3800, mdl_requests=<optimized out>, lock_wait_timeout=31536000) at /homesql/mdl.cc:2214 #5 0x0000000000549c83 in lock_table_names (thd=0x23e3710, tables_start=<optimized out>, tables_end=0x247de30, lock_wait_timeout=31536000, flags=1) at /homesql/sql_base.cc:4739 #6 0x0000000000609bc9 in mysql_rm_table (thd=0x23e3710, tables=0x247de30, if_exists=0 '\000', drop_temporary=0 '\000') at /homesql/sql_table.cc:1895 #7 0x0000000000590d53 in mysql_execute_command (thd=0x23e3710) at /homesql/sql_parse.cc:3116 #8 0x0000000000595400 in mysql_parse (thd=0x23e3710, rawbuf=0x247dd68 "DROP TABLE t1", length=<optimized out>, parser_state=0x7fca79c56620) at /homesql/sql_parse.cc:5730 #9 0x00000000005969ba in dispatch_command (command=<optimized out>, thd=0x23e3710, packet=0x24759f1 "DROP TABLE t1", packet_length=13) at /homesql/sql_parse.cc:1055 #10 0x0000000000596d5b in do_command (thd=0x23e3710) at /homesql/sql_parse.cc:794 #11 0x000000000064e274 in do_handle_one_connection (thd_arg=0x23e3710) at /homesql/sql_connect.cc:1253 #12 0x000000000064e3ac in handle_one_connection (arg=<optimized out>) at /homesql/sql_connect.cc:1168 #13 0x0000000000a6be88 in pfs_spawn_thread (arg=<optimized out>) at /homestorage/perfschema/pfs.cc:1015 #14 0x00007fca854beefc in start_thread (arg=0x7fca79c57700) at pthread_create.c:304 #15 0x00007fca84628f4d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #16 0x0000000000000000 in ?? () But I'm not sure it's related to the failure described in this report.
          pomyk Patryk Pomykalski added a comment - - edited

          A bit modified two threaded test that fails before rev 3593 in the same way (on 3593 it fails even faster):

          --disable_query_log
          --disable_result_log

          --let $run=1000
          --connect (con1, localhost, root,,)

          CREATE TABLE t1 ( f1 INTEGER AUTO_INCREMENT, PRIMARY KEY (f1)) ENGINE=MyISAM;

          while ($run)
          {
          --echo # $run attempts left...
          --send INSERT DELAYED t1 VALUES (NULL);
          --connection default
          --error ER_TABLE_EXISTS_ERROR
          CREATE TABLE t1 AS SELECT 1 AS f1;
          --connection con1
          --reap
          --dec $run
          }
          DROP TABLE t1;

          --echo # All done

          pomyk Patryk Pomykalski added a comment - - edited A bit modified two threaded test that fails before rev 3593 in the same way (on 3593 it fails even faster): --disable_query_log --disable_result_log --let $run=1000 --connect (con1, localhost, root,,) CREATE TABLE t1 ( f1 INTEGER AUTO_INCREMENT, PRIMARY KEY (f1)) ENGINE=MyISAM; while ($run) { --echo # $run attempts left... --send INSERT DELAYED t1 VALUES (NULL); --connection default --error ER_TABLE_EXISTS_ERROR CREATE TABLE t1 AS SELECT 1 AS f1; --connection con1 --reap --dec $run } DROP TABLE t1; --echo # All done

          Actually it can fail in two ways:
          1. mysqltest: At line 19: query 'reap' failed: 1317: Query execution was interrupted

          or:
          2. mysqltest: At line 19: query 'reap' failed: 1927: Connection was killed

          pomyk Patryk Pomykalski added a comment - Actually it can fail in two ways: 1. mysqltest: At line 19: query 'reap' failed: 1317: Query execution was interrupted or: 2. mysqltest: At line 19: query 'reap' failed: 1927: Connection was killed

          My analysis:
          Connection creating table sets killed= KILL_SYSTEM_THREAD state on the "delayed" thread (in function mysql_notify_thread_having_shared_lock). If there comes another delayed insert before the thread is really killed it could abort with 'Query execution was interrupted'. Before rev 3593 it is impossible (or at least a lot harder) to get into that state with only one thread issuing create table and insert delayed because create table command waits for lock that "delayed" thread has to abort.

          pomyk Patryk Pomykalski added a comment - My analysis: Connection creating table sets killed= KILL_SYSTEM_THREAD state on the "delayed" thread (in function mysql_notify_thread_having_shared_lock). If there comes another delayed insert before the thread is really killed it could abort with 'Query execution was interrupted'. Before rev 3593 it is impossible (or at least a lot harder) to get into that state with only one thread issuing create table and insert delayed because create table command waits for lock that "delayed" thread has to abort.

          Found and fixed the issues
          It was two different bugs:
          a) INSERT DELAYED + CREATE TABLE

          • This was a new bug introduced by the CREATE TABLE optimization
            b) INSERT DELAYED + DROP TABLE.
          • This was an old bug in 5.5 introduced with MDL locks.
          monty Michael Widenius added a comment - Found and fixed the issues It was two different bugs: a) INSERT DELAYED + CREATE TABLE This was a new bug introduced by the CREATE TABLE optimization b) INSERT DELAYED + DROP TABLE. This was an old bug in 5.5 introduced with MDL locks.

          Fixed both issues

          monty Michael Widenius added a comment - Fixed both issues

          People

            monty Michael Widenius
            elenst Elena Stepanova
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.