Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-82

thd_wait_end() callback without matching thd_wait_begin()

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • None
    • None
    • None

    Description

      In wait_for_lock() in thr_lock, it is possible that after_wait_callback() is called without matching before_wait_callback(). This happens e.g if thread_var->abort is set.
      This causes assertion in sql/threadpool_unix.cc that checks inside tp_wait_end that begin callback was called.

      Attachments

        Activity

          Fixed in 5.5-threadpool branch

          wlad Vladislav Vaintroub added a comment - Fixed in 5.5-threadpool branch

          With a moderate probability, the following RQG test triggers the problem and causes assertion `connection->waiting' failure at threadpool_unix.cc:1091: void tp_wait_end(THD*) (as of revno 3194):

          perl runall.pl -queries=100000000 --no-mask --seed=time --duration=300 --queries=100M --mysqld=innodb-lock-wait-timeout=3 --mysqld=lock-wait-timeout=5 --reporters=QueryTimeout,Backtrace,ErrorLog,Deadlock --gendata=conf/runtime/WL5004_data.zz --engine=InnoDB --mysqld=max-connections=500 --mysqld=-log-output=TABLE --threads=128 --grammar=conf/runtime/WL5004_sql.yy --basedir1=<your debug basedir> --vardir1=<your vardir>

          Usually it requires several attempts to get the failure.

          Stack trace:

          #6 0x00007fefc3331235 in __assert_fail () from /lib64/libc.so.6
          #7 0x00000000007c0e74 in tp_wait_end (thd=0x35688c0)
          at /home/elenst/lp-5.5-tp/sql/threadpool_unix.cc:1091
          #8 0x0000000000745e2d in scheduler_wait_lock_end () at /home/elenst/lp-5.5-tp/sql/scheduler.cc:58
          #9 0x0000000000cda417 in wait_for_lock (wait=0x7fef94013d38, data=0x7fef9c3a5998,
          in_wait_list=0 '\000', lock_wait_timeout=31536000) at /home/elenst/lp-5.5-tp/mysys/thr_lock.c:576
          #10 0x0000000000cdabde in thr_lock (data=0x7fef9c3a5998, owner=0x3569e40,
          lock_type=TL_WRITE_CONCURRENT_INSERT, lock_wait_timeout=31536000)
          at /home/elenst/lp-5.5-tp/mysys/thr_lock.c:865
          #11 0x0000000000cdb48e in thr_multi_lock (data=0x7fef9c00a1b0, count=1, owner=0x3569e40,
          lock_wait_timeout=31536000) at /home/elenst/lp-5.5-tp/mysys/thr_lock.c:1142
          #12 0x00000000008c4910 in mysql_lock_tables (thd=0x35688c0, sql_lock=0x7fef9c00a190, flags=2075)
          at /home/elenst/lp-5.5-tp/sql/lock.cc:320
          #13 0x00000000008c4763 in mysql_lock_tables (thd=0x35688c0, tables=0x7fefa35dd9d8, count=1, flags=2075)
          at /home/elenst/lp-5.5-tp/sql/lock.cc:275
          #14 0x00000000005fb4d3 in open_ltable (thd=0x35688c0, table_list=0x7fefa35dd910,
          lock_type=TL_WRITE_CONCURRENT_INSERT, lock_flags=2075)
          at /home/elenst/lp-5.5-tp/sql/sql_base.cc:5487
          #15 0x0000000000604631 in open_log_table (thd=0x35688c0, one_table=0x7fefa35dd910,
          backup=0x7fefa35de0d0) at /home/elenst/lp-5.5-tp/sql/sql_base.cc:9542
          #16 0x00000000008b0a83 in Log_to_csv_event_handler::log_general (this=0x7fefa4012250, thd=0x35688c0,
          event_time=..., user_host=0x7fefa35de1f0 "root[root] @ [127.0.0.1]", user_host_len=25,
          thread_id=86, command_type=0xd43062 "Query", command_type_len=5,
          sql_text=0x7fef9c002388 "LOCK TABLES testdb_S . t1_merge1_N AS o READ", sql_text_len=45,
          client_cs=0x1419840) at /home/elenst/lp-5.5-tp/sql/log.cc:631
          #17 0x00000000008b2857 in LOGGER::general_log_write (this=0x1515a80, thd=0x35688c0, command=COM_QUERY,
          query=0x7fef9c002388 "LOCK TABLES testdb_S . t1_merge1_N AS o READ", query_length=45)
          at /home/elenst/lp-5.5-tp/sql/log.cc:1304
          #18 0x00000000008bcd5c in general_log_write (thd=0x35688c0, command=COM_QUERY,
          query=0x7fef9c002388 "LOCK TABLES testdb_S . t1_merge1_N AS o READ", query_length=45)
          at /home/elenst/lp-5.5-tp/sql/log.cc:5268
          #19 0x000000000064e8ec in dispatch_command (command=COM_QUERY, thd=0x35688c0,
          packet=0x356d741 "LOCK TABLES testdb_S . t1_merge1_N AS o READ ", packet_length=46)
          at /home/elenst/lp-5.5-tp/sql/sql_parse.cc:1043
          #20 0x000000000064dcc8 in do_command (thd=0x35688c0) at /home/elenst/lp-5.5-tp/sql/sql_parse.cc:791
          #21 0x00000000007918f5 in threadpool_process_request (thd=0x35688c0)
          at /home/elenst/lp-5.5-tp/sql/threadpool_common.cc:183
          #22 0x00000000007c1263 in handle_event (ev=0x3218130)
          at /home/elenst/lp-5.5-tp/sql/threadpool_unix.cc:1223
          #23 0x00000000007c1543 in worker_main (param=0x1505000)
          at /home/elenst/lp-5.5-tp/sql/threadpool_unix.cc:1277
          #24 0x00007fefc4641a4f in start_thread () from /lib64/libpthread.so.0
          #25 0x00007fefc33d682d in clone () from /lib64/libc.so.6

          elenst Elena Stepanova added a comment - With a moderate probability, the following RQG test triggers the problem and causes assertion `connection->waiting' failure at threadpool_unix.cc:1091: void tp_wait_end(THD*) (as of revno 3194): perl runall.pl - queries=100000000 --no-mask --seed=time --duration=300 --queries=100M --mysqld= innodb-lock-wait-timeout=3 --mysqld= lock-wait-timeout=5 --reporters=QueryTimeout,Backtrace,ErrorLog,Deadlock --gendata=conf/runtime/WL5004_data.zz --engine=InnoDB --mysqld= max-connections=500 --mysqld= -log-output=TABLE --threads=128 --grammar=conf/runtime/WL5004_sql.yy --basedir1=<your debug basedir> --vardir1=<your vardir> Usually it requires several attempts to get the failure. Stack trace: #6 0x00007fefc3331235 in __assert_fail () from /lib64/libc.so.6 #7 0x00000000007c0e74 in tp_wait_end (thd=0x35688c0) at /home/elenst/lp-5.5-tp/sql/threadpool_unix.cc:1091 #8 0x0000000000745e2d in scheduler_wait_lock_end () at /home/elenst/lp-5.5-tp/sql/scheduler.cc:58 #9 0x0000000000cda417 in wait_for_lock (wait=0x7fef94013d38, data=0x7fef9c3a5998, in_wait_list=0 '\000', lock_wait_timeout=31536000) at /home/elenst/lp-5.5-tp/mysys/thr_lock.c:576 #10 0x0000000000cdabde in thr_lock (data=0x7fef9c3a5998, owner=0x3569e40, lock_type=TL_WRITE_CONCURRENT_INSERT, lock_wait_timeout=31536000) at /home/elenst/lp-5.5-tp/mysys/thr_lock.c:865 #11 0x0000000000cdb48e in thr_multi_lock (data=0x7fef9c00a1b0, count=1, owner=0x3569e40, lock_wait_timeout=31536000) at /home/elenst/lp-5.5-tp/mysys/thr_lock.c:1142 #12 0x00000000008c4910 in mysql_lock_tables (thd=0x35688c0, sql_lock=0x7fef9c00a190, flags=2075) at /home/elenst/lp-5.5-tp/sql/lock.cc:320 #13 0x00000000008c4763 in mysql_lock_tables (thd=0x35688c0, tables=0x7fefa35dd9d8, count=1, flags=2075) at /home/elenst/lp-5.5-tp/sql/lock.cc:275 #14 0x00000000005fb4d3 in open_ltable (thd=0x35688c0, table_list=0x7fefa35dd910, lock_type=TL_WRITE_CONCURRENT_INSERT, lock_flags=2075) at /home/elenst/lp-5.5-tp/sql/sql_base.cc:5487 #15 0x0000000000604631 in open_log_table (thd=0x35688c0, one_table=0x7fefa35dd910, backup=0x7fefa35de0d0) at /home/elenst/lp-5.5-tp/sql/sql_base.cc:9542 #16 0x00000000008b0a83 in Log_to_csv_event_handler::log_general (this=0x7fefa4012250, thd=0x35688c0, event_time=..., user_host=0x7fefa35de1f0 "root [root] @ [127.0.0.1] ", user_host_len=25, thread_id=86, command_type=0xd43062 "Query", command_type_len=5, sql_text=0x7fef9c002388 "LOCK TABLES testdb_S . t1_merge1_N AS o READ", sql_text_len=45, client_cs=0x1419840) at /home/elenst/lp-5.5-tp/sql/log.cc:631 #17 0x00000000008b2857 in LOGGER::general_log_write (this=0x1515a80, thd=0x35688c0, command=COM_QUERY, query=0x7fef9c002388 "LOCK TABLES testdb_S . t1_merge1_N AS o READ", query_length=45) at /home/elenst/lp-5.5-tp/sql/log.cc:1304 #18 0x00000000008bcd5c in general_log_write (thd=0x35688c0, command=COM_QUERY, query=0x7fef9c002388 "LOCK TABLES testdb_S . t1_merge1_N AS o READ", query_length=45) at /home/elenst/lp-5.5-tp/sql/log.cc:5268 #19 0x000000000064e8ec in dispatch_command (command=COM_QUERY, thd=0x35688c0, packet=0x356d741 "LOCK TABLES testdb_S . t1_merge1_N AS o READ ", packet_length=46) at /home/elenst/lp-5.5-tp/sql/sql_parse.cc:1043 #20 0x000000000064dcc8 in do_command (thd=0x35688c0) at /home/elenst/lp-5.5-tp/sql/sql_parse.cc:791 #21 0x00000000007918f5 in threadpool_process_request (thd=0x35688c0) at /home/elenst/lp-5.5-tp/sql/threadpool_common.cc:183 #22 0x00000000007c1263 in handle_event (ev=0x3218130) at /home/elenst/lp-5.5-tp/sql/threadpool_unix.cc:1223 #23 0x00000000007c1543 in worker_main (param=0x1505000) at /home/elenst/lp-5.5-tp/sql/threadpool_unix.cc:1277 #24 0x00007fefc4641a4f in start_thread () from /lib64/libpthread.so.0 #25 0x00007fefc33d682d in clone () from /lib64/libc.so.6

          Here is a different stack trace, now from a test that does not require log-output=TABLE, but so far it was a one-time occurrence:

          #6 0x00007f5ca9610235 in __assert_fail () from /lib64/libc.so.6
          #7 0x00000000007c0e74 in tp_wait_end (thd=0x7f5c8c2ce8d0)
          at /home/elenst/lp-5.5-tp/sql/threadpool_unix.cc:1091
          #8 0x0000000000745e2d in scheduler_wait_lock_end () at /home/elenst/lp-5.5-tp/sql/scheduler.cc:58
          #9 0x0000000000cda417 in wait_for_lock (wait=0x3905e88, data=0x372c6c8, in_wait_list=0 '\000',
          lock_wait_timeout=31536000) at /home/elenst/lp-5.5-tp/mysys/thr_lock.c:576
          #10 0x0000000000cdabde in thr_lock (data=0x372c6c8, owner=0x7f5c8c2cfe50, lock_type=TL_WRITE,
          lock_wait_timeout=31536000) at /home/elenst/lp-5.5-tp/mysys/thr_lock.c:865
          #11 0x0000000000cdb48e in thr_multi_lock (data=0x7f5c84167450, count=1, owner=0x7f5c8c2cfe50,
          lock_wait_timeout=31536000) at /home/elenst/lp-5.5-tp/mysys/thr_lock.c:1142
          #12 0x00000000008c4910 in mysql_lock_tables (thd=0x7f5c8c2ce8d0, sql_lock=0x7f5c84167430, flags=2048)
          at /home/elenst/lp-5.5-tp/sql/lock.cc:320
          #13 0x00000000008c4763 in mysql_lock_tables (thd=0x7f5c8c2ce8d0, tables=0x7f5c89d6b768, count=1,
          flags=2048) at /home/elenst/lp-5.5-tp/sql/lock.cc:275
          #14 0x00000000005fb4d3 in open_ltable (thd=0x7f5c8c2ce8d0, table_list=0x7f5c89d6b6a0,
          lock_type=TL_WRITE, lock_flags=2048) at /home/elenst/lp-5.5-tp/sql/sql_base.cc:5487
          #15 0x000000000060454f in open_system_table_for_update (thd=0x7f5c8c2ce8d0, one_table=0x7f5c89d6b6a0)
          at /home/elenst/lp-5.5-tp/sql/sql_base.cc:9506
          #16 0x00000000009126c2 in open_proc_table_for_update (thd=0x7f5c8c2ce8d0)
          at /home/elenst/lp-5.5-tp/sql/sp.cc:451
          #17 0x0000000000913b9c in sp_create_routine (thd=0x7f5c8c2ce8d0, type=TYPE_ENUM_FUNCTION, sp=0x359e958)
          at /home/elenst/lp-5.5-tp/sql/sp.cc:1018
          #18 0x000000000065613c in mysql_execute_command (thd=0x7f5c8c2ce8d0)
          at /home/elenst/lp-5.5-tp/sql/sql_parse.cc:3858
          #19 0x000000000065aab8 in mysql_parse (thd=0x7f5c8c2ce8d0,
          rawbuf=0x3112c68 "CREATE FUNCTION testdb_S . f1_2_N () RETURNS INTEGER BEGIN SELECT MAX( `col_int` ) FROM testdb_N . t1_part1_N INTO @my_var ; INSERT INTO testdb_N . t1_view2_N ( `col_int` ) VALUES ( 0 ) ; RETURN"..., length=208, parser_state=0x7f5c89d6c580)
          at /home/elenst/lp-5.5-tp/sql/sql_parse.cc:5694
          #20 0x000000000064e9f5 in dispatch_command (command=COM_QUERY, thd=0x7f5c8c2ce8d0,
          packet=0x7f5c8c2de051 "CREATE FUNCTION testdb_S . f1_2_N () RETURNS INTEGER BEGIN SELECT MAX( `col_int` ) FROM testdb_N . t1_part1_N INTO @my_var ; INSERT INTO testdb_N . t1_view2_N ( `col_int` ) VALUES ( 0 ) ; RETURN"..., packet_length=208) at /home/elenst/lp-5.5-tp/sql/sql_parse.cc:1052
          #21 0x000000000064dcc8 in do_command (thd=0x7f5c8c2ce8d0) at /home/elenst/lp-5.5-tp/sql/sql_parse.cc:791
          #22 0x00000000007918f5 in threadpool_process_request (thd=0x7f5c8c2ce8d0)
          at /home/elenst/lp-5.5-tp/sql/threadpool_common.cc:183
          #23 0x00000000007c1263 in handle_event (ev=0x7f5c8c0bf920)
          at /home/elenst/lp-5.5-tp/sql/threadpool_unix.cc:1223
          #24 0x00000000007c1543 in worker_main (param=0x1509a00)
          at /home/elenst/lp-5.5-tp/sql/threadpool_unix.cc:1277
          #25 0x00007f5caa920a4f in start_thread () from /lib64/libpthread.so.0

          RQG command for this one (some parameters might be irrelevant):

          perl runall.pl -queries=100000000 --debug --no-mask --seed=time --duration=300 --queries=100M --mysqld=innodb-lock-wait-timeout=3 --mysqld=lock-wait-timeout=5 --reporters=QueryTimeout,Backtrace,ErrorLog,Deadlock --mysqld=log-output=FILE --grammar=conf/runtime/WL5004_sql.yy --gendata=conf/runtime/WL5004_data.zz --engine=InnoDB --mysqld=thread_pool_stall_limit=10 --mysqld=thread_pool_oversubscribe=4 --mysqld=thread_pool_size=128 --mysqld=-thread_pool_idle_timeout=60 --threads=128 --mtr-build-thread=300 --mask=59978 --basedir1=<your debug server basedir> --vardir1=<your vardir>

          elenst Elena Stepanova added a comment - Here is a different stack trace, now from a test that does not require log-output=TABLE, but so far it was a one-time occurrence: #6 0x00007f5ca9610235 in __assert_fail () from /lib64/libc.so.6 #7 0x00000000007c0e74 in tp_wait_end (thd=0x7f5c8c2ce8d0) at /home/elenst/lp-5.5-tp/sql/threadpool_unix.cc:1091 #8 0x0000000000745e2d in scheduler_wait_lock_end () at /home/elenst/lp-5.5-tp/sql/scheduler.cc:58 #9 0x0000000000cda417 in wait_for_lock (wait=0x3905e88, data=0x372c6c8, in_wait_list=0 '\000', lock_wait_timeout=31536000) at /home/elenst/lp-5.5-tp/mysys/thr_lock.c:576 #10 0x0000000000cdabde in thr_lock (data=0x372c6c8, owner=0x7f5c8c2cfe50, lock_type=TL_WRITE, lock_wait_timeout=31536000) at /home/elenst/lp-5.5-tp/mysys/thr_lock.c:865 #11 0x0000000000cdb48e in thr_multi_lock (data=0x7f5c84167450, count=1, owner=0x7f5c8c2cfe50, lock_wait_timeout=31536000) at /home/elenst/lp-5.5-tp/mysys/thr_lock.c:1142 #12 0x00000000008c4910 in mysql_lock_tables (thd=0x7f5c8c2ce8d0, sql_lock=0x7f5c84167430, flags=2048) at /home/elenst/lp-5.5-tp/sql/lock.cc:320 #13 0x00000000008c4763 in mysql_lock_tables (thd=0x7f5c8c2ce8d0, tables=0x7f5c89d6b768, count=1, flags=2048) at /home/elenst/lp-5.5-tp/sql/lock.cc:275 #14 0x00000000005fb4d3 in open_ltable (thd=0x7f5c8c2ce8d0, table_list=0x7f5c89d6b6a0, lock_type=TL_WRITE, lock_flags=2048) at /home/elenst/lp-5.5-tp/sql/sql_base.cc:5487 #15 0x000000000060454f in open_system_table_for_update (thd=0x7f5c8c2ce8d0, one_table=0x7f5c89d6b6a0) at /home/elenst/lp-5.5-tp/sql/sql_base.cc:9506 #16 0x00000000009126c2 in open_proc_table_for_update (thd=0x7f5c8c2ce8d0) at /home/elenst/lp-5.5-tp/sql/sp.cc:451 #17 0x0000000000913b9c in sp_create_routine (thd=0x7f5c8c2ce8d0, type=TYPE_ENUM_FUNCTION, sp=0x359e958) at /home/elenst/lp-5.5-tp/sql/sp.cc:1018 #18 0x000000000065613c in mysql_execute_command (thd=0x7f5c8c2ce8d0) at /home/elenst/lp-5.5-tp/sql/sql_parse.cc:3858 #19 0x000000000065aab8 in mysql_parse (thd=0x7f5c8c2ce8d0, rawbuf=0x3112c68 "CREATE FUNCTION testdb_S . f1_2_N () RETURNS INTEGER BEGIN SELECT MAX( `col_int` ) FROM testdb_N . t1_part1_N INTO @my_var ; INSERT INTO testdb_N . t1_view2_N ( `col_int` ) VALUES ( 0 ) ; RETURN"..., length=208, parser_state=0x7f5c89d6c580) at /home/elenst/lp-5.5-tp/sql/sql_parse.cc:5694 #20 0x000000000064e9f5 in dispatch_command (command=COM_QUERY, thd=0x7f5c8c2ce8d0, packet=0x7f5c8c2de051 "CREATE FUNCTION testdb_S . f1_2_N () RETURNS INTEGER BEGIN SELECT MAX( `col_int` ) FROM testdb_N . t1_part1_N INTO @my_var ; INSERT INTO testdb_N . t1_view2_N ( `col_int` ) VALUES ( 0 ) ; RETURN"..., packet_length=208) at /home/elenst/lp-5.5-tp/sql/sql_parse.cc:1052 #21 0x000000000064dcc8 in do_command (thd=0x7f5c8c2ce8d0) at /home/elenst/lp-5.5-tp/sql/sql_parse.cc:791 #22 0x00000000007918f5 in threadpool_process_request (thd=0x7f5c8c2ce8d0) at /home/elenst/lp-5.5-tp/sql/threadpool_common.cc:183 #23 0x00000000007c1263 in handle_event (ev=0x7f5c8c0bf920) at /home/elenst/lp-5.5-tp/sql/threadpool_unix.cc:1223 #24 0x00000000007c1543 in worker_main (param=0x1509a00) at /home/elenst/lp-5.5-tp/sql/threadpool_unix.cc:1277 #25 0x00007f5caa920a4f in start_thread () from /lib64/libpthread.so.0 RQG command for this one (some parameters might be irrelevant): perl runall.pl - queries=100000000 --debug --no-mask --seed=time --duration=300 --queries=100M --mysqld= innodb-lock-wait-timeout=3 --mysqld= lock-wait-timeout=5 --reporters=QueryTimeout,Backtrace,ErrorLog,Deadlock --mysqld= log-output=FILE --grammar=conf/runtime/WL5004_sql.yy --gendata=conf/runtime/WL5004_data.zz --engine=InnoDB --mysqld= thread_pool_stall_limit=10 --mysqld= thread_pool_oversubscribe=4 --mysqld= thread_pool_size=128 --mysqld= -thread_pool_idle_timeout=60 --threads=128 --mtr-build-thread=300 --mask=59978 --basedir1=<your debug server basedir> --vardir1=<your vardir>

          People

            wlad Vladislav Vaintroub
            wlad Vladislav Vaintroub
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.