Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-10550

Parallel replication can't sync with master in InnoDB 5.7 (branch bb-10.2-jan)

Details

    Description

      This with InnoDB 5.7 with refactored lock0lock.cc in branch bb-10.2-jan commit 3f1b8c9474e351e50a6556675108864d5aac5528

      Affected tests:

      • rpl.rpl_parallel
      • rpl_parallel_optimistic

      Attachments

        Issue Links

          Activity

            jplindst Jan Lindström (Inactive) created issue -
            jplindst Jan Lindström (Inactive) made changes -
            Field Original Value New Value
            jplindst Jan Lindström (Inactive) made changes -
            Description This with InnoDB 5.7 with refactored lock0lock.cc in branch bb-10.2-jan (please wait for email about status).

            Affected tests:
            * rpl.rpl_parallel
            * rpl_parallel_optimistic
            * rpl_parallel_retry
            * rpl_parallel_optimistic_nobinlog

            We see assertion like:

            {noformat}
            2016-08-12 09:48:25 0x7fb708381300 InnoDB: Assertion failure in thread 140424093635328 in file sync0policy.ic line 63
            InnoDB: Failing assertion: !is_owned()
            {noformat}

            This is because current thread tries to do lock_mutex_enter(); but this same thread has already locked same mutex. This is because in RecLock::add_to_waitq where we are holding e.g. lock_sys and trx_mutex we call thd_report_lock_wait (this will call thd::awake() and later kill_query()). Either we try to kill thread that we try to add waiting queue or victim thread
            also enters to same mutexes before we kill it.

            Stack:

            {noformat}
            Thread 1 (Thread 0x7fb708381300 (LWP 6564)):
            #0 __pthread_kill (threadid=<optimized out>, signo=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:62
            #1 0x000055c3fe810d6c in my_write_core (sig=6) at /home/jan/mysql/10.2/mysys/stacktrace.c:456
            #2 0x000055c3fe0dd7ab in handle_fatal_signal (sig=6) at /home/jan/mysql/10.2/sql/signal_handler.cc:272
            #3 <signal handler called>
            #4 0x00007fb725aff418 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
            #5 0x00007fb725b0101a in __GI_abort () at abort.c:89
            #6 0x000055c3fe63fd5f in ut_dbg_assertion_failed (expr=0x55c3fe9cf78d "!is_owned()", file=0x55c3fe9cf750 "/home/jan/mysql/10.2/storage/innobase/include/sync0policy.ic", line=63) at /home/jan/mysql/10.2/storage/innobase/ut/ut0dbg.cc:67
            #7 0x000055c3fe44b93d in MutexDebug<TTASEventMutex<GenericPolicy> >::enter (this=0x55c40068c678, mutex=0x55c40068c660, name=0x55c3fe9decf8 "/home/jan/mysql/10.2/storage/innobase/lock/lock0lock.cc", line=6857) at /home/jan/mysql/10.2/storage/innobase/include/sync0policy.ic:63
            #8 0x000055c3fe44a252 in GenericPolicy<TTASEventMutex<GenericPolicy> >::enter (this=0x55c40068c678, mutex=..., filename=0x55c3fe9decf8 "/home/jan/mysql/10.2/storage/innobase/lock/lock0lock.cc", line=6857) at /home/jan/mysql/10.2/storage/innobase/include/sync0policy.h:347
            #9 0x000055c3fe448abb in PolicyMutex<TTASEventMutex<GenericPolicy> >::enter (this=0x55c40068c660, n_spins=30, n_delay=6, name=0x55c3fe9decf8 "/home/jan/mysql/10.2/storage/innobase/lock/lock0lock.cc", line=6857) at /home/jan/mysql/10.2/storage/innobase/include/ib0mutex.h:985
            #10 0x000055c3fe4a9aa9 in lock_trx_handle_wait (trx=0x7fb720d6cae8) at /home/jan/mysql/10.2/storage/innobase/lock/lock0lock.cc:6857
            #11 0x000055c3fe4269b2 in innobase_kill_query (hton=0x55c4004ec760, thd=0x7fb68c000b00, level=THD_ABORT_SOFTLY) at /home/jan/mysql/10.2/storage/innobase/handler/ha_innodb.cc:5518
            #12 0x000055c3fe0dfc67 in kill_handlerton (thd=0x7fb68c000b00, plugin=0x7fb70837c828, level=0x7fb70837c894) at /home/jan/mysql/10.2/sql/handler.cc:792
            #13 0x000055c3fdec0d74 in plugin_foreach_with_mask (thd=0x7fb68c000b00, func=0x55c3fe0dfbe9 <kill_handlerton(THD*, plugin_ref, void*)>, type=1, state_mask=4294967287, arg=0x7fb70837c894) at /home/jan/mysql/10.2/sql/sql_plugin.cc:2376
            #14 0x000055c3fe0dfcc1 in ha_kill_query (thd=0x7fb68c000b00, level=THD_ABORT_SOFTLY) at /home/jan/mysql/10.2/sql/handler.cc:799
            #15 0x000055c3fde64d89 in THD::awake (this=0x7fb68c000b00, state_to_set=KILL_CONNECTION) at /home/jan/mysql/10.2/sql/sql_class.cc:1926
            #16 0x000055c3fde6c435 in thd_report_wait_for (thd=0x7fb6a4000b00, other_thd=0x7fb68c000b00) at /home/jan/mysql/10.2/sql/sql_class.cc:4670
            #17 0x000055c3fe49eaaa in RecLock::add_to_waitq (this=0x7fb70837cff0, wait_for=0x55c400549590, prdt=0x0) at /home/jan/mysql/10.2/storage/innobase/lock/lock0lock.cc:2071
            #18 0x000055c3fe4a771e in lock_rec_insert_check_and_lock (flags=0, rec=0x7fb70f5980ae "\200", block=0x7fb70f15d800, index=0x7fb69401e1e8, thr=0x7fb6bc00f1a0, mtr=0x7fb70837e270, inherit=0x7fb70837d430) at /home/jan/mysql/10.2/storage/innobase/lock/lock0lock.cc:5900
            #19 0x000055c3fe6777c3 in btr_cur_ins_lock_and_undo (flags=0, cursor=0x7fb70837d750, entry=0x7fb6a40466e8, thr=0x7fb6bc00f1a0, mtr=0x7fb70837e270, inherit=0x7fb70837d430) at /home/jan/mysql/10.2/storage/innobase/btr/btr0cur.cc:3075
            #20 0x000055c3fe67821a in btr_cur_optimistic_insert (flags=0, cursor=0x7fb70837d750, offsets=0x7fb70837d6e8, heap=0x7fb70837d6b0, entry=0x7fb6a40466e8, rec=0x7fb70837d6f0, big_rec=0x7fb70837d6f8, n_ext=0, thr=0x7fb6bc00f1a0, mtr=0x7fb70837e270) at /home/jan/mysql/10.2/storage/innobase/btr/btr0cur.cc:3348
            #21 0x000055c3fe54a76e in row_ins_sec_index_entry_low (flags=0, mode=2, index=0x7fb69401e1e8, offsets_heap=0x7fb6a40071a0, heap=0x7fb6a402be10, entry=0x7fb6a40466e8, trx_id=0, thr=0x7fb6bc00f1a0, dup_chk_only=false) at /home/jan/mysql/10.2/storage/innobase/row/row0ins.cc:3057
            #22 0x000055c3fe54b2c5 in row_ins_sec_index_entry (index=0x7fb69401e1e8, entry=0x7fb6a40466e8, thr=0x7fb6bc00f1a0, dup_chk_only=false) at /home/jan/mysql/10.2/storage/innobase/row/row0ins.cc:3302
            #23 0x000055c3fe5c6a8d in row_upd_sec_index_entry (node=0x7fb6bc00ee68, thr=0x7fb6bc00f1a0) at /home/jan/mysql/10.2/storage/innobase/row/row0upd.cc:2444
            #24 0x000055c3fe5c6bfd in row_upd_sec_step (node=0x7fb6bc00ee68, thr=0x7fb6bc00f1a0) at /home/jan/mysql/10.2/storage/innobase/row/row0upd.cc:2471
            #25 0x000055c3fe5c87a7 in row_upd (node=0x7fb6bc00ee68, thr=0x7fb6bc00f1a0) at /home/jan/mysql/10.2/storage/innobase/row/row0upd.cc:3248
            #26 0x000055c3fe5c8af2 in row_upd_step (thr=0x7fb6bc00f1a0) at /home/jan/mysql/10.2/storage/innobase/row/row0upd.cc:3365
            #27 0x000055c3fe56c38e in row_update_for_mysql_using_upd_graph (mysql_rec=0x7fb6bc016f78 "\375\006", prebuilt=0x7fb6bc00e338) at /home/jan/mysql/10.2/storage/innobase/row/row0mysql.cc:2550
            #28 0x000055c3fe56cecc in row_update_for_mysql (mysql_rec=0x7fb6bc016f78 "\375\006", prebuilt=0x7fb6bc00e338) at /home/jan/mysql/10.2/storage/innobase/row/row0mysql.cc:2762
            #29 0x000055c3fe42d9ad in ha_innobase::update_row (this=0x7fb6bc00db98, old_row=0x7fb6bc016f78 "\375\006", new_row=0x7fb6bc016f68 "\377\006") at /home/jan/mysql/10.2/storage/innobase/handler/ha_innodb.cc:9696
            #30 0x000055c3fe0ec088 in handler::ha_update_row (this=0x7fb6bc00db98, old_data=0x7fb6bc016f78 "\375\006", new_data=0x7fb6bc016f68 "\377\006") at /home/jan/mysql/10.2/sql/handler.cc:5924
            #31 0x000055c3fdf878b9 in mysql_update (thd=0x7fb6a4000b00, table_list=0x7fb6a40092a0, fields=..., values=..., conds=0x7fb6a4009bf0, order_num=0, order=0x0, limit=18446744073709551615, handle_duplicates=DUP_ERROR, ignore=false, found_return=0x7fb70837fe30, updated_return=0x7fb70837fee0) at /home/jan/mysql/10.2/sql/sql_update.cc:810
            #32 0x000055c3fdea9a02 in mysql_execute_command (thd=0x7fb6a4000b00) at /home/jan/mysql/10.2/sql/sql_parse.cc:4140
            #33 0x000055c3fdeb311e in mysql_parse (thd=0x7fb6a4000b00, rawbuf=0x7fb6a800c04b "UPDATE t4 SET b=NULL WHERE a=6", length=30, parser_state=0x7fb7083806f0, is_com_multi=false, is_next_command=false) at /home/jan/mysql/10.2/sql/sql_parse.cc:7742
            #34 0x000055c3fe1e14ce in Query_log_event::do_apply_event (this=0x7fb6a800f590, rgi=0x7fb6a8034780, query_arg=0x7fb6a800c04b "UPDATE t4 SET b=NULL WHERE a=6", q_len_arg=30) at /home/jan/mysql/10.2/sql/log_event.cc:4459
            #35 0x000055c3fe1e063f in Query_log_event::do_apply_event (this=0x7fb6a800f590, rgi=0x7fb6a8034780) at /home/jan/mysql/10.2/sql/log_event.cc:4170
            #36 0x000055c3fde10991 in Log_event::apply_event (this=0x7fb6a800f590, rgi=0x7fb6a8034780) at /home/jan/mysql/10.2/sql/log_event.h:1343
            #37 0x000055c3fde068a4 in apply_event_and_update_pos (ev=0x7fb6a800f590, thd=0x7fb6a4000b00, rgi=0x7fb6a8034780, rpt=0x7fb6c40f9350) at /home/jan/mysql/10.2/sql/slave.cc:3366
            #38 0x000055c3fe0491b4 in rpt_handle_event (qev=0x7fb6a8035230, rpt=0x7fb6c40f9350) at /home/jan/mysql/10.2/sql/rpl_parallel.cc:52
            #39 0x000055c3fe04be64 in handle_rpl_parallel_thread (arg=0x7fb6c40f9350) at /home/jan/mysql/10.2/sql/rpl_parallel.cc:1237
            #40 0x000055c3fe2d8c38 in pfs_spawn_thread (arg=0x7fb6c40feb10) at /home/jan/mysql/10.2/storage/perfschema/pfs.cc:1862
            #41 0x00007fb7270036fa in start_thread (arg=0x7fb708381300) at pthread_create.c:333
            #42 0x00007fb725bd0b5d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
            {noformat}

            Problem is that when we enter innodb_kill_query victim_thd is holding incorrect mutexes or we try to kill incorrect victim. There are new handler functions like thd_trx_arbitrate() and thd_trx_priority() not sure if I implemented them correctly.
            This with InnoDB 5.7 with refactored lock0lock.cc in branch bb-10.2-jan commit 3f1b8c9474e351e50a6556675108864d5aac5528

            Affected tests:
            * rpl.rpl_parallel
            * rpl_parallel_optimistic
            * rpl_parallel_retry
            * rpl_parallel_optimistic_nobinlog

            We see assertion like:

            {noformat}
            2016-08-12 09:48:25 0x7fb708381300 InnoDB: Assertion failure in thread 140424093635328 in file sync0policy.ic line 63
            InnoDB: Failing assertion: !is_owned()
            {noformat}

            This is because current thread tries to do lock_mutex_enter(); but this same thread has already locked same mutex. This is because in RecLock::add_to_waitq where we are holding e.g. lock_sys and trx_mutex we call thd_report_lock_wait (this will call thd::awake() and later kill_query()). Either we try to kill thread that we try to add waiting queue or victim thread
            also enters to same mutexes before we kill it.

            Stack:

            {noformat}
            Thread 1 (Thread 0x7fb708381300 (LWP 6564)):
            #0 __pthread_kill (threadid=<optimized out>, signo=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:62
            #1 0x000055c3fe810d6c in my_write_core (sig=6) at /home/jan/mysql/10.2/mysys/stacktrace.c:456
            #2 0x000055c3fe0dd7ab in handle_fatal_signal (sig=6) at /home/jan/mysql/10.2/sql/signal_handler.cc:272
            #3 <signal handler called>
            #4 0x00007fb725aff418 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
            #5 0x00007fb725b0101a in __GI_abort () at abort.c:89
            #6 0x000055c3fe63fd5f in ut_dbg_assertion_failed (expr=0x55c3fe9cf78d "!is_owned()", file=0x55c3fe9cf750 "/home/jan/mysql/10.2/storage/innobase/include/sync0policy.ic", line=63) at /home/jan/mysql/10.2/storage/innobase/ut/ut0dbg.cc:67
            #7 0x000055c3fe44b93d in MutexDebug<TTASEventMutex<GenericPolicy> >::enter (this=0x55c40068c678, mutex=0x55c40068c660, name=0x55c3fe9decf8 "/home/jan/mysql/10.2/storage/innobase/lock/lock0lock.cc", line=6857) at /home/jan/mysql/10.2/storage/innobase/include/sync0policy.ic:63
            #8 0x000055c3fe44a252 in GenericPolicy<TTASEventMutex<GenericPolicy> >::enter (this=0x55c40068c678, mutex=..., filename=0x55c3fe9decf8 "/home/jan/mysql/10.2/storage/innobase/lock/lock0lock.cc", line=6857) at /home/jan/mysql/10.2/storage/innobase/include/sync0policy.h:347
            #9 0x000055c3fe448abb in PolicyMutex<TTASEventMutex<GenericPolicy> >::enter (this=0x55c40068c660, n_spins=30, n_delay=6, name=0x55c3fe9decf8 "/home/jan/mysql/10.2/storage/innobase/lock/lock0lock.cc", line=6857) at /home/jan/mysql/10.2/storage/innobase/include/ib0mutex.h:985
            #10 0x000055c3fe4a9aa9 in lock_trx_handle_wait (trx=0x7fb720d6cae8) at /home/jan/mysql/10.2/storage/innobase/lock/lock0lock.cc:6857
            #11 0x000055c3fe4269b2 in innobase_kill_query (hton=0x55c4004ec760, thd=0x7fb68c000b00, level=THD_ABORT_SOFTLY) at /home/jan/mysql/10.2/storage/innobase/handler/ha_innodb.cc:5518
            #12 0x000055c3fe0dfc67 in kill_handlerton (thd=0x7fb68c000b00, plugin=0x7fb70837c828, level=0x7fb70837c894) at /home/jan/mysql/10.2/sql/handler.cc:792
            #13 0x000055c3fdec0d74 in plugin_foreach_with_mask (thd=0x7fb68c000b00, func=0x55c3fe0dfbe9 <kill_handlerton(THD*, plugin_ref, void*)>, type=1, state_mask=4294967287, arg=0x7fb70837c894) at /home/jan/mysql/10.2/sql/sql_plugin.cc:2376
            #14 0x000055c3fe0dfcc1 in ha_kill_query (thd=0x7fb68c000b00, level=THD_ABORT_SOFTLY) at /home/jan/mysql/10.2/sql/handler.cc:799
            #15 0x000055c3fde64d89 in THD::awake (this=0x7fb68c000b00, state_to_set=KILL_CONNECTION) at /home/jan/mysql/10.2/sql/sql_class.cc:1926
            #16 0x000055c3fde6c435 in thd_report_wait_for (thd=0x7fb6a4000b00, other_thd=0x7fb68c000b00) at /home/jan/mysql/10.2/sql/sql_class.cc:4670
            #17 0x000055c3fe49eaaa in RecLock::add_to_waitq (this=0x7fb70837cff0, wait_for=0x55c400549590, prdt=0x0) at /home/jan/mysql/10.2/storage/innobase/lock/lock0lock.cc:2071
            #18 0x000055c3fe4a771e in lock_rec_insert_check_and_lock (flags=0, rec=0x7fb70f5980ae "\200", block=0x7fb70f15d800, index=0x7fb69401e1e8, thr=0x7fb6bc00f1a0, mtr=0x7fb70837e270, inherit=0x7fb70837d430) at /home/jan/mysql/10.2/storage/innobase/lock/lock0lock.cc:5900
            #19 0x000055c3fe6777c3 in btr_cur_ins_lock_and_undo (flags=0, cursor=0x7fb70837d750, entry=0x7fb6a40466e8, thr=0x7fb6bc00f1a0, mtr=0x7fb70837e270, inherit=0x7fb70837d430) at /home/jan/mysql/10.2/storage/innobase/btr/btr0cur.cc:3075
            #20 0x000055c3fe67821a in btr_cur_optimistic_insert (flags=0, cursor=0x7fb70837d750, offsets=0x7fb70837d6e8, heap=0x7fb70837d6b0, entry=0x7fb6a40466e8, rec=0x7fb70837d6f0, big_rec=0x7fb70837d6f8, n_ext=0, thr=0x7fb6bc00f1a0, mtr=0x7fb70837e270) at /home/jan/mysql/10.2/storage/innobase/btr/btr0cur.cc:3348
            #21 0x000055c3fe54a76e in row_ins_sec_index_entry_low (flags=0, mode=2, index=0x7fb69401e1e8, offsets_heap=0x7fb6a40071a0, heap=0x7fb6a402be10, entry=0x7fb6a40466e8, trx_id=0, thr=0x7fb6bc00f1a0, dup_chk_only=false) at /home/jan/mysql/10.2/storage/innobase/row/row0ins.cc:3057
            #22 0x000055c3fe54b2c5 in row_ins_sec_index_entry (index=0x7fb69401e1e8, entry=0x7fb6a40466e8, thr=0x7fb6bc00f1a0, dup_chk_only=false) at /home/jan/mysql/10.2/storage/innobase/row/row0ins.cc:3302
            #23 0x000055c3fe5c6a8d in row_upd_sec_index_entry (node=0x7fb6bc00ee68, thr=0x7fb6bc00f1a0) at /home/jan/mysql/10.2/storage/innobase/row/row0upd.cc:2444
            #24 0x000055c3fe5c6bfd in row_upd_sec_step (node=0x7fb6bc00ee68, thr=0x7fb6bc00f1a0) at /home/jan/mysql/10.2/storage/innobase/row/row0upd.cc:2471
            #25 0x000055c3fe5c87a7 in row_upd (node=0x7fb6bc00ee68, thr=0x7fb6bc00f1a0) at /home/jan/mysql/10.2/storage/innobase/row/row0upd.cc:3248
            #26 0x000055c3fe5c8af2 in row_upd_step (thr=0x7fb6bc00f1a0) at /home/jan/mysql/10.2/storage/innobase/row/row0upd.cc:3365
            #27 0x000055c3fe56c38e in row_update_for_mysql_using_upd_graph (mysql_rec=0x7fb6bc016f78 "\375\006", prebuilt=0x7fb6bc00e338) at /home/jan/mysql/10.2/storage/innobase/row/row0mysql.cc:2550
            #28 0x000055c3fe56cecc in row_update_for_mysql (mysql_rec=0x7fb6bc016f78 "\375\006", prebuilt=0x7fb6bc00e338) at /home/jan/mysql/10.2/storage/innobase/row/row0mysql.cc:2762
            #29 0x000055c3fe42d9ad in ha_innobase::update_row (this=0x7fb6bc00db98, old_row=0x7fb6bc016f78 "\375\006", new_row=0x7fb6bc016f68 "\377\006") at /home/jan/mysql/10.2/storage/innobase/handler/ha_innodb.cc:9696
            #30 0x000055c3fe0ec088 in handler::ha_update_row (this=0x7fb6bc00db98, old_data=0x7fb6bc016f78 "\375\006", new_data=0x7fb6bc016f68 "\377\006") at /home/jan/mysql/10.2/sql/handler.cc:5924
            #31 0x000055c3fdf878b9 in mysql_update (thd=0x7fb6a4000b00, table_list=0x7fb6a40092a0, fields=..., values=..., conds=0x7fb6a4009bf0, order_num=0, order=0x0, limit=18446744073709551615, handle_duplicates=DUP_ERROR, ignore=false, found_return=0x7fb70837fe30, updated_return=0x7fb70837fee0) at /home/jan/mysql/10.2/sql/sql_update.cc:810
            #32 0x000055c3fdea9a02 in mysql_execute_command (thd=0x7fb6a4000b00) at /home/jan/mysql/10.2/sql/sql_parse.cc:4140
            #33 0x000055c3fdeb311e in mysql_parse (thd=0x7fb6a4000b00, rawbuf=0x7fb6a800c04b "UPDATE t4 SET b=NULL WHERE a=6", length=30, parser_state=0x7fb7083806f0, is_com_multi=false, is_next_command=false) at /home/jan/mysql/10.2/sql/sql_parse.cc:7742
            #34 0x000055c3fe1e14ce in Query_log_event::do_apply_event (this=0x7fb6a800f590, rgi=0x7fb6a8034780, query_arg=0x7fb6a800c04b "UPDATE t4 SET b=NULL WHERE a=6", q_len_arg=30) at /home/jan/mysql/10.2/sql/log_event.cc:4459
            #35 0x000055c3fe1e063f in Query_log_event::do_apply_event (this=0x7fb6a800f590, rgi=0x7fb6a8034780) at /home/jan/mysql/10.2/sql/log_event.cc:4170
            #36 0x000055c3fde10991 in Log_event::apply_event (this=0x7fb6a800f590, rgi=0x7fb6a8034780) at /home/jan/mysql/10.2/sql/log_event.h:1343
            #37 0x000055c3fde068a4 in apply_event_and_update_pos (ev=0x7fb6a800f590, thd=0x7fb6a4000b00, rgi=0x7fb6a8034780, rpt=0x7fb6c40f9350) at /home/jan/mysql/10.2/sql/slave.cc:3366
            #38 0x000055c3fe0491b4 in rpt_handle_event (qev=0x7fb6a8035230, rpt=0x7fb6c40f9350) at /home/jan/mysql/10.2/sql/rpl_parallel.cc:52
            #39 0x000055c3fe04be64 in handle_rpl_parallel_thread (arg=0x7fb6c40f9350) at /home/jan/mysql/10.2/sql/rpl_parallel.cc:1237
            #40 0x000055c3fe2d8c38 in pfs_spawn_thread (arg=0x7fb6c40feb10) at /home/jan/mysql/10.2/storage/perfschema/pfs.cc:1862
            #41 0x00007fb7270036fa in start_thread (arg=0x7fb708381300) at pthread_create.c:333
            #42 0x00007fb725bd0b5d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
            {noformat}

            Problem is that when we enter innodb_kill_query victim_thd is holding incorrect mutexes or we try to kill incorrect victim. There are new handler functions like thd_trx_arbitrate() and thd_trx_priority() not sure if I implemented them correctly.
            jplindst Jan Lindström (Inactive) made changes -
            Description This with InnoDB 5.7 with refactored lock0lock.cc in branch bb-10.2-jan commit 3f1b8c9474e351e50a6556675108864d5aac5528

            Affected tests:
            * rpl.rpl_parallel
            * rpl_parallel_optimistic
            * rpl_parallel_retry
            * rpl_parallel_optimistic_nobinlog

            We see assertion like:

            {noformat}
            2016-08-12 09:48:25 0x7fb708381300 InnoDB: Assertion failure in thread 140424093635328 in file sync0policy.ic line 63
            InnoDB: Failing assertion: !is_owned()
            {noformat}

            This is because current thread tries to do lock_mutex_enter(); but this same thread has already locked same mutex. This is because in RecLock::add_to_waitq where we are holding e.g. lock_sys and trx_mutex we call thd_report_lock_wait (this will call thd::awake() and later kill_query()). Either we try to kill thread that we try to add waiting queue or victim thread
            also enters to same mutexes before we kill it.

            Stack:

            {noformat}
            Thread 1 (Thread 0x7fb708381300 (LWP 6564)):
            #0 __pthread_kill (threadid=<optimized out>, signo=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:62
            #1 0x000055c3fe810d6c in my_write_core (sig=6) at /home/jan/mysql/10.2/mysys/stacktrace.c:456
            #2 0x000055c3fe0dd7ab in handle_fatal_signal (sig=6) at /home/jan/mysql/10.2/sql/signal_handler.cc:272
            #3 <signal handler called>
            #4 0x00007fb725aff418 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
            #5 0x00007fb725b0101a in __GI_abort () at abort.c:89
            #6 0x000055c3fe63fd5f in ut_dbg_assertion_failed (expr=0x55c3fe9cf78d "!is_owned()", file=0x55c3fe9cf750 "/home/jan/mysql/10.2/storage/innobase/include/sync0policy.ic", line=63) at /home/jan/mysql/10.2/storage/innobase/ut/ut0dbg.cc:67
            #7 0x000055c3fe44b93d in MutexDebug<TTASEventMutex<GenericPolicy> >::enter (this=0x55c40068c678, mutex=0x55c40068c660, name=0x55c3fe9decf8 "/home/jan/mysql/10.2/storage/innobase/lock/lock0lock.cc", line=6857) at /home/jan/mysql/10.2/storage/innobase/include/sync0policy.ic:63
            #8 0x000055c3fe44a252 in GenericPolicy<TTASEventMutex<GenericPolicy> >::enter (this=0x55c40068c678, mutex=..., filename=0x55c3fe9decf8 "/home/jan/mysql/10.2/storage/innobase/lock/lock0lock.cc", line=6857) at /home/jan/mysql/10.2/storage/innobase/include/sync0policy.h:347
            #9 0x000055c3fe448abb in PolicyMutex<TTASEventMutex<GenericPolicy> >::enter (this=0x55c40068c660, n_spins=30, n_delay=6, name=0x55c3fe9decf8 "/home/jan/mysql/10.2/storage/innobase/lock/lock0lock.cc", line=6857) at /home/jan/mysql/10.2/storage/innobase/include/ib0mutex.h:985
            #10 0x000055c3fe4a9aa9 in lock_trx_handle_wait (trx=0x7fb720d6cae8) at /home/jan/mysql/10.2/storage/innobase/lock/lock0lock.cc:6857
            #11 0x000055c3fe4269b2 in innobase_kill_query (hton=0x55c4004ec760, thd=0x7fb68c000b00, level=THD_ABORT_SOFTLY) at /home/jan/mysql/10.2/storage/innobase/handler/ha_innodb.cc:5518
            #12 0x000055c3fe0dfc67 in kill_handlerton (thd=0x7fb68c000b00, plugin=0x7fb70837c828, level=0x7fb70837c894) at /home/jan/mysql/10.2/sql/handler.cc:792
            #13 0x000055c3fdec0d74 in plugin_foreach_with_mask (thd=0x7fb68c000b00, func=0x55c3fe0dfbe9 <kill_handlerton(THD*, plugin_ref, void*)>, type=1, state_mask=4294967287, arg=0x7fb70837c894) at /home/jan/mysql/10.2/sql/sql_plugin.cc:2376
            #14 0x000055c3fe0dfcc1 in ha_kill_query (thd=0x7fb68c000b00, level=THD_ABORT_SOFTLY) at /home/jan/mysql/10.2/sql/handler.cc:799
            #15 0x000055c3fde64d89 in THD::awake (this=0x7fb68c000b00, state_to_set=KILL_CONNECTION) at /home/jan/mysql/10.2/sql/sql_class.cc:1926
            #16 0x000055c3fde6c435 in thd_report_wait_for (thd=0x7fb6a4000b00, other_thd=0x7fb68c000b00) at /home/jan/mysql/10.2/sql/sql_class.cc:4670
            #17 0x000055c3fe49eaaa in RecLock::add_to_waitq (this=0x7fb70837cff0, wait_for=0x55c400549590, prdt=0x0) at /home/jan/mysql/10.2/storage/innobase/lock/lock0lock.cc:2071
            #18 0x000055c3fe4a771e in lock_rec_insert_check_and_lock (flags=0, rec=0x7fb70f5980ae "\200", block=0x7fb70f15d800, index=0x7fb69401e1e8, thr=0x7fb6bc00f1a0, mtr=0x7fb70837e270, inherit=0x7fb70837d430) at /home/jan/mysql/10.2/storage/innobase/lock/lock0lock.cc:5900
            #19 0x000055c3fe6777c3 in btr_cur_ins_lock_and_undo (flags=0, cursor=0x7fb70837d750, entry=0x7fb6a40466e8, thr=0x7fb6bc00f1a0, mtr=0x7fb70837e270, inherit=0x7fb70837d430) at /home/jan/mysql/10.2/storage/innobase/btr/btr0cur.cc:3075
            #20 0x000055c3fe67821a in btr_cur_optimistic_insert (flags=0, cursor=0x7fb70837d750, offsets=0x7fb70837d6e8, heap=0x7fb70837d6b0, entry=0x7fb6a40466e8, rec=0x7fb70837d6f0, big_rec=0x7fb70837d6f8, n_ext=0, thr=0x7fb6bc00f1a0, mtr=0x7fb70837e270) at /home/jan/mysql/10.2/storage/innobase/btr/btr0cur.cc:3348
            #21 0x000055c3fe54a76e in row_ins_sec_index_entry_low (flags=0, mode=2, index=0x7fb69401e1e8, offsets_heap=0x7fb6a40071a0, heap=0x7fb6a402be10, entry=0x7fb6a40466e8, trx_id=0, thr=0x7fb6bc00f1a0, dup_chk_only=false) at /home/jan/mysql/10.2/storage/innobase/row/row0ins.cc:3057
            #22 0x000055c3fe54b2c5 in row_ins_sec_index_entry (index=0x7fb69401e1e8, entry=0x7fb6a40466e8, thr=0x7fb6bc00f1a0, dup_chk_only=false) at /home/jan/mysql/10.2/storage/innobase/row/row0ins.cc:3302
            #23 0x000055c3fe5c6a8d in row_upd_sec_index_entry (node=0x7fb6bc00ee68, thr=0x7fb6bc00f1a0) at /home/jan/mysql/10.2/storage/innobase/row/row0upd.cc:2444
            #24 0x000055c3fe5c6bfd in row_upd_sec_step (node=0x7fb6bc00ee68, thr=0x7fb6bc00f1a0) at /home/jan/mysql/10.2/storage/innobase/row/row0upd.cc:2471
            #25 0x000055c3fe5c87a7 in row_upd (node=0x7fb6bc00ee68, thr=0x7fb6bc00f1a0) at /home/jan/mysql/10.2/storage/innobase/row/row0upd.cc:3248
            #26 0x000055c3fe5c8af2 in row_upd_step (thr=0x7fb6bc00f1a0) at /home/jan/mysql/10.2/storage/innobase/row/row0upd.cc:3365
            #27 0x000055c3fe56c38e in row_update_for_mysql_using_upd_graph (mysql_rec=0x7fb6bc016f78 "\375\006", prebuilt=0x7fb6bc00e338) at /home/jan/mysql/10.2/storage/innobase/row/row0mysql.cc:2550
            #28 0x000055c3fe56cecc in row_update_for_mysql (mysql_rec=0x7fb6bc016f78 "\375\006", prebuilt=0x7fb6bc00e338) at /home/jan/mysql/10.2/storage/innobase/row/row0mysql.cc:2762
            #29 0x000055c3fe42d9ad in ha_innobase::update_row (this=0x7fb6bc00db98, old_row=0x7fb6bc016f78 "\375\006", new_row=0x7fb6bc016f68 "\377\006") at /home/jan/mysql/10.2/storage/innobase/handler/ha_innodb.cc:9696
            #30 0x000055c3fe0ec088 in handler::ha_update_row (this=0x7fb6bc00db98, old_data=0x7fb6bc016f78 "\375\006", new_data=0x7fb6bc016f68 "\377\006") at /home/jan/mysql/10.2/sql/handler.cc:5924
            #31 0x000055c3fdf878b9 in mysql_update (thd=0x7fb6a4000b00, table_list=0x7fb6a40092a0, fields=..., values=..., conds=0x7fb6a4009bf0, order_num=0, order=0x0, limit=18446744073709551615, handle_duplicates=DUP_ERROR, ignore=false, found_return=0x7fb70837fe30, updated_return=0x7fb70837fee0) at /home/jan/mysql/10.2/sql/sql_update.cc:810
            #32 0x000055c3fdea9a02 in mysql_execute_command (thd=0x7fb6a4000b00) at /home/jan/mysql/10.2/sql/sql_parse.cc:4140
            #33 0x000055c3fdeb311e in mysql_parse (thd=0x7fb6a4000b00, rawbuf=0x7fb6a800c04b "UPDATE t4 SET b=NULL WHERE a=6", length=30, parser_state=0x7fb7083806f0, is_com_multi=false, is_next_command=false) at /home/jan/mysql/10.2/sql/sql_parse.cc:7742
            #34 0x000055c3fe1e14ce in Query_log_event::do_apply_event (this=0x7fb6a800f590, rgi=0x7fb6a8034780, query_arg=0x7fb6a800c04b "UPDATE t4 SET b=NULL WHERE a=6", q_len_arg=30) at /home/jan/mysql/10.2/sql/log_event.cc:4459
            #35 0x000055c3fe1e063f in Query_log_event::do_apply_event (this=0x7fb6a800f590, rgi=0x7fb6a8034780) at /home/jan/mysql/10.2/sql/log_event.cc:4170
            #36 0x000055c3fde10991 in Log_event::apply_event (this=0x7fb6a800f590, rgi=0x7fb6a8034780) at /home/jan/mysql/10.2/sql/log_event.h:1343
            #37 0x000055c3fde068a4 in apply_event_and_update_pos (ev=0x7fb6a800f590, thd=0x7fb6a4000b00, rgi=0x7fb6a8034780, rpt=0x7fb6c40f9350) at /home/jan/mysql/10.2/sql/slave.cc:3366
            #38 0x000055c3fe0491b4 in rpt_handle_event (qev=0x7fb6a8035230, rpt=0x7fb6c40f9350) at /home/jan/mysql/10.2/sql/rpl_parallel.cc:52
            #39 0x000055c3fe04be64 in handle_rpl_parallel_thread (arg=0x7fb6c40f9350) at /home/jan/mysql/10.2/sql/rpl_parallel.cc:1237
            #40 0x000055c3fe2d8c38 in pfs_spawn_thread (arg=0x7fb6c40feb10) at /home/jan/mysql/10.2/storage/perfschema/pfs.cc:1862
            #41 0x00007fb7270036fa in start_thread (arg=0x7fb708381300) at pthread_create.c:333
            #42 0x00007fb725bd0b5d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
            {noformat}

            Problem is that when we enter innodb_kill_query victim_thd is holding incorrect mutexes or we try to kill incorrect victim. There are new handler functions like thd_trx_arbitrate() and thd_trx_priority() not sure if I implemented them correctly.
            This with InnoDB 5.7 with refactored lock0lock.cc in branch bb-10.2-jan commit 3f1b8c9474e351e50a6556675108864d5aac5528

            Affected tests:
            * rpl.rpl_parallel
            * rpl_parallel_optimistic
            jplindst Jan Lindström (Inactive) made changes -
            Summary Parallel replication lock waits/deadlock handling does not work with InnoDB 5.7 Parallel replication can't sync with master in InnoDB 5.7 (10.2)
            jplindst Jan Lindström (Inactive) made changes -
            Summary Parallel replication can't sync with master in InnoDB 5.7 (10.2) Parallel replication can't sync with master in InnoDB 5.7 (branch bb-10.2-jan)
            elenst Elena Stepanova made changes -
            jplindst Jan Lindström (Inactive) made changes -
            Fix Version/s N/A [ 14700 ]
            Fix Version/s 10.2 [ 14601 ]
            Resolution Fixed [ 1 ]
            Status Open [ 1 ] Closed [ 6 ]
            alice Alice Sherepa made changes -
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 76644 ] MariaDB v4 [ 150753 ]

            People

              monty Michael Widenius
              jplindst Jan Lindström (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.