Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-22494

Galera assertion lock_sys.mutex.is_owned() at lock_trx_handle_wait_low

Details

    Description

      Tests: galera.galera_bf_abort_shutdown galera_sr.galera_sr_kill_connection

      Thread 1 (Thread 0x7f0a058aa700 (LWP 290963)):
      #0  __pthread_kill (threadid=<optimized out>, signo=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
      #1  0x00005610f3f597cd in my_write_core (sig=6) at /home/jan/mysql/10.4/mysys/stacktrace.c:481
      #2  0x00005610f36aaa47 in handle_fatal_signal (sig=6) at /home/jan/mysql/10.4/sql/signal_handler.cc:343
      #3  <signal handler called>
      #4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
      #5  0x00007f0a1475a859 in __GI_abort () at abort.c:79
      #6  0x00007f0a1475a729 in __assert_fail_base (fmt=0x7f0a148f0588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x5610f425fcb3 "(lock_sys.mutex.is_owned())", file=0x5610f4260648 "/home/jan/mysql/10.4/storage/innobase/lock/lock0lock.cc", line=6233, function=<optimized out>) at assert.c:92
      #7  0x00007f0a1476bf36 in __GI___assert_fail (assertion=0x5610f425fcb3 "(lock_sys.mutex.is_owned())", file=0x5610f4260648 "/home/jan/mysql/10.4/storage/innobase/lock/lock0lock.cc", line=6233, function=0x5610f4264998 "dberr_t lock_trx_handle_wait_low(trx_t*)") at assert.c:101
      #8  0x00005610f39c37f9 in lock_trx_handle_wait_low (trx=0x7f0a063d9268) at /home/jan/mysql/10.4/storage/innobase/lock/lock0lock.cc:6233
      #9  0x00005610f39c38ed in lock_trx_handle_wait (trx=0x7f0a063d9268) at /home/jan/mysql/10.4/storage/innobase/lock/lock0lock.cc:6261
      #10 0x00005610f3903459 in innobase_kill_query (thd=0x7f0994000d90) at /home/jan/mysql/10.4/storage/innobase/handler/ha_innodb.cc:5134
      #11 0x00005610f36aeed7 in kill_handlerton (thd=0x7f0994000d90, plugin=0x7f0990012df0, level=0x7f0a058a88f4) at /home/jan/mysql/10.4/sql/handler.cc:847
      #12 0x00005610f336a090 in plugin_foreach_with_mask (thd=0x7f0994000d90, func=0x5610f36aee51 <kill_handlerton(THD*, plugin_ref, void*)>, type=1, state_mask=8, arg=0x7f0a058a88f4) at /home/jan/mysql/10.4/sql/sql_plugin.cc:2438
      #13 0x00005610f36aef35 in ha_kill_query (thd=0x7f0994000d90, level=THD_ABORT_ASAP) at /home/jan/mysql/10.4/sql/handler.cc:854
      #14 0x00005610f32d4eae in THD::awake_no_mutex (this=0x7f0994000d90, state_to_set=KILL_CONNECTION_HARD) at /home/jan/mysql/10.4/sql/sql_class.cc:1902
      #15 0x00005610f335d78e in kill_one_thread (thd=0x7f0990001fe0, id=65, kill_signal=KILL_CONNECTION_HARD, type=KILL_TYPE_ID) at /home/jan/mysql/10.4/sql/sql_parse.cc:9083
      #16 0x00005610f335dcd3 in sql_kill (thd=0x7f0990001fe0, id=65, state=KILL_CONNECTION_HARD, type=KILL_TYPE_ID) at /home/jan/mysql/10.4/sql/sql_parse.cc:9202
      #17 0x00005610f335335a in mysql_execute_command (thd=0x7f0990001fe0) at /home/jan/mysql/10.4/sql/sql_parse.cc:5531
      #18 0x00005610f335acb1 in mysql_parse (thd=0x7f0990001fe0, rawbuf=0x7f09900134b0 "KILL CONNECTION 65", length=18, parser_state=0x7f0a058a94d0, is_com_multi=false, is_next_command=false) at /home/jan/mysql/10.4/sql/sql_parse.cc:7900
      #19 0x00005610f335a3a3 in wsrep_mysql_parse (thd=0x7f0990001fe0, rawbuf=0x7f09900134b0 "KILL CONNECTION 65", length=18, parser_state=0x7f0a058a94d0, is_com_multi=false, is_next_command=false) at /home/jan/mysql/10.4/sql/sql_parse.cc:7704
      #20 0x00005610f3347137 in dispatch_command (command=COM_QUERY, thd=0x7f0990001fe0, packet=0x7f09900099e1 "KILL CONNECTION 65", packet_length=18, is_com_multi=false, is_next_command=false) at /home/jan/mysql/10.4/sql/sql_parse.cc:1827
      #21 0x00005610f3345aa4 in do_command (thd=0x7f0990001fe0) at /home/jan/mysql/10.4/sql/sql_parse.cc:1359
      #22 0x00005610f34d0ed3 in do_handle_one_connection (connect=0x5610f66d3300) at /home/jan/mysql/10.4/sql/sql_connect.cc:1412
      #23 0x00005610f34d0c1c in handle_one_connection (arg=0x5610f66d3300) at /home/jan/mysql/10.4/sql/sql_connect.cc:1316
      #24 0x00005610f3ee5812 in pfs_spawn_thread (arg=0x5610f66e9cb0) at /home/jan/mysql/10.4/storage/perfschema/pfs.cc:1869
      #25 0x00007f0a15099609 in start_thread (arg=<optimized out>) at pthread_create.c:477
      #26 0x00007f0a14857103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
      

      Attachments

        Activity

          Note bug is not the same as in MDEV-21910.

          jplindst Jan Lindström (Inactive) added a comment - Note bug is not the same as in MDEV-21910 .
          jplindst Jan Lindström (Inactive) added a comment - https://github.com/MariaDB/server/commit/514ede82c7e853ca0a7f734ecfcaff9fd8b7153c

          As far as I understand, the flag trx->lock.was_chosen_as_wsrep_victim must be reset at the end of the transaction’s life cycle, similar to the flag trx_t::wsrep that was recently introduced in MDEV-7962. Please refer to the follow-up fix in 10.4. Your proposed fix fails to extend assert_trx_is_free(), and it is resetting the flag in too many places. The debug macro should be replaced with an inline member function and some code adjusted accordingly. I would propose the following changes to the latest 10.4 as a starting point for the fix. It passed ./mtr --big-test of the default test suites (which do not include Galera):

          diff --git a/storage/innobase/include/trx0trx.h b/storage/innobase/include/trx0trx.h
          index 87bf97f00cf..70df62d0d03 100644
          --- a/storage/innobase/include/trx0trx.h
          +++ b/storage/innobase/include/trx0trx.h
          @@ -436,32 +436,6 @@ Check transaction state */
           	ut_error;							\
           } while (0)
           
          -/** Check if transaction is free so that it can be re-initialized.
          -@param t transaction handle */
          -#define	assert_trx_is_free(t)	do {					\
          -	ut_ad(trx_state_eq((t), TRX_STATE_NOT_STARTED));		\
          -	ut_ad(!(t)->id);						\
          -	ut_ad(!(t)->has_logged());					\
          -	ut_ad(!(t)->is_referenced());					\
          -	ut_ad(!(t)->is_wsrep());					\
          -	ut_ad(!(t)->read_view.is_open());				\
          -	ut_ad((t)->lock.wait_thr == NULL);				\
          -	ut_ad(UT_LIST_GET_LEN((t)->lock.trx_locks) == 0);		\
          -	ut_ad((t)->lock.table_locks.empty());				\
          -	ut_ad(!(t)->autoinc_locks					\
          -	      || ib_vector_is_empty((t)->autoinc_locks));		\
          -	ut_ad(UT_LIST_GET_LEN((t)->lock.evicted_tables) == 0);		\
          -	ut_ad((t)->dict_operation == TRX_DICT_OP_NONE);			\
          -} while(0)
          -
          -/** Check if transaction is in-active so that it can be freed and put back to
          -transaction pool.
          -@param t transaction handle */
          -#define assert_trx_is_inactive(t) do {					\
          -	assert_trx_is_free((t));					\
          -	ut_ad((t)->dict_operation_lock_mode == 0);			\
          -} while(0)
          -
           #ifdef UNIV_DEBUG
           /*******************************************************************//**
           Assert that an autocommit non-locking select cannot be in the
          @@ -1158,6 +1132,26 @@ struct trx_t {
             }
           
           
          +  void assert_freed() const
          +  {
          +    ut_ad(state == TRX_STATE_NOT_STARTED);
          +    ut_ad(!id);
          +    ut_ad(!has_logged());
          +    ut_ad(!is_referenced());
          +    ut_ad(!is_wsrep());
          +#ifdef WITH_WSREP
          +    ut_ad(!lock.was_chosen_as_wsrep_victim);
          +#endif
          +    ut_ad(!read_view.is_open());
          +    ut_ad(!lock.wait_thr);
          +    ut_ad(UT_LIST_GET_LEN(lock.trx_locks) == 0);
          +    ut_ad(lock.table_locks.empty());
          +    ut_ad(!autoinc_locks || ib_vector_is_empty(autoinc_locks));
          +    ut_ad(UT_LIST_GET_LEN(lock.evicted_tables) == 0);
          +    ut_ad(dict_operation == TRX_DICT_OP_NONE);
          +  }
          +
          +
           private:
           	/** Assign a rollback segment for modifying temporary tables.
           	@return the assigned rollback segment */
          diff --git a/storage/innobase/trx/trx0trx.cc b/storage/innobase/trx/trx0trx.cc
          index b1f88b1226f..efbedea3ea1 100644
          --- a/storage/innobase/trx/trx0trx.cc
          +++ b/storage/innobase/trx/trx0trx.cc
          @@ -353,14 +353,13 @@ trx_t *trx_create()
           {
           	trx_t*	trx = trx_pools->get();
           
          -	assert_trx_is_free(trx);
          +	trx->assert_freed();
           
           	mem_heap_t*	heap;
           	ib_alloc_t*	alloc;
           
           	/* We just got trx from pool, it should be non locking */
           	ut_ad(trx->will_lock == 0);
          -	ut_ad(trx->state == TRX_STATE_NOT_STARTED);
           	ut_ad(!trx->rw_trx_hash_pins);
           
           	DBUG_LOG("trx", "Create: " << trx);
          @@ -431,11 +430,11 @@ void trx_free(trx_t*& trx)
           	}
           
           	trx->dict_operation = TRX_DICT_OP_NONE;
          -	assert_trx_is_inactive(trx);
          +	ut_ad(!trx->dict_operation_lock_mode);
           
           	trx_sys.deregister_trx(trx);
           
          -	assert_trx_is_free(trx);
          +	trx->assert_freed();
           
           	trx_sys.rw_trx_hash.put_pins(trx);
           	trx->mysql_thd = 0;
          @@ -1509,10 +1508,11 @@ inline void trx_t::commit_in_memory(const mtr_t *mtr)
             {
               wsrep= false;
               wsrep_commit_ordered(mysql_thd);
          +    lock.was_chosen_as_wsrep_victim= false;
             }
           #endif /* WITH_WSREP */
           
          -  assert_trx_is_free(this);
          +  assert_freed();
             trx_init(this);
             trx_mutex_exit(this);
           
          diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc
          index 4f11c50da9c..63f990aaffe 100644
          --- a/storage/innobase/handler/ha_innodb.cc
          +++ b/storage/innobase/handler/ha_innodb.cc
          @@ -18693,6 +18693,7 @@ wsrep_innobase_kill_one_trx(
                   ut_ad(trx_mutex_own(victim_trx));
                   ut_ad(bf_thd_ptr);
                   ut_ad(victim_trx);
          +	DBUG_ASSERT(victim_trx->is_wsrep());
           
           	DBUG_ENTER("wsrep_innobase_kill_one_trx");
           	THD *bf_thd       = bf_thd_ptr ? (THD*) bf_thd_ptr : NULL;
          

          The last hunk is enforcing the invariant trx->is_wsrep() || !trx->lock.was_chosen_as_wsrep_victim(). You should also try to add a similar debug assertion for the transaction that is associated with bf_thd, something like this:

          DBUG_ASSERT(!bf_thd || thd_to_trx(bf_thd)->is_wsrep());
          

          Before the introduction of trx_t::is_wsrep() it could only have been written as follows:

          DBUG_ASSERT(!bf_thd || wsrep_on(bf_thd));
          

          Please submit a revised patch for final review.

          marko Marko Mäkelä added a comment - As far as I understand, the flag trx->lock.was_chosen_as_wsrep_victim must be reset at the end of the transaction’s life cycle, similar to the flag trx_t::wsrep that was recently introduced in MDEV-7962 . Please refer to the follow-up fix in 10.4 . Your proposed fix fails to extend assert_trx_is_free() , and it is resetting the flag in too many places. The debug macro should be replaced with an inline member function and some code adjusted accordingly. I would propose the following changes to the latest 10.4 as a starting point for the fix. It passed ./mtr --big-test of the default test suites (which do not include Galera): diff --git a/storage/innobase/include/trx0trx.h b/storage/innobase/include/trx0trx.h index 87bf97f00cf..70df62d0d03 100644 --- a/storage/innobase/include/trx0trx.h +++ b/storage/innobase/include/trx0trx.h @@ -436,32 +436,6 @@ Check transaction state */ ut_error; \ } while (0) -/** Check if transaction is free so that it can be re-initialized. -@param t transaction handle */ -#define assert_trx_is_free(t) do { \ - ut_ad(trx_state_eq((t), TRX_STATE_NOT_STARTED)); \ - ut_ad(!(t)->id); \ - ut_ad(!(t)->has_logged()); \ - ut_ad(!(t)->is_referenced()); \ - ut_ad(!(t)->is_wsrep()); \ - ut_ad(!(t)->read_view.is_open()); \ - ut_ad((t)->lock.wait_thr == NULL); \ - ut_ad(UT_LIST_GET_LEN((t)->lock.trx_locks) == 0); \ - ut_ad((t)->lock.table_locks.empty()); \ - ut_ad(!(t)->autoinc_locks \ - || ib_vector_is_empty((t)->autoinc_locks)); \ - ut_ad(UT_LIST_GET_LEN((t)->lock.evicted_tables) == 0); \ - ut_ad((t)->dict_operation == TRX_DICT_OP_NONE); \ -} while(0) - -/** Check if transaction is in-active so that it can be freed and put back to -transaction pool. -@param t transaction handle */ -#define assert_trx_is_inactive(t) do { \ - assert_trx_is_free((t)); \ - ut_ad((t)->dict_operation_lock_mode == 0); \ -} while(0) - #ifdef UNIV_DEBUG /*******************************************************************//** Assert that an autocommit non-locking select cannot be in the @@ -1158,6 +1132,26 @@ struct trx_t { } + void assert_freed() const + { + ut_ad(state == TRX_STATE_NOT_STARTED); + ut_ad(!id); + ut_ad(!has_logged()); + ut_ad(!is_referenced()); + ut_ad(!is_wsrep()); +#ifdef WITH_WSREP + ut_ad(!lock.was_chosen_as_wsrep_victim); +#endif + ut_ad(!read_view.is_open()); + ut_ad(!lock.wait_thr); + ut_ad(UT_LIST_GET_LEN(lock.trx_locks) == 0); + ut_ad(lock.table_locks.empty()); + ut_ad(!autoinc_locks || ib_vector_is_empty(autoinc_locks)); + ut_ad(UT_LIST_GET_LEN(lock.evicted_tables) == 0); + ut_ad(dict_operation == TRX_DICT_OP_NONE); + } + + private: /** Assign a rollback segment for modifying temporary tables. @return the assigned rollback segment */ diff --git a/storage/innobase/trx/trx0trx.cc b/storage/innobase/trx/trx0trx.cc index b1f88b1226f..efbedea3ea1 100644 --- a/storage/innobase/trx/trx0trx.cc +++ b/storage/innobase/trx/trx0trx.cc @@ -353,14 +353,13 @@ trx_t *trx_create() { trx_t* trx = trx_pools->get(); - assert_trx_is_free(trx); + trx->assert_freed(); mem_heap_t* heap; ib_alloc_t* alloc; /* We just got trx from pool, it should be non locking */ ut_ad(trx->will_lock == 0); - ut_ad(trx->state == TRX_STATE_NOT_STARTED); ut_ad(!trx->rw_trx_hash_pins); DBUG_LOG("trx", "Create: " << trx); @@ -431,11 +430,11 @@ void trx_free(trx_t*& trx) } trx->dict_operation = TRX_DICT_OP_NONE; - assert_trx_is_inactive(trx); + ut_ad(!trx->dict_operation_lock_mode); trx_sys.deregister_trx(trx); - assert_trx_is_free(trx); + trx->assert_freed(); trx_sys.rw_trx_hash.put_pins(trx); trx->mysql_thd = 0; @@ -1509,10 +1508,11 @@ inline void trx_t::commit_in_memory(const mtr_t *mtr) { wsrep= false; wsrep_commit_ordered(mysql_thd); + lock.was_chosen_as_wsrep_victim= false; } #endif /* WITH_WSREP */ - assert_trx_is_free(this); + assert_freed(); trx_init(this); trx_mutex_exit(this); diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc index 4f11c50da9c..63f990aaffe 100644 --- a/storage/innobase/handler/ha_innodb.cc +++ b/storage/innobase/handler/ha_innodb.cc @@ -18693,6 +18693,7 @@ wsrep_innobase_kill_one_trx( ut_ad(trx_mutex_own(victim_trx)); ut_ad(bf_thd_ptr); ut_ad(victim_trx); + DBUG_ASSERT(victim_trx->is_wsrep()); DBUG_ENTER("wsrep_innobase_kill_one_trx"); THD *bf_thd = bf_thd_ptr ? (THD*) bf_thd_ptr : NULL; The last hunk is enforcing the invariant trx->is_wsrep() || !trx->lock.was_chosen_as_wsrep_victim() . You should also try to add a similar debug assertion for the transaction that is associated with bf_thd , something like this: DBUG_ASSERT(!bf_thd || thd_to_trx(bf_thd)->is_wsrep()); Before the introduction of trx_t::is_wsrep() it could only have been written as follows: DBUG_ASSERT(!bf_thd || wsrep_on(bf_thd)); Please submit a revised patch for final review.

          Last hunk is not possible for Galera, consider following:

          Thread 1 (Thread 0x7f16cc4dd700 (LWP 1933260)):
          #0  __pthread_kill (threadid=<optimized out>, signo=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
          #1  0x00005613b420954b in my_write_core (sig=6) at /home/jan/mysql/10.4/mysys/stacktrace.c:481
          #2  0x00005613b395ba47 in handle_fatal_signal (sig=6) at /home/jan/mysql/10.4/sql/signal_handler.cc:343
          #3  <signal handler called>
          #4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
          #5  0x00007f16d3099859 in __GI_abort () at abort.c:79
          #6  0x00007f16d3099729 in __assert_fail_base (fmt=0x7f16d322f588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x5613b44ef247 "victim_trx->is_wsrep()", file=0x5613b44e60b0 "/home/jan/mysql/10.4/storage/innobase/handler/ha_innodb.cc", line=18696, function=<optimized out>) at assert.c:92
          #7  0x00007f16d30aaf36 in __GI___assert_fail (assertion=0x5613b44ef247 "victim_trx->is_wsrep()", file=0x5613b44e60b0 "/home/jan/mysql/10.4/storage/innobase/handler/ha_innodb.cc", line=18696, function=0x5613b44ef1c8 "int wsrep_innobase_kill_one_trx(void*, const trx_t*, trx_t*, ulint)") at assert.c:101
          #8  0x00005613b3bd1af5 in wsrep_innobase_kill_one_trx (bf_thd_ptr=0x7f16b4000d90, bf_trx=0x7f16c5457140, victim_trx=0x7f16c5459390, signal=1) at /home/jan/mysql/10.4/storage/innobase/handler/ha_innodb.cc:18696
          #9  0x00005613b3bd2148 in wsrep_abort_transaction (bf_thd=0x7f16b4000d90, victim_thd=0x7f1650000d90, signal=1 '\001') at /home/jan/mysql/10.4/storage/innobase/handler/ha_innodb.cc:18777
          #10 0x00005613b397277a in ha_abort_transaction (bf_thd=0x7f16b4000d90, victim_thd=0x7f1650000d90, signal=1 '\001') at /home/jan/mysql/10.4/sql/handler.cc:7020
          #11 0x00005613b387b9e8 in wsrep_abort_thd (bf_thd_ptr=0x7f16b4000d90, victim_thd_ptr=0x7f1650000d90, signal=1 '\001') at /home/jan/mysql/10.4/sql/wsrep_thd.cc:338
          #12 0x00005613b386a6f5 in wsrep_handle_mdl_conflict (requestor_ctx=0x7f16b4000eb0, ticket=0x7f16500c1ea0, key=0x7f16600de110) at /home/jan/mysql/10.4/sql/wsrep_mysqld.cc:2245
          #13 0x00005613b37900e1 in MDL_lock::can_grant_lock (this=0x7f16600de110, type_arg=MDL_EXCLUSIVE, requestor_ctx=0x7f16b4000eb0, ignore_lock_priority=false) at /home/jan/mysql/10.4/sql/mdl.cc:1769
          #14 0x00005613b3790a56 in MDL_context::try_acquire_lock_impl (this=0x7f16b4000eb0, mdl_request=0x7f16cc4da150, out_ticket=0x7f16cc4da068) at /home/jan/mysql/10.4/sql/mdl.cc:2100
          #15 0x00005613b3790e66 in MDL_context::acquire_lock (this=0x7f16b4000eb0, mdl_request=0x7f16cc4da150, lock_wait_timeout=86400) at /home/jan/mysql/10.4/sql/mdl.cc:2247
          #16 0x00005613b3791ad8 in MDL_context::upgrade_shared_lock (this=0x7f16b4000eb0, mdl_ticket=0x7f1650067a40, new_type=MDL_EXCLUSIVE, lock_wait_timeout=86400) at /home/jan/mysql/10.4/sql/mdl.cc:2523
          #17 0x00005613b355ba9d in upgrade_lock_if_not_exists (thd=0x7f16b4000d90, create_info=..., create_table=0x7f16b400f858, lock_wait_timeout=86400) at /home/jan/mysql/10.4/sql/sql_base.cc:3982
          #18 0x00005613b355bfb5 in lock_table_names (thd=0x7f16b4000d90, options=..., tables_start=0x7f16b400f858, tables_end=0x0, lock_wait_timeout=86400, flags=0) at /home/jan/mysql/10.4/sql/sql_base.cc:4087
          #19 0x00005613b355c5a8 in open_tables (thd=0x7f16b4000d90, options=..., start=0x7f16cc4da718, counter=0x7f16cc4da734, flags=0, prelocking_strategy=0x7f16cc4da7b8) at /home/jan/mysql/10.4/sql/sql_base.cc:4292
          #20 0x00005613b355e84b in open_and_lock_tables (thd=0x7f16b4000d90, options=..., tables=0x7f16b400f858, derived=false, flags=0, prelocking_strategy=0x7f16cc4da7b8) at /home/jan/mysql/10.4/sql/sql_base.cc:5217
          #21 0x00005613b36f808a in open_and_lock_tables (thd=0x7f16b4000d90, options=..., tables=0x7f16b400f858, derived=false, flags=0) at /home/jan/mysql/10.4/sql/sql_base.h:495
          #22 0x00005613b36e411b in mysql_create_table (thd=0x7f16b4000d90, create_table=0x7f16b400f858, create_info=0x7f16cc4da9f0, alter_info=0x7f16cc4da930) at /home/jan/mysql/10.4/sql/sql_table.cc:5219
          #23 0x00005613b36f6a0c in Sql_cmd_create_table_like::execute (this=0x7f16b400f7f0, thd=0x7f16b4000d90) at /home/jan/mysql/10.4/sql/sql_table.cc:11464
          #24 0x00005613b3606737 in mysql_execute_command (thd=0x7f16b4000d90) at /home/jan/mysql/10.4/sql/sql_parse.cc:6101
          #25 0x00005613b360bcb1 in mysql_parse (thd=0x7f16b4000d90, rawbuf=0x7f16b40476ab "CREATE TABLE `t1` (\n  `SLEEP(0.1)` int(1) NOT NULL\n)", length=52, parser_state=0x7f16cc4db4e0, is_com_multi=false, is_next_command=false) at /home/jan/mysql/10.4/sql/sql_parse.cc:7900
          #26 0x00005613b3ab8fb4 in Query_log_event::do_apply_event (this=0x7f16b4044940, rgi=0x7f16b4020080, query_arg=0x7f16b40476ab "CREATE TABLE `t1` (\n  `SLEEP(0.1)` int(1) NOT NULL\n)", q_len_arg=52) at /home/jan/mysql/10.4/sql/log_event.cc:5738
          #27 0x00005613b3ab7e55 in Query_log_event::do_apply_event (this=0x7f16b4044940, rgi=0x7f16b4020080) at /home/jan/mysql/10.4/sql/log_event.cc:5416
          #28 0x00005613b34ff917 in Log_event::apply_event (this=0x7f16b4044940, rgi=0x7f16b4020080) at /home/jan/mysql/10.4/sql/log_event.h:1482
          #29 0x00005613b3879ce8 in wsrep_apply_events (thd=0x7f16b4000d90, rli=0x7f16b4014960, events_buf=0x7f16cd9f1788, buf_len=130) at /home/jan/mysql/10.4/sql/wsrep_applier.cc:200
          #30 0x00005613b385ca8a in Wsrep_applier_service::apply_write_set (this=0x7f16cc4dcd40, ws_meta=..., data=...) at /home/jan/mysql/10.4/sql/wsrep_high_priority_service.cc:501
          #31 0x00005613b4286815 in apply_write_set (server_state=..., high_priority_service=..., ws_handle=..., ws_meta=..., data=...) at /home/jan/mysql/10.4/wsrep-lib/src/server_state.cpp:327
          #32 0x00005613b428a4e4 in wsrep::server_state::on_apply (this=0x5613b6098ed0, high_priority_service=..., ws_handle=..., ws_meta=..., data=...) at /home/jan/mysql/10.4/wsrep-lib/src/server_state.cpp:1124
          #33 0x00005613b42a30eb in wsrep::high_priority_service::apply (this=0x7f16cc4dcd40, ws_handle=..., ws_meta=..., data=...) at /home/jan/mysql/10.4/wsrep-lib/include/wsrep/high_priority_service.hpp:47
          #34 0x00005613b42a01d3 in (anonymous namespace)::apply_cb (ctx=0x7f16cc4dcd40, wsh=0x7f16cc4dbdb0, flags=65, buf=0x7f16cc4dbdc0, meta=0x7f16cc4dc070, exit_loop=0x7f16cc4dc02f) at /home/jan/mysql/10.4/wsrep-lib/src/wsrep_provider_v26.cpp:496
          #35 0x00007f16ce226585 in galera::TrxHandleSlave::apply (this=this@entry=0x7f164c055530, recv_ctx=recv_ctx@entry=0x7f16cc4dcd40, apply_cb=0x5613b429ffa0 <(anonymous namespace)::apply_cb(void*, wsrep_ws_handle_t const*, uint32_t, wsrep_buf_t const*, wsrep_trx_meta_t const*, wsrep_bool_t*)>, meta=..., exit_loop=exit_loop@entry=@0x7f16cc4dc02f: false) at galera/src/trx_handle.cpp:391
          #36 0x00007f16ce26b5c2 in galera::ReplicatorSMM::apply_trx (this=0x5613b60cc060, recv_ctx=0x7f16cc4dcd40, ts=...) at galera/src/replicator_smm.cpp:504
          #37 0x00007f16ce2735b8 in galera::ReplicatorSMM::process_trx (this=0x5613b60cc060, recv_ctx=0x7f16cc4dcd40, ts_ptr=...) at galera/src/replicator_smm.cpp:2127
          #38 0x00007f16ce2455d3 in galera::GcsActionSource::process_writeset (this=0x5613b60d8dd0, recv_ctx=0x7f16cc4dcd40, act=..., exit_loop=@0x7f16cc4dc91f: false) at galera/src/gcs_action_source.cpp:62
          #39 0x00007f16ce245f9b in galera::GcsActionSource::dispatch (this=<optimized out>, recv_ctx=<optimized out>, act=..., exit_loop=<optimized out>) at galera/src/gcs_action_source.cpp:110
          #40 0x00007f16ce2461a2 in galera::GcsActionSource::process (this=0x5613b60d8dd0, recv_ctx=0x7f16cc4dcd40, exit_loop=@0x7f16cc4dc91f: false) at galera/src/gcs_action_source.cpp:183
          #41 0x00007f16ce270610 in galera::ReplicatorSMM::async_recv (this=0x5613b60cc060, recv_ctx=0x7f16cc4dcd40) at galera/src/replicator_smm.cpp:390
          #42 0x00007f16ce28cb41 in galera_recv (gh=<optimized out>, recv_ctx=<optimized out>) at galera/src/wsrep_provider.cpp:236
          #43 0x00005613b42a11ac in wsrep::wsrep_provider_v26::run_applier (this=0x5613b6069670, applier_ctx=0x7f16cc4dcd40) at /home/jan/mysql/10.4/wsrep-lib/src/wsrep_provider_v26.cpp:731
          #44 0x00005613b387a5b9 in wsrep_replication_process (thd=0x7f16b4000d90, arg=0x5613b611e0b0) at /home/jan/mysql/10.4/sql/wsrep_thd.cc:58
          #45 0x00005613b386c466 in start_wsrep_THD (arg=0x5613b611e0b0) at /home/jan/mysql/10.4/sql/wsrep_mysqld.cc:2789
          #46 0x00005613b4195590 in pfs_spawn_thread (arg=0x5613b60f19f0) at /home/jan/mysql/10.4/storage/perfschema/pfs.cc:1869
          #47 0x00007f16d39d8609 in start_thread (arg=<optimized out>) at pthread_create.c:477
          #48 0x00007f16d3196103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
          

          jplindst Jan Lindström (Inactive) added a comment - Last hunk is not possible for Galera, consider following: Thread 1 (Thread 0x7f16cc4dd700 (LWP 1933260)): #0 __pthread_kill (threadid=<optimized out>, signo=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:56 #1 0x00005613b420954b in my_write_core (sig=6) at /home/jan/mysql/10.4/mysys/stacktrace.c:481 #2 0x00005613b395ba47 in handle_fatal_signal (sig=6) at /home/jan/mysql/10.4/sql/signal_handler.cc:343 #3 <signal handler called> #4 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #5 0x00007f16d3099859 in __GI_abort () at abort.c:79 #6 0x00007f16d3099729 in __assert_fail_base (fmt=0x7f16d322f588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x5613b44ef247 "victim_trx->is_wsrep()", file=0x5613b44e60b0 "/home/jan/mysql/10.4/storage/innobase/handler/ha_innodb.cc", line=18696, function=<optimized out>) at assert.c:92 #7 0x00007f16d30aaf36 in __GI___assert_fail (assertion=0x5613b44ef247 "victim_trx->is_wsrep()", file=0x5613b44e60b0 "/home/jan/mysql/10.4/storage/innobase/handler/ha_innodb.cc", line=18696, function=0x5613b44ef1c8 "int wsrep_innobase_kill_one_trx(void*, const trx_t*, trx_t*, ulint)") at assert.c:101 #8 0x00005613b3bd1af5 in wsrep_innobase_kill_one_trx (bf_thd_ptr=0x7f16b4000d90, bf_trx=0x7f16c5457140, victim_trx=0x7f16c5459390, signal=1) at /home/jan/mysql/10.4/storage/innobase/handler/ha_innodb.cc:18696 #9 0x00005613b3bd2148 in wsrep_abort_transaction (bf_thd=0x7f16b4000d90, victim_thd=0x7f1650000d90, signal=1 '\001') at /home/jan/mysql/10.4/storage/innobase/handler/ha_innodb.cc:18777 #10 0x00005613b397277a in ha_abort_transaction (bf_thd=0x7f16b4000d90, victim_thd=0x7f1650000d90, signal=1 '\001') at /home/jan/mysql/10.4/sql/handler.cc:7020 #11 0x00005613b387b9e8 in wsrep_abort_thd (bf_thd_ptr=0x7f16b4000d90, victim_thd_ptr=0x7f1650000d90, signal=1 '\001') at /home/jan/mysql/10.4/sql/wsrep_thd.cc:338 #12 0x00005613b386a6f5 in wsrep_handle_mdl_conflict (requestor_ctx=0x7f16b4000eb0, ticket=0x7f16500c1ea0, key=0x7f16600de110) at /home/jan/mysql/10.4/sql/wsrep_mysqld.cc:2245 #13 0x00005613b37900e1 in MDL_lock::can_grant_lock (this=0x7f16600de110, type_arg=MDL_EXCLUSIVE, requestor_ctx=0x7f16b4000eb0, ignore_lock_priority=false) at /home/jan/mysql/10.4/sql/mdl.cc:1769 #14 0x00005613b3790a56 in MDL_context::try_acquire_lock_impl (this=0x7f16b4000eb0, mdl_request=0x7f16cc4da150, out_ticket=0x7f16cc4da068) at /home/jan/mysql/10.4/sql/mdl.cc:2100 #15 0x00005613b3790e66 in MDL_context::acquire_lock (this=0x7f16b4000eb0, mdl_request=0x7f16cc4da150, lock_wait_timeout=86400) at /home/jan/mysql/10.4/sql/mdl.cc:2247 #16 0x00005613b3791ad8 in MDL_context::upgrade_shared_lock (this=0x7f16b4000eb0, mdl_ticket=0x7f1650067a40, new_type=MDL_EXCLUSIVE, lock_wait_timeout=86400) at /home/jan/mysql/10.4/sql/mdl.cc:2523 #17 0x00005613b355ba9d in upgrade_lock_if_not_exists (thd=0x7f16b4000d90, create_info=..., create_table=0x7f16b400f858, lock_wait_timeout=86400) at /home/jan/mysql/10.4/sql/sql_base.cc:3982 #18 0x00005613b355bfb5 in lock_table_names (thd=0x7f16b4000d90, options=..., tables_start=0x7f16b400f858, tables_end=0x0, lock_wait_timeout=86400, flags=0) at /home/jan/mysql/10.4/sql/sql_base.cc:4087 #19 0x00005613b355c5a8 in open_tables (thd=0x7f16b4000d90, options=..., start=0x7f16cc4da718, counter=0x7f16cc4da734, flags=0, prelocking_strategy=0x7f16cc4da7b8) at /home/jan/mysql/10.4/sql/sql_base.cc:4292 #20 0x00005613b355e84b in open_and_lock_tables (thd=0x7f16b4000d90, options=..., tables=0x7f16b400f858, derived=false, flags=0, prelocking_strategy=0x7f16cc4da7b8) at /home/jan/mysql/10.4/sql/sql_base.cc:5217 #21 0x00005613b36f808a in open_and_lock_tables (thd=0x7f16b4000d90, options=..., tables=0x7f16b400f858, derived=false, flags=0) at /home/jan/mysql/10.4/sql/sql_base.h:495 #22 0x00005613b36e411b in mysql_create_table (thd=0x7f16b4000d90, create_table=0x7f16b400f858, create_info=0x7f16cc4da9f0, alter_info=0x7f16cc4da930) at /home/jan/mysql/10.4/sql/sql_table.cc:5219 #23 0x00005613b36f6a0c in Sql_cmd_create_table_like::execute (this=0x7f16b400f7f0, thd=0x7f16b4000d90) at /home/jan/mysql/10.4/sql/sql_table.cc:11464 #24 0x00005613b3606737 in mysql_execute_command (thd=0x7f16b4000d90) at /home/jan/mysql/10.4/sql/sql_parse.cc:6101 #25 0x00005613b360bcb1 in mysql_parse (thd=0x7f16b4000d90, rawbuf=0x7f16b40476ab "CREATE TABLE `t1` (\n `SLEEP(0.1)` int(1) NOT NULL\n)", length=52, parser_state=0x7f16cc4db4e0, is_com_multi=false, is_next_command=false) at /home/jan/mysql/10.4/sql/sql_parse.cc:7900 #26 0x00005613b3ab8fb4 in Query_log_event::do_apply_event (this=0x7f16b4044940, rgi=0x7f16b4020080, query_arg=0x7f16b40476ab "CREATE TABLE `t1` (\n `SLEEP(0.1)` int(1) NOT NULL\n)", q_len_arg=52) at /home/jan/mysql/10.4/sql/log_event.cc:5738 #27 0x00005613b3ab7e55 in Query_log_event::do_apply_event (this=0x7f16b4044940, rgi=0x7f16b4020080) at /home/jan/mysql/10.4/sql/log_event.cc:5416 #28 0x00005613b34ff917 in Log_event::apply_event (this=0x7f16b4044940, rgi=0x7f16b4020080) at /home/jan/mysql/10.4/sql/log_event.h:1482 #29 0x00005613b3879ce8 in wsrep_apply_events (thd=0x7f16b4000d90, rli=0x7f16b4014960, events_buf=0x7f16cd9f1788, buf_len=130) at /home/jan/mysql/10.4/sql/wsrep_applier.cc:200 #30 0x00005613b385ca8a in Wsrep_applier_service::apply_write_set (this=0x7f16cc4dcd40, ws_meta=..., data=...) at /home/jan/mysql/10.4/sql/wsrep_high_priority_service.cc:501 #31 0x00005613b4286815 in apply_write_set (server_state=..., high_priority_service=..., ws_handle=..., ws_meta=..., data=...) at /home/jan/mysql/10.4/wsrep-lib/src/server_state.cpp:327 #32 0x00005613b428a4e4 in wsrep::server_state::on_apply (this=0x5613b6098ed0, high_priority_service=..., ws_handle=..., ws_meta=..., data=...) at /home/jan/mysql/10.4/wsrep-lib/src/server_state.cpp:1124 #33 0x00005613b42a30eb in wsrep::high_priority_service::apply (this=0x7f16cc4dcd40, ws_handle=..., ws_meta=..., data=...) at /home/jan/mysql/10.4/wsrep-lib/include/wsrep/high_priority_service.hpp:47 #34 0x00005613b42a01d3 in (anonymous namespace)::apply_cb (ctx=0x7f16cc4dcd40, wsh=0x7f16cc4dbdb0, flags=65, buf=0x7f16cc4dbdc0, meta=0x7f16cc4dc070, exit_loop=0x7f16cc4dc02f) at /home/jan/mysql/10.4/wsrep-lib/src/wsrep_provider_v26.cpp:496 #35 0x00007f16ce226585 in galera::TrxHandleSlave::apply (this=this@entry=0x7f164c055530, recv_ctx=recv_ctx@entry=0x7f16cc4dcd40, apply_cb=0x5613b429ffa0 <(anonymous namespace)::apply_cb(void*, wsrep_ws_handle_t const*, uint32_t, wsrep_buf_t const*, wsrep_trx_meta_t const*, wsrep_bool_t*)>, meta=..., exit_loop=exit_loop@entry=@0x7f16cc4dc02f: false) at galera/src/trx_handle.cpp:391 #36 0x00007f16ce26b5c2 in galera::ReplicatorSMM::apply_trx (this=0x5613b60cc060, recv_ctx=0x7f16cc4dcd40, ts=...) at galera/src/replicator_smm.cpp:504 #37 0x00007f16ce2735b8 in galera::ReplicatorSMM::process_trx (this=0x5613b60cc060, recv_ctx=0x7f16cc4dcd40, ts_ptr=...) at galera/src/replicator_smm.cpp:2127 #38 0x00007f16ce2455d3 in galera::GcsActionSource::process_writeset (this=0x5613b60d8dd0, recv_ctx=0x7f16cc4dcd40, act=..., exit_loop=@0x7f16cc4dc91f: false) at galera/src/gcs_action_source.cpp:62 #39 0x00007f16ce245f9b in galera::GcsActionSource::dispatch (this=<optimized out>, recv_ctx=<optimized out>, act=..., exit_loop=<optimized out>) at galera/src/gcs_action_source.cpp:110 #40 0x00007f16ce2461a2 in galera::GcsActionSource::process (this=0x5613b60d8dd0, recv_ctx=0x7f16cc4dcd40, exit_loop=@0x7f16cc4dc91f: false) at galera/src/gcs_action_source.cpp:183 #41 0x00007f16ce270610 in galera::ReplicatorSMM::async_recv (this=0x5613b60cc060, recv_ctx=0x7f16cc4dcd40) at galera/src/replicator_smm.cpp:390 #42 0x00007f16ce28cb41 in galera_recv (gh=<optimized out>, recv_ctx=<optimized out>) at galera/src/wsrep_provider.cpp:236 #43 0x00005613b42a11ac in wsrep::wsrep_provider_v26::run_applier (this=0x5613b6069670, applier_ctx=0x7f16cc4dcd40) at /home/jan/mysql/10.4/wsrep-lib/src/wsrep_provider_v26.cpp:731 #44 0x00005613b387a5b9 in wsrep_replication_process (thd=0x7f16b4000d90, arg=0x5613b611e0b0) at /home/jan/mysql/10.4/sql/wsrep_thd.cc:58 #45 0x00005613b386c466 in start_wsrep_THD (arg=0x5613b611e0b0) at /home/jan/mysql/10.4/sql/wsrep_mysqld.cc:2789 #46 0x00005613b4195590 in pfs_spawn_thread (arg=0x5613b60f19f0) at /home/jan/mysql/10.4/storage/perfschema/pfs.cc:1869 #47 0x00007f16d39d8609 in start_thread (arg=<optimized out>) at pthread_create.c:477 #48 0x00007f16d3196103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

          https://github.com/MariaDB/server/commit/0fc77320a133447183f2cfaccb037be878959ad6

          Meanwhile I will run 10 full mtr big tests on galera suites, it just takes few hours machine time.

          jplindst Jan Lindström (Inactive) added a comment - - edited https://github.com/MariaDB/server/commit/0fc77320a133447183f2cfaccb037be878959ad6 Meanwhile I will run 10 full mtr big tests on galera suites, it just takes few hours machine time.

          In trx_init() and trx_create(), there is no need to reset trx->lock.was_chosen_as_wsrep_victim, because it must already have been reset. Likewise, in TrxFactory it is unnecessary to add an assertion about it. We already cover it by trx_t::assert_freed().

          I would expect that your change in trx_t::commit_in_memory() would break cmake -DWITH_WSREP=OFF. But, it did not. It turns out that an #ifdef WITH_WSREP is missing around the declaration in trx0trx.h. Please add that. And please reset the flag at the end of trx_t::commit_in_memory(), as follows:

          #ifdef WITH_WSREP
            /* Serialization history has been written and the transaction is
            committed in memory, which makes this commit ordered. Release commit
            order critical section. */
            if (wsrep)
            {
              wsrep= false;
              wsrep_commit_ordered(mysql_thd);
            }
            lock.was_chosen_as_wsrep_victim= false;
          #endif /* WITH_WSREP */
          

          I assume that Galera can kill an internal transaction, and hence we want to reset the flag for all transactions.

          Please submit a revised patch for review.

          marko Marko Mäkelä added a comment - In trx_init() and trx_create() , there is no need to reset trx->lock.was_chosen_as_wsrep_victim , because it must already have been reset. Likewise, in TrxFactory it is unnecessary to add an assertion about it. We already cover it by trx_t::assert_freed() . I would expect that your change in trx_t::commit_in_memory() would break cmake -DWITH_WSREP=OFF . But, it did not. It turns out that an #ifdef WITH_WSREP is missing around the declaration in trx0trx.h . Please add that. And please reset the flag at the end of trx_t::commit_in_memory() , as follows: #ifdef WITH_WSREP /* Serialization history has been written and the transaction is committed in memory, which makes this commit ordered. Release commit order critical section. */ if (wsrep) { wsrep= false ; wsrep_commit_ordered(mysql_thd); } lock.was_chosen_as_wsrep_victim= false ; #endif /* WITH_WSREP */ I assume that Galera can kill an internal transaction, and hence we want to reset the flag for all transactions. Please submit a revised patch for review.
          jplindst Jan Lindström (Inactive) added a comment - https://github.com/MariaDB/server/commit/3e4a804bdaf0a1ece7a067902dc732fceca3160b

          Thanks! This looks good. I did not review the test cases or wsrep_innobase_kill_one_trx(), but the other changes in InnoDB, including the lifetime of the was_chosen_as_wsrep_victim, look OK, and all tests pass on buildbot.

          marko Marko Mäkelä added a comment - Thanks! This looks good. I did not review the test cases or wsrep_innobase_kill_one_trx() , but the other changes in InnoDB, including the lifetime of the was_chosen_as_wsrep_victim , look OK, and all tests pass on buildbot.

          People

            jplindst Jan Lindström (Inactive)
            jplindst Jan Lindström (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.