Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-15785

Test case galera_as_slave_replication_bundle caused debug assertion

Details

    Description

      2018-04-05 10:48:04 140629028526848 [Note] WSREP: ready state reached
      2018-04-05 10:48:04 140629028526848 [Note] Slave SQL thread initialized, starting replication in log 'FIRST' at position 0, relay log './mysqld-relay-bin.000001' position: 4
      2018-04-05 10:48:04 140629028834048 [Note] Slave I/O thread: connected to master 'root@127.0.0.1:16000',replication started in log 'FIRST' at position 4
      mysqld: /home/jan/mysql/10.1-galera/sql/log_event.cc:6784: virtual int Gtid_log_event::do_apply_event(rpl_group_info*): Assertion `(bits & (1ULL << 25)) == 0' failed.
      180405 10:48:04 [ERROR] mysqld got signal 6 ;
      This could be because you hit a bug. It is also possible that this binary
      or one of the libraries it was linked against is corrupt, improperly built,
      or misconfigured. This error can also be caused by malfunctioning hardware.
       
      To report this bug, see https://mariadb.com/kb/en/reporting-bugs
       
      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed, 
      something is definitely wrong and this may fail.
       
      Server version: 10.1.33-MariaDB-debug
      key_buffer_size=1048576
      read_buffer_size=131072
      max_used_connections=2
      max_threads=153
      thread_count=4
      It is possible that mysqld could use up to 
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 63044 K  bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.
       
      Thread pointer: 0x7fe6ab06a070
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x7fe6bf4a48f0 thread_stack 0x48400
      mysys/stacktrace.c:267(my_print_stacktrace)[0x559f11653ab0]
      sql/signal_handler.cc:168(handle_fatal_signal)[0x559f10fed82b]
      /lib/x86_64-linux-gnu/libpthread.so.0(+0x11f50)[0x7fe6c2a45f50]
      linux/raise.c:51(__GI_raise)[0x7fe6bfcb0e7b]
      stdlib/abort.c:81(__GI_abort)[0x7fe6bfcb2231]
      assert/assert.c:89(__assert_fail_base)[0x7fe6bfca99da]
      /lib/x86_64-linux-gnu/libc.so.6(+0x2da52)[0x7fe6bfca9a52]
      sql/log_event.cc:6785(Gtid_log_event::do_apply_event(rpl_group_info*))[0x559f110eb487]
      sql/log_event.h:1343(Log_event::apply_event(rpl_group_info*))[0x559f10d1afcb]
      sql/slave.cc:3479(apply_event_and_update_pos_apply(Log_event*, THD*, rpl_group_info*, int))[0x559f10d10729]
      sql/slave.cc:3601(apply_event_and_update_pos(Log_event*, THD*, rpl_group_info*))[0x559f10d10bb2]
      sql/slave.cc:3885(exec_relay_log_event(THD*, Relay_log_info*, rpl_group_info*))[0x559f10d113b6]
      sql/slave.cc:4981(handle_slave_sql)[0x559f10d145ff]
      perfschema/pfs.cc:1863(pfs_spawn_thread)[0x559f11314c3b]
      nptl/pthread_create.c:463(start_thread)[0x7fe6c2a3b5aa]
      x86_64/clone.S:97(clone)[0x7fe6bfd72cbf]
       
      Trying to get some variables.
      Some pointers may be invalid and cause the dump to abort.
      Query (0x0): 
      Connection ID (thread ID): 9
      Status: NOT_KILLED
       
      Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off
       
      The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
      information that should help you find out what is causing the crash.
      Writing a core file
      ----------SERVER LOG END-------------
       
       
       - saving '/home/jan/mysql/10.1-galera/mysql-test/var/log/galera.galera_as_slave_replication_bundle-innodb_plugin/' to '/home/jan/mysql/10.1-galera/mysql-test/var/log/galera.galera_as_slave_replication_bundle-innodb_plugin/'
       
      Only  1  of 2 completed.
      --------------------------------------------------------------------------
      The servers were restarted 0 times
      Spent 0.000 of 12 seconds executing testcases
       
      Failure: Failed 1/1 tests, 0.00% were successful.
       
      Failing test(s): galera.galera_as_slave_replication_bundle
       
      The log files in var/log may give you some hint of what went wrong.
       
      If you want to report this error, please read first the documentation
      at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
       
      mysql-test-run: *** ERROR: there were failing test cases
      

      Attachments

        1. bundle.diff
          6 kB
          Jan Lindström

        Activity

          Elkin Can you have a look this assertion happens on middle of replication code that is not familiar to me.

          jplindst Jan Lindström (Inactive) added a comment - Elkin Can you have a look this assertion happens on middle of replication code that is not familiar to me.
          Elkin Andrei Elkin added a comment -

          This looks like caused by skipped resetting of the flag by a preceding transaction. You can find its end position in the master binlog
          in file: rli->group_master_log_name at pos:rli->group_master_log_pos.

          It should have clear the flag at execution
          of Xid_log_event::do_apply_event (or Query_log_event's method for Commit query). As you have a test that reproduces the issue
          it must be straightforward to locate that miss.
          I would do this:

          set a gdb br for this SQL thread before it takes on a preceding Gtid group and watch
          how the flag is set, and then how the group of events terminates (if it does). E.g absence of Xid_log_event at its end
          would trigger the assert. You can

          If you will need any help to process in gdb I am always there.

          Elkin Andrei Elkin added a comment - This looks like caused by skipped resetting of the flag by a preceding transaction. You can find its end position in the master binlog in file: rli->group_master_log_name at pos:rli->group_master_log_pos. It should have clear the flag at execution of Xid_log_event::do_apply_event (or Query_log_event's method for Commit query). As you have a test that reproduces the issue it must be straightforward to locate that miss. I would do this: set a gdb br for this SQL thread before it takes on a preceding Gtid group and watch how the flag is set, and then how the group of events terminates (if it does). E.g absence of Xid_log_event at its end would trigger the assert. You can If you will need any help to process in gdb I am always there.

          After fixing original assertion a new one surfaced from wsrep InnoDB checkpoint. First checkpoint is written as follows (using instrumented code to get assertion where we want):

          2018-10-02 13:03:33 140472161303296 [Note] WSREP: Xid_log_event: Skipping wsrep commit replicated 1 binlog ./mysqld-relay-bin.000002 pos 871
          2018-10-02 13:03:33 140472161303296 [Note] WSREP: apply_event_and_update_pos() result: 0
          2018-10-02 13:03:33 140472161303296 [Note] WSREP: Wsrep checkpoint writing for uuid 66f554e0-c62a-11e8-97f5-9262cfb8a273 xid_seqno 2 trx_sys_cur_xid_seqno 1 trx_state 2
          2018-10-02 13:03:33 140472161303296 [Note] WSREP: apply_event_and_update_pos() result: 0
          2018-10-02 13:03:33 140472161303296 [Note] WSREP: apply_event_and_update_pos() result: 0
          2018-10-02 13:03:33 140472161303296 [Note] WSREP: apply_event_and_update_pos() result: 0
          2018-10-02 13:03:33 7fc239467b00  InnoDB: Assertion failure in thread 140472161303296 in file trx0sys.cc line 391
          InnoDB: Failing assertion: 0
          InnoDB: We intentionally generate a memory trap.
          InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
          InnoDB: If you get repeated assertion failures or crashes, even
          InnoDB: immediately after the mysqld startup, there may be
          InnoDB: corruption in the InnoDB tablespace. Please refer to
          InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
          InnoDB: about forcing recovery.
          181002 13:03:33 [ERROR] mysqld got signal 6 ;
          This could be because you hit a bug. It is also possible that this binary
          or one of the libraries it was linked against is corrupt, improperly built,
          or misconfigured. This error can also be caused by malfunctioning hardware.
           
          To report this bug, see https://mariadb.com/kb/en/reporting-bugs
           
          We will try our best to scrape up some info that will hopefully help
          diagnose the problem, but since we have already crashed, 
          something is definitely wrong and this may fail.
           
          Server version: 10.1.37-MariaDB-debug
          key_buffer_size=1048576
          read_buffer_size=131072
          max_used_connections=2
          max_threads=153
          thread_count=4
          It is possible that mysqld could use up to 
          key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 63026 K  bytes of memory
          Hope that's ok; if not, decrease some variables in the equation.
           
          Thread pointer: 0x7fc221c70070
          Attempting backtrace. You can use the following information to find out
          where mysqld died. If you see no messages after this, something went
          terribly wrong...
          stack_bottom = 0x7fc2394668f0 thread_stack 0x48400
          2018-10-02 13:03:33 140472161917696 [Note] WSREP: wsrep_sync_wait: thd->variables.wsrep_sync_wait = 15, mask = 1
          /home/jan/mysql/10.1-galera/sql/mysqld(my_print_stacktrace+0x38)[0x55fd4be68365]
          mysys/stacktrace.c:267(my_print_stacktrace)[0x55fd4b803e39]
          /lib/x86_64-linux-gnu/libpthread.so.0(+0x128e0)[0x7fc23b6858e0]
          /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x10b)[0x7fc239ad5f3b]
          linux/raise.c:51(__GI_raise)[0x7fc239ad72f1]
          /home/jan/mysql/10.1-galera/sql/mysqld(+0xbe82e2)[0x55fd4bcdd2e2]
          trx/trx0sys.cc:397(trx_sys_update_wsrep_checkpoint(xid_t const*, unsigned char*, mtr_t*, trx_t const*))[0x55fd4bce49a2]
          trx/trx0trx.cc:1243(trx_write_serialisation_history(trx_t*, mtr_t*))[0x55fd4bce59c1]
          trx/trx0trx.cc:1639(trx_commit_low(trx_t*, mtr_t*))[0x55fd4bce5a86]
          trx/trx0trx.cc:1668(trx_commit(trx_t*))[0x55fd4bce6441]
          trx/trx0trx.cc:1932(trx_commit_for_mysql(trx_t*))[0x55fd4bb40d0d]
          handler/ha_innodb.cc:4704(innobase_commit_low(trx_t*))[0x55fd4bb40feb]
          handler/ha_innodb.cc:4846(innobase_commit_ordered_2(trx_t*, THD*))[0x55fd4bb41191]
          handler/ha_innodb.cc:4906(innobase_commit_ordered(handlerton*, THD*, bool))[0x55fd4b8ea1b4]
          sql/log.cc:8794(TC_LOG::run_commit_ordered(THD*, bool))[0x55fd4b8e8369]
          sql/log.cc:8024(MYSQL_BIN_LOG::trx_group_commit_leader(MYSQL_BIN_LOG::group_commit_entry*))[0x55fd4b8e7345]
          sql/log.cc:7616(MYSQL_BIN_LOG::write_transaction_to_binlog_events(MYSQL_BIN_LOG::group_commit_entry*))[0x55fd4b8e6a58]
          sql/log.cc:7290(MYSQL_BIN_LOG::write_transaction_to_binlog(THD*, binlog_cache_mngr*, Log_event*, bool, bool, bool))[0x55fd4b8d8a29]
          sql/log.cc:1751(binlog_flush_cache(THD*, binlog_cache_mngr*, Log_event*, bool, bool, bool))[0x55fd4b8d8ee8]
          sql/log.cc:1859(binlog_commit_flush_xid_caches(THD*, binlog_cache_mngr*, bool, unsigned long long))[0x55fd4b8ec2c6]
          sql/log.cc:9576(MYSQL_BIN_LOG::log_and_order(THD*, unsigned long long, bool, bool, bool))[0x55fd4b807474]
          sql/handler.cc:1463(ha_commit_trans(THD*, bool))[0x55fd4b71093b]
          sql/transaction.cc:235(trans_commit(THD*))[0x55fd4b902ea5]
          sql/log_event.cc:7675(Xid_log_event::do_apply_event(rpl_group_info*))[0x55fd4b530f3f]
          sql/log_event.h:1343(Log_event::apply_event(rpl_group_info*))[0x55fd4b5267d9]
          sql/slave.cc:3479(apply_event_and_update_pos_apply(Log_event*, THD*, rpl_group_info*, int))[0x55fd4b526c62]
          sql/slave.cc:3601(apply_event_and_update_pos(Log_event*, THD*, rpl_group_info*))[0x55fd4b527466]
          sql/slave.cc:3885(exec_relay_log_event(THD*, Relay_log_info*, rpl_group_info*))[0x55fd4b52a6bb]
          sql/slave.cc:4981(handle_slave_sql)[0x55fd4bb29ace]
          nptl/pthread_create.c:463(start_thread)[0x7fc23b67af2a]
          /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7fc239b97edf]
          

          This does not yet break any existing rules. However, we have a second commit for the same xid_seqno as follows (again using instrumented code) that does break current expectations i.e. the fact that xid_seqno should be increasing:

          2018-10-02 13:14:57 140216945548032 [Note] WSREP: Xid_log_event: Skipping wsrep commit replicated 1 binlog ./mysqld-relay-bin.000002 pos 871
          2018-10-02 13:14:57 140216945548032 [Note] WSREP: apply_event_and_update_pos() result: 0
          2018-10-02 13:14:57 140216945548032 [Note] WSREP: Wsrep checkpoint writing for uuid fe20fa80-c62b-11e8-b401-2e57249066f2 xid_seqno 2 trx_sys_cur_xid_seqno 1 trx_state 2
          2018-10-02 13:14:57 140216945548032 [Note] WSREP: apply_event_and_update_pos() result: 0
          2018-10-02 13:14:57 140216945548032 [Note] WSREP: apply_event_and_update_pos() result: 0
          2018-10-02 13:14:57 140216945548032 [Note] WSREP: apply_event_and_update_pos() result: 0
          2018-10-02 13:14:57 140216945548032 [Note] WSREP: WRONG: wsrep checkpoint writing for uuid fe20fa80-c62b-11e8-b401-2e57249066f2 xid_seqno 2 trx_sys_cur_xid_seqno 2 trx_state 2
          2018-10-02 13:14:57 7f86cd3b1b00  InnoDB: Assertion failure in thread 140216945548032 in file trx0sys.cc line 395
          InnoDB: Failing assertion: xid_seqno > trx_sys_cur_xid_seqno
          InnoDB: We intentionally generate a memory trap.
          InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
          InnoDB: If you get repeated assertion failures or crashes, even
          InnoDB: immediately after the mysqld startup, there may be
          InnoDB: corruption in the InnoDB tablespace. Please refer to
          InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
          InnoDB: about forcing recovery.
          181002 13:14:57 [ERROR] mysqld got signal 6 ;
          This could be because you hit a bug. It is also possible that this binary
          or one of the libraries it was linked against is corrupt, improperly built,
          or misconfigured. This error can also be caused by malfunctioning hardware.
           
          To report this bug, see https://mariadb.com/kb/en/reporting-bugs
           
          We will try our best to scrape up some info that will hopefully help
          diagnose the problem, but since we have already crashed, 
          something is definitely wrong and this may fail.
           
          Server version: 10.1.37-MariaDB-debug
          key_buffer_size=1048576
          read_buffer_size=131072
          max_used_connections=2
          max_threads=153
          thread_count=4
          It is possible that mysqld could use up to 
          key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 63026 K  bytes of memory
          Hope that's ok; if not, decrease some variables in the equation.
           
          Thread pointer: 0x7f86c1474070
          Attempting backtrace. You can use the following information to find out
          where mysqld died. If you see no messages after this, something went
          terribly wrong...
          stack_bottom = 0x7f86cd3b08f0 thread_stack 0x48400
          2018-10-02 13:14:57 140217000942336 [Note] WSREP: wsrep_sync_wait: thd->variables.wsrep_sync_wait = 15, mask = 1
          /home/jan/mysql/10.1-galera/sql/mysqld(my_print_stacktrace+0x38)[0x55b9c460c409]
          mysys/stacktrace.c:267(my_print_stacktrace)[0x55b9c3fa7e39]
          /lib/x86_64-linux-gnu/libpthread.so.0(+0x128e0)[0x7f86da3c68e0]
          /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x10b)[0x7f86d8816f3b]
          linux/raise.c:51(__GI_raise)[0x7f86d88182f1]
          /home/jan/mysql/10.1-galera/sql/mysqld(+0xbe83a9)[0x55b9c44813a9]
          trx/trx0sys.cc:398(trx_sys_update_wsrep_checkpoint(xid_t const*, unsigned char*, mtr_t*, trx_t const*))[0x55b9c4488a45]
          trx/trx0trx.cc:1243(trx_write_serialisation_history(trx_t*, mtr_t*))[0x55b9c4489a64]
          trx/trx0trx.cc:1639(trx_commit_low(trx_t*, mtr_t*))[0x55b9c4489b29]
          trx/trx0trx.cc:1668(trx_commit(trx_t*))[0x55b9c448a4e4]
          trx/trx0trx.cc:1932(trx_commit_for_mysql(trx_t*))[0x55b9c42e4d0d]
          handler/ha_innodb.cc:4704(innobase_commit_low(trx_t*))[0x55b9c42e4feb]
          handler/ha_innodb.cc:4846(innobase_commit_ordered_2(trx_t*, THD*))[0x55b9c42e5191]
          handler/ha_innodb.cc:4906(innobase_commit_ordered(handlerton*, THD*, bool))[0x55b9c408e1b4]
          sql/log.cc:8794(TC_LOG::run_commit_ordered(THD*, bool))[0x55b9c408c369]
          sql/log.cc:8024(MYSQL_BIN_LOG::trx_group_commit_leader(MYSQL_BIN_LOG::group_commit_entry*))[0x55b9c408b345]
          sql/log.cc:7616(MYSQL_BIN_LOG::write_transaction_to_binlog_events(MYSQL_BIN_LOG::group_commit_entry*))[0x55b9c408aa58]
          sql/log.cc:7290(MYSQL_BIN_LOG::write_transaction_to_binlog(THD*, binlog_cache_mngr*, Log_event*, bool, bool, bool))[0x55b9c407ca29]
          sql/log.cc:1751(binlog_flush_cache(THD*, binlog_cache_mngr*, Log_event*, bool, bool, bool))[0x55b9c407cee8]
          sql/log.cc:1859(binlog_commit_flush_xid_caches(THD*, binlog_cache_mngr*, bool, unsigned long long))[0x55b9c40902c6]
          sql/log.cc:9576(MYSQL_BIN_LOG::log_and_order(THD*, unsigned long long, bool, bool, bool))[0x55b9c3fab474]
          sql/handler.cc:1463(ha_commit_trans(THD*, bool))[0x55b9c3eb493b]
          sql/transaction.cc:235(trans_commit(THD*))[0x55b9c40a6ea5]
          sql/log_event.cc:7675(Xid_log_event::do_apply_event(rpl_group_info*))[0x55b9c3cd4f3f]
          sql/log_event.h:1343(Log_event::apply_event(rpl_group_info*))[0x55b9c3cca7d9]
          sql/slave.cc:3479(apply_event_and_update_pos_apply(Log_event*, THD*, rpl_group_info*, int))[0x55b9c3ccac62]
          sql/slave.cc:3601(apply_event_and_update_pos(Log_event*, THD*, rpl_group_info*))[0x55b9c3ccb466]
          sql/slave.cc:3885(exec_relay_log_event(THD*, Relay_log_info*, rpl_group_info*))[0x55b9c3cce6bb]
          sql/slave.cc:4981(handle_slave_sql)[0x55b9c42cdace]
          nptl/pthread_create.c:463(start_thread)[0x7f86da3bbf2a]
          /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7f86d88d8edf]
           
          Trying to get some variables.
          Some pointers may be invalid and cause the dump to abort.
          Query (0x0): 
          Connection ID (thread ID): 9
          Status: NOT_KILLED
          

          Patch for instrumented code attached on this ticket. We could just ignore these same xid_seqno writes on InnoDB, but in my opinion it is not correct way. How to repeat:

          ./mtr galera_as_slave_replication_bundle,xtradb --enable-disabled --mysqld=--wsrep-debug=ON
          

          jplindst Jan Lindström (Inactive) added a comment - - edited After fixing original assertion a new one surfaced from wsrep InnoDB checkpoint. First checkpoint is written as follows (using instrumented code to get assertion where we want): 2018-10-02 13:03:33 140472161303296 [Note] WSREP: Xid_log_event: Skipping wsrep commit replicated 1 binlog ./mysqld-relay-bin.000002 pos 871 2018-10-02 13:03:33 140472161303296 [Note] WSREP: apply_event_and_update_pos() result: 0 2018-10-02 13:03:33 140472161303296 [Note] WSREP: Wsrep checkpoint writing for uuid 66f554e0-c62a-11e8-97f5-9262cfb8a273 xid_seqno 2 trx_sys_cur_xid_seqno 1 trx_state 2 2018-10-02 13:03:33 140472161303296 [Note] WSREP: apply_event_and_update_pos() result: 0 2018-10-02 13:03:33 140472161303296 [Note] WSREP: apply_event_and_update_pos() result: 0 2018-10-02 13:03:33 140472161303296 [Note] WSREP: apply_event_and_update_pos() result: 0 2018-10-02 13:03:33 7fc239467b00 InnoDB: Assertion failure in thread 140472161303296 in file trx0sys.cc line 391 InnoDB: Failing assertion: 0 InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to https://jira.mariadb.org/ InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. Please refer to InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html InnoDB: about forcing recovery. 181002 13:03:33 [ERROR] mysqld got signal 6 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware.   To report this bug, see https://mariadb.com/kb/en/reporting-bugs   We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail.   Server version: 10.1.37-MariaDB-debug key_buffer_size=1048576 read_buffer_size=131072 max_used_connections=2 max_threads=153 thread_count=4 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 63026 K bytes of memory Hope that's ok; if not, decrease some variables in the equation.   Thread pointer: 0x7fc221c70070 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x7fc2394668f0 thread_stack 0x48400 2018-10-02 13:03:33 140472161917696 [Note] WSREP: wsrep_sync_wait: thd->variables.wsrep_sync_wait = 15, mask = 1 /home/jan/mysql/10.1-galera/sql/mysqld(my_print_stacktrace+0x38)[0x55fd4be68365] mysys/stacktrace.c:267(my_print_stacktrace)[0x55fd4b803e39] /lib/x86_64-linux-gnu/libpthread.so.0(+0x128e0)[0x7fc23b6858e0] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x10b)[0x7fc239ad5f3b] linux/raise.c:51(__GI_raise)[0x7fc239ad72f1] /home/jan/mysql/10.1-galera/sql/mysqld(+0xbe82e2)[0x55fd4bcdd2e2] trx/trx0sys.cc:397(trx_sys_update_wsrep_checkpoint(xid_t const*, unsigned char*, mtr_t*, trx_t const*))[0x55fd4bce49a2] trx/trx0trx.cc:1243(trx_write_serialisation_history(trx_t*, mtr_t*))[0x55fd4bce59c1] trx/trx0trx.cc:1639(trx_commit_low(trx_t*, mtr_t*))[0x55fd4bce5a86] trx/trx0trx.cc:1668(trx_commit(trx_t*))[0x55fd4bce6441] trx/trx0trx.cc:1932(trx_commit_for_mysql(trx_t*))[0x55fd4bb40d0d] handler/ha_innodb.cc:4704(innobase_commit_low(trx_t*))[0x55fd4bb40feb] handler/ha_innodb.cc:4846(innobase_commit_ordered_2(trx_t*, THD*))[0x55fd4bb41191] handler/ha_innodb.cc:4906(innobase_commit_ordered(handlerton*, THD*, bool))[0x55fd4b8ea1b4] sql/log.cc:8794(TC_LOG::run_commit_ordered(THD*, bool))[0x55fd4b8e8369] sql/log.cc:8024(MYSQL_BIN_LOG::trx_group_commit_leader(MYSQL_BIN_LOG::group_commit_entry*))[0x55fd4b8e7345] sql/log.cc:7616(MYSQL_BIN_LOG::write_transaction_to_binlog_events(MYSQL_BIN_LOG::group_commit_entry*))[0x55fd4b8e6a58] sql/log.cc:7290(MYSQL_BIN_LOG::write_transaction_to_binlog(THD*, binlog_cache_mngr*, Log_event*, bool, bool, bool))[0x55fd4b8d8a29] sql/log.cc:1751(binlog_flush_cache(THD*, binlog_cache_mngr*, Log_event*, bool, bool, bool))[0x55fd4b8d8ee8] sql/log.cc:1859(binlog_commit_flush_xid_caches(THD*, binlog_cache_mngr*, bool, unsigned long long))[0x55fd4b8ec2c6] sql/log.cc:9576(MYSQL_BIN_LOG::log_and_order(THD*, unsigned long long, bool, bool, bool))[0x55fd4b807474] sql/handler.cc:1463(ha_commit_trans(THD*, bool))[0x55fd4b71093b] sql/transaction.cc:235(trans_commit(THD*))[0x55fd4b902ea5] sql/log_event.cc:7675(Xid_log_event::do_apply_event(rpl_group_info*))[0x55fd4b530f3f] sql/log_event.h:1343(Log_event::apply_event(rpl_group_info*))[0x55fd4b5267d9] sql/slave.cc:3479(apply_event_and_update_pos_apply(Log_event*, THD*, rpl_group_info*, int))[0x55fd4b526c62] sql/slave.cc:3601(apply_event_and_update_pos(Log_event*, THD*, rpl_group_info*))[0x55fd4b527466] sql/slave.cc:3885(exec_relay_log_event(THD*, Relay_log_info*, rpl_group_info*))[0x55fd4b52a6bb] sql/slave.cc:4981(handle_slave_sql)[0x55fd4bb29ace] nptl/pthread_create.c:463(start_thread)[0x7fc23b67af2a] /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7fc239b97edf] This does not yet break any existing rules. However, we have a second commit for the same xid_seqno as follows (again using instrumented code) that does break current expectations i.e. the fact that xid_seqno should be increasing: 2018-10-02 13:14:57 140216945548032 [Note] WSREP: Xid_log_event: Skipping wsrep commit replicated 1 binlog ./mysqld-relay-bin.000002 pos 871 2018-10-02 13:14:57 140216945548032 [Note] WSREP: apply_event_and_update_pos() result: 0 2018-10-02 13:14:57 140216945548032 [Note] WSREP: Wsrep checkpoint writing for uuid fe20fa80-c62b-11e8-b401-2e57249066f2 xid_seqno 2 trx_sys_cur_xid_seqno 1 trx_state 2 2018-10-02 13:14:57 140216945548032 [Note] WSREP: apply_event_and_update_pos() result: 0 2018-10-02 13:14:57 140216945548032 [Note] WSREP: apply_event_and_update_pos() result: 0 2018-10-02 13:14:57 140216945548032 [Note] WSREP: apply_event_and_update_pos() result: 0 2018-10-02 13:14:57 140216945548032 [Note] WSREP: WRONG: wsrep checkpoint writing for uuid fe20fa80-c62b-11e8-b401-2e57249066f2 xid_seqno 2 trx_sys_cur_xid_seqno 2 trx_state 2 2018-10-02 13:14:57 7f86cd3b1b00 InnoDB: Assertion failure in thread 140216945548032 in file trx0sys.cc line 395 InnoDB: Failing assertion: xid_seqno > trx_sys_cur_xid_seqno InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to https://jira.mariadb.org/ InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. Please refer to InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html InnoDB: about forcing recovery. 181002 13:14:57 [ERROR] mysqld got signal 6 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware.   To report this bug, see https://mariadb.com/kb/en/reporting-bugs   We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail.   Server version: 10.1.37-MariaDB-debug key_buffer_size=1048576 read_buffer_size=131072 max_used_connections=2 max_threads=153 thread_count=4 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 63026 K bytes of memory Hope that's ok; if not, decrease some variables in the equation.   Thread pointer: 0x7f86c1474070 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x7f86cd3b08f0 thread_stack 0x48400 2018-10-02 13:14:57 140217000942336 [Note] WSREP: wsrep_sync_wait: thd->variables.wsrep_sync_wait = 15, mask = 1 /home/jan/mysql/10.1-galera/sql/mysqld(my_print_stacktrace+0x38)[0x55b9c460c409] mysys/stacktrace.c:267(my_print_stacktrace)[0x55b9c3fa7e39] /lib/x86_64-linux-gnu/libpthread.so.0(+0x128e0)[0x7f86da3c68e0] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x10b)[0x7f86d8816f3b] linux/raise.c:51(__GI_raise)[0x7f86d88182f1] /home/jan/mysql/10.1-galera/sql/mysqld(+0xbe83a9)[0x55b9c44813a9] trx/trx0sys.cc:398(trx_sys_update_wsrep_checkpoint(xid_t const*, unsigned char*, mtr_t*, trx_t const*))[0x55b9c4488a45] trx/trx0trx.cc:1243(trx_write_serialisation_history(trx_t*, mtr_t*))[0x55b9c4489a64] trx/trx0trx.cc:1639(trx_commit_low(trx_t*, mtr_t*))[0x55b9c4489b29] trx/trx0trx.cc:1668(trx_commit(trx_t*))[0x55b9c448a4e4] trx/trx0trx.cc:1932(trx_commit_for_mysql(trx_t*))[0x55b9c42e4d0d] handler/ha_innodb.cc:4704(innobase_commit_low(trx_t*))[0x55b9c42e4feb] handler/ha_innodb.cc:4846(innobase_commit_ordered_2(trx_t*, THD*))[0x55b9c42e5191] handler/ha_innodb.cc:4906(innobase_commit_ordered(handlerton*, THD*, bool))[0x55b9c408e1b4] sql/log.cc:8794(TC_LOG::run_commit_ordered(THD*, bool))[0x55b9c408c369] sql/log.cc:8024(MYSQL_BIN_LOG::trx_group_commit_leader(MYSQL_BIN_LOG::group_commit_entry*))[0x55b9c408b345] sql/log.cc:7616(MYSQL_BIN_LOG::write_transaction_to_binlog_events(MYSQL_BIN_LOG::group_commit_entry*))[0x55b9c408aa58] sql/log.cc:7290(MYSQL_BIN_LOG::write_transaction_to_binlog(THD*, binlog_cache_mngr*, Log_event*, bool, bool, bool))[0x55b9c407ca29] sql/log.cc:1751(binlog_flush_cache(THD*, binlog_cache_mngr*, Log_event*, bool, bool, bool))[0x55b9c407cee8] sql/log.cc:1859(binlog_commit_flush_xid_caches(THD*, binlog_cache_mngr*, bool, unsigned long long))[0x55b9c40902c6] sql/log.cc:9576(MYSQL_BIN_LOG::log_and_order(THD*, unsigned long long, bool, bool, bool))[0x55b9c3fab474] sql/handler.cc:1463(ha_commit_trans(THD*, bool))[0x55b9c3eb493b] sql/transaction.cc:235(trans_commit(THD*))[0x55b9c40a6ea5] sql/log_event.cc:7675(Xid_log_event::do_apply_event(rpl_group_info*))[0x55b9c3cd4f3f] sql/log_event.h:1343(Log_event::apply_event(rpl_group_info*))[0x55b9c3cca7d9] sql/slave.cc:3479(apply_event_and_update_pos_apply(Log_event*, THD*, rpl_group_info*, int))[0x55b9c3ccac62] sql/slave.cc:3601(apply_event_and_update_pos(Log_event*, THD*, rpl_group_info*))[0x55b9c3ccb466] sql/slave.cc:3885(exec_relay_log_event(THD*, Relay_log_info*, rpl_group_info*))[0x55b9c3cce6bb] sql/slave.cc:4981(handle_slave_sql)[0x55b9c42cdace] nptl/pthread_create.c:463(start_thread)[0x7f86da3bbf2a] /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7f86d88d8edf]   Trying to get some variables. Some pointers may be invalid and cause the dump to abort. Query (0x0): Connection ID (thread ID): 9 Status: NOT_KILLED Patch for instrumented code attached on this ticket. We could just ignore these same xid_seqno writes on InnoDB, but in my opinion it is not correct way. How to repeat: ./mtr galera_as_slave_replication_bundle,xtradb --enable-disabled --mysqld=--wsrep-debug=ON

          seppo I do not know how to proceed from this state. To me it looks like there is two participants on xa group commit on the same node.

          jplindst Jan Lindström (Inactive) added a comment - - edited seppo I do not know how to proceed from this state. To me it looks like there is two participants on xa group commit on the same node.

          People

            seppo Seppo Jaakola
            jplindst Jan Lindström (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.