Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-26177

WSREP assertion failure in bf_abort when replicating certain DDL against in-memory tables

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Cannot Reproduce
    • 10.5.11
    • N/A
    • Galera
    • None
    • CentOS7 / RH7

    Description

      The bug is triggered by truncates of ENGINE=MEMORY tables: TRUNCATE TABLE statement is (erroneously?) replicated, fails to be applied because in-memory tables are node-local, tx is brute force aborted, then “local or streaming tx” assertion fails in WSREP and node crashes with an error similar to the following, and may eventually lead to cluster lockup when WSREP goes completely haywire due to repeated node crashes caused by this same transaction replicated again and again: Perhaps mode==local should evaluate to true ?

      2021-07-17 10:26:43 10 [ERROR] Slave SQL: Error 'Table 'lsa.ttm$etl_dimension_discovery_etl06_lsa_001' doesn't exist' on query. Default database: 'lsa'. Query: '/* DATABASE_EXECUTE_DDL*/ TRUNCATE TABLE `ttm$etl_dimension_discovery_etl06_lsa_001` /* PROCESS(LSA [ETLs] <lsa@etl06>, metadata_background|abandoned[1h]|time_to_live[4h]|maximum_reuse[15m])*/', Internal MariaDB error code: 1146
      2021-07-17 10:26:43 10 [Warning] WSREP: Ignoring error 'Table 'lsa.ttm$etl_dimension_discovery_etl06_lsa_001' doesn't exist' on query. Default database: 'lsa'. Query: '/* DATABASE_EXECUTE_DDL*/ TRUNCATE TABLE `ttm$etl_dimension_discovery_etl06_lsa_001` /* PROCESS(LSA [ETLs] <lsa@etl06>, metadata_background|abandoned[1h]|time_to_live[4h]|maximum_reuse[15m])*/', Error_code: 1146
      mariadbd: /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.5.11/wsrep-lib/include/wsrep/client_state.hpp:668: int wsrep::client_state::bf_abort(wsrep::seqno): Assertion `mode_ == m_local || transaction_.is_streaming()' failed.

      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed,
      something is definitely wrong and this may fail.

      Server version: 10.5.11-MariaDB
      key_buffer_size=134217728
      read_buffer_size=131072
      max_used_connections=165
      max_threads=3010
      thread_count=182
      It is possible that mysqld could use up to
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 6732055 K bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.

      Thread pointer: 0x0
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x0 thread_stack 0x49000
      2021-07-17 10:27:18 12 [ERROR] Slave SQL: Could not execute Update_rows_v1 event on table lsa.configuration_group; Deadlock found when trying to get lock; try restarting transaction, Error_code: 1213; handler error HA_ERR_LOCK_DEADLOCK; the event's master log FIRST, end_log_pos 713, Internal MariaDB error code: 1213
      ??:0(my_print_stacktrace)[0x55c7d751179e]
      ??:0(handle_fatal_signal)[0x55c7d6f16457]
      sigaction.c:0(__restore_rt)[0x7f6ad94c1630]
      :0(__GI_raise)[0x7f6ad890c387]
      :0(__GI_abort)[0x7f6ad890da78]
      :0(__assert_fail_base)[0x7f6ad89051a6]
      :0(_GI__assert_fail)[0x7f6ad8905252]
      ??:0(wsrep_bf_abort(THD const*, THD*))[0x55c7d71d9dc6]
      ??:0(wsrep_thd_bf_abort)[0x55c7d71e000f]
      ??:0(wsrep_notify_status(wsrep::server_state::state, wsrep::view const*))[0x55c7d72076cd]
      ??:0(handle_manager)[0x55c7d6d05fde]
      ??:0(MyCTX_nopad::finish(unsigned char*, unsigned int*))[0x55c7d716356d]
      pthread_create.c:0(start_thread)[0x7f6ad94b9ea5]
      ??:0(__clone)[0x7f6ad89d49fd]
      The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
      information that should help you find out what is causing the crash.
      Writing a core file...
      Working directory at /data/mysql
      Resource Limits:
      Limit Soft Limit Hard Limit Units
      Max cpu time unlimited unlimited seconds
      Max file size unlimited unlimited bytes
      Max data size unlimited unlimited bytes
      Max stack size 8388608 unlimited bytes
      Max core file size 0 unlimited bytes
      Max resident set unlimited unlimited bytes
      Max processes 127953 127953 processes
      Max open files 32768 32768 files
      Max locked memory 65536 65536 bytes
      Max address space unlimited unlimited bytes
      Max file locks unlimited unlimited locks
      Max pending signals 127953 127953 signals
      Max msgqueue size 819200 819200 bytes
      Max nice priority 0 0
      Max realtime priority 0 0
      Max realtime timeout unlimited unlimited us
      Core pattern: core

      Attachments

        Activity

          People

            jplindst Jan Lindström (Inactive)
            khoov Kent Hoover
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.