Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-24502

MariaDB 10.2.35 crashes after conflicting lock during DELETE

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Duplicate
    • 10.2.35
    • N/A
    • Galera
    • CentOS 7.9

    Description

      Hello,

      Edit: This may be related to MDEV-23851 but we would like to have confirmation from your side.

      We upgraded MariaDB from 10.2.24 to 10.2.35 and the nodes in Cluster started crashing one day after the update. It seems to happen when there is a conflicting lock during a DELETE.

      It's a 3-Nodes cluster. Every single node may crash a couple of times during a day. It also resulted in a crash of the whole cluster few times during the last two weeks.

      log-node1

      2020-12-29  8:40:08 140172980565760 [ERROR] InnoDB: Conflicting lock on table: `$DB`.`$TABLE1` index: GEN_CLUST_INDEX that has lock
      RECORD LOCKS space id 945 page no 3 n bits 168 index GEN_CLUST_INDEX of table `$DB`.`$TABLE1` trx id 275420837 lock_mode X locks rec but not gap
      Record lock, heap no 2
      Record lock, heap no 98
      2020-12-29  8:40:08 140172980565760 [ERROR] InnoDB: WSREP state:
      2020-12-29  8:40:08 140172980565760 [ERROR] WSREP: Thread BF trx_id: 275420838 thread: 2 seqno: 87923500 query_state: executing conf_state: no conflict exec_mode: applier applier: 1 query: DELETE FROM process_id
              WHERE process_name = 'my-process'
              AND process_host = 'my-app.example.com'XÝê_
      2020-12-29  8:40:08 140172980565760 [ERROR] WSREP: Thread BF trx_id: 275420837 thread: 10 seqno: 87923499 query_state: executing conf_state: no conflict exec_mode: applier applier: 1 query: DELETE FROM process_id
              WHERE process_name = 'my-process-2'
              AND process_host = 'my-app.example.com'XÝê_
      2020-12-29 08:40:08 0x7f7c90b6b700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      

      Stack:

      Server version: 10.2.35-MariaDB-log
      key_buffer_size=268435456
      read_buffer_size=2097152
      max_used_connections=15
      max_threads=502
      thread_count=26
      It is possible that mysqld could use up to
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2329025 K  bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.
       
      Thread pointer: 0x7f7c780009a8
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x7f7c90b6ad20 thread_stack 0x49000
      /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55587dc621ee]
      /usr/sbin/mysqld(handle_fatal_signal+0x30d)[0x55587d6ff04d]
      /lib64/libpthread.so.0(+0xf630)[0x7f7c9b2c0630]
      :0(__GI_raise)[0x7f7c99590387]
      :0(__GI_abort)[0x7f7c99591a78]
      /usr/sbin/mysqld(+0x44918e)[0x55587d4a418e]
      /usr/sbin/mysqld(+0x87cd6d)[0x55587d8d7d6d]
      /usr/sbin/mysqld(+0x87d9b4)[0x55587d8d89b4]
      /usr/sbin/mysqld(+0x884145)[0x55587d8df145]
      /usr/sbin/mysqld(+0x884b2a)[0x55587d8dfb2a]
      /usr/sbin/mysqld(+0x91a1ba)[0x55587d9751ba]
      /usr/sbin/mysqld(+0x91d48f)[0x55587d97848f]
      /usr/sbin/mysqld(+0x849855)[0x55587d8a4855]
      /usr/sbin/mysqld(+0x82cbb7)[0x55587d887bb7]
      /usr/sbin/mysqld(+0x841cc9)[0x55587d89ccc9]
      /usr/sbin/mysqld(_ZN7handler11ha_rnd_nextEPh+0x1c7)[0x55587d703c37]
      /usr/sbin/mysqld(_ZN14Rows_log_event8find_rowEP14rpl_group_info+0x50e)[0x55587d800efe]
      /usr/sbin/mysqld(_ZN21Delete_rows_log_event11do_exec_rowEP14rpl_group_info+0x8e)[0x55587d80100e]
      /usr/sbin/mysqld(_ZN14Rows_log_event14do_apply_eventEP14rpl_group_info+0x2fd)[0x55587d7f3e8d]
      /usr/sbin/mysqld(wsrep_apply_cb+0x482)[0x55587d6a48c2]
      src/trx_handle.cpp:312(galera::TrxHandle::apply(void*, wsrep_cb_status (*)(void*, void const*, unsigned long, unsigned int, wsrep_trx_meta const*), wsrep_trx_meta const&) const)[0x7f7c93d47ef8]
      src/replicator_smm.cpp:92(apply_trx_ws(void*, wsrep_cb_status (*)(void*, void const*, unsigned long, unsigned int, wsrep_trx_meta const*), wsrep_cb_status (*)(void*, unsigned int, wsrep_trx_meta const*, bool*, bool), galera::TrxHandle const&, wsrep_trx_meta const&))[0x7f7c93d856f3]
      src/replicator_smm.cpp:458(galera::ReplicatorSMM::apply_trx(void*, galera::TrxHandle*))[0x7f7c93d8877c]
      src/replicator_smm.cpp:1258(galera::ReplicatorSMM::process_trx(void*, galera::TrxHandle*))[0x7f7c93d8b99e]
      src/gcs_action_source.cpp:116(galera::GcsActionSource::dispatch(void*, gcs_action const&, bool&))[0x7f7c93d67078]
      src/gcs_action_source.cpp:28(~Release)[0x7f7c93d6876c]
      src/replicator_smm.cpp:362(galera::ReplicatorSMM::async_recv(void*))[0x7f7c93d8bf7b]
      src/wsrep_provider.cpp:271(galera_recv)[0x7f7c93d99f38]
      /usr/sbin/mysqld(+0x64a976)[0x55587d6a5976]
      /usr/sbin/mysqld(start_wsrep_THD+0x3eb)[0x55587d698c5b]
      pthread_create.c:0(start_thread)[0x7f7c9b2b8ea5]
      /lib64/libc.so.6(clone+0x6d)[0x7f7c9965896d]
       
      Trying to get some variables.
      Some pointers may be invalid and cause the dump to abort.
      Query (0x7f7c85a75fcb): DELETE FROM process_id
              WHERE process_name = 'my-process'
              AND process_host = 'my-app.example.com'
       
      Connection ID (thread ID): 2
      Status: NOT_KILLED
      

      The following packages are installed on the servers:

      galera-25.3.31-1.el7.centos.x86_64
      MariaDB-client-10.2.36-1.el7.centos.x86_64
      MariaDB-compat-10.2.36-1.el7.centos.x86_64
      MariaDB-common-10.2.36-1.el7.centos.x86_64
      MariaDB-server-10.2.35-1.el7.centos.x86_64
      

      Taking 29.12.2020 as example, when the monitoring system alarmed few times about node1 and node2 with the following message:

      wsrep_cluster_status: ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock'
      

      It happened on node1 during December, 29th at:

      • 08:40am
      • 09:40am
      • 03:50pm

      Also on Dec, 29th on node2:

      • 12:40am
      • 12:50am

      MariaDB crashed more times, though.

      20201229-node1-mariadb.err

      2020-12-29 08:40:08 0x7f7c90b6b700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 08:40:19 0x7fb11796e700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 09:40:12 0x7f937c61b700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 15:50:19 0x7f06184cf700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 16:30:02 0x7f80c7722700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 21:50:07 0x7f7d9062e700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      

      20201229-node2-mariadb.err

      2020-12-29 00:40:12 0x7f19f84c6700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 00:50:13 0x7f77084b1700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 00:50:26 0x7febec0bc700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 00:50:38 0x7f43207d5700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 00:50:52 0x7fcc8c395700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 00:51:03 0x7faa98c43700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 09:00:09 0x7f80983b7700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 14:10:11 0x7f2890261700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 22:40:10 0x7f296120d700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      

      20201229-node3-mariadb.err

      2020-12-29 08:10:06 0x7faa6473d700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 16:40:03 0x7f3d78084700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 19:50:18 0x7f8944225700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      

      Attached the obfuscated logs from 29.12.2020 of all three nodes.

      Is there any known workaround to avoid further crashes? I couldn't find any.

      Many thanks in advance.

      Attachments

        Issue Links

          Activity

            People

              jplindst Jan Lindström (Inactive)
              jmox jmox
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.