Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-24502

MariaDB 10.2.35 crashes after conflicting lock during DELETE

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Duplicate
    • 10.2.35
    • N/A
    • Galera
    • CentOS 7.9

    Description

      Hello,

      Edit: This may be related to MDEV-23851 but we would like to have confirmation from your side.

      We upgraded MariaDB from 10.2.24 to 10.2.35 and the nodes in Cluster started crashing one day after the update. It seems to happen when there is a conflicting lock during a DELETE.

      It's a 3-Nodes cluster. Every single node may crash a couple of times during a day. It also resulted in a crash of the whole cluster few times during the last two weeks.

      log-node1

      2020-12-29  8:40:08 140172980565760 [ERROR] InnoDB: Conflicting lock on table: `$DB`.`$TABLE1` index: GEN_CLUST_INDEX that has lock
      RECORD LOCKS space id 945 page no 3 n bits 168 index GEN_CLUST_INDEX of table `$DB`.`$TABLE1` trx id 275420837 lock_mode X locks rec but not gap
      Record lock, heap no 2
      Record lock, heap no 98
      2020-12-29  8:40:08 140172980565760 [ERROR] InnoDB: WSREP state:
      2020-12-29  8:40:08 140172980565760 [ERROR] WSREP: Thread BF trx_id: 275420838 thread: 2 seqno: 87923500 query_state: executing conf_state: no conflict exec_mode: applier applier: 1 query: DELETE FROM process_id
              WHERE process_name = 'my-process'
              AND process_host = 'my-app.example.com'XÝê_
      2020-12-29  8:40:08 140172980565760 [ERROR] WSREP: Thread BF trx_id: 275420837 thread: 10 seqno: 87923499 query_state: executing conf_state: no conflict exec_mode: applier applier: 1 query: DELETE FROM process_id
              WHERE process_name = 'my-process-2'
              AND process_host = 'my-app.example.com'XÝê_
      2020-12-29 08:40:08 0x7f7c90b6b700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      

      Stack:

      Server version: 10.2.35-MariaDB-log
      key_buffer_size=268435456
      read_buffer_size=2097152
      max_used_connections=15
      max_threads=502
      thread_count=26
      It is possible that mysqld could use up to
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2329025 K  bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.
       
      Thread pointer: 0x7f7c780009a8
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x7f7c90b6ad20 thread_stack 0x49000
      /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55587dc621ee]
      /usr/sbin/mysqld(handle_fatal_signal+0x30d)[0x55587d6ff04d]
      /lib64/libpthread.so.0(+0xf630)[0x7f7c9b2c0630]
      :0(__GI_raise)[0x7f7c99590387]
      :0(__GI_abort)[0x7f7c99591a78]
      /usr/sbin/mysqld(+0x44918e)[0x55587d4a418e]
      /usr/sbin/mysqld(+0x87cd6d)[0x55587d8d7d6d]
      /usr/sbin/mysqld(+0x87d9b4)[0x55587d8d89b4]
      /usr/sbin/mysqld(+0x884145)[0x55587d8df145]
      /usr/sbin/mysqld(+0x884b2a)[0x55587d8dfb2a]
      /usr/sbin/mysqld(+0x91a1ba)[0x55587d9751ba]
      /usr/sbin/mysqld(+0x91d48f)[0x55587d97848f]
      /usr/sbin/mysqld(+0x849855)[0x55587d8a4855]
      /usr/sbin/mysqld(+0x82cbb7)[0x55587d887bb7]
      /usr/sbin/mysqld(+0x841cc9)[0x55587d89ccc9]
      /usr/sbin/mysqld(_ZN7handler11ha_rnd_nextEPh+0x1c7)[0x55587d703c37]
      /usr/sbin/mysqld(_ZN14Rows_log_event8find_rowEP14rpl_group_info+0x50e)[0x55587d800efe]
      /usr/sbin/mysqld(_ZN21Delete_rows_log_event11do_exec_rowEP14rpl_group_info+0x8e)[0x55587d80100e]
      /usr/sbin/mysqld(_ZN14Rows_log_event14do_apply_eventEP14rpl_group_info+0x2fd)[0x55587d7f3e8d]
      /usr/sbin/mysqld(wsrep_apply_cb+0x482)[0x55587d6a48c2]
      src/trx_handle.cpp:312(galera::TrxHandle::apply(void*, wsrep_cb_status (*)(void*, void const*, unsigned long, unsigned int, wsrep_trx_meta const*), wsrep_trx_meta const&) const)[0x7f7c93d47ef8]
      src/replicator_smm.cpp:92(apply_trx_ws(void*, wsrep_cb_status (*)(void*, void const*, unsigned long, unsigned int, wsrep_trx_meta const*), wsrep_cb_status (*)(void*, unsigned int, wsrep_trx_meta const*, bool*, bool), galera::TrxHandle const&, wsrep_trx_meta const&))[0x7f7c93d856f3]
      src/replicator_smm.cpp:458(galera::ReplicatorSMM::apply_trx(void*, galera::TrxHandle*))[0x7f7c93d8877c]
      src/replicator_smm.cpp:1258(galera::ReplicatorSMM::process_trx(void*, galera::TrxHandle*))[0x7f7c93d8b99e]
      src/gcs_action_source.cpp:116(galera::GcsActionSource::dispatch(void*, gcs_action const&, bool&))[0x7f7c93d67078]
      src/gcs_action_source.cpp:28(~Release)[0x7f7c93d6876c]
      src/replicator_smm.cpp:362(galera::ReplicatorSMM::async_recv(void*))[0x7f7c93d8bf7b]
      src/wsrep_provider.cpp:271(galera_recv)[0x7f7c93d99f38]
      /usr/sbin/mysqld(+0x64a976)[0x55587d6a5976]
      /usr/sbin/mysqld(start_wsrep_THD+0x3eb)[0x55587d698c5b]
      pthread_create.c:0(start_thread)[0x7f7c9b2b8ea5]
      /lib64/libc.so.6(clone+0x6d)[0x7f7c9965896d]
       
      Trying to get some variables.
      Some pointers may be invalid and cause the dump to abort.
      Query (0x7f7c85a75fcb): DELETE FROM process_id
              WHERE process_name = 'my-process'
              AND process_host = 'my-app.example.com'
       
      Connection ID (thread ID): 2
      Status: NOT_KILLED
      

      The following packages are installed on the servers:

      galera-25.3.31-1.el7.centos.x86_64
      MariaDB-client-10.2.36-1.el7.centos.x86_64
      MariaDB-compat-10.2.36-1.el7.centos.x86_64
      MariaDB-common-10.2.36-1.el7.centos.x86_64
      MariaDB-server-10.2.35-1.el7.centos.x86_64
      

      Taking 29.12.2020 as example, when the monitoring system alarmed few times about node1 and node2 with the following message:

      wsrep_cluster_status: ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock'
      

      It happened on node1 during December, 29th at:

      • 08:40am
      • 09:40am
      • 03:50pm

      Also on Dec, 29th on node2:

      • 12:40am
      • 12:50am

      MariaDB crashed more times, though.

      20201229-node1-mariadb.err

      2020-12-29 08:40:08 0x7f7c90b6b700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 08:40:19 0x7fb11796e700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 09:40:12 0x7f937c61b700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 15:50:19 0x7f06184cf700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 16:30:02 0x7f80c7722700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 21:50:07 0x7f7d9062e700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      

      20201229-node2-mariadb.err

      2020-12-29 00:40:12 0x7f19f84c6700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 00:50:13 0x7f77084b1700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 00:50:26 0x7febec0bc700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 00:50:38 0x7f43207d5700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 00:50:52 0x7fcc8c395700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 00:51:03 0x7faa98c43700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 09:00:09 0x7f80983b7700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 14:10:11 0x7f2890261700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 22:40:10 0x7f296120d700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      

      20201229-node3-mariadb.err

      2020-12-29 08:10:06 0x7faa6473d700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 16:40:03 0x7f3d78084700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      2020-12-29 19:50:18 0x7f8944225700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
      

      Attached the obfuscated logs from 29.12.2020 of all three nodes.

      Is there any known workaround to avoid further crashes? I couldn't find any.

      Many thanks in advance.

      Attachments

        Issue Links

          Activity

            jmox jmox created issue -
            jmox jmox made changes -
            Field Original Value New Value
            Description Hello,

            We upgraded MariaDB from 10.2.24 to 10.2.35 and the nodes in Cluster started crashing one day after the update. It seems to happen when there is a conflicting lock during a DELETE.

            It's a 3-Nodes cluster. Every single node may crash a couple of times during a day. It also resulted in a crash of the whole cluster few times during the last two weeks.

            {code:title=log-node1|borderStyle=solid}
            2020-12-29 8:40:08 140172980565760 [ERROR] InnoDB: Conflicting lock on table: `$DB`.`$TABLE1` index: GEN_CLUST_INDEX that has lock
            RECORD LOCKS space id 945 page no 3 n bits 168 index GEN_CLUST_INDEX of table `$DB`.`$TABLE1` trx id 275420837 lock_mode X locks rec but not gap
            Record lock, heap no 2
            Record lock, heap no 98
            2020-12-29 8:40:08 140172980565760 [ERROR] InnoDB: WSREP state:
            2020-12-29 8:40:08 140172980565760 [ERROR] WSREP: Thread BF trx_id: 275420838 thread: 2 seqno: 87923500 query_state: executing conf_state: no conflict exec_mode: applier applier: 1 query: DELETE FROM process_id
                    WHERE process_name = 'my-process'
                    AND process_host = 'my-app.example.com'XÝê_
            2020-12-29 8:40:08 140172980565760 [ERROR] WSREP: Thread BF trx_id: 275420837 thread: 10 seqno: 87923499 query_state: executing conf_state: no conflict exec_mode: applier applier: 1 query: DELETE FROM process_id
                    WHERE process_name = 'my-process-2'
                    AND process_host = 'my-app.example.com'XÝê_
            2020-12-29 08:40:08 0x7f7c90b6b700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            {code}

            {code:title=stack-node1|borderStyle=solid}
            Server version: 10.2.35-MariaDB-log
            key_buffer_size=268435456
            read_buffer_size=2097152
            max_used_connections=15
            max_threads=502
            thread_count=26
            It is possible that mysqld could use up to
            key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2329025 K bytes of memory
            Hope that's ok; if not, decrease some variables in the equation.

            Thread pointer: 0x7f7c780009a8
            Attempting backtrace. You can use the following information to find out
            where mysqld died. If you see no messages after this, something went
            terribly wrong...
            stack_bottom = 0x7f7c90b6ad20 thread_stack 0x49000
            /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55587dc621ee]
            /usr/sbin/mysqld(handle_fatal_signal+0x30d)[0x55587d6ff04d]
            /lib64/libpthread.so.0(+0xf630)[0x7f7c9b2c0630]
            :0(__GI_raise)[0x7f7c99590387]
            :0(__GI_abort)[0x7f7c99591a78]
            /usr/sbin/mysqld(+0x44918e)[0x55587d4a418e]
            /usr/sbin/mysqld(+0x87cd6d)[0x55587d8d7d6d]
            /usr/sbin/mysqld(+0x87d9b4)[0x55587d8d89b4]
            /usr/sbin/mysqld(+0x884145)[0x55587d8df145]
            /usr/sbin/mysqld(+0x884b2a)[0x55587d8dfb2a]
            /usr/sbin/mysqld(+0x91a1ba)[0x55587d9751ba]
            /usr/sbin/mysqld(+0x91d48f)[0x55587d97848f]
            /usr/sbin/mysqld(+0x849855)[0x55587d8a4855]
            /usr/sbin/mysqld(+0x82cbb7)[0x55587d887bb7]
            /usr/sbin/mysqld(+0x841cc9)[0x55587d89ccc9]
            /usr/sbin/mysqld(_ZN7handler11ha_rnd_nextEPh+0x1c7)[0x55587d703c37]
            /usr/sbin/mysqld(_ZN14Rows_log_event8find_rowEP14rpl_group_info+0x50e)[0x55587d800efe]
            /usr/sbin/mysqld(_ZN21Delete_rows_log_event11do_exec_rowEP14rpl_group_info+0x8e)[0x55587d80100e]
            /usr/sbin/mysqld(_ZN14Rows_log_event14do_apply_eventEP14rpl_group_info+0x2fd)[0x55587d7f3e8d]
            /usr/sbin/mysqld(wsrep_apply_cb+0x482)[0x55587d6a48c2]
            src/trx_handle.cpp:312(galera::TrxHandle::apply(void*, wsrep_cb_status (*)(void*, void const*, unsigned long, unsigned int, wsrep_trx_meta const*), wsrep_trx_meta const&) const)[0x7f7c93d47ef8]
            src/replicator_smm.cpp:92(apply_trx_ws(void*, wsrep_cb_status (*)(void*, void const*, unsigned long, unsigned int, wsrep_trx_meta const*), wsrep_cb_status (*)(void*, unsigned int, wsrep_trx_meta const*, bool*, bool), galera::TrxHandle const&, wsrep_trx_meta const&))[0x7f7c93d856f3]
            src/replicator_smm.cpp:458(galera::ReplicatorSMM::apply_trx(void*, galera::TrxHandle*))[0x7f7c93d8877c]
            src/replicator_smm.cpp:1258(galera::ReplicatorSMM::process_trx(void*, galera::TrxHandle*))[0x7f7c93d8b99e]
            src/gcs_action_source.cpp:116(galera::GcsActionSource::dispatch(void*, gcs_action const&, bool&))[0x7f7c93d67078]
            src/gcs_action_source.cpp:28(~Release)[0x7f7c93d6876c]
            src/replicator_smm.cpp:362(galera::ReplicatorSMM::async_recv(void*))[0x7f7c93d8bf7b]
            src/wsrep_provider.cpp:271(galera_recv)[0x7f7c93d99f38]
            /usr/sbin/mysqld(+0x64a976)[0x55587d6a5976]
            /usr/sbin/mysqld(start_wsrep_THD+0x3eb)[0x55587d698c5b]
            pthread_create.c:0(start_thread)[0x7f7c9b2b8ea5]
            /lib64/libc.so.6(clone+0x6d)[0x7f7c9965896d]

            Trying to get some variables.
            Some pointers may be invalid and cause the dump to abort.
            Query (0x7f7c85a75fcb): DELETE FROM process_id
                    WHERE process_name = 'my-process'
                    AND process_host = 'my-app.example.com'

            Connection ID (thread ID): 2
            Status: NOT_KILLED
            {code}

            The following packages are installed on the servers:
            {code}
            galera-25.3.31-1.el7.centos.x86_64
            MariaDB-client-10.2.36-1.el7.centos.x86_64
            MariaDB-compat-10.2.36-1.el7.centos.x86_64
            MariaDB-common-10.2.36-1.el7.centos.x86_64
            MariaDB-server-10.2.35-1.el7.centos.x86_64
            {code}

            Taking 29.12.2020 as example, which the monitoring system alarmed few times about *node1* and *node2* with the following message:
            {code}
            wsrep_cluster_status: ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock'
            {code}

            It happened on *node1* during December, 29th at:
            * 08:40am
            * 09:40am
            * 03:50pm

            Also on Dec, 29th on *node2*:
            * 12:40am
            * 12:50am

            MariaDB crashed more times, though.

            {code:title=20201229-node1-mariadb.err|borderStyle=solid}
            2020-12-29 08:40:08 0x7f7c90b6b700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 08:40:19 0x7fb11796e700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 09:40:12 0x7f937c61b700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 15:50:19 0x7f06184cf700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 16:30:02 0x7f80c7722700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 21:50:07 0x7f7d9062e700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            {code}

            {code:title=20201229-node2-mariadb.err|borderStyle=solid}
            2020-12-29 00:40:12 0x7f19f84c6700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 00:50:13 0x7f77084b1700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 00:50:26 0x7febec0bc700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 00:50:38 0x7f43207d5700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 00:50:52 0x7fcc8c395700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 00:51:03 0x7faa98c43700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 09:00:09 0x7f80983b7700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 14:10:11 0x7f2890261700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 22:40:10 0x7f296120d700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            {code}

            {code:title=20201229-node3-mariadb.err|borderStyle=solid}
            2020-12-29 08:10:06 0x7faa6473d700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 16:40:03 0x7f3d78084700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 19:50:18 0x7f8944225700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            {code}

            Attached the obfuscated logs from 29.12.2020 of all three nodes.

            Is there any known workaround to avoid further crashes? I couldn't find any.

            Many thanks in advance.
            Hello,

            *Edit:* This may be related to MDEV-23851 but we would like to have confirmation from your side.

            We upgraded MariaDB from 10.2.24 to 10.2.35 and the nodes in Cluster started crashing one day after the update. It seems to happen when there is a conflicting lock during a DELETE.

            It's a 3-Nodes cluster. Every single node may crash a couple of times during a day. It also resulted in a crash of the whole cluster few times during the last two weeks.

            {code:title=log-node1|borderStyle=solid}
            2020-12-29 8:40:08 140172980565760 [ERROR] InnoDB: Conflicting lock on table: `$DB`.`$TABLE1` index: GEN_CLUST_INDEX that has lock
            RECORD LOCKS space id 945 page no 3 n bits 168 index GEN_CLUST_INDEX of table `$DB`.`$TABLE1` trx id 275420837 lock_mode X locks rec but not gap
            Record lock, heap no 2
            Record lock, heap no 98
            2020-12-29 8:40:08 140172980565760 [ERROR] InnoDB: WSREP state:
            2020-12-29 8:40:08 140172980565760 [ERROR] WSREP: Thread BF trx_id: 275420838 thread: 2 seqno: 87923500 query_state: executing conf_state: no conflict exec_mode: applier applier: 1 query: DELETE FROM process_id
                    WHERE process_name = 'my-process'
                    AND process_host = 'my-app.example.com'XÝê_
            2020-12-29 8:40:08 140172980565760 [ERROR] WSREP: Thread BF trx_id: 275420837 thread: 10 seqno: 87923499 query_state: executing conf_state: no conflict exec_mode: applier applier: 1 query: DELETE FROM process_id
                    WHERE process_name = 'my-process-2'
                    AND process_host = 'my-app.example.com'XÝê_
            2020-12-29 08:40:08 0x7f7c90b6b700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            {code}

            *Stack:*
            {code}
            Server version: 10.2.35-MariaDB-log
            key_buffer_size=268435456
            read_buffer_size=2097152
            max_used_connections=15
            max_threads=502
            thread_count=26
            It is possible that mysqld could use up to
            key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2329025 K bytes of memory
            Hope that's ok; if not, decrease some variables in the equation.

            Thread pointer: 0x7f7c780009a8
            Attempting backtrace. You can use the following information to find out
            where mysqld died. If you see no messages after this, something went
            terribly wrong...
            stack_bottom = 0x7f7c90b6ad20 thread_stack 0x49000
            /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55587dc621ee]
            /usr/sbin/mysqld(handle_fatal_signal+0x30d)[0x55587d6ff04d]
            /lib64/libpthread.so.0(+0xf630)[0x7f7c9b2c0630]
            :0(__GI_raise)[0x7f7c99590387]
            :0(__GI_abort)[0x7f7c99591a78]
            /usr/sbin/mysqld(+0x44918e)[0x55587d4a418e]
            /usr/sbin/mysqld(+0x87cd6d)[0x55587d8d7d6d]
            /usr/sbin/mysqld(+0x87d9b4)[0x55587d8d89b4]
            /usr/sbin/mysqld(+0x884145)[0x55587d8df145]
            /usr/sbin/mysqld(+0x884b2a)[0x55587d8dfb2a]
            /usr/sbin/mysqld(+0x91a1ba)[0x55587d9751ba]
            /usr/sbin/mysqld(+0x91d48f)[0x55587d97848f]
            /usr/sbin/mysqld(+0x849855)[0x55587d8a4855]
            /usr/sbin/mysqld(+0x82cbb7)[0x55587d887bb7]
            /usr/sbin/mysqld(+0x841cc9)[0x55587d89ccc9]
            /usr/sbin/mysqld(_ZN7handler11ha_rnd_nextEPh+0x1c7)[0x55587d703c37]
            /usr/sbin/mysqld(_ZN14Rows_log_event8find_rowEP14rpl_group_info+0x50e)[0x55587d800efe]
            /usr/sbin/mysqld(_ZN21Delete_rows_log_event11do_exec_rowEP14rpl_group_info+0x8e)[0x55587d80100e]
            /usr/sbin/mysqld(_ZN14Rows_log_event14do_apply_eventEP14rpl_group_info+0x2fd)[0x55587d7f3e8d]
            /usr/sbin/mysqld(wsrep_apply_cb+0x482)[0x55587d6a48c2]
            src/trx_handle.cpp:312(galera::TrxHandle::apply(void*, wsrep_cb_status (*)(void*, void const*, unsigned long, unsigned int, wsrep_trx_meta const*), wsrep_trx_meta const&) const)[0x7f7c93d47ef8]
            src/replicator_smm.cpp:92(apply_trx_ws(void*, wsrep_cb_status (*)(void*, void const*, unsigned long, unsigned int, wsrep_trx_meta const*), wsrep_cb_status (*)(void*, unsigned int, wsrep_trx_meta const*, bool*, bool), galera::TrxHandle const&, wsrep_trx_meta const&))[0x7f7c93d856f3]
            src/replicator_smm.cpp:458(galera::ReplicatorSMM::apply_trx(void*, galera::TrxHandle*))[0x7f7c93d8877c]
            src/replicator_smm.cpp:1258(galera::ReplicatorSMM::process_trx(void*, galera::TrxHandle*))[0x7f7c93d8b99e]
            src/gcs_action_source.cpp:116(galera::GcsActionSource::dispatch(void*, gcs_action const&, bool&))[0x7f7c93d67078]
            src/gcs_action_source.cpp:28(~Release)[0x7f7c93d6876c]
            src/replicator_smm.cpp:362(galera::ReplicatorSMM::async_recv(void*))[0x7f7c93d8bf7b]
            src/wsrep_provider.cpp:271(galera_recv)[0x7f7c93d99f38]
            /usr/sbin/mysqld(+0x64a976)[0x55587d6a5976]
            /usr/sbin/mysqld(start_wsrep_THD+0x3eb)[0x55587d698c5b]
            pthread_create.c:0(start_thread)[0x7f7c9b2b8ea5]
            /lib64/libc.so.6(clone+0x6d)[0x7f7c9965896d]

            Trying to get some variables.
            Some pointers may be invalid and cause the dump to abort.
            Query (0x7f7c85a75fcb): DELETE FROM process_id
                    WHERE process_name = 'my-process'
                    AND process_host = 'my-app.example.com'

            Connection ID (thread ID): 2
            Status: NOT_KILLED
            {code}

            The following packages are installed on the servers:
            {code}
            galera-25.3.31-1.el7.centos.x86_64
            MariaDB-client-10.2.36-1.el7.centos.x86_64
            MariaDB-compat-10.2.36-1.el7.centos.x86_64
            MariaDB-common-10.2.36-1.el7.centos.x86_64
            MariaDB-server-10.2.35-1.el7.centos.x86_64
            {code}

            Taking 29.12.2020 as example, when the monitoring system alarmed few times about *node1* and *node2* with the following message:
            {code}
            wsrep_cluster_status: ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock'
            {code}

            It happened on *node1* during December, 29th at:
            * 08:40am
            * 09:40am
            * 03:50pm

            Also on Dec, 29th on *node2*:
            * 12:40am
            * 12:50am

            MariaDB crashed more times, though.

            {code:title=20201229-node1-mariadb.err|borderStyle=solid}
            2020-12-29 08:40:08 0x7f7c90b6b700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 08:40:19 0x7fb11796e700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 09:40:12 0x7f937c61b700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 15:50:19 0x7f06184cf700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 16:30:02 0x7f80c7722700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 21:50:07 0x7f7d9062e700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            {code}

            {code:title=20201229-node2-mariadb.err|borderStyle=solid}
            2020-12-29 00:40:12 0x7f19f84c6700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 00:50:13 0x7f77084b1700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 00:50:26 0x7febec0bc700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 00:50:38 0x7f43207d5700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 00:50:52 0x7fcc8c395700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 00:51:03 0x7faa98c43700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 09:00:09 0x7f80983b7700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 14:10:11 0x7f2890261700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 22:40:10 0x7f296120d700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            {code}

            {code:title=20201229-node3-mariadb.err|borderStyle=solid}
            2020-12-29 08:10:06 0x7faa6473d700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 16:40:03 0x7f3d78084700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            2020-12-29 19:50:18 0x7f8944225700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.35/storage/innobase/lock/lock0lock.cc line 694
            {code}

            Attached the obfuscated logs from 29.12.2020 of all three nodes.

            Is there any known workaround to avoid further crashes? I couldn't find any.

            Many thanks in advance.
            jmox jmox made changes -
            Attachment 20201229-node1-mariadb.err [ 55474 ]
            jmox jmox made changes -
            Attachment 20201229-node2-mariadb.err [ 55473 ]
            jmox jmox made changes -
            Attachment 20201229-node3-mariadb.err [ 55472 ]
            jmox jmox made changes -
            Attachment 20201229-node1-mariadb.err [ 55475 ]
            Attachment 20201229-node2-mariadb.err [ 55476 ]
            Attachment 20201229-node3-mariadb.err [ 55477 ]
            alice Alice Sherepa made changes -
            elenst Elena Stepanova made changes -
            Fix Version/s 10.2 [ 14601 ]
            Assignee Jan Lindström [ jplindst ]
            jplindst Jan Lindström (Inactive) made changes -
            issue.field.resolutiondate 2021-01-14 06:34:47.0 2021-01-14 06:34:47.31
            jplindst Jan Lindström (Inactive) made changes -
            Fix Version/s N/A [ 14700 ]
            Fix Version/s 10.2 [ 14601 ]
            Resolution Duplicate [ 3 ]
            Status Open [ 1 ] Closed [ 6 ]
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 117738 ] MariaDB v4 [ 158738 ]

            People

              jplindst Jan Lindström (Inactive)
              jmox jmox
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.