Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-32851

Assertion failure crash

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Cannot Reproduce
    • 10.6.5
    • N/A
    • Server
    • None
    • Ubuntu 20.04.3 LTS (GNU/Linux 5.4.0-97-generic x86_64)

    Description

      We have a MariaDB cluster with two nodes. MaxScale sits in front.

      For some reason, master node's MariaDB crashed and restarted. Slave node became the master which was good, but master node didn't come back to normal. I've logged into the node via SSH and realised two MariaDB instances were running, which indicates the old service didn't stop when restart happened. I've killed all MariaDB instances and restarted the service, which recovered the cluster.

      I am doing some RCA duty and I'd like to know what made the database service crash on the node. I've obained the following log messages, but I don't know how to interpret.

      2023-11-18 06:37:47 0x7fd5c37fe700  InnoDB: Assertion failure in file /home/buildbot/buildbot/build/mariadb-10.6.5/storage/innobase/fil/fil0fil.cc line 482
      InnoDB: Failing assertion: space->is_ready_to_close() || space->purpose == FIL_TYPE_TEMPORARY || srv_fast_shutdown == 2 || !srv_was_started
      InnoDB: We intentionally generate a memory trap.
      InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
      InnoDB: If you get repeated assertion failures or crashes, even
      InnoDB: immediately after the mariadbd startup, there may be
      InnoDB: corruption in the InnoDB tablespace. Please refer to
      InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
      InnoDB: about forcing recovery.
      231118  6:37:47 [ERROR] mysqld got signal 6 ;
      This could be because you hit a bug. It is also possible that this binary
      or one of the libraries it was linked against is corrupt, improperly built,
      or misconfigured. This error can also be caused by malfunctioning hardware.
       
      To report this bug, see https://mariadb.com/kb/en/reporting-bugs
       
      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed, 
      something is definitely wrong and this may fail.
       
      Server version: 10.6.5-MariaDB-1:10.6.5+maria~focal-log
      key_buffer_size=134217728
      read_buffer_size=131072
      max_used_connections=516
      max_threads=702
      thread_count=426
      It is possible that mysqld could use up to 
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 3114573 K  bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.
       
      Thread pointer: 0x557ef60cc748
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x7fd5c37fdcd8 thread_stack 0x49000
      /usr/sbin/mariadbd(my_print_stacktrace+0x32)[0x557ef2ec3442]
      /usr/sbin/mariadbd(handle_fatal_signal+0x485)[0x557ef29766a5]
      2023-11-18  6:42:06 0 [Warning] Aborted connection 0 to db: 'unconnected' user: 'unauthenticated' host: 'connecting host' (Too many connections)
      2023-11-18  6:42:07 0 [Warning] Aborted connection 0 to db: 'unconnected' user: 'unauthenticated' host: 'connecting host' (Too many connections)
      2023-11-18  6:42:07 0 [Warning] Aborted connection 0 to db: 'unconnected' user: 'unauthenticated' host: 'connecting host' (Too many connections)
      2023-11-18  6:42:07 0 [Warning] Aborted connection 0 to db: 'unconnected' user: 'unauthenticated' host: 'connecting host' (Too many connections)
      2023-11-18  6:42:07 0 [Warning] Aborted connection 0 to db: 'unconnected' user: 'unauthenticated' host: 'connecting host' (Too many connections)
      2023-11-18  6:42:08 0 [Warning] Aborted connection 0 to db: 'unconnected' user: 'unauthenticated' host: 'connecting host' (Too many connections)
      2023-11-18  6:42:09 0 [Warning] Aborted connection 0 to db: 'unconnected' user: 'unauthenticated' host: 'connecting host' (Too many connections)
      2023-11-18  6:42:09 0 [Warning] Aborted connection 0 to db: 'unconnected' user: 'unauthenticated' host: 'connecting host' (Too many connections)
      2023-11-18 6:47:30 247590222 [Warning] InnoDB: A long wait (580 seconds) was observed for dict_sys.latch
      

      After waiting for about 5 mins, I killed all mariadb and restarted it.

      The next startup after demoting this master to a replica:

      2023-11-18  6:56:52 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
      2023-11-18  6:56:52 0 [Note] InnoDB: Number of pools: 1
      2023-11-18  6:56:52 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
      2023-11-18  6:56:52 0 [Note] InnoDB: Using Linux native AIO
      2023-11-18  6:56:52 0 [Note] InnoDB: Initializing buffer pool, total size = 50063212544, chunk size = 134217728
      2023-11-18  6:56:52 0 [Note] InnoDB: Completed initialization of buffer pool
      2023-11-18  6:56:53 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=34833922131011,34834075130209
      2023-11-18  6:57:02 0 [Note] InnoDB: Transaction 1741415686 was in the XA prepared state.
      2023-11-18  6:57:02 0 [Note] InnoDB: 24 transaction(s) which must be rolled back or cleaned up in total 24 row operations to undo
      2023-11-18  6:57:02 0 [Note] InnoDB: Trx id counter is 1741415687
      2023-11-18  6:57:02 0 [Note] InnoDB: Starting final batch to recover 80462 pages from redo log.
      2023-11-18  6:57:10 0 [Note] InnoDB: Last binlog file '/data/mysql/mysql-bin.115677', position 929052422
      2023-11-18  6:57:10 0 [Note] InnoDB: 128 rollback segments are active.
      2023-11-18  6:57:11 0 [Note] InnoDB: Starting in background the rollback of recovered transactions
      2023-11-18  6:57:11 0 [Note] InnoDB: To roll back: 23 transactions, 24 rows
      2023-11-18  6:57:11 0 [Note] InnoDB: Rolled back recovered transaction 1741415288
      2023-11-18  6:57:11 0 [Note] InnoDB: Rolled back recovered transaction 1741415421
      2023-11-18  6:57:11 0 [Note] InnoDB: Rolled back recovered transaction 1741415383
      2023-11-18  6:57:11 0 [Note] InnoDB: Rolled back recovered transaction 1741415160
      2023-11-18  6:57:11 0 [Note] InnoDB: Rolled back recovered transaction 1741415170
      2023-11-18  6:57:11 0 [Note] InnoDB: Rolled back recovered transaction 1741415265
      2023-11-18  6:57:11 0 [Note] InnoDB: Rolled back recovered transaction 1741415564
      2023-11-18  6:57:11 0 [Note] InnoDB: Rolled back recovered transaction 1741415321
      2023-11-18  6:57:11 0 [Note] InnoDB: Rolled back recovered transaction 1741415032
      2023-11-18  6:57:11 0 [Note] InnoDB: Rolled back recovered transaction 1741415645
      2023-11-18  6:57:11 0 [Note] InnoDB: Rolled back recovered transaction 1741415465
      2023-11-18  6:57:11 0 [Note] InnoDB: Rolled back recovered transaction 1741415630
      2023-11-18  6:57:11 0 [Note] InnoDB: Rolled back recovered transaction 1741415138
      2023-11-18  6:57:11 0 [Note] InnoDB: Rolled back recovered transaction 1741415451
      2023-11-18  6:57:11 0 [Note] InnoDB: Rolled back recovered transaction 1741415057
      2023-11-18  6:57:11 0 [Note] InnoDB: Rolled back recovered transaction 1741415109
      2023-11-18  6:57:11 0 [Note] InnoDB: Rolled back recovered transaction 1741415559
      2023-11-18  6:57:11 0 [Note] InnoDB: Rolled back recovered transaction 1741415434
      2023-11-18  6:57:11 0 [Note] InnoDB: Rolled back recovered transaction 1741415157
      2023-11-18  6:57:11 0 [Note] InnoDB: Rolled back recovered transaction 1741415006
      2023-11-18  6:57:11 0 [Note] InnoDB: Rolled back recovered transaction 1741415102
      2023-11-18  6:57:11 0 [Note] InnoDB: Rolled back recovered transaction 1741415045
      2023-11-18  6:57:11 0 [Note] InnoDB: Rolled back recovered transaction 1741415001
      2023-11-18  6:57:11 0 [Note] InnoDB: Rollback of non-prepared transactions completed
      2023-11-18  6:57:11 0 [Note] InnoDB: Removed temporary tablespace data file: "./ibtmp1"
      2023-11-18  6:57:11 0 [Note] InnoDB: Creating shared tablespace for temporary tables
      2023-11-18  6:57:11 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
      2023-11-18  6:57:11 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
      2023-11-18  6:57:11 0 [Note] InnoDB: 10.6.5 started; log sequence number 34834749712353; transaction id 1741415711
      2023-11-18  6:57:11 0 [Note] InnoDB: Loading buffer pool(s) from /data/mysql/ib_buffer_pool
      2023-11-18  6:57:11 0 [Note] Plugin 'FEEDBACK' is disabled.
      2023-11-18  6:57:11 0 [Note] Recovering after a crash using /data/mysql/mysql-bin
      2023-11-18  6:57:11 0 [Note] Starting table crash recovery...
      2023-11-18  6:57:11 0 [Note] InnoDB: Starting recovery for XA transactions...
      2023-11-18  6:57:11 0 [Note] InnoDB: Transaction 1741415686 in prepared state after recovery
      2023-11-18  6:57:11 0 [Note] InnoDB: Transaction contains changes to 1 rows
      2023-11-18  6:57:11 0 [Note] InnoDB: 1 transactions in prepared state after recovery
      2023-11-18  6:57:11 0 [Note] Found 1 prepared transaction(s) in InnoDB
      2023-11-18  6:57:11 0 [Note] Crash table recovery finished.
      2023-11-18  6:57:12 0 [Note] Server socket created on IP: '0.0.0.0'.
      2023-11-18  6:57:12 0 [Note] /usr/sbin/mariadbd: ready for connections.
      Version: '10.6.5-MariaDB-1:10.6.5+maria~focal-log'  socket: '/run/mysqld/mysqld.sock'  port: 3306  mariadb.org binary distribution
      2023-11-18  6:57:29 0 [Note] InnoDB: Buffer pool(s) load completed at 231118  6:57:29
      2023-11-18  6:58:01 7 [Note] Master connection name: ''  Master_info_file: 'master.info'  Relay_info_file: 'relay-log.info'
      2023-11-18  6:58:01 7 [Note] 'CHANGE MASTER TO executed'. Previous state master_host='', master_port='3306', master_log_file='', master_log_pos='4'. New state master_host='192.168.152.22', master_port='3306', master_log_file='', master_log_pos='4'.
      2023-11-18  6:58:01 7 [Note] Previous Using_Gtid=No. New Using_Gtid=Current_Pos
      2023-11-18  6:58:01 33 [Note] Slave I/O thread: Start asynchronous replication to master 'repl@192.168.152.22:3306' in log '' at position 4
      2023-11-18  6:58:01 33 [Note] Slave I/O thread: connected to master 'repl@192.168.152.22:3306',replication starts at GTID position '0-16-510364683'
      2023-11-18  6:58:01 34 [Note] Slave SQL thread initialized, starting replication in log 'FIRST' at position 4, relay log '/data/mysql/mysqld-relay-bin.000001' position: 4; GTID position '0-16-510364683'
      

      Unfortunately no core was saved.

      Binary logs uploaded to https://mariadb.com/kb/en/meta/mariadb-ftp-server/ if something can be salvaged on the execution there.

      from new master

      2023-11-18  6:47:51 8642 [Note] Error reading relay log event: slave SQL thread was killed
      2023-11-18  6:47:51 8642 [Note] Slave SQL thread exiting, replication stopped in log 'mysql-bin.115677' at position 929053385; GTID position '0-16-510364683', master: 192.168.152.21:3306
      2023-11-18  6:47:51 8641 [Note] Slave I/O thread exiting, read up to log 'mysql-bin.115677', position 929053385; GTID position 0-16-510364683, master 192.168.152.21:3306
      2023-11-18  6:47:51 241938254 [Note] Deleted Master_info file '/data/mysql/master.info'.
      2023-11-18  6:47:51 241938254 [Note] Deleted Master_info file '/data/mysql/relay-log.info'.
      2023-11-18  6:58:01 251514015 [Note] Start binlog_dump to slave_server(16), pos(, 4), using_gtid(1), gtid('0-16-510364683')
      2023-11-18  7:09:23 251519693 [Warning] Aborted connection 251519693 to db: 'unconnected' user: 'admin' host: '10.0.12.2' (Got timeout reading communication packets)
      

      I'd like to know what this means and what should be prepared to avoid the same event if possible.

      Hope I can get some help.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              eoin Eoin Kim
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.