[MDEV-25410] Assertion `state_ == s_exec' failed - mysqld got signal 6 Created: 2021-04-14  Updated: 2021-07-28  Resolved: 2021-07-28

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.5.9, 10.5
Fix Version/s: 10.4.21, 10.5.12, 10.6.4

Type: Bug Priority: Critical
Reporter: Jaroslav Assignee: Jan Lindström (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Attachments: File crash.log.gz     Text File logs.log    
Issue Links:
Relates
relates to MDEV-22227 Assertion `state_ == s_exec' failed i... Closed
relates to MDEV-25114 Crash: WSREP: invalid state ROLLED_BA... Closed

 Description   

Hi,
We can see that one of nodes in our database crashed today with following errors:
Any ideas what might be wrong and why it came down?

This happened on multi-master (3 nodes) environment where 1/3 node crashed

mysqld: /home/buildbot/buildbot/build/mariadb-10.5.9/wsrep-lib/include/wsrep/client_state.hpp:320: int wsrep::client_state::start_transaction(const wsrep::transaction_id&): Assertion `state_ == s_exec' failed.
210414  7:14:29 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
 
Server version: 10.5.9-MariaDB-1:10.5.9+maria~focal
key_buffer_size=629145600
read_buffer_size=131072
max_used_connections=85
max_threads=352
thread_count=94
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1389275 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x49000
mysqld(my_print_stacktrace+0x32)[0x5595a3430af2]
Printing to addr2line failed
mysqld(handle_fatal_signal+0x485)[0x5595a2e866f5]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0)[0x7f96a922f3c0]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7f96a8d3618b]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x12b)[0x7f96a8d15859]
/lib/x86_64-linux-gnu/libc.so.6(+0x25729)[0x7f96a8d15729]
/lib/x86_64-linux-gnu/libc.so.6(+0x36f36)[0x7f96a8d26f36]
mysqld(_Z14wsrep_bf_abortPK3THDPS_+0x563)[0x5595a315bb33]
mysqld(wsrep_thd_bf_abort+0x23)[0x5595a3163713]
mysqld(+0xc70e3c)[0x5595a318be3c]
mysqld(handle_manager+0x15e)[0x5595a2c68e9e]
mysqld(+0xbc1c66)[0x5595a30dcc66]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x9609)[0x7f96a9223609]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x43)[0x7f96a8e12293]
The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
information that should help you find out what is causing the crash.
Writing a core file...
Working directory at /var/lib/mysql
Resource Limits:
Limit                     Soft Limit           Hard Limit           Units
Max cpu time              unlimited            unlimited            seconds
Max file size             unlimited            unlimited            bytes
Max data size             unlimited            unlimited            bytes
Max stack size            8388608              unlimited            bytes
Max core file size        unlimited            unlimited            bytes
Max resident set          unlimited            unlimited            bytes
Max processes             unlimited            unlimited            processes
Max open files            1048576              1048576              files
Max locked memory         16777216             16777216             bytes
Max address space         unlimited            unlimited            bytes
Max file locks            unlimited            unlimited            locks
Max pending signals       59986                59986                signals
Max msgqueue size         819200               819200               bytes
Max nice priority         0                    0
Max realtime priority     0                    0
Max realtime timeout      unlimited            unlimited            us
Core pattern: core.%e.%p.%t
 
Fatal signal 11 while backtracing

Our configuration:

data:
  galera.cnf: |
    [galera]
    user = mysql
    bind-address = 0.0.0.0
    default_storage_engine = InnoDB
    binlog_format = ROW
    innodb_autoinc_lock_mode = 2
    innodb_flush_log_at_trx_commit = 2
 
    # MariaDB Galera settings
    wsrep_on=ON
    wsrep_provider=/usr/lib/galera/libgalera_smm.so
    wsrep_sst_method=rsync
    wsrep_slave_threads=8
    wsrep_sync_wait=7
 
    # Cluster settings (automatically updated)
    wsrep_cluster_address=gcomm://
    wsrep_cluster_name=mysql
    wsrep_node_address=127.0.0.1
  mariadb.cnf: "[client]\ndefault-character-set = utf8\n[mysqld]\nunix_socket = OFF\nperformance_schema
    = ON\ncharacter-set-server = utf8\ncollation-server = utf8_general_ci\nignore-db-dirs
    = lost+found \nmax_connections = 350\ninteractive_timeout = 450 \nwait_timeout
    = 450\njoin_buffer_size = 524288\nkey_buffer_size = 600MB\n# InnoDB tuning\ninnodb_log_file_size
    = 600MB\ninnodb_buffer_pool_size = 8400MB\n"



 Comments   
Comment by Jaroslav [ 2021-04-14 ]

I've also tried to upload the core (crash) file but unable to do so on FTP (19mb)

➜  Downloads lftp -u anonymous -e 'put MDEV-25410_sql_dump.tar.gz' ftp://ftp.mariadb.org/private/
Password:
cd ok, cwd=/private
put: Access failed: 553 Could not create file. (MDEV-25410_sql_dump.tar.gz)

Comment by Mario Karuza (Inactive) [ 2021-04-22 ]

Hi Jaroslav,

Was there any locking before this crash was observed ?

Comment by Jaroslav [ 2021-04-22 ]

Mario it's hard to tell. It still happens sometimes. We normally find out few hours after crash happens. I can provide core file if it help but unable to upload it directly here because of size and ftp doesn't accept file for me.

Comment by Mario Karuza (Inactive) [ 2021-04-22 ]

Can you maybe upload it somewhere and provide link and/or maybe provide log file

Comment by Jaroslav [ 2021-04-22 ]

Here is core file https://file.io/t0WXFQUs3ZZd (from the time reported in this ticket. 1 download only)
And I attached logs from recent case. Even there is not much in it.

Comment by Matthias Bethke [ 2021-07-23 ]

The same crash has been happening frequently here since upgrading a 5-node Galera to MariaDB 10.5. @Mario, I can confirm it always happens after some locking conflict; I'll upload the corresponding log.

Generated at Thu Feb 08 09:37:29 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.