[MDEV-23025] int wsrep::transaction::before_rollback(): Assertion `state() == s_executing || state() == s_preparing || state() == s_prepared || state() == s_must_abort || state() == s_aborting || state() == s_cert_failed || state() == s_must_replay' failed signal 6 Created: 2020-06-26  Updated: 2020-11-10  Resolved: 2020-11-10

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.4.13
Fix Version/s: 10.4.14

Type: Bug Priority: Major
Reporter: Richard Stracke Assignee: Julius Goryavsky
Resolution: Cannot Reproduce Votes: 1
Labels: need_feedback

Issue Links:
Relates
relates to MDEV-23057 wsrep::transaction::before_rollback()... Closed
relates to MDEV-18935 Galera test mysql-wsrep#198 sporaric ... Closed
relates to MDEV-22222 Assertion `state() == s_executing || ... Closed
relates to MDEV-22223 Server crashes in gu::Mutex::lock / g... Closed

 Description   

Occsionally a node crash with

transaction.cpp:632: int wsrep::transaction::before_rollback(): Assertion `state() == s_executing || state() == s_preparing || state() == s_prepared || state() == s_must_abort || state() == s_aborting || state() == s_cert_failed || state() == s_must_replay' failed.

Maybe related to MDEV-22222 / MDEV-22223 / MDEV-18935

some stacktraces:

2020-06-23  8:15:02 37433 [Warning] Aborted connection 37433 to db: 'information_schema' user: 'pam_siha' host: '10.109.113.79' (Got an error reading communication packets)
mysqld: /home/buildbot/buildbot/build/mariadb-10.4.13/wsrep-lib/src/transaction.cpp:632: int wsrep::transaction::before_rollback(): Assertion `state() == s_executing || state() == s_preparing || state() == s_prepared || state() == s_must_abort || state() == s_aborting || state() == s_cert_failed || state() == s_must_replay' failed.
200623  8:50:13 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
 
Server version: 10.4.13-MariaDB-1:10.4.13+maria~bionic-log
key_buffer_size=134217728
read_buffer_size=16777216
max_used_connections=257
max_threads=6002
thread_count=394
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 123200322 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x7f27a8060dc8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7fb26eeceda8 thread_stack 0x30000
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x5610cdfa2e8e]

mysqld: /home/buildbot/buildbot/build/mariadb-10.4.13/wsrep-lib/src/transaction.cpp:632: int wsrep::transaction::before_rollback(): Assertion `state() == s_executing || state() == s_preparing || state() == s_prepared || state() == s_must_abort || state() == s_aborting || state() == s_cert_failed || state() == s_must_replay' failed.
200622  9:40:16 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
 
Server version: 10.4.13-MariaDB-1:10.4.13+maria~bionic-log
key_buffer_size=134217728
read_buffer_size=16777216
max_used_connections=306
max_threads=6002
thread_count=446
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 123200322 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x7f65a4375ec8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f6ba398cda8 thread_stack 0x30000
[... some aborted connections]
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55f56b16ce8e]
/usr/sbin/mysqld(handle_fatal_signal+0x515)[0x55f56abe8915]

2020-06-22  9:49:37 0 [Note] /usr/sbin/mysqld: ready for connections.
Version: '10.4.13-MariaDB-1:10.4.13+maria~bionic-log'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  mariadb.org binary distribution
2020-06-22  9:49:37 1 [Note] WSREP: Lowest cert indnex boundary for CC from group: 1095767230
2020-06-22  9:49:37 1 [Note] WSREP: Min available from gcache for CC from group: 1093637757
2020-06-22  9:49:37 1 [Note] WSREP: Server S21006 synced with group
2020-06-22  9:49:37 1 [Note] WSREP: Server status change joined -> synced
2020-06-22  9:49:37 1 [Note] WSREP: Synchronized with group, ready for connections
2020-06-22  9:49:48 0 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000000 of size 1073741824 bytes
mysqld: /home/buildbot/buildbot/build/mariadb-10.4.13/wsrep-lib/src/transaction.cpp:632: int wsrep::transaction::before_rollback(): Assertion `state() == s_executing || state() == s_preparing || state() == s_prepared || state() == s_must_abort || state() == s_aborting || state() == s_cert_failed || state() == s_must_replay' failed.
200622  9:49:53 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
 
Server version: 10.4.13-MariaDB-1:10.4.13+maria~bionic-log
key_buffer_size=134217728
read_buffer_size=16777216
max_used_connections=1017
max_threads=6002
thread_count=1157
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 123200322 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x7ebecc01d0b8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7ebc6ad3fda8 thread_stack 0x30000
*** buffer overflow detected ***: /usr/sbin/mysqld terminated
2020-06-22  9:49:54 2098 [Warning] Aborted connection 2098 to db: 'unconnected' user: 'unauthenticated' host: '10.109.113.79' (This connection closed normally without authentication)
2020-06-22  9:52:22 0 [Note] WSREP: Loading provider /usr/lib/libgalera_smm.so initial position: 7f40ce9f-916a-11ea-a326-97aa42b5043e:1095767404
wsrep loader: [INFO] wsrep_load(): loading provider library '/usr/lib/libgalera_smm.so'
wsrep loader: [INFO] wsrep_load(): Galera 26.4.4(r4599) by Codership Oy <info@codership.com> loaded successfully.
2020-06-22  9:52:22 0 [Note] WSREP: CRC-32C: using hardware acceleration.
2020-06-22  9:52:22 0 [Note] WSREP: Found saved state: 7f40ce9f-916a-11ea-a326-97aa42b5043e:-1, safe_to_bootstrap: 1
2020-06-22  9:52:22 0 [Note] WSREP: GCache DEBUG: opened preamble:
Version: 2
UUID: 7f40ce9f-916a-11ea-a326-97aa42b5043e
Seqno: -1 - -1
Offset: -1
Synced: 0
2020-06-22  9:52:22 0 [Note] WSREP: Recovering GCache ring buffer: version: 2, UUID: 7f40ce9f-916a-11ea-a326-97aa42b5043e, offset: -1
2020-06-22  9:52:22 0 [Note] WSREP: GCache::RingBuffer initial scan...  0.0% (         0/3221225496 bytes) complete.
2020-06-22  9:52:23 0 [Note] WSREP: GCache::RingBuffer initial scan...100.0% (3221225496/3221225496 bytes) complete.



 Comments   
Comment by Richard Stracke [ 2020-06-26 ]

Found some entries in syslog.

Jun 22 09:28:19 S21005 -innobackupex-apply: 2020-06-22 11:28:19 0 [ERROR] InnoDB: Data file './ibdata1' uses page size 8192, but the innodb_page_size start-up parameter is 16384
 
 
Your server is configured with 8192.

Maybe some "injected" pages with wrong size cause the crash, if a rollback thread try to access them.

Comment by Jan Lindström (Inactive) [ 2020-07-31 ]

To really get some idea we would need some further information how to reproduce. I just wonder is this issue fixed on MDEV-22222?

Comment by Julius Goryavsky [ 2020-11-10 ]

It is no longer reproducible and with a very high probability (almost certainly) is a duplicate of the already closed MDEV-22222 problem.

Generated at Thu Feb 08 09:19:16 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.