[MDEV-7217] MariaDB Node crashes with WSREP: SQL statement was ineffective Created: 2014-11-26  Updated: 2023-06-06  Resolved: 2023-06-06

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 5.5.37-galera, 10.0.19-galera
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Andrew Assignee: Ramesh Sivaraman
Resolution: Won't Fix Votes: 2
Labels: galera, need_verification
Environment:

Centos 6.5


Attachments: Text File crash_rollback_rollback.txt     File my.cnf     File my.cnf    

 Description   

Hi there.

I've encountered an issue with my Galera Cluster.

I have 4 Nodes clustered across 2 different physical locations with a substantial link between them.

I've recently had 2 node crashes (different nodes with the same error message)

I'm getting the following in the server log:

141124 7:26:54 [Warning] WSREP: SQL statement was ineffective, THD: 832, buf: 175
QUERY: commit
=> Skipping replication
141124 7:26:54 [ERROR] WSREP: FSM: no such a transition ROLLED_BACK -> ROLLED_BACK
141124 7:26:54 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see http://kb.askmonty.org/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

Server version: 5.5.37-MariaDB-wsrep-log
key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=178
max_threads=501
thread_count=76
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1225846 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0x7f1d92e94000
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f30a38d9c38 thread_stack 0x48000
/usr/sbin/mysqld(my_print_stacktrace+0x2b)[0xa9226b]
/usr/sbin/mysqld(handle_fatal_signal+0x398)[0x6ea9d8]
/lib64/libpthread.so.0[0x349440f710]
/lib64/libc.so.6(gsignal+0x35)[0x3494032925]
/lib64/libc.so.6(abort+0x175)[0x3494034105]
/usr/lib64/galera/libgalera_smm.so(ZN6galera3FSMINS_9TrxHandle5StateENS1_10TransitionENS_10EmptyGuardENS_11EmptyActionEE8shift_toES2+0x2d9)[0x7f31e956b4d9]
/usr/lib64/galera/libgalera_smm.so(_ZN6galera13ReplicatorSMM13post_rollbackEPNS_9TrxHandleE+0x2e)[0x7f31e958628e]
/usr/lib64/galera/libgalera_smm.so(galera_post_rollback+0x68)[0x7f31e959fd88]
/usr/sbin/mysqld[0x691375]
/usr/sbin/mysqld[0x691488]
/usr/sbin/mysqld(_Z17ha_rollback_transP3THDb+0xe6)[0x6ed4c6]
/usr/sbin/mysqld(_Z15ha_commit_transP3THDb+0x1e2)[0x6ed7d2]
/usr/sbin/mysqld(_Z12trans_commitP3THD+0x45)[0x66ce65]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x2c80)[0x5a7df0]
/usr/sbin/mysqld[0x5abf94]
/usr/sbin/mysqld[0x5ac37b]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1889)[0x5adfd9]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x11c)[0x5ae73c]
/usr/sbin/mysqld(_Z26threadpool_process_requestP3THD+0xa7)[0x690457]
/usr/sbin/mysqld[0x6c2115]
/lib64/libpthread.so.0[0x34944079d1]
/lib64/libc.so.6(clone+0x6d)[0x34940e8b5d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f1d9558b018): is an invalid pointer
Connection ID (thread ID): 832
Status: NOT_KILLED

Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=off

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
141124 07:26:55 mysqld_safe Number of processes running now: 0
141124 07:26:55 mysqld_safe WSREP: not restarting wsrep node automatically
141124 07:26:55 mysqld_safe mysqld from pid file /var/lib/mysql/mysql.pid ended

The following Warning appears multiple times in the logs for both servers

[Warning] WSREP: SQL statement was ineffective, THD: 832, buf: 175
QUERY: commit
=> Skipping replication

I've definitely got all 4 nodes configured with bin_log=ROW and no auto_increment settings so I don't know whats causing this to occur during normal operations.



 Comments   
Comment by Andrew [ 2014-12-02 ]

Is there any additional information I can provide that will help decipher the cause of this issue?

I'd really appreciate some feedback.

Comment by Joerg [ 2015-03-03 ]

We got a similar problem. One of our 3-node MariaDB cluster node crashed.

CentOS 6.6 64Bit
MariaDB 10.0.16-1 galera 25.3.5-1

150303 12:06:02 [Warning] WSREP: SQL statement was ineffective, THD: 938807, buf: 390
QUERY: COMMIT
 => Skipping replication
150303 12:06:02 [ERROR] WSREP: FSM: no such a transition ROLLED_BACK -> ROLLED_BACK
150303 12:06:02 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see http://kb.askmonty.org/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
 
Server version: 10.0.16-MariaDB-wsrep-log
key_buffer_size=16777216
read_buffer_size=262144
max_used_connections=44
max_threads=252
thread_count=8
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1118402 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0x7ff5507f5008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7ffe0f063d00 thread_stack 0x48000
(my_addr_resolve failure: fork)
/usr/sbin/mysqld(my_print_stacktrace+0x2b) [0xb995bb]
/usr/sbin/mysqld(handle_fatal_signal+0x398) [0x743ed8]
/lib64/libpthread.so.0() [0x32bb60f710]
/lib64/libc.so.6(gsignal+0x35) [0x32bb232625]
/lib64/libc.so.6(abort+0x175) [0x32bb233e05]
/usr/lib64/galera/libgalera_smm.so(galera::FSM<galera::TrxHandle::State, galera::TrxHandle::Transition, galera::EmptyGuard, galera::EmptyAction>::shift_to(galera::TrxHandle::State)+0x2d9) [0x7ffe17f5d4d9]
/usr/lib64/galera/libgalera_smm.so(galera::ReplicatorSMM::post_rollback(galera::TrxHandle*)+0x2e) [0x7ffe17f7828e]
/usr/lib64/galera/libgalera_smm.so(galera_post_rollback+0x68) [0x7ffe17f91d88]
/usr/sbin/mysqld() [0x6e4265]
/usr/sbin/mysqld() [0x6e42d8]
/usr/sbin/mysqld(ha_rollback_trans(THD*, bool)+0xde) [0x746dee]
/usr/sbin/mysqld(ha_commit_trans(THD*, bool)+0x1e2) [0x747142]
/usr/sbin/mysqld(trans_commit(THD*)+0x4c) [0x6b11cc]
/usr/sbin/mysqld(mysql_execute_command(THD*)+0x1020) [0x5dc110]
/usr/sbin/mysqld() [0x5e3452]
/usr/sbin/mysqld() [0x5e3d4b]
/usr/sbin/mysqld(dispatch_command(enum_server_command, THD*, char*, unsigned int)+0x17f0) [0x5e5a80]

What can we do to avoid the error? Is there a fix or something misconfigured?

Comment by Nirbhay Choubey (Inactive) [ 2015-04-25 ]

Fos joergw : Can you share the transaction that triggered to this failure? Can you try upgrading to the latest mariadb/galera version to see if this still occurs?

Comment by Joerg [ 2015-04-27 ]

Hi,

i am sorry that i cannot provide the requested query/transaction.
We just upgrade our cluster 4 weeks ago and the problem has not occurred again since upgrade.

Comment by Andrew [ 2015-04-27 ]

I do not have the query run and I've had similar crashes occur since then but never on the same node.

Upgrading is currently not an option.

I'll paste the queries on the cluster across all 5 nodes if I get another crash.

Comment by Nirbhay Choubey (Inactive) [ 2015-04-28 ]

joergw Thanks for confirming that.
Fos Since the bug does not seem to affect the latest versions, I am closing it.
But do post the queries/transaction in case you hit a crash, so that I can confirm
if its really fixed.

Comment by Andrew [ 2015-07-02 ]

@Nirbhay I've upgraded all nodes to 10.0.19 and I just got the exact same error on a node in preproduction.

150701 23:10:45 [ERROR] WSREP: FSM: no such a transition ROLLED_BACK -> ROLLED_BACK
150701 23:10:45 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see http://kb.askmonty.org/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

Server version: 10.0.19-MariaDB-wsrep-log
key_buffer_size=134217728
read_buffer_size=31457280
max_used_connections=139
max_threads=501
thread_count=24
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 16552907 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0x7f3a844ba008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f3a6affec78 thread_stack 0x48000
/usr/sbin/mysqld(my_print_stacktrace+0x2b)[0xba1e4b]
/usr/sbin/mysqld(handle_fatal_signal+0x398)[0x748948]
/lib64/libpthread.so.0(+0xf710)[0x7f3b6e594710]
/lib64/libc.so.6(gsignal+0x35)[0x7f3b6cbf3925]
/lib64/libc.so.6(abort+0x175)[0x7f3b6cbf5105]
/usr/lib64/galera/libgalera_smm.so(ZN6galera3FSMINS_9TrxHandle5StateENS1_10TransitionENS_10EmptyGuardENS_11EmptyActionEE8shift_toES2+0x2d9)[0x7f3b6656b4d9]
/usr/lib64/galera/libgalera_smm.so(_ZN6galera13ReplicatorSMM13post_rollbackEPNS_9TrxHandleE+0x2e)[0x7f3b6658628e]
/usr/lib64/galera/libgalera_smm.so(galera_post_rollback+0x68)[0x7f3b6659fd88]
/usr/sbin/mysqld[0x6e9e75]
/usr/sbin/mysqld[0x6e9ee8]
/usr/sbin/mysqld(_Z17ha_rollback_transP3THDb+0xde)[0x74b76e]
/usr/sbin/mysqld(_Z15ha_commit_transP3THDb+0x200)[0x74bae0]
/usr/sbin/mysqld(_Z12trans_commitP3THD+0x4c)[0x6b4f4c]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x1020)[0x5de1e0]
/usr/sbin/mysqld[0x5e54f7]
/usr/sbin/mysqld[0x5e5ebb]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x19fb)[0x5e7dfb]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x1e2)[0x5e86c2]
/usr/sbin/mysqld(_Z26threadpool_process_requestP3THD+0xa7)[0x6dd3b7]
/usr/sbin/mysqld[0x71d99d]
/lib64/libpthread.so.0(+0x79d1)[0x7f3b6e58c9d1]
/lib64/libc.so.6(clone+0x6d)[0x7f3b6cca9b5d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f3a6a9a5020): is an invalid pointer
Connection ID (thread ID): 2363
Status: NOT_KILLED*strong text*

Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
150701 23:10:48 mysqld_safe Number of processes running now: 0
150701 23:10:48 mysqld_safe WSREP: not restarting wsrep node automatically
150701 23:10:48 mysqld_safe mysqld from pid file /var/lib/mysql/mysql.pid ended

Comment by Nirbhay Choubey (Inactive) [ 2015-07-02 ]

Fos Ok thanks, reopened this issue for further investigation.

Comment by Andrew [ 2015-07-02 ]

@Nirbhay thanks. I've enabled wsrep_debug and general logging so I can try get additional information.

Comment by Nirbhay Choubey (Inactive) [ 2015-07-08 ]

Fos Thanks! Please update when you find something of interest.

Comment by Rich Theobald [ 2016-09-23 ]

Same issue MariaDB Galera 10.0.22 crash_rollback_rollback.txt

Comment by Rich Theobald [ 2017-07-25 ]

In my case, I worked around the issue by using a procedure and checked if the session was in transaction before attempting a rollback:

  IF @@IN_TRANSACTION THEN  
  
    ROLLBACK;
 
  END IF;

Comment by Ralf Gebhardt [ 2020-06-02 ]

stepan.patryshev, I will remove the unsupported FixVersion from this task. Please check if the task is still valid for 10.1+ and add the appropriate FixVersion

Comment by Jan Lindström [ 2023-06-06 ]

Both versions are EOL.

Generated at Thu Feb 08 07:17:54 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.