[MDEV-28228] WSREP: Internal library error: unexpected trx release state: ... state: EXECUTING:0->REPLICATING:661 Created: 2022-04-04  Updated: 2022-05-30  Resolved: 2022-05-30

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.4.14
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Valerii Kravchuk Assignee: Jan Lindström (Inactive)
Resolution: Incomplete Votes: 0
Labels: assertion, crash


 Description   

Galera crashes after error related to unexpected state, like this:

2022-04-03 1:02:56 88 [ERROR] WSREP: Internal library error: unexpected trx release state: source: 36d0ce3a-b2cf-11ec-a206-274c87e76795 version: 5 local: 1 flags: 65 conn_id: 73 trx_id: 43252 tstamp: 1648936976268162208; state: EXECUTING:0->REPLICATING:661 (FATAL)
at galera/src/wsrep_provider.cpp:galera_release():930
2022-04-03 1:02:56 73 [ERROR] WSREP: Mutex unlock failed: 22 (Invalid argument), Aborting.
220403 1:02:56 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

Server version: 10.4.14-MariaDB
key_buffer_size=134217728
read_buffer_size=262144
max_used_connections=21
max_threads=1002
thread_count=90
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2464400 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x7f78500009a8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f7a20629cf0 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x563b0d21e29e]
/usr/sbin/mysqld(handle_fatal_signal+0x30f)[0x563b0ccb3abf]
/lib64/libpthread.so.0(+0xf630)[0x7fc53acfc630]
/lib64/libc.so.6(gsignal+0x37)[0x7fc538fcc387]
/lib64/libc.so.6(abort+0x148)[0x7fc538fcda78]
/usr/lib64/galera-4/libgalera_smm.so(+0x6021f)[0x7fc53565f21f]
/usr/lib64/galera-4/libgalera_smm.so(+0x1be00b)[0x7fc5357bd00b]
/usr/sbin/mysqld(_ZN5wsrep18wsrep_provider_v267certifyENS_9client_idERNS_9ws_handleEiRNS_7ws_metaE+0xe5)[0x563b0d2cc615]
/usr/sbin/mysqld(_ZN5wsrep11transaction14certify_commitERNS_11unique_lockINS_5mutexEEE+0x13a)[0x563b0d2c5c7a]
/usr/sbin/mysqld(_ZN5wsrep11transaction14before_prepareERNS_11unique_lockINS_5mutexEEE+0xcb)[0x563b0d2c637b]
/usr/sbin/mysqld(_ZN5wsrep11transaction13before_commitEv+0x1b7)[0x563b0d2c6677]
/usr/sbin/mysqld(_Z15ha_commit_transP3THDb+0x5f3)[0x563b0ccc1fc3]
/usr/sbin/mysqld(_Z17trans_commit_stmtP3THD+0x3f)[0x563b0cba21cf]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0xfb2)[0x563b0caa4992]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_statebb+0x36d)[0x563b0caace1d]
/usr/sbin/mysqld(+0x5ab0ac)[0x563b0c9980ac]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcjbb+0x32a0)[0x563b0cab0890]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x109)[0x563b0cab0cb9]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP7CONNECT+0x1fa)[0x563b0cb93a0a]
/usr/sbin/mysqld(handle_one_connection+0x3d)[0x563b0cb93aed]
2022-04-03  1:03:10 0 [Note] InnoDB: Buffer pool(s) load completed at 220403  1:03:10
/lib64/libpthread.so.0(+0x7ea5)[0x7fc53acf4ea5]
/lib64/libc.so.6(clone+0x6d)[0x7fc539094b0d]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f78509e72f0): UPDATE ...
Connection ID (thread ID): 73
Status: NOT_KILLED

or like this:

2022-04-03  1:14:36 123 [ERROR] WSREP: Mutex lock failed: 22 (Invalid argument)
         at galerautils/src/gu_mutex.hpp:lock():61
2022-04-03  1:14:36 123 [ERROR] WSREP: Mutex lock failed: 22 (Invalid argument)
         at galerautils/src/gu_mutex.hpp:lock():61
2022-04-03  1:14:36 119 [ERROR] WSREP: Mutex lock failed: 22 (Invalid argument)
         at galerautils/src/gu_mutex.hpp:lock():61
2022-04-03  1:14:36 119 [ERROR] WSREP: Mutex lock failed: 22 (Invalid argument)
         at galerautils/src/gu_mutex.hpp:lock():61
2022-04-03  1:14:36 123 [ERROR] WSREP: Internal library error: unexpected trx release state: source: 36d0ce3a-b2cf-11ec-a208-274c87e76795 version: 5 local: 1 flags: 65 conn_id: 119 trx_id: 1462743 tstamp: 1648937676903510819; state: EXECUTING:0->REPLICATING:661 (FATAL)
         at galera/src/wsrep_provider.cpp:galera_release():930
terminate called after throwing an instance of 'gu::Exception'
  what():  gu_mutex_destroy(): 16 (Device or resource busy)
         at galerautils/src/gu_mutex.hpp:~Mutex():44
220403  1:14:36 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
 
Server version: 10.4.14-MariaDB
key_buffer_size=134217728
read_buffer_size=262144
max_used_connections=39
max_threads=1002
thread_count=106
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2464400 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x7f97f40009a8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f9a787f7cf0 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x560dc850a29e]
/usr/sbin/mysqld(handle_fatal_signal+0x30f)[0x560dc7f9fabf]
/lib64/libpthread.so.0(+0xf630)[0x7fe5848be630]
/lib64/libc.so.6(gsignal+0x37)[0x7fe582b8e387]
:0(__GI_raise)[0x7fe582b8fa78]
/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x165)[0x7fe583288a95]
/lib64/libstdc++.so.6(+0x5ea06)[0x7fe583286a06]
/lib64/libstdc++.so.6(+0x5d9b9)[0x7fe5832859b9]
/lib64/libstdc++.so.6(__gxx_personality_v0+0x564)[0x7fe583286624]
/lib64/libgcc_s.so.1(+0xf8e3)[0x7fe58161a8e3]
/lib64/libgcc_s.so.1(_Unwind_Resume+0x57)[0x7fe58161ae17]
/usr/lib64/galera-4/libgalera_smm.so(+0x25ab0)[0x7fe57f1e6ab0]
src/gu_exception.hpp:23(Exception)[0x7fe57f2380e9]
:0(gu::Mutex::~Mutex())[0x7fe57f332000]
src/trx_handle.hpp:210(~TrxHandle)[0x7fe57f332aa0]
src/gu_lock.hpp:31(Lock)[0x7fe57f32f2e2]
detail/sp_counted_base_gcc_x86.hpp:50(atomic_exchange_and_add)[0x7fe57f37fa03]
/usr/sbin/mysqld(_ZN5wsrep18wsrep_provider_v267releaseERNS_9ws_handleE+0x29)[0x560dc85b6b39]
/usr/sbin/mysqld(_ZN5wsrep11transaction15after_statementEv+0x127)[0x560dc85b3cd7]
/usr/sbin/mysqld(_ZN5wsrep12client_state15after_statementEv+0x4d)[0x560dc858fa8d]
/usr/sbin/mysqld(_ZN7sp_head7executeEP3THDb+0xef6)[0x560dc7cf9866]
/usr/sbin/mysqld(_ZN7sp_head15execute_triggerEP3THDPK25st_mysql_const_lex_stringS4_P13st_grant_info+0x1d7)[0x560dc7cf9e87]
/usr/sbin/mysqld(_ZN19Table_triggers_list16process_triggersEP3THD14trg_event_type20trg_action_time_typeb+0x105)[0x560dc7e31ee5]
/usr/sbin/mysqld(_Z36fill_record_n_invoke_before_triggersP3THDP5TABLER4ListI4ItemES6_b14trg_event_type+0x7d)[0x560dc7d36fdd]
/usr/sbin/mysqld(_Z12mysql_insertP3THDP10TABLE_LISTR4ListI4ItemERS3_IS5_ES6_S6_15enum_duplicatesb+0x11bd)[0x560dc7d63fad]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x3824)[0x560dc7d93204]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_statebb+0x36d)[0x560dc7d98e1d]
/usr/sbin/mysqld(+0x5ab0ac)[0x560dc7c840ac]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcjbb+0x320f)[0x560dc7d9c7ff]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x109)[0x560dc7d9ccb9]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP7CONNECT+0x1fa)[0x560dc7e7fa0a]
/usr/sbin/mysqld(handle_one_connection+0x3d)[0x560dc7e7faed]
/lib64/libpthread.so.0(+0x7ea5)[0x7fe5848b6ea5]
/lib64/libc.so.6(clone+0x6d)[0x7fe582c56b0d]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f97f406efe0): INSERT IGNORE INTO ...

I can not find the exact match for the problematic state change, EXECUTING:0->REPLICATING:661, hence this new report.



 Comments   
Comment by Jan Lindström (Inactive) [ 2022-04-20 ]

valerii I would need some instructions how to reproduce this issue. Customer could also migrate to more recent version of MariaDB server and Galera library.

Comment by Valerii Kravchuk [ 2022-04-20 ]

Upgrade was suggested.

Comment by Jan Lindström (Inactive) [ 2022-04-20 ]

ramesh Can you try to reproduce using rgq/pquery with grammar containing INSERT IGNORE clauses.

Comment by Ramesh Sivaraman [ 2022-04-26 ]

jplindst Could not reproduce the issue using multi-threaded pquery/RQG run.

Generated at Thu Feb 08 09:59:05 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.