[MDEV-21025] Galera: Server 10.4 crashes with signal 6 on attempt to use RQG Created: 2019-11-11  Updated: 2020-08-25  Resolved: 2020-04-18

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.4.9
Fix Version/s: 10.4.13

Type: Bug Priority: Critical
Reporter: Stepan Patryshev (Inactive) Assignee: Teemu Ollakka
Resolution: Fixed Votes: 2
Labels: None
Environment:

OS: CentOS Linux release 7.6.1810 (Core)


Attachments: Text File RQG_191106_crash_10.4.9.txt     Zip Archive RQG_191106_crash_10.4.9_logs.zip     File galera_stress.yy     File galera_stress.zz    
Issue Links:
Relates
relates to MDEV-21026 Galera: Assertion `xid_seqno > wsrep_... Closed
relates to MDEV-21597 MariaDB standalone server replication... Closed

 Description   

MariaDB Version 10.4.9-MariaDB-debug: Repository: MariaDB/server; branch 10.4; Revision 2b5f4b3ed68585b310b7ebede474928ff90d9aa2ss; debug built from sources.

Galera lib 26.4.3(r4535): Repository: MariaDB/galera; branch mariadb-4.x; Revision 752664dc3c7065d8e0c73ac99d0028a5f84eb250; debug built from sources.

RQG: Repository: MariaDB/randgen; master branch; Revision bea5b6b07d08ba6349d8a6ff4356b01678822727.

Run:

perl ./runall-new.pl --grammar=conf/galera/galera_stress.yy --gendata=conf/galera/galera_stress.zz --duration=4000 --queries=200M --threads=1 --galera=mss --basedir=/home/stepan/mariadb/10.4/git --vardir=/home/stepan/rqg/github/var --sqltrace=MarkErrors "--mysqld=--wsrep-provider=/usr/lib/libgalera_smm_4.so" "--mysqld=--wsrep_sst_method=rsync" "--mysqld=--core" "--mysqld=--general-log" "--mysqld=--general-log-file=queries.log" "--mysqld=--log-output=file" "--mysqld=--wsrep-debug=server" "--mysqld=--wsrep-sync-wait=15" "--mysqld=--wsrep_retry_autocommit=0" "--mysqld=--wsrep_log_conflicts=1" "--mysqld=--wsrep_on=ON"

Output:
191106 15:41:05 [ERROR] mysqld got signal 6 ;

2019-11-06 15:41:05 17 [Note] WSREP: wsrep_sync_wait: thd->variables.wsrep_sync_wait= 15, mask= 1, thd->variables.wsrep_on= 1
mysqld: /home/stepan/mariadb/10.4/git/wsrep-lib/src/transaction.cpp:123: int wsrep::transaction::start_transaction(const wsrep::transaction_id&): Assertion `active() == false' failed.
191106 15:41:05 [ERROR] mysqld got signal 6 ;
Server version: 10.4.9-MariaDB-debug-log (edited) 
key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=2
max_threads=153
thread_count=10
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 467842 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x7f52ac000b00
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f5328088d80 thread_stack 0x49000
mysys/stacktrace.c:269(my_print_stacktrace)[0x55ed5fccf008]
sql/signal_handler.cc:209(handle_fatal_signal)[0x55ed5f43b5dd]
sigaction.c:0(__restore_rt)[0x7f533d6ba5d0]
:0(__GI_raise)[0x7f533b9a8207]
:0(__GI_abort)[0x7f533b9a98f8]
:0(__assert_fail_base)[0x7f533b9a1026]
:0(__GI___assert_fail)[0x7f533b9a10d2]
src/transaction.cpp:124(wsrep::transaction::start_transaction(wsrep::transaction_id const&))[0x55ed5fd7bf32]
wsrep/client_state.hpp:287(wsrep::client_state::start_transaction(wsrep::transaction_id const&))[0x55ed5f09c4d8]
sql/wsrep_trans_observer.h:138(wsrep_start_transaction(THD*, unsigned long))[0x55ed5f275d1e]
sql/transaction.cc:185(trans_begin(THD*, unsigned int))[0x55ed5f276441]
sql/sql_parse.cc:5593(mysql_execute_command(THD*))[0x55ed5f0e30ef]
sql/sql_parse.cc:7912(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x55ed5f0ea7b5]
sql/sql_parse.cc:7727(wsrep_mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x55ed5f0e9d68]
sql/sql_parse.cc:1826(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool))[0x55ed5f0d5a77]
sql/sql_parse.cc:1359(do_command(THD*))[0x55ed5f0d418a]
sql/sql_connect.cc:1412(do_handle_one_connection(CONNECT*))[0x55ed5f25dd51]
sql/sql_connect.cc:1317(handle_one_connection)[0x55ed5f25da80]
pthread_create.c:0(start_thread)[0x7f533d6b2dd5]
/lib64/libc.so.6(clone+0x6d)[0x7f533ba6fead]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f52ac013e18): START TRANSACTION /* QNO 39441 CON_ID 17 */
Connection ID (thread ID): 17
Status: NOT_KILLED
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
Writing a core file...
Working directory at /home/stepan/rqg/github/var/node0/data
Resource Limits:
Limit                     Soft Limit           Hard Limit           Units
Max cpu time              unlimited            unlimited            seconds
Max file size             unlimited            unlimited            bytes
Max data size             unlimited            unlimited            bytes
Max stack size            8388608              unlimited            bytes
Max core file size        unlimited            unlimited            bytes
Max resident set          unlimited            unlimited            bytes
Max processes             4096                 23005                processes
Max open files            1024                 4096                 files
Max locked memory         65536                65536                bytes
Max address space         unlimited            unlimited            bytes
Max file locks            unlimited            unlimited            locks
Max pending signals       23005                23005                signals
Max msgqueue size         819200               819200               bytes
Max nice priority         0                    0
Max realtime priority     0                    0
Max realtime timeout      unlimited            unlimited            us
Core pattern: |/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e %P %I %h

Servers logs.



 Comments   
Comment by Stepan Patryshev (Inactive) [ 2019-11-13 ]

It's also reproduced on ES 10.4.10-4:

MariaDB Version 10.4.10-4-MariaDB-debug: Repository: mariadb-corporation/MariaDBEnterprise; branch 10.4-enterprise; Revision c6be931c93c2c8511f046960b6a28bf684175867; debug built from sources.

Galera lib 26.4.4(r4648): Repository: mariadb-corporation/es-galera.git; branch es-mariadb-4.x; Revision bacbdc0b8ac002acfa041de25d32297eb3c63bcc; debug built from sources.

Comment by Stepan Patryshev (Inactive) [ 2019-11-27 ]

I have attached appropriate xx/yy files.

Generated at Thu Feb 08 09:04:02 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.