[MDEV-20898] Galera test failure on galera_sr.galera_sr_kill_all_pcrecovery: mysqld got signal 6 Created: 2019-10-25  Updated: 2020-03-09  Resolved: 2020-02-10

Status: Closed
Project: MariaDB Server
Component/s: Galera, Tests
Affects Version/s: 10.4.9
Fix Version/s: N/A

Type: Bug Priority: Minor
Reporter: Stepan Patryshev (Inactive) Assignee: Stepan Patryshev (Inactive)
Resolution: Cannot Reproduce Votes: 0
Labels: galera_4
Environment:

CentOS: 7.6.1810.


Attachments: Zip Archive 191025_SSD_galera_sr_kill_all_pcrecovery.zip     Text File 191025_SSD_stdout.log    

 Description   

It sporadically fails only on the local suite runs, not on the bb.

MariaDB Version 10.4.9-MariaDB-debug: Repository: MariaDB/server; branch 10.4; Revision 2b5f4b3ed68585b310b7ebede474928ff90d9aa2; debug built from sources.

Galera 26.4.3(r4535): Repository: MariaDB/galera; branch mariadb-4.x; Revision 752664dc3c7065d8e0c73ac99d0028a5f84eb250; debug built from sources.

Run:

./mtr --suite galera_sr --force --big-test --max-test-fail=0 --timer --timestamp --timediff

stdout.log (191025_SSD_stdout.log):

2019-10-25 18:21:44 2 [Note] WSREP: GCache history reset: old(1ad5e5a2-f73b-11e9-b01c-0e518872a648:32010 -> 25ca339b-f73b-11e9-9669-cffc50bdea26:2
mysqld: gcache/src/gcache_rb_store.cpp:462: void gcache::RingBuffer::seqno_reset(): Assertion `(reinterpret_cast<BufferHeader*>(first_))->size > 0' failed.
191025 18:21:44 [ERROR] mysqld got signal 6 ;

191025 18:21:45 +15.985 galera_sr.galera_sr_kill_all_pcrecovery 'innodb' [ fail ]
        Test ended at 2019-10-25 18:21:45
 
CURRENT_TEST: galera_sr.galera_sr_kill_all_pcrecovery
 
 
Server [mysqld.2 - pid: 13177, winpid: 13177, exit: 256] failed during test run
Server log from this test:
----------SERVER LOG START-----------
2019-10-25 18:21:43 0 [Warning] WSREP: option --wsrep-causal-reads is deprecated
2019-10-25 18:21:43 0 [Note] /home/stepan/mariadb/10.4/git/sql/mysqld (mysqld 10.4.9-MariaDB-debug-log) starting as process 13178 ...
2019-10-25 18:21:43 0 [Warning] Could not increase number of max_open_files to more than 1024 (request: 4186)
2019-10-25 18:21:43 0 [Warning] Changed limits: max_open_files: 1024  max_connections: 151 (was 151)  table_cache: 421 (was 2000)
2019-10-25 18:21:43 0 [Note] WSREP: Loading provider /usr/lib/libgalera_smm_4.so initial position: 00000000-0000-0000-0000-000000000000:-1
wsrep loader: [INFO] wsrep_load(): loading provider library '/usr/lib/libgalera_smm_4.so'
wsrep loader: [INFO] wsrep_load(): Galera 26.4.3(r4535) by Codership Oy <info@codership.com> loaded successfully.
2019-10-25 18:21:43 0 [Note] WSREP: CRC-32C: using hardware acceleration.
2019-10-25 18:21:43 0 [Warning] WSREP: Could not open state file for reading: '/home/stepan/mariadb/10.4/git/mysql-test/var/mysqld.2/data//grastate.dat'
2019-10-25 18:21:43 0 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1, safe_to_bootstrap: 1
2019-10-25 18:21:43 0 [Note] WSREP: GCache DEBUG: opened preamble:
Version: 2
UUID: 1ad5e5a2-f73b-11e9-b01c-0e518872a648
Seqno: -1 - -1
Offset: -1
Synced: 0
2019-10-25 18:21:43 0 [Note] WSREP: Recovering GCache ring buffer: version: 2, UUID: 1ad5e5a2-f73b-11e9-b01c-0e518872a648, offset: -1
2019-10-25 18:21:43 0 [Note] WSREP: GCache::RingBuffer initial scan...  0.0% (        0/134217752 bytes) complete.
2019-10-25 18:21:43 0 [Note] WSREP: GCache::RingBuffer initial scan...100.0% (134217752/134217752 bytes) complete.
2019-10-25 18:21:43 0 [Note] WSREP: Recovering GCache ring buffer: found gapless sequence 2-32010
2019-10-25 18:21:43 0 [Note] WSREP: GCache::RingBuffer unused buffers scan...  0.0% (      0/7677944 bytes) complete.
2019-10-25 18:21:43 0 [Note] WSREP: GCache::RingBuffer unused buffers scan...100.0% (7677944/7677944 bytes) complete.
2019-10-25 18:21:43 0 [Note] WSREP: GCache DEBUG: RingBuffer::recover(): found 2/32011 locked buffers
2019-10-25 18:21:43 0 [Note] WSREP: GCache DEBUG: RingBuffer::recover(): used space: 7677944/134217728
2019-10-25 18:21:43 0 [Note] WSREP: Passing config to GCS: base_dir = /home/stepan/mariadb/10.4/git/mysql-test/var/mysqld.2/data/; base_host = 127.0.0.1; base_port = 16005; cert.log_conflicts = no; cert.optimistic_pa = yes; dbug = ; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT30S; evs.install_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT10S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.debug = 0; gcache.dir = /home/stepan/mariadb/10.4/git/mysql-test/var/mysqld.2/data/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = galera.cache; gcache.page_size = 128M; gcache.recover = yes; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 92233720368
2019-10-25 18:21:43 0 [Note] WSREP: Start replication
2019-10-25 18:21:43 0 [Note] WSREP: Connecting with bootstrap option: 0
2019-10-25 18:21:43 0 [Note] WSREP: Setting GCS initial position to 00000000-0000-0000-0000-000000000000:-1
2019-10-25 18:21:43 0 [Note] WSREP: protonet asio version 0
2019-10-25 18:21:43 0 [Note] WSREP: Using CRC-32C for message checksums.
2019-10-25 18:21:43 0 [Note] WSREP: backend: asio
2019-10-25 18:21:43 0 [Note] WSREP: gcomm thread scheduling priority set to other:0 
2019-10-25 18:21:43 0 [Note] WSREP: restore pc from disk successfully
2019-10-25 18:21:43 0 [Note] WSREP: GMCast version 0
2019-10-25 18:21:43 0 [Note] WSREP: (9d3582a4, 'tcp://0.0.0.0:16005') listening at tcp://0.0.0.0:16005
2019-10-25 18:21:43 0 [Note] WSREP: (9d3582a4, 'tcp://0.0.0.0:16005') multicast: , ttl: 1
2019-10-25 18:21:43 0 [Note] WSREP: EVS version 1
2019-10-25 18:21:43 0 [Note] WSREP: gcomm: connecting to group 'my_wsrep_cluster', peer '127.0.0.1:16002'
2019-10-25 18:21:43 0 [Note] WSREP: (9d3582a4, 'tcp://0.0.0.0:16005') connection established to 9ce019b5 tcp://127.0.0.1:16002
2019-10-25 18:21:43 0 [Note] WSREP: (9d3582a4, 'tcp://0.0.0.0:16005') turning message relay requesting on, nonlive peers: 
2019-10-25 18:21:43 0 [Note] WSREP: EVS version upgrade 0 -> 1
2019-10-25 18:21:43 0 [Note] WSREP: declaring 9ce019b5 at tcp://127.0.0.1:16002 stable
2019-10-25 18:21:43 0 [Note] WSREP: PC protocol upgrade 0 -> 1
2019-10-25 18:21:43 0 [Note] WSREP: Node 9ce019b5 state prim
2019-10-25 18:21:43 0 [Note] WSREP: view(view_id(PRIM,9ce019b5,8) memb {
	9ce019b5,0
	9d3582a4,0
} joined {
} left {
} partitioned {
})
2019-10-25 18:21:43 0 [Note] WSREP: save pc into disk
2019-10-25 18:21:43 0 [Note] WSREP: clear restored view
2019-10-25 18:21:44 0 [Note] WSREP: gcomm: connected
2019-10-25 18:21:44 0 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
2019-10-25 18:21:44 0 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
2019-10-25 18:21:44 0 [Note] WSREP: Opened channel 'my_wsrep_cluster'
2019-10-25 18:21:44 0 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
2019-10-25 18:21:44 0 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
2019-10-25 18:21:44 0 [Note] WSREP: STATE EXCHANGE: sent state msg: 26cfd148-f73b-11e9-a3d0-ee9e3a656acb
2019-10-25 18:21:44 0 [Note] WSREP: STATE EXCHANGE: got state msg: 26cfd148-f73b-11e9-a3d0-ee9e3a656acb from 0 (cnt7glr11.localdomain)
2019-10-25 18:21:44 0 [Note] WSREP: STATE EXCHANGE: got state msg: 26cfd148-f73b-11e9-a3d0-ee9e3a656acb from 1 (cnt7glr11.localdomain)
2019-10-25 18:21:44 0 [Note] WSREP: Quorum results:
	version    = 6,
	component  = PRIMARY,
	conf_id    = 1,
	members    = 1/2 (joined/total),
	act_id     = 1,
	last_appl. = 0,
	protocols  = 2/10/4 (gcs/repl/appl),
	vote policy= 0,
	group UUID = 25ca339b-f73b-11e9-9669-cffc50bdea26
2019-10-25 18:21:44 0 [Note] WSREP: Flow-control interval: [23, 23]
2019-10-25 18:21:44 0 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 2)
2019-10-25 18:21:44 1 [Note] WSREP: Starting rollbacker thread 1
2019-10-25 18:21:44 2 [Note] WSREP: Starting applier thread 2
2019-10-25 18:21:44 2 [Note] WSREP: ####### processing CC 2, local, ordered
2019-10-25 18:21:44 2 [Note] WSREP: REPL Protocols: 10 (5, 3)
2019-10-25 18:21:44 2 [Note] WSREP: ####### My UUID: 9d3582a4-f73a-11e9-b44a-12b990b7901a
2019-10-25 18:21:44 2 [Note] WSREP: Server cnt7glr11.localdomain connected to cluster at position 25ca339b-f73b-11e9-9669-cffc50bdea26:2 with ID 9d3582a4-f73a-11e9-b44a-12b990b7901a
2019-10-25 18:21:44 2 [Note] WSREP: Server status change disconnected -> connected
2019-10-25 18:21:44 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2019-10-25 18:21:44 2 [Note] WSREP: State transfer required: 
	Group state: 25ca339b-f73b-11e9-9669-cffc50bdea26:2
	Local state: 00000000-0000-0000-0000-000000000000:-1
2019-10-25 18:21:44 2 [Note] WSREP: Server status change connected -> joiner
2019-10-25 18:21:44 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2019-10-25 18:21:44 0 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '127.0.0.2:16007' --datadir '/home/stepan/mariadb/10.4/git/mysql-test/var/mysqld.2/data/' --defaults-file '/home/stepan/mariadb/10.4/git/mysql-test/var/my.cnf' --defaults-group-suffix '.2' --parent '13178' --binlog-index 'mysqld-bin.index' --mysqld-args --defaults-group-suffix=.2 --defaults-file=/home/stepan/mariadb/10.4/git/mysql-test/var/my.cnf --log-output=file --innodb --innodb-cmpmem --innodb-cmp-per-index --innodb-trx --innodb-locks --innodb-lock-waits --innodb-metrics --innodb-buffer-pool-stats --innodb-buffer-page --innodb-buffer-page-lru --innodb-sys-columns --innodb-sys-fields --innodb-sys-foreign --innodb-sys-foreign-cols --innodb-sys-indexes --innodb-sys-tables --innodb-sys-virtual --core-file --loose-debug-sync-timeout=300'
2019-10-25 18:21:44 2 [Note] WSREP: Prepared SST request: rsync|127.0.0.2:16007/rsync_sst
2019-10-25 18:21:44 2 [Note] WSREP: Cert index reset to 00000000-0000-0000-0000-000000000000:-1 (proto: 10), state transfer needed: yes
2019-10-25 18:21:44 0 [Note] WSREP: Service thread queue flushed.
2019-10-25 18:21:44 2 [Note] WSREP: ####### Assign initial position for certification: 00000000-0000-0000-0000-000000000000:-1, protocol version: 5
2019-10-25 18:21:44 2 [Note] WSREP: ####### IST uuid:00000000-0000-0000-0000-000000000000 f: 0, l: 2, STRv: 3
2019-10-25 18:21:44 2 [Note] WSREP: IST receiver addr using tcp://127.0.0.1:16006
2019-10-25 18:21:44 2 [Note] WSREP: Prepared IST receiver for 0-2, listening at: tcp://127.0.0.1:16006
2019-10-25 18:21:44 0 [Note] WSREP: Member 1.0 (cnt7glr11.localdomain) requested state transfer from '*any*'. Selected 0.0 (cnt7glr11.localdomain)(SYNCED) as donor.
2019-10-25 18:21:44 0 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 2)
2019-10-25 18:21:44 2 [Note] WSREP: Requesting state transfer: success, donor: 0
2019-10-25 18:21:44 2 [Note] WSREP: Resetting GCache seqno map due to different histories.
2019-10-25 18:21:44 2 [Note] WSREP: GCache history reset: old(1ad5e5a2-f73b-11e9-b01c-0e518872a648:32010 -> 25ca339b-f73b-11e9-9669-cffc50bdea26:2
mysqld: gcache/src/gcache_rb_store.cpp:462: void gcache::RingBuffer::seqno_reset(): Assertion `(reinterpret_cast<BufferHeader*>(first_))->size > 0' failed.
191025 18:21:44 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
 
Server version: 10.4.9-MariaDB-debug-log
key_buffer_size=0
read_buffer_size=131072
max_used_connections=0
max_threads=153
thread_count=2
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 62594 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x7f77f4000b00
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f78185b0950 thread_stack 0x49000
mysys/stacktrace.c:269(my_print_stacktrace)[0x557553f61008]
sql/signal_handler.cc:209(handle_fatal_signal)[0x5575536cd5dd]
sigaction.c:0(__restore_rt)[0x7f78212fc5d0]
:0(__GI_raise)[0x7f781f5ea207]
:0(__GI_abort)[0x7f781f5eb8f8]
:0(__assert_fail_base)[0x7f781f5e3026]
:0(__GI___assert_fail)[0x7f781f5e30d2]
src/gcache_rb_store.cpp:462(gcache::RingBuffer::seqno_reset())[0x7f781916e1e5]
src/GCache_seqno.cpp:45(gcache::GCache::seqno_reset(gu::GTID const&))[0x7f7819167264]
src/replicator_str.cpp:824(galera::ReplicatorSMM::request_state_transfer(void*, wsrep_uuid const&, long, void const*, long))[0x7f78192874e5]
src/replicator_smm.cpp:2759(galera::ReplicatorSMM::process_conf_change(void*, gcs_action const&))[0x7f7819276d68]
src/gcs_action_source.cpp:154(galera::GcsActionSource::dispatch(void*, gcs_action const&, bool&))[0x7f7819256cc9]
src/gcs_action_source.cpp:188(galera::GcsActionSource::process(void*, bool&))[0x7f781925701d]
src/replicator_smm.cpp:391(galera::ReplicatorSMM::async_recv(void*))[0x7f7819275f24]
/home/stepan/mariadb/10.4/git/sql/mysqld(_ZN5wsrep18wsrep_provider_v2611run_applierEPNS_21high_priority_serviceE+0x30)[0x55755401ce7a]
src/wsrep_provider_v26.cpp:727(wsrep::wsrep_provider_v26::run_applier(wsrep::high_priority_service*))[0x5575535ee787]
sql/wsrep_thd.cc:62(wsrep_replication_process(THD*, void*))[0x5575535dee70]
sql/wsrep_mysqld.cc:2710(start_wsrep_THD(void*))[0x557553eef4a5]
pthread_create.c:0(start_thread)[0x7f78212f4dd5]
/lib64/libc.so.6(clone+0x6d)[0x7f781f6b1ead]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x0): 
Connection ID (thread ID): 2
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on
 
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
Writing a core file...
Working directory at /home/stepan/mariadb/10.4/git/mysql-test/var/mysqld.2/data
Resource Limits:
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            8388608              unlimited            bytes     
Max core file size        unlimited            unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             4096                 23005                processes 
Max open files            1024                 1024                 files     
Max locked memory         65536                65536                bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       23005                23005                signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us        
Core pattern: |/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e %P %I %h
 
WSREP_SST: [ERROR] Parent mysqld process (PID:13178) terminated unexpectedly. (20191025 18:21:45.458)
WSREP_SST: [INFO] Joiner cleanup. rsync PID: 13287 (20191025 18:21:45.460)
----------SERVER LOG END-------------
mysqltest failed but provided no output
The result from queries just before the failure was:
< snip >
connection node_2;
connection node_1;
CREATE TABLE t1 (f1 INTEGER PRIMARY KEY) ENGINE=InnoDB;
SET SESSION wsrep_trx_fragment_size = 1;
SET AUTOCOMMIT=OFF;
START TRANSACTION;
INSERT INTO t1 VALUES (1);
INSERT INTO t1 VALUES (2);
INSERT INTO t1 VALUES (3);
INSERT INTO t1 VALUES (4);
INSERT INTO t1 VALUES (5);
connection node_2;
Killing server ...
connection node_1;
Killing server ...
connection node_1;
# restart
connection node_2;
# restart
 
 
 
 - found 'core.13178' (0/5)
 
Trying 'dbx' to get a backtrace
 
Trying 'gdb' to get a backtrace from coredump /home/stepan/mariadb/10.4/git/mysql-test/var/log/galera_sr.galera_sr_kill_all_pcrecovery-innodb/mysqld.2/data/core.13178
Core generated by '/home/stepan/mariadb/10.4/git/sql/mysqld'
Output from gdb follows. The first stack trace is from the failing thread.
The following stack traces are from all threads (so the failing one is
duplicated).
--------------------------
[New LWP 13234]
[New LWP 13178]
[New LWP 13179]
[New LWP 13181]
[New LWP 13180]
[New LWP 13232]
[New LWP 13235]
[New LWP 13233]
[New LWP 13306]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/home/stepan/mariadb/10.4/git/sql/mysqld --defaults-group-suffix=.2 --defaults-'.
Program terminated with signal 6, Aborted.
#0  0x00007f78212f99d1 in pthread_kill () from /lib64/libpthread.so.0
#0  0x00007f78212f99d1 in pthread_kill () from /lib64/libpthread.so.0
#1  0x0000557553f610f6 in my_write_core (sig=6) at /home/stepan/mariadb/10.4/git/mysys/stacktrace.c:481
#2  0x00005575536cd964 in handle_fatal_signal (sig=6) at /home/stepan/mariadb/10.4/git/sql/signal_handler.cc:343
#3  <signal handler called>
#4  0x00007f781f5ea207 in raise () from /lib64/libc.so.6
#5  0x00007f781f5eb8f8 in abort () from /lib64/libc.so.6
#6  0x00007f781f5e3026 in __assert_fail_base () from /lib64/libc.so.6
#7  0x00007f781f5e30d2 in __assert_fail () from /lib64/libc.so.6
#8  0x00007f781916e1e5 in gcache::RingBuffer::seqno_reset (this=this@entry=0x55755747e3f8) at gcache/src/gcache_rb_store.cpp:462
#9  0x00007f7819167264 in gcache::GCache::seqno_reset (this=this@entry=0x55755747e2a8, gtid=...) at gcache/src/GCache_seqno.cpp:44
#10 0x00007f78192874e5 in galera::ReplicatorSMM::request_state_transfer (this=this@entry=0x55755747ded0, recv_ctx=recv_ctx@entry=0x7f78185b0480, group_uuid=..., cc_seqno=cc_seqno@entry=2, sst_req=0x7f77f4027180, sst_req_len=32) at galera/src/replicator_str.cpp:821
#11 0x00007f7819276d68 in galera::ReplicatorSMM::process_conf_change (this=0x55755747ded0, recv_ctx=0x7f78185b0480, cc=...) at galera/src/replicator_smm.cpp:2688
#12 0x00007f7819256cc9 in galera::GcsActionSource::dispatch (this=this@entry=0x5575576b60e0, recv_ctx=recv_ctx@entry=0x7f78185b0480, act=..., exit_loop=@0x7f78185b024e: false) at galera/src/gcs_action_source.cpp:124
#13 0x00007f781925701d in galera::GcsActionSource::process (this=0x5575576b60e0, recv_ctx=0x7f78185b0480, exit_loop=@0x7f78185b024e: false) at galera/src/gcs_action_source.cpp:182
#14 0x00007f7819275f24 in galera::ReplicatorSMM::async_recv (this=0x55755747ded0, recv_ctx=0x7f78185b0480) at galera/src/replicator_smm.cpp:391
#15 0x000055755401ce7a in wsrep::wsrep_provider_v26::run_applier (this=0x5575573167e0, applier_ctx=0x7f78185b0480) at /home/stepan/mariadb/10.4/git/wsrep-lib/src/wsrep_provider_v26.cpp:727
#16 0x00005575535ee787 in wsrep_replication_process (thd=0x7f77f4000b00, arg=0x5575576db900) at /home/stepan/mariadb/10.4/git/sql/wsrep_thd.cc:62
#17 0x00005575535dee70 in start_wsrep_THD (arg=0x5575576db900) at /home/stepan/mariadb/10.4/git/sql/wsrep_mysqld.cc:2708
#18 0x0000557553eef4a5 in pfs_spawn_thread (arg=0x5575576962d0) at /home/stepan/mariadb/10.4/git/storage/perfschema/pfs.cc:1862
#19 0x00007f78212f4dd5 in start_thread () from /lib64/libpthread.so.0
#20 0x00007f781f6b1ead in clone () from /lib64/libc.so.6
 
Thread 9 (Thread 0x7f7808b78700 (LWP 13306)):
#0  0x00007f78212f8965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f7819259aa7 in gu::Lock::wait (cond=..., this=0x7f7808b77c10) at galerautils/src/gu_lock.hpp:45
#2  0x00007f781925c9eb in galera::ist::Receiver::run (this=0x55755747e750) at galera/src/ist.cpp:422
#3  0x00007f781925d8e9 in run_receiver_thread (arg=<optimized out>) at galera/src/ist.cpp:156
#4  0x00007f78212f4dd5 in start_thread () from /lib64/libpthread.so.0
#5  0x00007f781f6b1ead in clone () from /lib64/libc.so.6
 
Thread 8 (Thread 0x7f78185fc700 (LWP 13233)):
#0  0x00007f78212f8965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000557553f66769 in safe_cond_wait (cond=0x7f7800015a68, mp=0x7f78000159b8, file=0x5575540e2a20 "/home/stepan/mariadb/10.4/git/include/mysql/psi/mysql_thread.h", line=1178) at /home/stepan/mariadb/10.4/git/mysys/thr_mutex.c:492
#2  0x00005575535ee1b4 in inline_mysql_cond_wait (that=0x7f7800015a68, mutex=0x7f78000159b8, src_file=0x5575540e2c30 "/home/stepan/mariadb/10.4/git/sql/wsrep_thd.h", src_line=69) at /home/stepan/mariadb/10.4/git/include/mysql/psi/mysql_thread.h:1178
#3  0x00005575535f1416 in Wsrep_thd_queue::pop_front (this=0x7f7800015960) at /home/stepan/mariadb/10.4/git/sql/wsrep_thd.h:69
#4  0x00005575535efe23 in wsrep_rollback_process (rollbacker=0x7f7800001d50, arg=0x5575576dcb00) at /home/stepan/mariadb/10.4/git/sql/wsrep_thd.cc:266
#5  0x00005575535dee70 in start_wsrep_THD (arg=0x5575576dcb00) at /home/stepan/mariadb/10.4/git/sql/wsrep_mysqld.cc:2708
#6  0x0000557553eef4a5 in pfs_spawn_thread (arg=0x557557695de0) at /home/stepan/mariadb/10.4/git/storage/perfschema/pfs.cc:1862
#7  0x00007f78212f4dd5 in start_thread () from /lib64/libpthread.so.0
#8  0x00007f781f6b1ead in clone () from /lib64/libc.so.6
 
Thread 7 (Thread 0x7f7809379700 (LWP 13235)):
#0  0x00007f781f6a2f8d in read () from /lib64/libc.so.6
#1  0x00007f781f62fb14 in __GI__IO_file_underflow () from /lib64/libc.so.6
#2  0x00007f781f630ce2 in __GI__IO_default_uflow () from /lib64/libc.so.6
#3  0x00007f781f623ac4 in __GI__IO_getline_info () from /lib64/libc.so.6
#4  0x00007f781f6228c6 in fgets () from /lib64/libc.so.6
#5  0x00005575535e28d4 in my_fgets (buf=0x7f7809378750 "ready 127.0.0.2:16007/rsync_sst", buf_len=512, stream=0x7f77f8001050) at /home/stepan/mariadb/10.4/git/sql/wsrep_sst.cc:301
#6  0x00005575535e3070 in sst_joiner_thread (a=0x7f78185ad4f0) at /home/stepan/mariadb/10.4/git/sql/wsrep_sst.cc:446
#7  0x0000557553eef4a5 in pfs_spawn_thread (arg=0x7f77f4024a30) at /home/stepan/mariadb/10.4/git/storage/perfschema/pfs.cc:1862
#8  0x00007f78212f4dd5 in start_thread () from /lib64/libpthread.so.0
#9  0x00007f781f6b1ead in clone () from /lib64/libc.so.6
 
Thread 6 (Thread 0x7f7809b7a700 (LWP 13232)):
#0  0x00007f78212f8965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f781922e4b7 in gu::Lock::wait (cond=..., this=0x7f7809b79850) at galerautils/src/gu_lock.hpp:45
#2  0x00007f7819230ffb in RecvBuf::front (this=this@entry=0x5575576ba778, timeout=...) at gcs/src/gcs_gcomm.cpp:124
#3  0x00007f781922eb96 in gcomm_recv (backend=<optimized out>, msg=0x55755767d990, timeout=9223372035999999999) at gcs/src/gcs_gcomm.cpp:673
#4  0x00007f7819222b2c in core_msg_recv (backend=backend@entry=0x55755767daf0, recv_msg=recv_msg@entry=0x55755767d990, timeout=timeout@entry=9223372035999999999) at gcs/src/gcs_core.cpp:480
#5  0x00007f7819225da4 in gcs_core_recv (conn=0x55755767d930, recv_act=recv_act@entry=0x7f7809b79d70, timeout=9223372035999999999) at gcs/src/gcs_core.cpp:1184
#6  0x00007f781922d028 in gcs_recv_thread (arg=0x55755767d6b0) at gcs/src/gcs.cpp:1398
#7  0x00007f78212f4dd5 in start_thread () from /lib64/libpthread.so.0
#8  0x00007f781f6b1ead in clone () from /lib64/libc.so.6
 
Thread 5 (Thread 0x7f7818dfd700 (LWP 13180)):
#0  0x00007f78212f8965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f7819253dd7 in gu::Lock::wait (cond=..., this=0x7f7818dfcbf0) at galerautils/src/gu_lock.hpp:45
#2  0x00007f7819253e69 in galera::ServiceThd::thd_func (arg=0x55755747e5e8) at galera/src/galera_service_thd.cpp:30
#3  0x00007f78212f4dd5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007f781f6b1ead in clone () from /lib64/libc.so.6
 
Thread 4 (Thread 0x7f780a37b700 (LWP 13181)):
#0  0x00007f781f6b2483 in epoll_wait () from /lib64/libc.so.6
#1  0x00007f78191f4548 in asio::detail::epoll_reactor::run (this=0x5575576baf10, block=block@entry=true, ops=...) at asio/asio/detail/impl/epoll_reactor.ipp:391
#2  0x00007f78191f8c40 in asio::detail::task_io_service::do_run_one (this=this@entry=0x5575576badb0, lock=..., this_thread=..., ec=...) at asio/asio/detail/impl/task_io_service.ipp:355
#3  0x00007f78191f8ef2 in asio::detail::task_io_service::run (this=0x5575576badb0, ec=...) at asio/asio/detail/impl/task_io_service.ipp:148
#4  0x00007f781921097d in asio::io_service::run (this=0x5575576b9af0) at asio/asio/impl/io_service.ipp:58
#5  0x00007f78192110d1 in gcomm::AsioProtonet::event_loop (this=0x5575576b9a50, period=...) at gcomm/src/asio_protonet.cpp:116
#6  0x00007f781922ff3d in GCommConn::run (this=0x5575576ba590) at gcs/src/gcs_gcomm.cpp:498
#7  0x00007f7819232fe9 in GCommConn::run_fn (arg=<optimized out>) at gcs/src/gcs_gcomm.cpp:209
#8  0x00007f78212f4dd5 in start_thread () from /lib64/libpthread.so.0
#9  0x00007f781f6b1ead in clone () from /lib64/libc.so.6
 
Thread 3 (Thread 0x7f7819d8d700 (LWP 13179)):
#0  0x00007f78212f8d12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000557553f66a6b in safe_cond_timedwait (cond=0x557555595ce0 <COND_timer>, mp=0x557555595c20 <LOCK_timer>, abstime=0x7f7819d8cd70, file=0x557554416d80 "/home/stepan/mariadb/10.4/git/include/mysql/psi/mysql_thread.h", line=1204) at /home/stepan/mariadb/10.4/git/mysys/thr_mutex.c:546
#2  0x0000557553f67aae in inline_mysql_cond_timedwait (that=0x557555595ce0 <COND_timer>, mutex=0x557555595c20 <LOCK_timer>, abstime=0x7f7819d8cd70, src_file=0x557554416dc0 "/home/stepan/mariadb/10.4/git/mysys/thr_timer.c", src_line=292) at /home/stepan/mariadb/10.4/git/include/mysql/psi/mysql_thread.h:1204
#3  0x0000557553f68671 in timer_handler (arg=0x0) at /home/stepan/mariadb/10.4/git/mysys/thr_timer.c:292
#4  0x0000557553eef4a5 in pfs_spawn_thread (arg=0x55755744a250) at /home/stepan/mariadb/10.4/git/storage/perfschema/pfs.cc:1862
#5  0x00007f78212f4dd5 in start_thread () from /lib64/libpthread.so.0
#6  0x00007f781f6b1ead in clone () from /lib64/libc.so.6
 
Thread 2 (Thread 0x7f7821705880 (LWP 13178)):
#0  0x00007f78212f8965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000557553f66769 in safe_cond_wait (cond=0x557554d2e980 <COND_wsrep_server_state>, mp=0x557554d2e8c0 <LOCK_wsrep_server_state>, file=0x55755404ef88 "/home/stepan/mariadb/10.4/git/include/mysql/psi/mysql_thread.h", line=1178) at /home/stepan/mariadb/10.4/git/mysys/thr_mutex.c:492
#2  0x00005575532f000c in inline_mysql_cond_wait (that=0x557554d2e980 <COND_wsrep_server_state>, mutex=0x557554d2e8c0 <LOCK_wsrep_server_state>, src_file=0x55755404f3f0 "/home/stepan/mariadb/10.4/git/sql/wsrep_condition_variable.h", src_line=48) at /home/stepan/mariadb/10.4/git/include/mysql/psi/mysql_thread.h:1178
#3  0x0000557553306edb in Wsrep_condition_variable::wait (this=0x55755744aba0, lock=...) at /home/stepan/mariadb/10.4/git/sql/wsrep_condition_variable.h:48
#4  0x000055755400917b in wsrep::server_state::wait_until_state (this=0x55755744a940, lock=..., state=wsrep::server_state::s_initializing) at /home/stepan/mariadb/10.4/git/wsrep-lib/src/server_state.cpp:1365
#5  0x00005575535dfa01 in wsrep::server_state::wait_until_state (this=0x55755744a940, state=wsrep::server_state::s_initializing) at /home/stepan/mariadb/10.4/git/wsrep-lib/include/wsrep/server_state.hpp:364
#6  0x00005575535d684e in wsrep_init_startup (sst_first=true) at /home/stepan/mariadb/10.4/git/sql/wsrep_mysqld.cc:841
#7  0x000055755322fa44 in init_server_components () at /home/stepan/mariadb/10.4/git/sql/mysqld.cc:5135
#8  0x0000557553230f35 in mysqld_main (argc=163, argv=0x557557304130) at /home/stepan/mariadb/10.4/git/sql/mysqld.cc:5721
#9  0x0000557553225095 in main (argc=23, argv=0x7fffa54805c8) at /home/stepan/mariadb/10.4/git/sql/main.cc:25
 
Thread 1 (Thread 0x7f78185b1700 (LWP 13234)):
#0  0x00007f78212f99d1 in pthread_kill () from /lib64/libpthread.so.0
#1  0x0000557553f610f6 in my_write_core (sig=6) at /home/stepan/mariadb/10.4/git/mysys/stacktrace.c:481
#2  0x00005575536cd964 in handle_fatal_signal (sig=6) at /home/stepan/mariadb/10.4/git/sql/signal_handler.cc:343
#3  <signal handler called>
#4  0x00007f781f5ea207 in raise () from /lib64/libc.so.6
#5  0x00007f781f5eb8f8 in abort () from /lib64/libc.so.6
#6  0x00007f781f5e3026 in __assert_fail_base () from /lib64/libc.so.6
#7  0x00007f781f5e30d2 in __assert_fail () from /lib64/libc.so.6
#8  0x00007f781916e1e5 in gcache::RingBuffer::seqno_reset (this=this@entry=0x55755747e3f8) at gcache/src/gcache_rb_store.cpp:462
#9  0x00007f7819167264 in gcache::GCache::seqno_reset (this=this@entry=0x55755747e2a8, gtid=...) at gcache/src/GCache_seqno.cpp:44
#10 0x00007f78192874e5 in galera::ReplicatorSMM::request_state_transfer (this=this@entry=0x55755747ded0, recv_ctx=recv_ctx@entry=0x7f78185b0480, group_uuid=..., cc_seqno=cc_seqno@entry=2, sst_req=0x7f77f4027180, sst_req_len=32) at galera/src/replicator_str.cpp:821
#11 0x00007f7819276d68 in galera::ReplicatorSMM::process_conf_change (this=0x55755747ded0, recv_ctx=0x7f78185b0480, cc=...) at galera/src/replicator_smm.cpp:2688
#12 0x00007f7819256cc9 in galera::GcsActionSource::dispatch (this=this@entry=0x5575576b60e0, recv_ctx=recv_ctx@entry=0x7f78185b0480, act=..., exit_loop=@0x7f78185b024e: false) at galera/src/gcs_action_source.cpp:124
#13 0x00007f781925701d in galera::GcsActionSource::process (this=0x5575576b60e0, recv_ctx=0x7f78185b0480, exit_loop=@0x7f78185b024e: false) at galera/src/gcs_action_source.cpp:182
#14 0x00007f7819275f24 in galera::ReplicatorSMM::async_recv (this=0x55755747ded0, recv_ctx=0x7f78185b0480) at galera/src/replicator_smm.cpp:391
#15 0x000055755401ce7a in wsrep::wsrep_provider_v26::run_applier (this=0x5575573167e0, applier_ctx=0x7f78185b0480) at /home/stepan/mariadb/10.4/git/wsrep-lib/src/wsrep_provider_v26.cpp:727
#16 0x00005575535ee787 in wsrep_replication_process (thd=0x7f77f4000b00, arg=0x5575576db900) at /home/stepan/mariadb/10.4/git/sql/wsrep_thd.cc:62
#17 0x00005575535dee70 in start_wsrep_THD (arg=0x5575576db900) at /home/stepan/mariadb/10.4/git/sql/wsrep_mysqld.cc:2708
#18 0x0000557553eef4a5 in pfs_spawn_thread (arg=0x5575576962d0) at /home/stepan/mariadb/10.4/git/storage/perfschema/pfs.cc:1862
#19 0x00007f78212f4dd5 in start_thread () from /lib64/libpthread.so.0
#20 0x00007f781f6b1ead in clone () from /lib64/libc.so.6
 
Compressed file /home/stepan/mariadb/10.4/git/mysql-test/var/log/galera_sr.galera_sr_kill_all_pcrecovery-innodb/mysqld.2/data/core.13178
 - saving '/home/stepan/mariadb/10.4/git/mysql-test/var/log/galera_sr.galera_sr_kill_all_pcrecovery-innodb/' to '/home/stepan/mariadb/10.4/git/mysql-test/var/log/galera_sr.galera_sr_kill_all_pcrecovery-innodb/'
***Warnings generated in error logs during shutdown after running tests: galera_sr.galera_sr_kill_all_pcrecovery
 
mysqld: gcache/src/gcache_rb_store.cpp:462: void gcache::RingBuffer::seqno_reset(): Assertion `(reinterpret_cast<BufferHeader*>(first_))->size > 0' failed.
Attempting backtrace. You can use the following information to find out
WSREP_SST: [ERROR] Parent mysqld process (PID:13178) terminated unexpectedly. (20191025 18:21:45.458)

All logs: 191025_SSD_galera_sr_kill_all_pcrecovery.zip.



 Comments   
Comment by Stepan Patryshev (Inactive) [ 2019-11-21 ]

It is also reproduced on ES 10.4.

MariaDB Version 10.4.10-4-MariaDB-debug: repository MariaDBEnterprise; branch 10.4-enterprise; commit 25b895a61f020feba0342a36f688a7a99e97d9f8; debug built from sources.

Galera lib 26.4.4(r4702): repository es-galera; branch es-mariadb-4.x; commit 2a4742f4f9b713d61c99bfbe348d01804e35d698; debug built from sources.

Comment by Stepan Patryshev (Inactive) [ 2019-12-23 ]

It's still reproduced on MariaDB Version 10.4.12-MariaDB-debug (5e0c80c).

Comment by Stepan Patryshev (Inactive) [ 2020-01-22 ]

Decreased the priority since it failed sporadically only on local runs.

Comment by Stepan Patryshev (Inactive) [ 2020-02-10 ]

Closing since it does not fail on CI.

Generated at Thu Feb 08 09:03:02 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.