[MDEV-16527] Spider crash in background thread Created: 2018-06-19  Updated: 2020-08-25  Resolved: 2019-03-03

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - Spider
Affects Version/s: 10.3
Fix Version/s: 10.3.7

Type: Bug Priority: Major
Reporter: Mattias Jonsson Assignee: Kentoku Shiba (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

Linux



 Description   

During testing (only ran a simple PK query on a big spider table partitioned on 4 datanodes and some monitoring queries, like: show table status from `db` like 't1') the server crashed in a spider backend thread

Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/local/mysql/bin/mysqld --defaults-file=/etc/my.cnf --user=mysql --core-fil'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007fbe9a044a01 in pthread_kill () from /lib64/libpthread.so.0
Missing separate debuginfos, use: debuginfo-install glibc-2.17-222.el7.x86_64 libaio-0.3.109-13.el7.x86_64 libgcc-4.8.5-28.el7.x86_64 libstdc++-4.8.5-28.el7.x86_64 nss-softokn-freebl-3.34.0-2.el7.x86_64 sssd-client-1.16.0-19.el7.x86_64 xz-libs-5.2.2-1.el7.x86_64
(gdb) bt
#0  0x00007fbe9a044a01 in pthread_kill () from /lib64/libpthread.so.0
#1  0x00007fbe9abfbfc2 in handle_fatal_signal (sig=11)
    at /MariaDB/server/sql/signal_handler.cc:305
#2  <signal handler called>
#3  movelink (newlink=1, next_link=4294967295, find=0, array=0x7fbe9c55b068)
    at /MariaDB/server/mysys/hash.c:337
#4  my_hash_insert (info=0x7fbe9c553710, record=0x7fbdc4011598 "\001")
    at /MariaDB/server/mysys/hash.c:517
#5  0x00007fbe21727399 in spider_get_conn (share=share@entry=0x7fbcd407ba28, link_idx=<optimized out>, 
    link_idx@entry=0, conn_key=<optimized out>, trx=0x7fbe9c553648, spider=spider@entry=0x7fbcd4085b40, 
    another=another@entry=false, thd_chg=thd_chg@entry=false, conn_kind=conn_kind@entry=1, 
    error_num=error_num@entry=0x7fbde77fde60) at /MariaDB/server/storage/spider/spd_conn.cc:1126
#6  0x00007fbe2173c5c8 in spider_table_bg_crd_action (arg=arg@entry=0x7fbe9c54fb78)
    at /MariaDB/server/storage/spider/spd_table.cc:10169
#7  0x00007fbe9b12360d in pfs_spawn_thread (arg=0x7fbe9c550ae8)
    at /MariaDB/server/storage/perfschema/pfs.cc:1862
#8  0x00007fbe9a03fe25 in start_thread () from /lib64/libpthread.so.0
#9  0x00007fbe98efcbad in clone () from /lib64/libc.so.6
(gdb) up
#1  0x00007fbe9abfbfc2 in handle_fatal_signal (sig=11)
    at /MariaDB/server/sql/signal_handler.cc:305
warning: Source file is more recent than executable.
305	    my_write_core(sig);
(gdb) up
#2  <signal handler called>
(gdb) up
#3  movelink (newlink=1, next_link=4294967295, find=0, array=0x7fbe9c55b068)
    at /MariaDB/server/mysys/hash.c:337
warning: Source file is more recent than executable.
337	  while ((next_link=old_link->next) != find);
(gdb) p old_link
$1 = (HASH_LINK *) 0x7fce9c55b058
(gdb) p find
$2 = 0
(gdb) p next_link
$3 = 4294967295
(gdb) p old_link->next
Cannot access memory at address 0x7fce9c55b058
(gdb) 

In the mysqld.log there are also other hints:

20180619 18:16:08 [SEND SPIDER SQL] from 2 to [spider-datanode3.example.com] 6120658:  sql: show table status from `db` like 't3'
180619 18:16:08 20180619 18:16:08 [SEND SPIDER SQL] from 9 to [spider-datanode2.example.com] 5459221:  sql: show table status from `db` like 't3'
[ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
 
Server version: 10.3.7-MariaDB-log
key_buffer_size=805306368
read_buffer_size=131072
max_used_connections=9
20180619 18:16:08 [SEND SPIDER SQL] from 7 to [spider-datanode1.example.com] 3978791:  sql: show table status from `db` like 't3'
max_threads=3002
thread_count=36
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 4311648 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x7fbdc40009a8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7fbde77fdd90 thread_stack 0x49000
20180619 18:16:08 [SEND SPIDER SQL] from 7 to [spider-datanode4.example.com] 23775456:  sql: show table status from `db` like 't3'
/usr/local/mysql/bin/mysqld(my_print_stacktrace+0x2e)[0x7fbe9b1700ee]
mysys/stacktrace.c:270(my_print_stacktrace)[0x7fbe9abfbf07]
sigaction.c:0(__restore_rt)[0x7fbe9a0476d0]
/usr/local/mysql/bin/mysqld(my_hash_insert+0x151)[0x7fbe9b14d4e1]
spider/spd_conn.cc:1126(spider_get_conn(st_spider_share*, int, char*, st_spider_transaction*, ha_spider*, bool, bool, unsigned int, int*))[0x7fbe21727399]
spider/spd_table.cc:10170(spider_table_bg_crd_action(void*))[0x7fbe2173c5c8]
/usr/local/mysql/bin/mysqld(+0xcab60d)[0x7fbe9b12360d]
pthread_create.c:0(start_thread)[0x7fbe9a03fe25]
/lib64/libc.so.6(clone+0x6d)[0x7fbe98efcbad]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x0): 
Connection ID (thread ID): 18
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on
 
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
Writing a core file at /mysql/db/data/

(Also notice the minor bugs in the crashing message: pointing to dev.mysql.com seems wrong when the MariaDB server crashes. 'Writing a core file at /mysql/db/data' is wrong, the config is different!)



 Comments   
Comment by Kentoku Shiba (Inactive) [ 2019-03-03 ]

This issue looks like that hash is accessed from multiple threads same time and this issue is fixed at MDEV-12900.
If this issue is reproduced, please reopen this.

Generated at Thu Feb 08 08:29:34 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.