[MDEV-10434] segfault in mariadb Created: 2016-07-25  Updated: 2020-10-20  Resolved: 2020-10-20

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 5.5.50-galera
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Evan Jardine-Skinner Assignee: Unassigned
Resolution: Incomplete Votes: 0
Labels: None
Environment:

Ubuntu 14.04 mariadb-5.5.50 and galera-3.25.16


Attachments: File mysql.log.gz    

 Description   

See the following in syslog. Attaching the general log as that was running at the time of the failure

Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: 160724 19:28:22 [ERROR] mysqld got signal 11 ;
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: This could be because you hit a bug. It is also possible that this binary
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: or one of the libraries it was linked against is corrupt, improperly built,
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: or misconfigured. This error can also be caused by malfunctioning hardware.
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: 
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: To report this bug, see http://kb.askmonty.org/en/reporting-bugs
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: 
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: We will try our best to scrape up some info that will hopefully help
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: diagnose the problem, but since we have already crashed, 
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: something is definitely wrong and this may fail.
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: 
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: Server version: 5.5.50-MariaDB-1~trusty-wsrep
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: key_buffer_size=134217728
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: read_buffer_size=2097152
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: max_used_connections=13
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: max_threads=502
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: thread_count=19
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: It is possible that mysqld could use up to 
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 3224507 K  bytes of memory
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: Hope that's ok; if not, decrease some variables in the equation.
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: 
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: Thread pointer: 0x0x0
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: Attempting backtrace. You can use the following information to find out
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: where mysqld died. If you see no messages after this, something went
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: terribly wrong...
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: stack_bottom = 0x0 thread_stack 0x48000
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x7fe8ea49688e]
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: /usr/sbin/mysqld(handle_fatal_signal+0x457)[0x7fe8ea07b4a7]
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x7fe8e8ac8340]
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: /usr/sbin/mysqld(+0x926b66)[0x7fe8ea4b2b66]
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: /usr/sbin/mysqld(+0x92511b)[0x7fe8ea4b111b]
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: /usr/sbin/mysqld(+0x72a52a)[0x7fe8ea2b652a]
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: /usr/sbin/mysqld(+0x8707f3)[0x7fe8ea3fc7f3]
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: /usr/sbin/mysqld(+0x85d7c8)[0x7fe8ea3e97c8]
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: /usr/sbin/mysqld(+0x7af2f5)[0x7fe8ea33b2f5]
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: /usr/sbin/mysqld(+0x759962)[0x7fe8ea2e5962]
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: /usr/sbin/mysqld(+0x877064)[0x7fe8ea403064]
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: /usr/sbin/mysqld(+0x878847)[0x7fe8ea404847]
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: /usr/sbin/mysqld(+0x86f965)[0x7fe8ea3fb965]
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: /usr/sbin/mysqld(+0x77cbd9)[0x7fe8ea308bd9]
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: /usr/sbin/mysqld(+0x770e36)[0x7fe8ea2fce36]
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: /lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x7fe8e8ac0182]
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fe8e81e347d]
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
Jul 24 19:28:22 advancedportalnode-test-20160720-10-137-29-116 mysqld: information that should help you find out what is causing the crash.
Jul 24 19:28:23 advancedportalnode-test-20160720-10-137-29-116 mysqld_safe: Number of processes running now: 0
Jul 24 19:28:23 advancedportalnode-test-20160720-10-137-29-116 mysqld_safe: WSREP: not restarting wsrep node automatically
Jul 24 19:28:23 advancedportalnode-test-20160720-10-137-29-116 mysqld_safe: mysqld from pid file /var/run/mysqld/mysqld.pid ended



 Comments   
Comment by Evan Jardine-Skinner [ 2016-07-25 ]

mysql.log.gz uploaded (the last 200000 lines anyway, due the the 10mb upload limit)

The last query in there is nothicng special though and doesn't cause the server to crash normally.

Comment by Elena Stepanova [ 2016-07-30 ]

Was it a one-time crash, or does it keep happening?

Unfortunately, there is literally nothing to work with – the stack trace is unreadable, and according to the general log (thanks for providing it), there were no queries at all at the time of the crash, all connections that recently performed queries were already closed or closing. My best guess is that the crash is somehow related to the Galera node activity, but it's just a speculation.

For a note, big files can be uploaded to ftp.askmonty.org/private (better to give them specific names though, optimally mentioning the JIRA issue number).

Comment by Evan Jardine-Skinner [ 2016-08-01 ]

I've seen segfault crashes every now and then (not all the time, but not a one time thing either). The stack trace is always different. Since adding the code to restart a Hung mysql I've not seen a segfault though, so it's possible the 2 are related. The bug concerning the hung mysql https://jira.mariadb.org/browse/MDEV-10400 is really the biggest problem for me at the moment.

Comment by Evan Jardine-Skinner [ 2016-08-01 ]

Closing this one because., as you say, it is diffcult to make any progress if I can't reproduce it. Thanks for looking into it.

Comment by Evan Jardine-Skinner [ 2016-08-01 ]

Hmm I must be being dumb but I can't find a way to close this. Please feel free to close it

Comment by Elena Stepanova [ 2016-08-01 ]

Ev, you said "the stack trace is always different" – does it mean that sometimes it looks more meaningful than the one in the description? If so, could you maybe paste it?

Comment by Evan Jardine-Skinner [ 2016-08-01 ]

The nodes have been re-installed since I saw more of the stack traces but I did find one interesting one:

Jul 29 09:30:18 advancedportalnode-test-20160725-10-137-105-48 mysqld: InnoDB: Error: trying to free a corrupt
Jul 29 09:30:18 advancedportalnode-test-20160725-10-137-105-48 mysqld: InnoDB: table handle. Magic n 6107117756612768768, magic n2 5200480683369135530, table name 160729 9:30:18 [ERROR] my
sqld got signal 11 ;
Jul 29 09:30:18 advancedportalnode-test-20160725-10-137-105-48 mysqld: This could be because you hit a bug. It is also possible that this binary
Jul 29 09:30:18 advancedportalnode-test-20160725-10-137-105-48 mysqld: or one of the libraries it was linked against is corrupt, improperly built,
Jul 29 09:30:18 advancedportalnode-test-20160725-10-137-105-48 mysqld: or misconfigured. This error can also be caused by malfunctioning hardware.
Jul 29 09:30:18 advancedportalnode-test-20160725-10-137-105-48 mysqld:
Jul 29 09:30:18 advancedportalnode-test-20160725-10-137-105-48 mysqld: To report this bug, see http://kb.askmonty.org/en/reporting-bugs
Jul 29 09:30:18 advancedportalnode-test-20160725-10-137-105-48 mysqld:
Jul 29 09:30:18 advancedportalnode-test-20160725-10-137-105-48 mysqld: We will try our best to scrape up some info that will hopefully help
Jul 29 09:30:18 advancedportalnode-test-20160725-10-137-105-48 mysqld: diagnose the problem, but since we have already crashed,
Jul 29 09:30:18 advancedportalnode-test-20160725-10-137-105-48 mysqld: something is definitely wrong and this may fail.
Jul 29 09:30:18 advancedportalnode-test-20160725-10-137-105-48 mysqld:
Jul 29 09:30:18 advancedportalnode-test-20160725-10-137-105-48 mysqld: Server version: 5.5.50-MariaDB-1~trusty-wsrep
Jul 29 09:30:18 advancedportalnode-test-20160725-10-137-105-48 mysqld: key_buffer_size=134217728
Jul 29 09:30:18 advancedportalnode-test-20160725-10-137-105-48 mysqld: read_buffer_size=2097152
Jul 29 09:30:18 advancedportalnode-test-20160725-10-137-105-48 mysqld: max_used_connections=15
Jul 29 09:30:18 advancedportalnode-test-20160725-10-137-105-48 mysqld: max_threads=502
Jul 29 09:30:18 advancedportalnode-test-20160725-10-137-105-48 mysqld: thread_count=24
Jul 29 09:30:18 advancedportalnode-test-20160725-10-137-105-48 mysqld: It is possible that mysqld could use up to
Jul 29 09:30:18 advancedportalnode-test-20160725-10-137-105-48 mysqld: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 3224507 K bytes of memory
Jul 29 09:30:18 advancedportalnode-test-20160725-10-137-105-48 mysqld: Hope that's ok; if not, decrease some variables in the equation.
Jul 29 09:30:18 advancedportalnode-test-20160725-10-137-105-48 mysqld:
Jul 29 09:30:18 advancedportalnode-test-20160725-10-137-105-48 mysqld: Thread pointer: 0x0x7fc0ed0f9000
Jul 29 09:30:18 advancedportalnode-test-20160725-10-137-105-48 mysqld: Attempting backtrace. You can use the following information to find out
Jul 29 09:30:18 advancedportalnode-test-20160725-10-137-105-48 mysqld: where mysqld died. If you see no messages after this, something went
Jul 29 09:30:18 advancedportalnode-test-20160725-10-137-105-48 mysqld: terribly wrong...
Jul 29 09:30:18 advancedportalnode-test-20160725-10-137-105-48 mysqld: stack_bottom = 0x7fc1fc071dd0 thread_stack 0x48000
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x7fc20194988e]
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld: /usr/sbin/mysqld(handle_fatal_signal+0x457)[0x7fc20152e4a7]
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld: /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x7fc1fff7b340]
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld: /usr/sbin/mysqld(+0x74dfd1)[0x7fc20178cfd1]
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld: /usr/sbin/mysqld(+0x722c6d)[0x7fc201761c6d]
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld: /usr/sbin/mysqld(_Z8closefrmP5TABLEb+0x38)[0x7fc20146bbd8]
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld: /usr/sbin/mysqld(+0x33ae43)[0x7fc201379e43]
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld: /usr/sbin/mysqld(_Z18close_thread_tableP3THDPP5TABLE+0x2f5)[0x7fc20137c3f5]
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld: /usr/sbin/mysqld(_Z19close_thread_tablesP3THD+0x193)[0x7fc20137c5b3]
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld: /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x3ba)[0x7fc2013cd59a]
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld: /usr/sbin/mysqld(+0x396ff3)[0x7fc2013d5ff3]
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld: /usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1f78)[0x7fc2013d8648]
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld: /usr/sbin/mysqld(_Z10do_commandP3THD+0x22f)[0x7fc2013d917f]
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld: /usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x33e)[0x7fc201498f7e]
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld: /usr/sbin/mysqld(handle_one_connection+0x4a)[0x7fc20149906a]
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld: /lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x7fc1fff73182]
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld: /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fc1ff69647d]
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld:
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld: Trying to get some variables.
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld: Some pointers may be invalid and cause the dump to abort.
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld: Query (0x7fc109114018): is an invalid pointer
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld: Connection ID (thread ID): 1589002
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld: Status: NOT_KILLED
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld:
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=off
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld:
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld: information that should help you find out what is causing the crash.
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld_safe: Number of processes running now: 0
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld_safe: WSREP: not restarting wsrep node automatically
Jul 29 09:30:19 advancedportalnode-test-20160725-10-137-105-48 mysqld_safe: mysqld from pid file /var/run/mysqld/mysqld.pid ended

Comment by Elena Stepanova [ 2020-10-20 ]

We have never been able to reproduce it, the stack trace is not distinctive enough to match it with other open or fixed bugs, and 5.5-galera is long gone now; so I'm closing it as incomplete. The issue can be re-opened if it gets fresh information related to active versions.

Generated at Thu Feb 08 07:42:12 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.