Details
-
Bug
-
Status: Open (View Workflow)
-
Major
-
Resolution: Unresolved
-
10.6.20
-
None
-
Operating System: Ubuntu Jammy running on a VM.
RAM: 32 GB.
Galera 26.4.12 (codership Galera and not MariaDB fork of Galera.)
Description
We have a Galera cluster consisting of 3 MariaDB nodes, during this month the segmentation error described in this report has happened twice.
the first occurence
The Cluster had 3 nodes, but one of them was in a broken state(mariadb/2) but it kept trying to join the cluster to no avail.
In one of the attempts mariadb/0 was doing a complex select statement, and then the process crashed with signal 11, this crash happened at the same time that Galera was trying to update the Cluster status to forget the ip address of mariadb/2:
logs from mariadb/0:
2025-04-08 15:40:54 0 [Note] WSREP: declaring mariadb/1 at ssl://xxx.xxx.xx.26:4567 stable
|
2025-04-08 15:40:54 0 [Note] WSREP: forgetting mariadb/2 (ssl://xxx.xxx.xx.27:4567)
|
250408 15:40:54 [ERROR] mysqld got signal 11 ;
|
Sorry, we probably made a mistake, and this is a bug.
|
|
Your assistance in bug reporting will enable us to fix this for the next release.
|
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
|
|
We will try our best to scrape up some info that will hopefully help
|
diagnose the problem, but since we have already crashed,
|
something is definitely wrong and this may fail.
|
|
Server version: 10.6.20-MariaDB-log source revision: f00711bba2cd383825d0be1867f7d7d7f641c9e4
|
key_buffer_size=134217728
|
read_buffer_size=131072
|
max_used_connections=111
|
max_threads=1502
|
thread_count=114
|
It is possible that mysqld could use up to
|
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 3439059 K bytes of memory
|
Hope that's ok; if not, decrease some variables in the equation.
|
and the Stack trace:
mysys/stacktrace.c:216(my_print_stacktrace)[0x55cf7f6152fe]
|
sql/signal_handler.cc:247(handle_fatal_signal)[0x55cf7f03cfe7]
|
/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f0d07c1c520]
|
/lib/x86_64-linux-gnu/libc.so.6(+0x1a0741)[0x7f0d07d7a741]
|
bits/string3.h:51(memcpy)[0x55cf7f25dddc]
|
maria/ma_blockrec.c:3577(allocate_and_write_block_record)[0x55cf7f25fe73]
|
maria/ma_write.c:157(maria_write)[0x55cf7f26bb84]
|
sql/sql_class.h:7707(handler::ha_write_tmp_row(unsigned char*))[0x55cf7ee75a9f]
|
sql/sql_select.cc:23940(end_write(JOIN*, st_join_table*, bool))[0x55cf7ee6ae47]
|
sql/sql_class.h:4563(THD::get_stmt_da())[0x55cf7ee3b81b]
|
sql/sql_select.cc:22392(sub_select(JOIN*, st_join_table*, bool))[0x55cf7ee41ae7]
|
sql/sql_class.h:4563(THD::get_stmt_da())[0x55cf7ee3b81b]
|
sql/sql_select.cc:22392(sub_select(JOIN*, st_join_table*, bool))[0x55cf7ee41ae7]
|
sql/sql_class.h:4563(THD::get_stmt_da())[0x55cf7ee3b81b]
|
sql/sql_select.cc:22392(sub_select(JOIN*, st_join_table*, bool))[0x55cf7ee41ae7]
|
sql/sql_class.h:4563(THD::get_stmt_da())[0x55cf7ee3b81b]
|
sql/sql_select.cc:22392(sub_select(JOIN*, st_join_table*, bool))[0x55cf7ee41ae7]
|
sql/sql_class.h:4563(THD::get_stmt_da())[0x55cf7ee3b81b]
|
sql/sql_select.cc:22392(sub_select(JOIN*, st_join_table*, bool))[0x55cf7ee41ae7]
|
sql/sql_class.h:4563(THD::get_stmt_da())[0x55cf7ee3b81b]
|
sql/sql_select.cc:22392(sub_select(JOIN*, st_join_table*, bool))[0x55cf7ee41b45]
|
sql/sql_select.cc:21908(JOIN::exec_inner())[0x55cf7ee73336]
|
sql/sql_select.cc:4715(JOIN::exec())[0x55cf7ee73679]
|
sql/sql_select.cc:5195(mysql_select(THD*, TABLE_LIST*, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*))[0x55cf7ee718c6]
|
sql/sql_select.cc:585(handle_select(THD*, LEX*, select_result*, unsigned long))[0x55cf7ee72124]
|
sql/sql_parse.cc:6408(execute_sqlcom_select(THD*, TABLE_LIST*))[0x55cf7ecbfa00]
|
sql/sql_parse.cc:3999(mysql_execute_command(THD*, bool))[0x55cf7ee121b1]
|
sql/sql_parse.cc:8195(mysql_parse(THD*, char*, unsigned int, Parser_state*))[0x55cf7ee1468b]
|
sql/sql_class.h:4563(THD::get_stmt_da())[0x55cf7ee14e21]
|
sql/sql_parse.cc:1895(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool))[0x55cf7ee1740a]
|
sql/sql_parse.cc:1423(do_command(THD*, bool))[0x55cf7ee1804e]
|
sql/sql_connect.cc:1407(do_handle_one_connection(CONNECT*, bool))[0x55cf7ef12a2f]
|
sql/sql_connect.cc:1325(handle_one_connection)[0x55cf7ef12cc4]
|
perfschema/pfs.cc:2204(pfs_spawn_thread)[0x55cf7f2b686c]
|
/lib/x86_64-linux-gnu/libc.so.6(+0x94ac3)[0x7f0d07c6eac3]
|
/lib/x86_64-linux-gnu/libc.so.6(+0x126850)[0x7f0d07d00850]
|
Query (0x7f06e4010c00): select <readacted>
|
the second occurence
|
Second Occurence:
The cluster this time was healthy, and we triggered a restart where we take down the nodes one by one so no downtime occurs.
- mariadb/0 was successfully restarted.
- When mariadb/1 shut down for the restart, mariadb/2 crashed with the same error while doing the same select query as the first occurrence, while Galera was also trying to forget the IP address of the node that just left at the same time.
*logs from mariadb/2*
2025-04-23 9:05:23 0 [Note] WSREP: declaring mariadb/0 at ssl://xxx.xxx.xx.28:4567 stable
|
2025-04-23 9:05:23 0 [Note] WSREP: forgetting mariadb/1 (ssl://xxx.xxx.xx.26:4567)
|
250423 9:05:23 [ERROR] mysqld got signal 11 ;
|
Sorry, we probably made a mistake, and this is a bug.
|
*The stack trace*
|
mysys/stacktrace.c:216(my_print_stacktrace)[0x55d093c152fe]
|
sql/signal_handler.cc:247(handle_fatal_signal)[0x55d09363cfe7]
|
/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f257c783520]
|
/lib/x86_64-linux-gnu/libc.so.6(+0x1a0741)[0x7f257c8e1741]
|
bits/string3.h:51(memcpy)[0x55d093861d78]
|
maria/ma_blockrec.c:5513(_ma_scan_block_record)[0x55d09386284a]
|
sql/handler.cc:3532(handler::ha_rnd_next(unsigned char*))[0x55d093643a07]
|
sql/filesort.cc:914(filesort(THD*, TABLE*, Filesort*, Filesort_tracker*, JOIN*, unsigned long long))[0x55d09363b83b]
|
sql/sql_select.cc:25929(create_sort_index(THD*, JOIN*, st_join_table*, Filesort*))[0x55d09344c613]
|
sql/sql_select.cc:23436(st_join_table::sort_table())[0x55d09344c93e]
|
sql/sql_select.cc:23373(join_init_read_record(st_join_table*))[0x55d09344ca00]
|
sql/sql_select.cc:31573(AGGR_OP::end_send())[0x55d0934529f3]
|
sql/sql_select.cc:22073(sub_select_postjoin_aggr(JOIN*, st_join_table*, bool))[0x55d093452bb1]
|
sql/sql_select.cc:21909(JOIN::exec_inner())[0x55d093473231]
|
sql/sql_select.cc:4715(JOIN::exec())[0x55d093473679]
|
sql/sql_select.cc:5195(mysql_select(THD*, TABLE_LIST*, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*))[0x55d0934718c6]
|
sql/sql_select.cc:585(handle_select(THD*, LEX*, select_result*, unsigned long))[0x55d093472124]
|
sql/sql_parse.cc:6408(execute_sqlcom_select(THD*, TABLE_LIST*))[0x55d0932bfa00]
|
sql/sql_parse.cc:3999(mysql_execute_command(THD*, bool))[0x55d0934121b1]
|
sql/sql_parse.cc:8195(mysql_parse(THD*, char*, unsigned int, Parser_state*))[0x55d09341468b]
|
sql/sql_class.h:4563(THD::get_stmt_da())[0x55d093414e21]
|
sql/sql_parse.cc:1895(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool))[0x55d09341740a]
|
sql/sql_parse.cc:1423(do_command(THD*, bool))[0x55d09341804e]
|
sql/sql_connect.cc:1407(do_handle_one_connection(CONNECT*, bool))[0x55d093512a2f]
|
sql/sql_connect.cc:1325(handle_one_connection)[0x55d093512cc4]
|
perfschema/pfs.cc:2204(pfs_spawn_thread)[0x55d0938b686c]
|
/lib/x86_64-linux-gnu/libc.so.6(+0x94ac3)[0x7f257c7d5ac3]
|
/lib/x86_64-linux-gnu/libc.so.6(+0x126850)[0x7f257c867850]
|
Trying to get some variables.
|
Some pointers may be invalid and cause the dump to abort.
|
Query (0x7f2110030040): select
|
The stacktrace is the same in both crashes and the circumstances are the same so we believe it's the same bug.
*Important note*: We run Upstream Galera from codership and not from the MariaDB fork of Galera.
Attachments
Issue Links
- relates to
-
MDEV-28589 Mariadb signal11 crash and restart after select
-
- Open
-