Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.3.22, 10.3.24, 10.2(EOL)
-
None
-
Ubuntu 16.04
Description
Consider this .cnf file:
openxs@ao756:~/dbs/maria10.3$ cat /home/openxs/galera/mynode1.cnf [mysqld]
|
datadir=/home/openxs/galera/node1
|
port=3306
|
socket=/tmp/mysql-node1.sock
|
pid-file=/tmp/mysql-node1.pid
|
log-error=/tmp/mysql-node1.err
|
binlog_format=ROW
|
innodb_autoinc_lock_mode=2
|
|
wsrep_on=ON
|
wsrep_provider=/home/openxs/galera/libgalera_smm.so
|
wsrep_log_conflicts=ON
|
wsrep_provider_options="cert.log_conflicts=YES"
|
wsrep_cluster_name = singlebox
|
wsrep_node_name = node1
|
wsrep_cluster_address=gcomm://127.0.0.1:4567,127.0.0.1:5020,127.0.0.1:6020?pc.wait_prim=no
|
|
log_bin
|
log_slave_updates=1
|
binlog_row_image=FULL
|
gtid_domain_id=4100
|
wsrep_gtid_domain_id=4100
|
wsrep_gtid_mode=ON
|
wsrep_restart_slave=1
|
server_id=2
|
slave-skip-errors=1062
|
innodb_flush_log_at_trx_commit=2
|
sync_binlog=0
|
openxs@ao756:~/dbs/maria10.3$
|
and server started with it but with wsrep_on set to OFF:
openxs@ao756:~/dbs/maria10.3$ bin/mysqld_safe --defaults-file=/home/openxs/galera/mynode1.cnf --wsrep_on=OFF &
|
[2] 20161
|
openxs@ao756:~/dbs/maria10.3$ 200701 13:03:32 mysqld_safe Logging to '/tmp/mysql-node1.err'.
|
200701 13:03:32 mysqld_safe Starting mysqld daemon with databases from /home/openxs/galera/node1
|
|
openxs@ao756:~/dbs/maria10.3$ bin/mysql -uroot --socket=/tmp/mysql-node1.sock Welcome to the MariaDB monitor. Commands end with ; or \g.
|
Your MariaDB connection id is 9
|
Server version: 10.3.24-MariaDB-log Source distribution
|
|
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
|
|
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
|
|
MariaDB [(none)]> show global status like 'wsrep_cluster%';
|
+--------------------------+----------------------+
|
| Variable_name | Value |
|
+--------------------------+----------------------+
|
| wsrep_cluster_conf_id | 18446744073709551615 |
|
| wsrep_cluster_size | 0 |
|
| wsrep_cluster_state_uuid | |
|
| wsrep_cluster_status | Disconnected |
|
+--------------------------+----------------------+
|
4 rows in set (0,002 sec)
|
|
MariaDB [(none)]> set global wsrep_on = ON;
|
Query OK, 0 rows affected (0,000 sec)
|
|
MariaDB [(none)]> select 1;
|
+---+
|
| 1 |
|
+---+
|
| 1 |
|
+---+
|
1 row in set (0,000 sec)
|
|
MariaDB [(none)]> \r
|
Connection id: 10
|
Current database: *** NONE ***
|
|
MariaDB [(none)]> select 1;
|
ERROR 2006 (HY000): MySQL server has gone away
|
No connection. Trying to reconnect...
|
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/tmp/mysql-node1.sock' (111)
|
ERROR: Can't connect to the server
|
|
unknown [(none)]> exit
|
Bye
|
As we can see, server crashes when we try to set wsrep_on to ON dynamically, after we reconnect to get new thread taking the value into account. In the error log we see:
openxs@ao756:~/dbs/maria10.3$ tail -70 /tmp/mysql-node1.err 2020-07-01 13:03:33 0 [Note] Added new Master_info '' to hash table
|
2020-07-01 13:03:33 0 [Note] /home/openxs/dbs/maria10.3/bin/mysqld: ready for connections.
|
Version: '10.3.24-MariaDB-log' socket: '/tmp/mysql-node1.sock' port: 3306 Source distribution
|
200701 13:03:55 [ERROR] mysqld got signal 11 ;
|
This could be because you hit a bug. It is also possible that this binary
|
or one of the libraries it was linked against is corrupt, improperly built,
|
or misconfigured. This error can also be caused by malfunctioning hardware.
|
|
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
|
|
We will try our best to scrape up some info that will hopefully help
|
diagnose the problem, but since we have already crashed,
|
something is definitely wrong and this may fail.
|
|
Server version: 10.3.24-MariaDB-log
|
key_buffer_size=134217728
|
read_buffer_size=131072
|
max_used_connections=2
|
max_threads=153
|
thread_count=8
|
It is possible that mysqld could use up to
|
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 467423 K bytes of memory
|
Hope that's ok; if not, decrease some variables in the equation.
|
|
Thread pointer: 0x7fc678001098
|
Attempting backtrace. You can use the following information to find out
|
where mysqld died. If you see no messages after this, something went
|
terribly wrong...
|
stack_bottom = 0x7fc6c5949e98 thread_stack 0x49000
|
/home/openxs/dbs/maria10.3/bin/mysqld(my_print_stacktrace+0x29)[0x55760af4cd29]
|
mysys/stacktrace.c:270(my_print_stacktrace)[0x55760aa9c187]
|
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7fc6dd892390]
|
sql/sql_connect.cc:1184(end_connection(THD*))[0x55760a9a3d59]
|
sql/sql_connect.cc:1409(do_handle_one_connection(CONNECT*))[0x55760a9a45ca]
|
sql/sql_connect.cc:1310(handle_one_connection)[0x55760a9a4764]
|
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7fc6dd8886ba]
|
x86_64/clone.S:111(clone)[0x7fc6dcd1d41d]
|
|
Trying to get some variables.
|
Some pointers may be invalid and cause the dump to abort.
|
Query (0x0):
|
Connection ID (thread ID): 9
|
Status: NOT_KILLED
|
|
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on
|
|
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
|
information that should help you find out what is causing the crash.
|
Writing a core file...
|
Working directory at /home/openxs/galera/node1
|
Resource Limits:
|
Limit Soft Limit Hard Limit Units
|
Max cpu time unlimited unlimited seconds
|
Max file size unlimited unlimited bytes
|
Max data size unlimited unlimited bytes
|
Max stack size 8388608 unlimited bytes
|
Max core file size 0 unlimited bytes
|
Max resident set unlimited unlimited bytes
|
Max processes 14895 14895 processes
|
Max open files 32184 32184 files
|
Max locked memory 65536 65536 bytes
|
Max address space unlimited unlimited bytes
|
Max file locks unlimited unlimited locks
|
Max pending signals 14895 14895 signals
|
Max msgqueue size 819200 819200 bytes
|
Max nice priority 0 0
|
Max realtime priority 0 0
|
Max realtime timeout unlimited unlimited us
|
Core pattern: |/usr/share/apport/apport %p %s %c %d %P %E
|
|
openxs@ao756:~/dbs/maria10.3$
|
So, something weird is going on when we try to execute any SQL after reconnection. To summarize:
1. We have to prevent crash in this case.
2. We should document that dynamic setting makes sense only if we started with wsrep_ON = ON.