|
Maxscale log just before crash happen :-
|
2022-06-21 23:19:32.499 warning: [mariadbmon] (monitor_server): Error during monitor update of server 'HOSTNAME_XXX': Query 'SHOW ALL SLAVES STATUS;' failed: 'Lost connection to server during query'.
|
2022-06-21 23:19:32.622 warning: [mariadbmon] (operator()): Lost the lock 'maxscale_mariadbmonitor' on server 'HOSTNAME_XXX' without releasing it. The lock is now owned by connection 561932.
|
2022-06-21 23:19:57.448 warning: (642070) (Read-Write-Service); (set_rlag_state): Replication lag of 'HOSTNAME_XXX' is 30 seconds, which is above the configured limit 3s. 'HOSTNAME_XXX' is excluded from query routing.
|
2022-06-21 23:20:03.308 warning: (642076) (Read-Write-Service); (set_rlag_state): Replication lag of 'HOSTNAME_XXX' is 0 seconds, which is below the configured limit 3s. 'HOSTNAME_XXX' is returned to query routing.
|
2022-06-21 23:30:18.195 warning: [mariadbmon] (monitor_server): Error during monitor update of server 'HOSTNAME_XXX': Query 'SHOW ALL SLAVES STATUS;' failed: 'Lost connection to server during query'.
|
2022-06-21 23:30:18.314 warning: [mariadbmon] (operator()): Lost the lock 'maxscale_mariadbmonitor' on server 'HOSTNAME_XXX' without releasing it. The lock is now owned by connection 562137.
|
2022-06-21 23:30:44.774 warning: (643050) (Read-Write-Service); (set_rlag_state): Replication lag of 'HOSTNAME_XXX' is 28 seconds, which is above the configured limit 3s. 'HOSTNAME_XXX' is excluded from query routing.
|
2022-06-21 23:30:47.964 warning: (643055) (Read-Write-Service); (set_rlag_state): Replication lag of 'HOSTNAME_XXX' is 0 seconds, which is below the configured limit 3s. 'HOSTNAME_XXX' is returned to query routing.
|
2022-06-21 23:41:26.547 warning: [mariadbmon] (monitor_server): Error during monitor update of server 'HOSTNAME_XXX': Query 'SHOW ALL SLAVES STATUS;' failed: 'Lost connection to server during query'.
|
2022-06-21 23:41:26.665 warning: [mariadbmon] (operator()): Lost the lock 'maxscale_mariadbmonitor' on server 'HOSTNAME_XXX' without releasing it. The lock is now owned by connection 562280.
|
2022-06-21 23:41:52.542 warning: (644051) (Read-Write-Service); (set_rlag_state): Replication lag of 'HOSTNAME_XXX' is 28 seconds, which is above the configured limit 3s. 'HOSTNAME_XXX' is excluded from query routing.
|
2022-06-21 23:41:55.222 warning: (644055) (Read-Write-Service); (set_rlag_state): Replication lag of 'HOSTNAME_XXX' is 0 seconds, which is below the configured limit 3s. 'HOSTNAME_XXX' is returned to query routing.
|
2022-06-21 23:49:34.197 warning: (644740) (Read-Write-Service); (set_rlag_state): Replication lag of 'HOSTNAME_XXX' is 3 seconds, which is above the configured limit 3s. 'HOSTNAME_XXX' is excluded from query routing.
|
2022-06-21 23:50:04.872 warning: (644780) (Read-Write-Service); (set_rlag_state): Replication lag of 'HOSTNAME_XXX' is 0 seconds, which is below the configured limit 3s. 'HOSTNAME_XXX' is returned to query routing.
|
2022-06-22 00:02:45.011 warning: [mariadbmon] (monitor_server): Error during monitor update of server 'HOSTNAME_XXX': Query 'SHOW ALL SLAVES STATUS;' failed: 'Lost connection to server during query'.
|
2022-06-22 00:02:45.134 warning: [mariadbmon] (operator()): Lost the lock 'maxscale_mariadbmonitor' on server 'HOSTNAME_XXX' without releasing it. The lock is now owned by connection 562438.
|
2022-06-22 00:04:42.218 warning: (646095) (Read-Write-Service); (set_rlag_state): Replication lag of 'HOSTNAME_XXX' is 3 seconds, which is above the configured limit 3s. 'HOSTNAME_XXX' is excluded from query routing.
|
2022-06-22 00:05:11.073 warning: (646142) (Read-Write-Service); (set_rlag_state): Replication lag of 'HOSTNAME_XXX' is 0 seconds, which is below the configured limit 3s. 'HOSTNAME_XXX' is returned to query routing.
|
alert : MaxScale 6.4.0 received fatal signal 6. Commit ID: 6bbea45dd24f39ccf8effbfaec15adaf295d70a5 System name: Linux Release string: Red Hat Enterprise Linux release 8.1 (Ootpa)
|
|
|
|
markus makela below is the crash detail after adding debug=gdb-stacktrace with in maxscale configuration file [MaxScale 6.4.0]
|
Thread 1 (Thread 0x7f5bca8c93c0 (LWP 3622917)):
|
#0 0x00007f5bc70115d4 in read () from /lib64/libc.so.6
|
#1 0x00007f5bc6fa1418 in __GI__IO_file_underflow () from /lib64/libc.so.6
|
#2 0x00007f5bc6fa2755 in _IO_default_xsgetn () from /lib64/libc.so.6
|
#3 0x00007f5bc6f955eb in fread () from /lib64/libc.so.6
|
#4 0x000000000040f0e3 in (anonymous namespace)::get_command_output_cb (cb=cb@entry=0x40b5d0 <<lambda(char const*)>::_FUN(const char *)>, format=0x412b88 "gdb --pid=%d -batch -nx -iex 'set auto-load off' -iex 'set print thread-events off' -ex 'thr a a bt'", format=0x412b88 "gdb --pid=%d -batch -nx -iex 'set auto-load off' -iex 'set print thread-events off' -ex 'thr a a bt'") at /home/timofey_turenko_mariadb_com/MaxScale/maxutils/maxbase/src/stacktrace.cc:78
|
#5 0x000000000040f7e7 in maxbase::dump_gdb_stacktrace (handler=0x40b5d0 <<lambda(char const*)>::_FUN(const char *)>) at /home/timofey_turenko_mariadb_com/MaxScale/maxutils/maxbase/src/stacktrace.cc:230
|
#6 0x000000000040b2d1 in sigfatal_handler (i=6) at /home/timofey_turenko_mariadb_com/MaxScale/server/core/gateway.cc:485
|
#7 <signal handler called>
|
#8 0x00007f5bc7c9fcd6 in do_futex_wait.constprop () from /lib64/libpthread.so.0
|
#9 0x00007f5bc7c9fdc8 in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0
|
#10 0x00007f5bca09edb2 in maxbase::Semaphore::wait (signal_approach=<optimized out>, this=<optimized out>) at /home/timofey_turenko_mariadb_com/MaxScale/maxutils/maxbase/include/maxbase/semaphore.hh:144
|
#11 maxbase::Semaphore::wait_n (signal_approach=<optimized out>, n_wait=<optimized out>, this=<optimized out>) at /home/timofey_turenko_mariadb_com/MaxScale/maxutils/maxbase/include/maxbase/semaphore.hh:176
|
#12 maxscale::RoutingWorker::execute_concurrently(std::function<void ()> const&) (func=...) at /home/timofey_turenko_mariadb_com/MaxScale/server/core/routingworker.cc:1190
|
#13 0x00007f5bca0ad9b3 in maxscale::WorkerGlobal<std::unordered_map<unsigned int, unsigned long, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned long> > > >::assign (t=..., this=0xec8348) at /usr/include/c++/8/new:169
|
#14 Server::<lambda()>::operator() (__closure=<optimized out>) at /home/timofey_turenko_mariadb_com/MaxScale/server/core/server.cc:667
|
#15 std::_Function_handler<void(), Server::set_gtid_list(const std::vector<std::pair<unsigned int, long unsigned int> >&)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/include/c++/8/bits/std_function.h:297
|
#16 0x00007f5bca1282b9 in std::function<void ()>::operator()() const (this=0x7f5b7c025398) at /usr/include/c++/8/bits/std_function.h:682
|
#17 maxbase::Worker::CustomTask::execute (worker=..., this=0x7f5b7c025390) at /home/timofey_turenko_mariadb_com/MaxScale/maxutils/maxbase/src/worker.cc:488
|
#18 maxbase::Worker::handle_message (this=<optimized out>, queue=..., msg=...) at /home/timofey_turenko_mariadb_com/MaxScale/maxutils/maxbase/src/worker.cc:651
|
#19 0x00007f5bca12b5d8 in maxbase::MessageQueue::handle_poll_events (pWorker=<optimized out>, events=<optimized out>, this=0xe83200) at /home/timofey_turenko_mariadb_com/MaxScale/maxutils/maxbase/src/messagequeue.cc:307
|
#20 maxbase::MessageQueue::handle_poll_events (this=0xe83200, pWorker=<optimized out>, events=<optimized out>) at /home/timofey_turenko_mariadb_com/MaxScale/maxutils/maxbase/src/messagequeue.cc:286
|
#21 0x00007f5bca1294a9 in maxbase::Worker::poll_waitevents (this=this@entry=0x7fffa8ba1190) at /home/timofey_turenko_mariadb_com/MaxScale/maxutils/maxbase/src/worker.cc:848
|
#22 0x00007f5bca1296bf in maxbase::Worker::run (this=this@entry=0x7fffa8ba1190, pSem=pSem@entry=0x0) at /home/timofey_turenko_mariadb_com/MaxScale/maxutils/maxbase/src/worker.cc:554
|
#23 0x0000000000409c0a in maxbase::Worker::run (this=0x7fffa8ba1190) at /home/timofey_turenko_mariadb_com/MaxScale/maxutils/maxbase/include/maxbase/worker.hh:451
|
#24 main (argc=<optimized out>, argv=<optimized out>) at /home/timofey_turenko_mariadb_com/MaxScale/server/core/gateway.cc:2239
|
[Inferior 1 (process 3622917) detached]
|
alert : Writing core dump.
|
|
Thread 21 (Thread 0x7f06c235a700 (LWP 4010870)):
#0 0x00007f06e6ea4cd6 in do_futex_wait.constprop () from /lib64/libpthread.so.0
#1 0x00007f06e6ea4dc8 in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0
#2 0x00007f06e932db72 in maxbase::Semaphore::wait (signal_approach=<optimized out>, this=<optimized out>) at /home/timofey_turenko_mariadb_com/MaxScale/maxutils/maxbase/include/maxbase/semaphore.hh:144
#3 maxbase::Worker::call(std::function<void ()> const&, maxbase::Worker::execute_mode_t) (this=this@entry=0x7ffc8a4f9410, func=..., mode=mode@entry=maxbase::Worker::EXECUTE_AUTO) at /home/timofey_turenko_mariadb_com/MaxScale/maxutils/maxbase/src/worker.cc:519
#4 0x00007f06e92972d7 in resource_handle_request (request=...) at /usr/include/c++/8/bits/std_function.h:87
#5 0x00007f06e91f7c54 in Client::process (this=0x7f06a802ffc0, url=..., method=..., upload_data=<optimized out>, upload_size=<optimized out>) at /home/timofey_turenko_mariadb_com/MaxScale/server/core/admin.cc:729
#6 0x00007f06e91fa2af in Client::handle (this=0x7f06a802ffc0, url=..., method=..., upload_data=0x0, upload_data_size=0x7f06c2356518) at /usr/include/c++/8/bits/basic_string.h:940
#7 0x00007f06e91fa3cf in (anonymous namespace)::handle_client (cls=<optimized out>, connection=<optimized out>, url=0x7f06a80635c4 "/servers", method=<optimized out>, version=<optimized out>, upload_data=0x0, upload_data_size=0x7f06c2356518, con_cls=0x7f06a803b1a8) at /usr/include/c++/8/bits/basic_string.h:256
#8 0x00007f06e93e1d7f in call_connection_handler (connection=connection@entry=0x7f06a803b150) at connection.c:2221
#9 0x00007f06e93e3880 in MHD_connection_handle_idle (connection=connection@entry=0x7f06a803b150) at connection.c:3401
#10 0x00007f06e93e575e in call_handlers (con=con@entry=0x7f06a803b150, read_ready=<optimized out>, write_ready=<optimized out>, force_close=<optimized out>) at daemon.c:1207
#11 0x00007f06e93ea3d8 in MHD_epoll (daemon=daemon@entry=0x1ffe520, may_block=may_block@entry=1) at daemon.c:4581
#12 0x00007f06e93eb2d7 in MHD_polling_thread (cls=0x1ffe520) at daemon.c:4746
#13 0x00007f06e6e9c17a in start_thread () from /lib64/libpthread.so.0
#14 0x00007f06e6225dc3 in clone () from /lib64/libc.so.6
{ode}
|
|
The stacktraces seem to point to the causal_reads resultset code in readwritesplit.
Thread 17 (Thread 0x7f06d899e700 (LWP 4010866)):
|
#0 gwbuf_link_length (b=0x7f06bf257730, b=0x7f06bf257730) at /home/timofey_turenko_mariadb_com/MaxScale/include/maxscale/buffer.hh:157
|
#1 gwbuf_copy_data (buffer=0x7f06bf257730, buffer@entry=0x7f06bc119550, offset=9163541, offset@entry=57627426, bytes=bytes@entry=3, dest=dest@entry=0x7f06d8997f6d "\037") at /home/timofey_turenko_mariadb_com/MaxScale/server/core/buffer.cc:736
|
#2 0x00007f06c2de94ef in RWSplitSession::correct_packet_sequence (this=this@entry=0x7f06bc0fe960, buffer=buffer@entry=0x7f06bc119550) at /home/timofey_turenko_mariadb_com/MaxScale/server/modules/routing/readwritesplit/rwsplit_causal_reads.cc:77
|
#3 0x00007f06c2dea339 in RWSplitSession::handle_causal_read_reply (this=this@entry=0x7f06bc0fe960, writebuf=writebuf@entry=0x7f06bc119550, reply=..., backend=backend@entry=0x7f06bc029e30) at /home/timofey_turenko_mariadb_com/MaxScale/server/modules/routing/readwritesplit/rwsplit_causal_reads.cc:113
|
#4 0x00007f06c2ddc95e in RWSplitSession::clientReply (this=0x7f06bc0fe960, writebuf=0x7f06bc119550, down=..., reply=...) at /home/timofey_turenko_mariadb_com/MaxScale/server/modules/routing/readwritesplit/rwsplitsession.cc:495
|
#5 0x00007f06e92bda57 in ServiceEndpoint::clientReply (this=0x7f06bc100460, buffer=<optimized out>, down=..., reply=...) at /home/timofey_turenko_mariadb_com/MaxScale/server/core/service.cc:1703
|
#6 0x00007f06e92b2ec6 in ServerEndpoint::clientReply (this=0x7f06bc0c1790, buffer=<optimized out>, down=..., reply=...) at /home/timofey_turenko_mariadb_com/MaxScale/server/core/server.cc:1202
|
#7 0x00007f06e92edb1e in MariaDBBackendConnection::normal_read (this=0x7f06bc09c050) at /home/timofey_turenko_mariadb_com/MaxScale/server/modules/protocol/MariaDB/mariadb_backend.cc:762
|
#8 0x00007f06e9242194 in DCB::process_events (events=5, this=0x7f06bc0e4f70) at /home/timofey_turenko_mariadb_com/MaxScale/server/core/dcb.cc:1311
|
#9 DCB::process_events (this=0x7f06bc0e4f70, events=5) at /home/timofey_turenko_mariadb_com/MaxScale/server/core/dcb.cc:1245
|
#10 0x00007f06e9242291 in DCB::event_handler (dcb=0x7f06bc0e4f70, events=<optimized out>) at /home/timofey_turenko_mariadb_com/MaxScale/server/core/dcb.cc:1363
|
#11 0x00007f06e932e4a9 in maxbase::Worker::poll_waitevents (this=this@entry=0x1fa7b20) at /home/timofey_turenko_mariadb_com/MaxScale/maxutils/maxbase/src/worker.cc:848
|
#12 0x00007f06e932e6bf in maxbase::Worker::run (this=0x1fa7b20, pSem=0x7ffc8a4f8b50) at /home/timofey_turenko_mariadb_com/MaxScale/maxutils/maxbase/src/worker.cc:554
|
#13 0x00007f06e81ebba3 in execute_native_thread_routine () from /lib64/libstdc++.so.6
|
#14 0x00007f06e6e9c17a in start_thread () from /lib64/libpthread.so.0
|
#15 0x00007f06e6225dc3 in clone () from /lib64/libc.so.6
|
This was true for both cases which might mean that there's some inefficiency in the result processing code that doesn't scale. However, this would be evident immediately when the query is executed.
|