[MXS-3815] maxscale crash Created: 2021-10-14  Updated: 2022-01-13  Resolved: 2021-11-17

Status: Closed
Project: MariaDB MaxScale
Component/s: pinloki
Affects Version/s: 2.5.15, 6.1.4
Fix Version/s: 2.5.17, 6.2.0

Type: Bug Priority: Major
Reporter: Muhammad Irfan Assignee: Niclas Antti
Resolution: Fixed Votes: 0
Labels: None

Sprint: MXS-SPRINT-143, MXS-SPRINT-144

 Description   

/lib64/libpthread.so.0(+0x11226): /usr/src/debug/glibc-2.26-lp152.26.6.1.x86_64/nptl/../sysdeps/unix/sysv/linux/futex-internal.h:205
/lib64/libpthread.so.0(+0x11318): /usr/src/debug/glibc-2.26-lp152.26.6.1.x86_64/nptl/sem_waitcommon.c:191
/usr/lib64/maxscale/libmaxscale-common.so.1.0.0(_ZN8maxscale13RoutingWorker20execute_concurrentlyESt8functionIFvvEE+0x62): maxutils/maxbase/include/maxbase/semaphore.hh:146
/usr/lib64/maxscale/libmaxscale-common.so.1.0.0(0x15dc3f): /usr/include/c+/9/bits/std_function.h:259
/usr/lib64/maxscale/libmaxscale-common.so.1.0.0(_ZN7maxbase6Worker14handle_messageERNS_12MessageQueueERKNS_19MessageQueueMessageE+0x6a): maxutils/maxbase/src/worker.cc:490
/usr/lib64/maxscale/libmaxscale-common.so.1.0.0(_ZN7maxbase12MessageQueue18handle_poll_eventsEPNS_6WorkerEj+0x98): maxutils/maxbase/src/messagequeue.cc:329
/usr/lib64/maxscale/libmaxscale-common.so.1.0.0(_ZN7maxbase6Worker15poll_waiteventsEv+0x1c6): maxutils/maxbase/src/worker.cc:879
/usr/lib64/maxscale/libmaxscale-common.so.1.0.0(_ZN7maxbase6Worker3runEPNS_9SemaphoreE+0x53): maxutils/maxbase/src/worker.cc:574
/usr/bin/maxscale(main+0x20dd): maxutils/maxbase/include/maxbase/log.h:168
/lib64/libc.so.6(__libc_start_main+0xea): /usr/src/debug/glibc-2.26-lp152.26.6.1.x86_64/csu/../csu/libc-start.c:342
/usr/bin/maxscale(_start+0x2a): /home/abuild/rpmbuild/BUILD/glibc-2.26/csu/../sysdeps/x86_64/start.S:122



 Comments   
Comment by Niclas Antti [ 2021-11-09 ]

This is a most likely caused by a watchdog signal, which in turn is caused when binlog router spends too much time scanning for a requested GTID. The main reason for a lengthy scan has been identified, but the question really is: why is there so much binlog data and how come the replica is so far behind (in the normal case only the first file is scanned).

Generated at Thu Feb 08 04:24:09 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.