[MXS-4101] Unexpected result with mixed 10.2 and 10.6 backends Created: 2022-04-20  Updated: 2022-05-31  Resolved: 2022-04-27

Status: Closed
Project: MariaDB MaxScale
Component/s: Protocol
Affects Version/s: 6.3.0
Fix Version/s: 6.3.1

Type: Bug Priority: Major
Reporter: markus makela Assignee: markus makela
Resolution: Fixed Votes: 1
Labels: None


 Description   

MaxScale crashes when it is used with a mix of MariaDB 10.2.7 and MariaDB 10.6 backends. This happens as the 10.2 backends end up sending a resultset that is not using the DEPRECATE_EOF resultset format which ends up causing MaxScale to crash.

alert  :   /lib64/libc.so.6(+0x8f88c): /usr/src/debug/glibc-2.34-29.fc35.x86_64/nptl/pthread_kill.c:44
  /lib64/libc.so.6(raise+0x16): /usr/src/debug/glibc-2.34-29.fc35.x86_64/signal/../sysdeps/posix/raise.c:27
  /usr/lib64/maxscale/libmaxscale-common.so.1.0.0(_ZN24MariaDBBackendConnection18process_one_packetEN8maxscale6Buffer8iteratorES2_j+0x65e): server/modules/protocol/MariaDB/mariadb_backend.cc:2190 (discriminator 3)
  /usr/lib64/maxscale/libmaxscale-common.so.1.0.0(_ZN24MariaDBBackendConnection15process_packetsEPP5GWBUF+0x27e): server/modules/protocol/MariaDB/mariadb_backend.cc:2098
  /usr/lib64/maxscale/libmaxscale-common.so.1.0.0(_ZN24MariaDBBackendConnection14track_responseEPP5GWBUF+0x24): server/modules/protocol/MariaDB/mariadb_backend.cc:1697
  /usr/lib64/maxscale/libmaxscale-common.so.1.0.0(_ZN24MariaDBBackendConnection11normal_readEv+0x2e9): server/modules/protocol/MariaDB/mariadb_backend.cc:681
  /usr/lib64/maxscale/libmaxscale-common.so.1.0.0(_ZN24MariaDBBackendConnection17ready_for_readingEP3DCB+0x392): server/modules/protocol/MariaDB/mariadb_backend.cc:544
  /usr/lib64/maxscale/libmaxscale-common.so.1.0.0(_ZN3DCB14process_eventsEj+0x55f): server/core/dcb.cc:1311
  /usr/lib64/maxscale/libmaxscale-common.so.1.0.0(_ZN3DCB13event_handlerEPS_j+0x6b): server/core/dcb.cc:1376
  /usr/lib64/maxscale/libmaxscale-common.so.1.0.0(_ZN3DCB12poll_handlerEP13MXB_POLL_DATAP10MXB_WORKERj+0x54): server/core/dcb.cc:1402
  /usr/lib64/maxscale/libmaxscale-common.so.1.0.0(_ZN7maxbase6Worker15poll_waiteventsEv+0x47b): maxutils/maxbase/src/worker.cc:848
  /usr/lib64/maxscale/libmaxscale-common.so.1.0.0(_ZN7maxbase6Worker3runEPNS_9SemaphoreE+0x12c): maxutils/maxbase/src/worker.cc:556
  /usr/lib64/maxscale/libmaxscale-common.so.1.0.0(_ZN7maxbase6Worker11thread_mainEPS0_PNS_9SemaphoreE+0x23): maxutils/maxbase/src/worker.cc:682
  /usr/lib64/maxscale/libmaxscale-common.so.1.0.0(_ZSt13__invoke_implIvPFvPN7maxbase6WorkerEPNS0_9SemaphoreEEJS2_S4_EET_St14__invoke_otherOT0_DpOT1_+0x4d): /usr/include/c++/11/bits/invoke.h:61
  /usr/lib64/maxscale/libmaxscale-common.so.1.0.0(_ZSt8__invokeIPFvPN7maxbase6WorkerEPNS0_9SemaphoreEEJS2_S4_EENSt15__invoke_resultIT_JDpT0_EE4typeEOS8_DpOS9_+0x4f): /usr/include/c++/11/bits/invoke.h:97
  /usr/lib64/maxscale/libmaxscale-common.so.1.0.0(_ZNSt6thread8_InvokerISt5tupleIJPFvPN7maxbase6WorkerEPNS2_9SemaphoreEES4_S6_EEE9_M_invokeIJLm0ELm1ELm2EEEEvSt12_Index_tupleIJXspT_EEE+0x5f): /usr/include/c++/11/bits/std_thread.h:253
  /usr/lib64/maxscale/libmaxscale-common.so.1.0.0(_ZNSt6thread8_InvokerISt5tupleIJPFvPN7maxbase6WorkerEPNS2_9SemaphoreEES4_S6_EEEclEv+0x18): /usr/include/c++/11/bits/std_thread.h:260
  /usr/lib64/maxscale/libmaxscale-common.so.1.0.0(_ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJPFvPN7maxbase6WorkerEPNS3_9SemaphoreEES5_S7_EEEEE6_M_runEv+0x1c): /usr/include/c++/11/bits/std_thread.h:211
  /lib64/libstdc++.so.6(+0xd95c4): /usr/src/debug/gcc-11.2.1-9.fc35.x86_64/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/unique_ptr.h:85
  /lib64/libc.so.6(+0x8db1a): /usr/src/debug/glibc-2.34-29.fc35.x86_64/nptl/pthread_create.c:443
  /lib64/libc.so.6(+0x112660): /usr/src/debug/glibc-2.34-29.fc35.x86_64/misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:83



 Comments   
Comment by markus makela [ 2022-04-20 ]

Seems to only happen with older 10.2 versions, upgrading to the latest 10.2 version seems to make the problem go away. The root cause of the problem right now appears to be that even when the server should send results using the DEPRECATE_EOF protocol, the server ends up not using it.

One possible explanation might be MDEV-13300 which was fixed for 10.2.8.

Generated at Thu Feb 08 04:26:13 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.