[MXS-4460] Crash during query replay with service-to-service configuration Created: 2022-12-21  Updated: 2023-06-16  Resolved: 2023-01-03

Status: Closed
Project: MariaDB MaxScale
Component/s: Core
Affects Version/s: 2.5.23, 6.4.4, 22.08.3
Fix Version/s: 6.4.5, 22.08.4

Type: Bug Priority: Major
Reporter: markus makela Assignee: markus makela
Resolution: Fixed Votes: 1
Labels: None

Issue Links:
Relates

 Description   

With a service-to-service configuration where one service uses another service as its target, a failure in the lower level service can cause a segmentation fault to occur if a query replay takes place right before the connection is closed. This seems to most easily occur if an authentication failure on the subservice happens while the schemarouter is mapping the shards.

The crash happens in DelayedRoutingTask::execute() on line 588 in session.cc:

int rc = m_down->routeQuery(buffer);

The DelayedRoutingTask is constructed as follows:

    DelayedRoutingTask(MXS_SESSION* session, mxs::Routable* down, GWBUF* buffer)
        : m_session(session_get_ref(session))
        , m_down(down)
        , m_buffer(buffer)
    {
    }

As long as a session is open in MaxScale, the top-level mxs::Endpoint will be valid which has prevented this problem from happening with single-level services. With a multi-level service where one service routes to another service, a failure in the subservice can occur that is not fatal to the top-level service. This means that not enough information is passed down to the DelayedRoutingTask to know whether this particular part of the routing chain is still valid.


Generated at Thu Feb 08 04:28:48 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.