[MXS-4317] Smartrouter interrupts the wrong query Created: 2022-09-26  Updated: 2022-11-18  Resolved: 2022-11-02

Status: Closed
Project: MariaDB MaxScale
Component/s: smartrouter
Affects Version/s: 2.5.22, 6.4.3, 22.08.2
Fix Version/s: 2.5.23, 6.4.4, 22.08.3

Type: Bug Priority: Major
Reporter: markus makela Assignee: markus makela
Resolution: Fixed Votes: 0
Labels: None

Sprint: MXS-SPRINT-167, MXS-SPRINT-168, MXS-SPRINT-169

 Description   

If two queries with different canonical query forms (e.g. SELECT user FROM t1 and SELECT host FROM t1) are executed back to back and the first query triggers a latency measurement, the second query can end up being interrupted if it ends up being executed on the server that is measured to be the slower server.

This happens because the KILL QUERY command that is used to interrupt the slower servers is not executed synchronously and just waiting for all the results to complete is not enough: the KILL commands must also complete before new queries should be allowed.

Original description:


The test appears to fail with:

214: 07:45:01   2.1s: TEST_FAILED! Thread 7 failed to SELECT: Query execution was interrupted

The query should never get interrupted which means an error leaks to the client that it shouldn't see.



 Comments   
Comment by markus makela [ 2022-10-19 ]

This appears to be caused by the KILL command not executing in time before new queries are accepted by the smartrouter. This means that if two commands are executed back to back and the first one triggers the query measurement to take place, it is possible that the second query ends up being interrupted by the KILL QUERY instead of the slower candidates of the first query.

Generated at Thu Feb 08 04:27:46 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.