[MXS-4288] mxs1585 regression in 6.4 Created: 2022-09-10  Updated: 2022-11-02  Resolved: 2022-11-02

Status: Closed
Project: MariaDB MaxScale
Component/s: test
Affects Version/s: 6.4.2
Fix Version/s: N/A

Type: Task Priority: Major
Reporter: markus makela Assignee: markus makela
Resolution: Cannot Reproduce Votes: 0
Labels: None

Sprint: MXS-SPRINT-166, MXS-SPRINT-167, MXS-SPRINT-168, MXS-SPRINT-169

 Description   

The mxs1585 test is unstable in 6.4 whereas in 2.5 it is much more stable and only fails due to VM related problems. The test appears to time out during a query which means this is possibly a real regression.



 Comments   
Comment by markus makela [ 2022-10-17 ]

I managed to catch this once and it didn't appear to be caused by anything in MaxScale itself locking up. This would suggest that the reason is that a query is not routed and no errors are generated or a part of a result is not fully returned.

Comment by markus makela [ 2022-10-19 ]

Now that the test framework collects the maxctrl create report output from tests that time out, I managed to catch the logs from two sessions for a failing test case. Both of them had been idle 308.6 seconds and were in the process of connecting to the galera cluster.

first session

2022-10-18 15:36:50   info   : (779) Found matching user 'skysql'@'%' for client 'skysql'@'::ffff:10.128.0.33' with sufficient privileges.
2022-10-18 15:36:50   info   : (779) [readwritesplit] (RW-Split-Router); Target connection counts:
2022-10-18 15:36:50   info   : (779) [readwritesplit] (RW-Split-Router); current operations : 0 in \tserver1 Maintenance, Running
2022-10-18 15:36:50   info   : (779) [readwritesplit] (RW-Split-Router); current operations : 45 in \tserver2 Master, Synced, Running
2022-10-18 15:36:50   info   : (779) [readwritesplit] (RW-Split-Router); current operations : 0 in \tserver3 Slave, Synced, Running
2022-10-18 15:36:50   info   : (779) [readwritesplit] (RW-Split-Router); current operations : 0 in \tserver4 Slave, Synced, Running
2022-10-18 15:36:50   info   : (779) [readwritesplit] (RW-Split-Router); Connected to 'server2'
2022-10-18 15:36:50   info   : (779) [readwritesplit] (RW-Split-Router); Selected Master: server2
2022-10-18 15:36:50   info   : (779) [readwritesplit] (RW-Split-Router); Connected to 'server4'
2022-10-18 15:36:50   info   : (779) [readwritesplit] (RW-Split-Router); Selected Slave: server4
2022-10-18 15:36:50   info   : (779) [readwritesplit] (RW-Split-Router); Connected to 'server3'
2022-10-18 15:36:50   info   : (779) [readwritesplit] (RW-Split-Router); Selected Slave: server3
2022-10-18 15:36:50   info   : (779) Started RW-Split-Router client session [779] for 'skysql' from ::ffff:10.128.0.33
2022-10-18 15:36:50   info   : (779) (RW-Split-Router); > Autocommit: [enabled], trx is [not open], cmd: (0x03) COM_QUERY, plen: 40, type: QUERY_TYPE_WRITE, stmt: INSERT INTO test.mxs1585 VALUES (1) 
2022-10-18 15:36:50   info   : (779) [readwritesplit] (RW-Split-Router); Route query to master: server2 <
2022-10-18 15:36:50   info   : (779) (server2); Storing COM_QUERY while in state 'Handshaking': INSERT INTO test.mxs1585 VALUES (1)
2022-10-18 15:36:50   info   : (779) Connected to 'server4' with thread id 781
2022-10-18 15:36:50   info   : (779) Connected to 'server3' with thread id 784
2022-10-18 15:36:50   info   : (779) Authentication to 'server4' succeeded.
2022-10-18 15:36:50   info   : (779) Authentication to 'server3' succeeded.

second session

2022-10-18 15:36:50   info   : (748) Found matching user 'skysql'@'%' for client 'skysql'@'::ffff:10.128.0.33' with sufficient privileges.
2022-10-18 15:36:50   info   : (748) [readwritesplit] (RW-Split-Router); Target connection counts:
2022-10-18 15:36:50   info   : (748) [readwritesplit] (RW-Split-Router); current operations : 0 in \tserver1 Maintenance, Running
2022-10-18 15:36:50   info   : (748) [readwritesplit] (RW-Split-Router); current operations : 43 in \tserver2 Master, Synced, Running
2022-10-18 15:36:50   info   : (748) [readwritesplit] (RW-Split-Router); current operations : 0 in \tserver3 Slave, Synced, Running
2022-10-18 15:36:50   info   : (748) [readwritesplit] (RW-Split-Router); current operations : 0 in \tserver4 Slave, Synced, Running
2022-10-18 15:36:50   info   : (748) [readwritesplit] (RW-Split-Router); Connected to 'server2'
2022-10-18 15:36:50   info   : (748) [readwritesplit] (RW-Split-Router); Selected Master: server2
2022-10-18 15:36:50   info   : (748) [readwritesplit] (RW-Split-Router); Connected to 'server3'
2022-10-18 15:36:50   info   : (748) [readwritesplit] (RW-Split-Router); Selected Slave: server3
2022-10-18 15:36:50   info   : (748) [readwritesplit] (RW-Split-Router); Connected to 'server4'
2022-10-18 15:36:50   info   : (748) [readwritesplit] (RW-Split-Router); Selected Slave: server4
2022-10-18 15:36:50   info   : (748) Started RW-Split-Router client session [748] for 'skysql' from ::ffff:10.128.0.33
2022-10-18 15:36:50   info   : (748) (RW-Split-Router); > Autocommit: [enabled], trx is [not open], cmd: (0x03) COM_QUERY, plen: 40, type: QUERY_TYPE_WRITE, stmt: INSERT INTO test.mxs1585 VALUES (1) 
2022-10-18 15:36:50   info   : (748) [readwritesplit] (RW-Split-Router); Route query to master: server2 <
2022-10-18 15:36:50   info   : (748) (server2); Storing COM_QUERY while in state 'Handshaking': INSERT INTO test.mxs1585 VALUES (1)
2022-10-18 15:36:50   info   : (748) Connected to 'server3' with thread id 788
2022-10-18 15:36:50   info   : (748) Connected to 'server4' with thread id 788
2022-10-18 15:36:50   info   : (748) Authentication to 'server3' succeeded.
2022-10-18 15:36:50   info   : (748) Authentication to 'server4' succeeded.

Based on the log output the handshake packet from server2 was either never sent or was not read by MaxScale.

Comment by markus makela [ 2022-11-02 ]

The test no longer fails and the last 100 test runs of it seem to only fail due to overall slowness of the test system. Mostly it fails due to the cluster simply not responding to anything anymore.

Generated at Thu Feb 08 04:27:33 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.