[MXS-20] bugzillaId-720: instability of MariaDB test cases: MaxScale + MariaDB 10 Created: 2015-01-04  Updated: 2019-12-19  Resolved: 2015-03-20

Status: Closed
Project: MariaDB MaxScale
Component/s: Core
Affects Version/s: 1.0.5
Fix Version/s: 1.1.0

Type: Bug Priority: Minor
Reporter: Timofey Turenko Assignee: markus makela
Resolution: Not a Bug Votes: 0
Labels: None
Environment:

Linux



 Description   

This is import of http://bugs.mariadb.com/show_bug.cgi?id=720

MariaDB test cases are unstable in case of MaxScale and MariaDB 10 Master/Slave backend.

Test case executes a number of test cases from MariaDB test suite in the loop.
The first test set execution is usually ok, but second and next are FAILED.

Tests are available https://github.com/mariadb-corporation/maxscale-system-test/tree/master/Hartmut_tests/maxscale-mysqltest

(https://github.com/mariadb-corporation/maxscale-system-test/tree/master/Hartmut_tests/maxscale-mysqltest/t - queries and https://github.com/mariadb-corporation/maxscale-system-test/tree/master/Hartmut_tests/maxscale-mysqltest/r expected results).

Test console outputs:

http://jenkins.engskysql.com:8088/job/Execute_system_test/5222/console
http://jenkins.engskysql.com:8088/view/Test/job/Execute_system_test/5223/console



 Comments   
Comment by Dipti Joshi (Inactive) [ 2015-03-09 ]

This is the comment history from bugzilla

Comment 1 Markus Mäkelä 2015-02-18 16:51:23 UTC
The tests seem to fail because the slave's haven't caught up with the master yet. This causes the test to randomly fail.

Modifying the tests to wait 10 seconds instead of 1 seems to make all tests pass.

Comment 2 Timofey Turenko 2015-02-27 17:30:00 UTC
error log is full with:

2015-02-27 17:33:37 Backend hangup error handling.
2015-02-27 17:33:37 Backend hangup error handling.
2015-02-27 17:33:37 Backend hangup -> closing session.
2015-02-27 17:33:37 Backend hangup -> closing session.
2015-02-27 17:33:39 Hangup in session that is not ready for routing, Error reported is 'Broken pipe'.
2015-02-27 17:33:39 Backend hangup error handling.
2015-02-27 17:33:48 Backend hangup error handling.
2015-02-27 17:33:48 Backend hangup -> closing session.
2015-02-27 17:33:48 Hangup in session that is not ready for routing, Error reported is 'Broken pipe'.
2015-02-27 17:34:27 Backend hangup error handling.
2015-02-27 17:34:27 Backend hangup error handling.
2015-02-27 17:34:48 Backend hangup error handling.
2015-02-27 17:34:48 Backend hangup -> closing session.
2015-02-27 17:34:56 Backend hangup error handling.
2015-02-27 17:34:56 Backend hangup error handling.
2015-02-27 17:34:56 Backend hangup -> closing session.
2015-02-27 17:34:56 Backend hangup -> closing session.
2015-02-27 17:34:58 Backend hangup error handling.
2015-02-27 17:34:58 Backend hangup -> closing session.
2015-02-27 17:35:00 Backend hangup error handling.
2015-02-27 17:35:35 Hangup in session that is not ready for routing, Error reported is 'Broken pipe'.
2015-02-27 17:35:35 Backend hangup error handling.
2015-02-27 17:35:54 Hangup in session that is not ready for routing, Error reported is 'Broken pipe'.
2015-02-27 17:36:41 Backend hangup error handling.
2015-02-27 17:36:41 Backend hangup error handling.
2015-02-27 17:36:41 Backend hangup -> closing session.
2015-02-27 17:36:41 Backend hangup -> closing session.
2015-02-27 17:36:41 Backend hangup error handling.
2015-02-27 17:36:41 Backend hangup -> closing session.
2015-02-27 17:36:45 Backend hangup error handling.
2015-02-27 17:37:10 Backend hangup error handling.
2015-02-27 17:37:10 Backend hangup -> closing session.
2015-02-27 17:37:43 Backend hangup error handling.
2015-02-27 17:37:43 Error : Unable to write to backend due to authentication failure.
2015-02-27 17:38:16 Hangup in session that is not ready for routing, Error reported is 'Broken pipe'.
2015-02-27 17:38:16 Hangup in session that is not ready for routing, Error reported is 'Broken pipe'.
2015-02-27 17:38:16 Backend hangup error handling.
2015-02-27 17:38:16 Backend hangup error handling.

Comment 3 Timofey Turenko 2015-02-27 17:32:14 UTC

test PASSED if backend is a Galera cluster
test PASSED if backend has MariaDB 5.5 (both Master/Slave and Galera)
 
test PASSED if 10 seconds sleep is added to the sleep-1.inc:

create table t1(id integer);
insert into t1 values(1); # in master
commit;
--source sleep-1.inc
select count(*) from t1; # in slave

Comment 4 Timofey Turenko 2015-02-27 18:20:31 UTC
after long running testing it failed even with 10 seconds sleep

Comment by Dipti Joshi (Inactive) [ 2015-03-19 ]

@Markus.Makela, @Timofey would setting appropriate value of max_slave_replication_lag and detect_replication_lag help for this case ?

Comment by markus makela [ 2015-03-19 ]

MaxScale's internal settings don't affect the replication which is the cause for this bug. If MaxScale detects that all slaves are out of sync and no slave is close enough in replication it will route the query to the master. This will cause the test to fail as it expects the query to be routed to a slave.

Comment by Timofey Turenko [ 2015-03-20 ]

5.5 Master/Slave and Galera works
10.0 Galera works

10.0, 10.1 do not work

10.0, 10.1 without Maxscale (test is connected directly to master) works

adding more SLEEP to sleep-1.inc decreases the number of failures, but does not solve the problem

Comment by Timofey Turenko [ 2015-03-20 ]

one possible reason (but there is not any evidence): https://mariadb.atlassian.net/browse/MDEV-7578
(waiting for 10.0.18 to retest)

Comment by Timofey Turenko [ 2015-03-20 ]

https://mariadb.atlassian.net/browse/MDEV-7578 is not a reason of Hartmut tests failures. I tried with lates 10.0 (built it from source and checked that it includes f66fbe8ce0ff4ffcd6a6c185f9b3d25bd9f67f8d MDEV-7578 :Slave is ~10x slower to execute set of statements compared to master when using RBR

Comment by Timofey Turenko [ 2015-03-20 ]

monitor settings:
monitor_interval=500
detect_replication_lag=1

and router settings:
max_slave_replication_lag=1

fixes the problem.

without it Maxscale takes data from the slave which is not yet replicated.
So, it is not a Maxcale issue.

Generated at Thu Feb 08 03:56:11 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.