[MDEV-10621] parts.partition_float_myisam, parts.partition_int_myisam failed with timeout in buildbot Created: 2016-08-21  Updated: 2017-02-19  Resolved: 2017-02-19

Status: Closed
Project: MariaDB Server
Component/s: Tests
Affects Version/s: 10.0
Fix Version/s: 5.5.55, 10.0.30, 10.1.22, 10.2.5

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Elena Stepanova
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-7069 Fix buildbot failures in main server ... Stalled
Sprint: 5.5.55

 Description   

http://buildbot.askmonty.org/buildbot/builders/work-amd64-valgrind/builds/8972

parts.partition_float_myisam             w2 [ fail ]  timeout after 9000 seconds
        Test ended at 2016-06-23 18:59:05
 
Test case timeout after 9000 seconds
 
== /mnt/data/buildot/maria-slave/work-opensuse-amd64/build/mysql-test/var/2/log/partition_float_myisam.log == 
-3.40282e37
-123.456
0
1.5
123.456
1234550
select * from t2 where a=1.5;
a
1.5
delete from t2 where a=1.5;
select * from t2;
a
-3.40282e38
-3.40282e37
-123.456
0
123.456
1234550
delete from t2;
16384*3 inserts;
 
 == /mnt/data/buildot/maria-slave/work-opensuse-amd64/build/mysql-test/var/2/tmp/analyze-timeout-mysqld.1.err ==
mysqltest: Could not open connection 'default' after 500 attempts: 2002 Can't connect to local MySQL server through socket '/mnt/data/buildot/maria-slave/work-opensuse-amd64/build/mysql-test/var/tmp/2/mysqld.1.sock' (111 "Connection refused")

http://buildbot.askmonty.org/buildbot/builders/work-amd64-valgrind/builds/8963

parts.partition_int_myisam               w6 [ fail ]  timeout after 9000 seconds
        Test ended at 2016-06-22 02:41:23
 
Test case timeout after 9000 seconds
 
== /mnt/data/buildot/maria-slave/work-opensuse-amd64/build/mysql-test/var/6/log/partition_int_myisam.log == 
/*!50100 PARTITION BY KEY (a)
PARTITIONS 8 */
insert into t2 values (18446744073709551615), (0xFFFFFFFFFFFFFFFE), (18446744073709551613), (18446744073709551612);
select * from t2;
a
18446744073709551612
18446744073709551613
18446744073709551614
18446744073709551615
select * from t2 where a=18446744073709551615;
a
18446744073709551615
delete from t2 where a=18446744073709551615;
select * from t2;
a
18446744073709551612
18446744073709551613
18446744073709551614
delete from t2;
65535 inserts;
 
 == /mnt/data/buildot/maria-slave/work-opensuse-amd64/build/mysql-test/var/6/tmp/analyze-timeout-mysqld.1.err ==
mysqltest: Could not open connection 'default' after 500 attempts: 2002 Can't connect to local MySQL server through socket '/mnt/data/buildot/maria-slave/work-opensuse-amd64/build/mysql-test/var/tmp/6/mysqld.1.sock' (111 "Connection refused")



 Comments   
Comment by Sergei Golubchik [ 2017-01-05 ]

Looks ok. On xenial it doesn't fail, it just takes a lot of time:

parts.partition_float_myisam             w6 [ pass ]  2212982

It's more than half an hour. On a slower box (with an old valgrind, etc) it could time out, I suppose.

There's no bug here, but perhaps this test should be considered "big" or skipped for valgrind runs completely?

Comment by Elena Stepanova [ 2017-01-05 ]

It doesn't take long on a regular builder, so maybe not "big", but we can make it "no_valgrind_without_big" (include/no_valgrind_without_big.inc)

Comment by Elena Stepanova [ 2017-02-18 ]

UPD: see later comments and MDEV-12084 about this.

Here is a timeout without valgrind:
http://buildbot.askmonty.org/buildbot/builders/kvm-bintar-centos5-amd64/builds/4623/steps/test/logs/stdio

parts.partition_float_myisam             w2 [ fail ]  timeout after 900 seconds
        Test ended at 2017-02-08 22:14:47
 
Test case timeout after 900 seconds
 
== /usr/local/mariadb-10.2.4-linux-x86_64/mysql-test/var/2/log/partition_float_myisam.log == 
-2.2250738585072014e-208
0
1.5
1234.567
2.2250738585072016e208
select * from t2 where a=1234.567;
a
1234.567
delete from t2 where a=1234.567;
select * from t2;
a
-2.2250738585072016e208
-1.5
-1
-2.2250738585072014e-208
0
1.5
2.2250738585072016e208
delete from t2;
16384*3 inserts;

And retry took only 20 seconds:

parts.partition_float_myisam             w2 [ retry-pass ]  20540

So, there might be more to this than just a slow machine.

On the other hand, the timeout on parts.partition_float_myisam happened 5 times over the last year, twice on work-valgrind and 3 times on centos5-amd64.
For parts.partition_int_myisam (it's a longer test) – 11 times over the last year, 8 times on work-valgrind and 3 times on centos5-amd64.

So, it is somehow specific to the machines.

So, maybe it's something specific to the machines, after all.

Comment by Elena Stepanova [ 2017-02-18 ]

Assuming it's just a slow machine, I think a reasonable solution for these tests is splitting them into logically separate parts.
partition_float_myisam is in fact two tests:

--source suite/parts/inc/partition_float.inc
--source suite/parts/inc/partition_double.inc

partition_int_myisam is 5 tests:

--source suite/parts/inc/partition_tinyint.inc
--source suite/parts/inc/partition_smallint.inc
--source suite/parts/inc/partition_int.inc
--source suite/parts/inc/partition_mediumint.inc
--source suite/parts/inc/partition_bigint.inc

There is literally no gain at combining these parts under one mega-test – there is no preparation work, no even test-specific server parameters which would cause a server restart. The test simply sets several variables and then runs the sequence of include files. From every point of view, it's more correct to split it. I'm going to do just that.

Comment by Elena Stepanova [ 2017-02-18 ]

Comparative timing for combined vs split tests (shows no loss either at server restarts or test execution time)

worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
parts.partition_float_innodb 'innodb_plugin' [ pass ]   1453
parts.partition_int_innodb 'innodb_plugin' [ pass ]   1232
worker[1] > Restart [mysqld.1 - pid: 445, winpid: 445] - running with different options '--innodb --innodb-cmpmem --innodb-trx --innodb-buffer-pool-stats --innodb-buffer-page --innodb-buffer-page-lru --enable-partition' != '--ignore-builtin-innodb --plugin-load=ha_innodb.so --innodb --innodb-cmpmem --innodb-trx --innodb-buffer-pool-stats --innodb-buffer-page --innodb-buffer-page-lru --enable-partition'
parts.partition_float_innodb 'xtradb'    [ pass ]   1497
parts.partition_int_innodb 'xtradb'      [ pass ]   1183
worker[1] > Restart [mysqld.1 - pid: 499, winpid: 499] - running with different options '--enable-partition' != '--innodb --innodb-cmpmem --innodb-trx --innodb-buffer-pool-stats --innodb-buffer-page --innodb-buffer-page-lru --enable-partition'
parts.partition_float_myisam             [ pass ]  28321
parts.partition_int_myisam               [ pass ]  69986
--------------------------------------------------------------------------
The servers were restarted 2 times
Spent 103.672 of 117 seconds executing testcases

worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
parts.partition_bigint_innodb 'innodb_plugin' [ pass ]    284
parts.partition_double_innodb 'innodb_plugin' [ pass ]    738
parts.partition_float_innodb 'innodb_plugin' [ pass ]    752
parts.partition_int_innodb 'innodb_plugin' [ pass ]    287
parts.partition_mediumint_innodb 'innodb_plugin' [ pass ]    268
parts.partition_smallint_innodb 'innodb_plugin' [ pass ]    416
parts.partition_tinyint_innodb 'innodb_plugin' [ pass ]    239
worker[1] > Restart [mysqld.1 - pid: 1339, winpid: 1339] - running with different options '--innodb --innodb-cmpmem --innodb-trx --innodb-buffer-pool-stats --innodb-buffer-page --innodb-buffer-page-lru --enable-partition' != '--ignore-builtin-innodb --plugin-load=ha_innodb.so --innodb --innodb-cmpmem --innodb-trx --innodb-buffer-pool-stats --innodb-buffer-page --innodb-buffer-page-lru --enable-partition'
parts.partition_bigint_innodb 'xtradb'   [ pass ]    408
parts.partition_double_innodb 'xtradb'   [ pass ]    649
parts.partition_float_innodb 'xtradb'    [ pass ]    652
parts.partition_int_innodb 'xtradb'      [ pass ]    359
parts.partition_mediumint_innodb 'xtradb' [ pass ]    448
parts.partition_smallint_innodb 'xtradb' [ pass ]    343
parts.partition_tinyint_innodb 'xtradb'  [ pass ]    186
worker[1] > Restart [mysqld.1 - pid: 1472, winpid: 1472] - running with different options '--enable-partition' != '--innodb --innodb-cmpmem --innodb-trx --innodb-buffer-pool-stats --innodb-buffer-page --innodb-buffer-page-lru --enable-partition'
parts.partition_bigint_myisam            [ pass ]  15853
parts.partition_double_myisam            [ pass ]  13525
parts.partition_float_myisam             [ pass ]  13687
parts.partition_int_myisam               [ pass ]  16200
parts.partition_mediumint_myisam         [ pass ]  16396
parts.partition_smallint_myisam          [ pass ]  16064
parts.partition_tinyint_myisam           [ pass ]    143
--------------------------------------------------------------------------
The servers were restarted 2 times
Spent 97.897 of 117 seconds executing testcases

InnoDB tests are much smaller than MyISAM (they insert less data), but there has been one occasion of a timeout on {[partition_float_innodb}} on a P8 builder, so I'll split them as well.

Comment by Elena Stepanova [ 2017-02-18 ]

https://github.com/MariaDB/server/commit/6364adb199f8adbc5adfe0c276bdf2d3dd17454c

Comment by Elena Stepanova [ 2017-02-19 ]

Apparently the problems on valgrind builders and on CentOS 5 are completely different. Valgrind is just a slow builder, but CentOS has some issue which affects communication between the server and the client. In scope of this JIRA entry the slow builder is being solved, splitting the test into logical parts should help with that. The CentOS problem is filed as MDEV-12084 and should be handled separately.

https://github.com/MariaDB/server/commit/6364adb199f8adbc5adfe0c276bdf2d3dd17454c

Generated at Thu Feb 08 07:43:38 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.