[MDEV-22592] Travis-CI broken for 10.5 in recent commit - Why does nobody care? - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Done
Affects Version/s: 10.5
Fix Version/s: N/A
Component/s: N/A
Labels:
None

Description

I noticed that Travis-CI has stopped passing for the 10.5 branch:

https://travis-ci.org/github/MariaDB/server/branches

Last successful one:
https://travis-ci.org/github/MariaDB/server/builds/686095229

First failing one:
https://travis-ci.org/github/MariaDB/server/builds/686533438

(there are also a couple cancelled in between)

The failure is due to https://jira.mariadb.org/browse/MDEV-21976 (currently assigned to sanja but no work yet). Fixing that issue would solve it permanently.

However, since it is not fixed, it was disabled by me in https://github.com/MariaDB/server/commit/a135f0ab88d63b9a8976d6b3010f27766c38873d (when https://github.com/MariaDB/server/pull/1484 was merged by marko).

This change to mysql-test/unstable-tests was lost in a merge commit. The fix to this is trivial: add back the line in mysql-test/unstable-tests.

However, the underlying issue here is that current MariaDB Server practices allow Travis-CI to be broken, and effectively after that:

All new and updated pull requests at https://github.com/MariaDB/server/pulls will start to fail, communicating indirectly to both contributors and reviewers that the code is broken and not worth reviewing until at least the CI passes
Any new contributors branching of the latest development git branch will have a failing CI as the starting point, which most likely puts them off.
Quality deteriorates, since once the CI starts failing, people start to ignore all results from the CI and more and more failures start to creep in.

And so on. I hope you get the point why failing CI is bad and how it is counter-productive and wastes a lot of human resources that is away from productive development.

Now what can be do about this?

Is there a need for more education? Travis-CI was added as the first and only CI system accessible to outside contributors in August 2016. Surely all developers have had a chance to learn about it? Or is there some obstacles? Should we maybe organize a webinar where we quickly go through what Travis-CI is, what the lines in .travis.yml mean and how to browser Travis-CI.org to look at build results or debug them?

Are the Travis-CI tests bad? There are no open bug reports on Jira about any complaints about Travis-CI.

I think the underlying problem here is the same reason why there are so many failures on buildbot.askmonty.org and buildbot.mariadb.org as well. Way too many people are taking the wrong tradeoff in the decision about "Just get it done and move on, don't wait for tests" vs "Work on something else, only merge once tests complete".

What do you think? What should be done about this to improve the situation, to improve the quality of MariaDB both by current developers and future contributors, and speed up the progress by having less breakage and steps backwards?

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

image-2020-05-16-11-09-57-613.png
33 kB
2020-05-16 08:09
screenshot-1.png
125 kB
2020-05-16 10:14
screenshot-2.png
11 kB
2020-08-03 06:52
screenshot-3.png
12 kB
2020-08-03 06:53
screenshot-4.png
11 kB
2020-10-13 19:11
screenshot-5.png
12 kB
2020-10-13 19:11
screenshot-6.png
12 kB
2020-10-13 19:12
screenshot-7.png
12 kB
2020-10-13 19:12
screenshot-8.png
12 kB
2020-10-13 19:13

Issue Links

is blocked by

MDEV-22173 OSX built mariadbd cannot connections [ERROR] Error in accept: Bad file descriptor

Closed

relates to

MDEV-23378 Memory leak in sys_vars.thread_pool_size_high

Closed

Activity

Ascending order - Click to sort in descending order

View 8 older comments

Marko Mäkelä added a comment - 2020-08-03 07:26 - edited

otto, for a recent 10.3 build https://travis-ci.org/github/MariaDB/server/builds/714262380 I see two failures apparently due to bad connectivity when trying to download clang:
Could not connect to apt.llvm.org:80 (199.232.66.49), connection timed out
E: Unable to locate package clang-5.0
Maybe we should try to figure out a solution that allows such build-time dependencies to be cached? The commit was only disabling a test (no code changes).

For a 10.2 build, I see something else:
The job exceeded the maximum time limit for jobs, and has been terminated. (Do we really have to spend time on building TokuDB on Travis? It is not getting updates, and was finally removed in 10.5/10.6 by ~~MDEV-19780~~.)
Errors/warnings were found in logfiles during server shutdown that I have also seen on http://buildbot.askmonty.org from time to time:

10.2 dc716da4571465af3adadcd2c471f11fef3a2191
Warnings generated in error logs during shutdown after running tests: sys_vars.thread_pool_size_high

Warning: Memory not freed: 38408

There does not appear to be any bug report for this memory leak yet.

In my opinion, the proverb that I heard at the compulsory service of the Finnish defence force applies: "Valvomaton käsky on kasku." (An unenforced order is a joke.) If nobody spends effort on monitoring Travis test failures, they are going to be rather useless. Build failures probably do bring some more value (if someone notices them before the breakage reaches a release).

Marko Mäkelä added a comment - 2020-08-03 07:26 - edited otto , for a recent 10.3 build https://travis-ci.org/github/MariaDB/server/builds/714262380 I see two failures apparently due to bad connectivity when trying to download clang : Could not connect to apt.llvm.org:80 (199.232.66.49), connection timed out E: Unable to locate package clang-5.0 Maybe we should try to figure out a solution that allows such build-time dependencies to be cached? The commit was only disabling a test (no code changes) . For a 10.2 build , I see something else: The job exceeded the maximum time limit for jobs, and has been terminated. (Do we really have to spend time on building TokuDB on Travis? It is not getting updates, and was finally removed in 10.5/10.6 by MDEV-19780 .) Errors/warnings were found in logfiles during server shutdown that I have also seen on http://buildbot.askmonty.org from time to time: 10.2 dc716da4571465af3adadcd2c471f11fef3a2191 Warnings generated in error logs during shutdown after running tests: sys_vars.thread_pool_size_high Warning: Memory not freed: 38408 There does not appear to be any bug report for this memory leak yet. In my opinion, the proverb that I heard at the compulsory service of the Finnish defence force applies: "Valvomaton käsky on kasku." (An unenforced order is a joke.) If nobody spends effort on monitoring Travis test failures, they are going to be rather useless. Build failures probably do bring some more value (if someone notices them before the breakage reaches a release).

Otto Kekäläinen added a comment - 2020-08-03 07:32

If there are a lot of false positives, then I suggest we simply disable those tests. It will also make the suite run faster.

Once Ubuntu 20.04 is available on Travis-CI we can get rid of those extra dependencies and thus streamline the config. WIP at https://github.com/MariaDB/server/pull/1507

Otto Kekäläinen added a comment - 2020-08-03 07:32 If there are a lot of false positives, then I suggest we simply disable those tests. It will also make the suite run faster. Once Ubuntu 20.04 is available on Travis-CI we can get rid of those extra dependencies and thus streamline the config. WIP at https://github.com/MariaDB/server/pull/1507

Marko Mäkelä added a comment - 2020-08-03 07:56 - edited

For the genuine 10.2 failure, I filed ~~MDEV-23378~~ Memory leak in sys_vars.thread_pool_size_high

Marko Mäkelä added a comment - 2020-08-03 07:56 - edited For the genuine 10.2 failure, I filed MDEV-23378 Memory leak in sys_vars.thread_pool_size_high

Otto Kekäläinen added a comment - 2020-10-13 19:13

Nice to see that nowadays Travis-CI seems to be all green and people are not ignoring the results of it!

Thanks!

Screenshots from https://travis-ci.org/github/MariaDB/server/branches

Otto Kekäläinen added a comment - 2020-10-13 19:13 Nice to see that nowadays Travis-CI seems to be all green and people are not ignoring the results of it! Thanks! Screenshots from https://travis-ci.org/github/MariaDB/server/branches

Daniel Black added a comment - 2020-10-13 20:46

Your welcome.

Daniel Black added a comment - 2020-10-13 20:46 Your welcome.

MariaDB Server

Travis-CI broken for 10.5 in recent commit - Why does nobody care?

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration