[MDEV-10615] archive.archive-big fails sporadically in buildbot, table marked as crashed Created: 2016-08-20 Updated: 2017-11-06 Resolved: 2017-11-05 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Tests |
| Affects Version/s: | 10.0, 10.1, 10.2 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Major |
| Reporter: | Elena Stepanova | Assignee: | Elena Stepanova |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Sprint: | 5.5.55, 10.1.29 | ||||||||
| Description |
|
http://buildbot.askmonty.org/buildbot/builders/kvm-fulltest/builds/6155/steps/test_2/logs/stdio
~15 occurrences over 1.5 years, so it's not a one-time coincidence. |
| Comments |
| Comment by Elena Stepanova [ 2017-02-18 ] | ||||||||||||||||||||||||||||
|
Another variation:
This one certainly happens due to disk space problems. | ||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2017-05-19 ] | ||||||||||||||||||||||||||||
|
We haven't had the problem (any failures with this test, actually) for almost a year, since 2016-06-26, so I think it's safe to assume for now that the problem has gone away. | ||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2017-09-22 ] | ||||||||||||||||||||||||||||
|
http://buildbot.askmonty.org/buildbot/builders/kvm-fulltest-big/builds/1198
Attached stdio and the error log, while we still have them. | ||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2017-11-05 ] | ||||||||||||||||||||||||||||
|
The initial corruption, which was happening at the beginning of the test, was most likely caused by some previous tests which would finish dirty. It stopped happening a while ago, possibly after we added more protection against dirty tests. The second failure, "Got error -1 "Internal error < 0", is easily reproducible in conditions of limited disk/memory, e.g. on my machine it happens when i run the test in shm. Not much to be done about it, it's environmental problem. We don't run big tests in shm in buildbot anymore, so the error stopped occurring there. The failure from the last comment remains a mystery.
But the error log has no sign of the test being retried. Instead, it claims that another test was started right away:
And everything was around the time when the suite was supposed to time out, and the other worker was running questionable rocksdb tests. We'll have to see if it ever occurs again. |