Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
BB V1.0
-
None
-
Sprint 2 (27.01.2025)
Description
Bug identified after migrating PROD to container.
How to reproduce
You start a docker latent builder in DEV, let's say tarball-docker and you get a conflict.
https://buildbot.dev.mariadb.org/#/builders/110/builds/99/steps/0/logs/err_html
The container name "/buildbot-hz-bbw1-docker-tarball-debian-12-15cc21" is already in use by container |
causing the worker to RETRY and sometimes it will block Production builder, stuck in Preparing worker.
Why?
Because now the master names are the SAME in both DEV and PROD
and when the same builder is started on the SAME worker host at the same time, the above error will occur.
Details
This is how Buildbot gives "unique" names to containers on a worker HOST
(f'buildbot-{self.workername}-{self.masterhash}').replace("_", "-") |
Example of container name on worker host: buildbot-hz-bbw1-docker-tarball-debian-12-15cc21
The last part is important i.e. self.masterhash (15cc21)
How buildbot calculate this hash
self.name = f"{self.hostname}:{os.path.abspath(self.basedir or '.')}" |
masterName = unicde2bytes(self.master.name)
|
self.masterhash = hashlib.sha1(masterName).hexdigest()[:6] |
For example:
self.hostname = autogen_aarch64-master-0 # name of the master container |
os.path.abspath(self.basedir or '.') = /srv/buildbot/master/autogen/aarch64-master-0 # path to master config in container |
Resulting in autogen_aarch64-master-0:/srv/buildbot/master/autogen/aarch64-master-0
The masterhash is the SAME in both DEV and PROD.
Proposed solution
Change hostname of master containers in DEV
Definition of done
Let's take for example tarball-docker running on master-protected-branches.
It will always render:
buildbot-hz-bbw1-docker-tarball-debian-12-15cc21 |
or
|
buildbot-hz-bbw4-docker-tarball-debian-12-15cc21 |
which is the same on both DEV and PROD.
Changing the master name should render something else than -15cc21