Details
-
Task
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
None
-
None
-
2025-6, 2025-9
Description
Please update how burza is run in buildbot using Ansible deployment description as a reference: https://github.com/mariadb-corporation/burza/blob/main/burza/deploy/
Most of the changes are related only to multinode mode (have a look at multinode.yml)
Starting Ray before running burza
The only change that is common to both monoburza and multiburza is that we must start Ray before running burza and stop it after running it. It will help us avoid giant errors that were occurring even after successful runs of burza.
So in monoburza:
poetry run ray start --head --port=6379 --disable-usage-stats |
<wait for ray status --address=127.0.0.1:6379 to return 0> |
poetry run burza run-tests...
|
poetry run ray stop --force
|
In multiburza we must also add this option to ray start command on every node: --resources="
{\"node_name:<node_name>\": 1}"
RAY_NODE_NAME is now CLUSTER_NODE_NAME
Turns out that Ray has its own meaning for envvar RAY_NODE_NAME. So now we must pass CLUSTER_NODE_NAME (head,replica1,replica2,...)
Options to set on head
CLUSTER_NODE_NAME: "head" |
SECONDARY_NODE_NAMES: "replica1,replica2" |
RESTART_WITH_MCS_CLUSTER: "true" |
Options to set on replicas
CLUSTER_NODE_NAME: "replicaN" |
TEST_RUNNER: "freeloader_test_runner" |
DATA_POINT_GENERATORS: "cpu_load,memory_stats" |
# All exporters/report generators are run on primary |
REPORT_GENERATORS: []
|
EXPORTERS: []
|
# Replicas have a reduced set of MCS services |
CPU_LOAD_PROCESS_NAMES: "PrimProc,StorageManager,WriteEngineServer,workernode" |
MEM_USAGE_PROCESS_NAMES: "PrimProc,StorageManager,WriteEngineServer,workernode" |
# Cluster is restarted by primary |
RESTART_DB_BEFORE_TEST_CASE: "false" |
Starting Ray on replicas
poetry run ray start \
--address=<head_ip:6379> \
--resources="
"