0

Update gpu_testing_bot_details.md

- No longer sharing GCEs with Chromium in chromium.star
- tools/build repo restructure
- No more mb_config_buckets.pyl

Bug: 1123215
Change-Id: I3874ba6a63873dc560837c87372ee22a6a7a061a
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2444440
Reviewed-by: Brian Sheedy <bsheedy@chromium.org>
Commit-Queue: Yuly Novikov <ynovikov@chromium.org>
Cr-Commit-Position: refs/heads/master@{#813284}
This commit is contained in:
Yuly Novikov
2020-10-02 18:23:43 +00:00
committed by Commit Bot
parent bfd85c7106
commit 55b23a6412

@ -66,7 +66,7 @@ differences in behavior between the tryservers and waterfall bots. Since the
tryservers mirror waterfall bots, if the waterfall bot is working, the
tryserver must almost inherently be working as well.
[chromium_trybot.py]: https://chromium.googlesource.com/chromium/tools/build/+/master/scripts/slave/recipes/chromium_trybot.py
[chromium_trybot.py]: https://chromium.googlesource.com/chromium/tools/build/+/master/recipes/recipes/chromium_trybot.py
There are some GPU configurations on the waterfall backed by only one machine,
or a very small number of machines in the Swarming pool. A few examples are:
@ -137,7 +137,7 @@ See [Adding new steps to the GPU bots] for details on this process.
In the [`tools/build`][tools/build] workspace:
* `scripts/slave/recipe_modules/chromium_tests/`:
* `recipes/recipe_modules/chromium_tests/`:
* [`chromium_gpu.py`][chromium_gpu.py] and
[`chromium_gpu_fyi.py`][chromium_gpu_fyi.py] define the following for
each builder and tester:
@ -177,9 +177,9 @@ In the [`tools/build`][tools/build] workspace:
specific hardware configuration.
[tools/build]: https://chromium.googlesource.com/chromium/tools/build/
[chromium_gpu.py]: https://chromium.googlesource.com/chromium/tools/build/+/master/scripts/slave/recipe_modules/chromium_tests/chromium_gpu.py
[chromium_gpu_fyi.py]: https://chromium.googlesource.com/chromium/tools/build/+/master/scripts/slave/recipe_modules/chromium_tests/chromium_gpu_fyi.py
[trybots.py]: https://chromium.googlesource.com/chromium/tools/build/+/master/scripts/slave/recipe_modules/chromium_tests/trybots.py
[chromium_gpu.py]: https://chromium.googlesource.com/chromium/tools/build/+/master/recipes/recipe_modules/chromium_tests/builders/chromium_gpu.py
[chromium_gpu_fyi.py]: https://chromium.googlesource.com/chromium/tools/build/+/master/recipes/recipe_modules/chromium_tests/builders/chromium_gpu_fyi.py
[trybots.py]: https://chromium.googlesource.com/chromium/tools/build/+/master/recipes/recipe_modules/chromium_tests/trybots.py
In the [`chromium/src`][chromium/src] workspace:
@ -208,9 +208,6 @@ In the [`chromium/src`][chromium/src] workspace:
behavior in the GN build.
* [`src/tools/mb/mb_config.pyl`][mb_config.pyl]
* Defines the GN arguments for all of the bots.
* [`src/tools/mb/mb_config_buckets.pyl`][mb_config_buckets.pyl]
* A new version of [`mb_config.pyl`][mb_config.pyl] that should supersede
it.
* [`src/infra/config`][src/infra/config]:
* Definitions of how bots are organized on the waterfall,
how builds are triggered, which VMs or machines are used for the
@ -226,7 +223,6 @@ In the [`chromium/src`][chromium/src] workspace:
[chromium.gpu.fyi.json]: https://chromium.googlesource.com/chromium/src/+/master/testing/buildbot/chromium.gpu.fyi.json
[gn_isolate_map.pyl]: https://chromium.googlesource.com/chromium/src/+/master/testing/buildbot/gn_isolate_map.pyl
[mb_config.pyl]: https://chromium.googlesource.com/chromium/src/+/master/tools/mb/mb_config.pyl
[mb_config_buckets.pyl]: https://chromium.googlesource.com/chromium/src/+/master/tools/mb/mb_config_buckets.pyl
[generate_buildbot_json.py]: https://chromium.googlesource.com/chromium/src/+/master/testing/buildbot/generate_buildbot_json.py
[mixins.pyl]: https://chromium.googlesource.com/chromium/src/+/master/testing/buildbot/mixins.pyl
[waterfalls.pyl]: https://chromium.googlesource.com/chromium/src/+/master/testing/buildbot/waterfalls.pyl
@ -245,9 +241,6 @@ sorry):
GPUs. New GPU hardware should be added to this pool.
* Also defines the GCEs, Mac VMs and Mac machines used for CI builders
on GPU and GPU.FYI waterfalls and trybots.
* [`chromium.star`][chromium.star]
* Defines Swarming pools of GCEs, shared with Chromium, which are used
for CI builders on GPU and GPU.FYI waterfalls and trybots.
* [`pools.cfg`][pools.cfg]
* Defines the Swarming pools for GCEs and Mac VMs used for manually
triggered trybots.
@ -302,17 +295,14 @@ The process is:
to be added to the right Swarming pools in a CL in the
[`infradata/config`][infradata/config] (Google internal) workspace.
1. GCEs for Windows CI builders and builder/testers should be added to
`luci-chromium-ci-win10-8` group in [`chromium.star`][chromium.star].
[Example](https://chrome-internal-review.googlesource.com/c/infradata/config/+/2077803).
`luci-chromium-gpu-ci-win10-8` group in [`gpu.star`][gpu.star].
1. GCEs for Linux and Android CI builders and builder/testers should be added to
one of `luci-chromium-ci-xenial-*-8` groups (but not `*ssd-8`) in
[`chromium.star`][chromium.star].
[Example](https://chrome-internal-review.googlesource.com/c/infradata/config/+/2077803).
`luci-chromium-gpu-ci-xenial-8` group in [`gpu.star`][gpu.star].
1. VMs for Mac CI builders and builder/testers should be added to
`gpu_ci_bots` group in [`gpu.star`][gpu.star].
`builderfull_gpu_ci_bots` group in [`gpu.star`][gpu.star].
[Example](https://chrome-internal-review.googlesource.com/c/infradata/config/+/1166889).
1. GCEs for CI testers for all OSes should be added to
`luci-chromium-ci-xenial-2` group in [`chromium.star`][chromium.star].
`luci-chromium-gpu-ci-xenial-2` group in [`gpu.star`][gpu.star].
[Example](https://chrome-internal-review.googlesource.com/c/infradata/config/+/2016410).
1. GCEs and VMs for CQ and optional CQ GPU trybots for should be added to
a corresponding `gpu_try_bots` group in [`gpu.star`][gpu.star].
@ -347,7 +337,8 @@ The process is:
longer be necessary per [crbug.com/942301](http://crbug.com/942301),
but consult with the Chrome Infra team to find out which of the
[zones](https://cloud.google.com/compute/docs/regions-zones/) has
available capacity.
available capacity. This also can be checked on viceroy
[dashboard](https://viceroy.corp.google.com/chrome_infra/Quota/chrome?duration=7d).
1. Get this reviewed and landed. This step associates the VM or pool of VMs
with the bot's name on the waterfall for "builderful" bots or increases
swarmed pool capacity for "builderless" bots.
@ -440,8 +431,7 @@ Builder].
1. Run `main.star` in [`src/infra/config`][src/infra/config] to update the
generated files. Double-check your work there.
1. If you were adding a new builder, you would need to also add the new
machine to [`src/tools/mb/mb_config.pyl`][mb_config.pyl] and
[`src/tools/mb/mb_config_buckets.pyl`][mb_config_buckets.pyl].
machine to [`src/tools/mb/mb_config.pyl`][mb_config.pyl].
1. After the Chromium-side CL lands it will take some time for all of
the configuration changes to be picked up by the system. The bot
@ -454,7 +444,7 @@ Builder].
following. Here's an [example
CL](https://chromium-review.googlesource.com/1041145).
1. Adds the new bot to [`chromium_gpu_fyi.py`][chromium_gpu_fyi.py] in
`scripts/slave/recipe_modules/chromium_tests/`. Make sure to set the
`recipes/recipe_modules/chromium_tests/builders/`. Make sure to set the
`serialize_tests` property to `True`. This is specified for waterfall
bots, but not trybots, and helps avoid overloading the physical
hardware. Double-check the `BUILD_CONFIG` and `parent_buildername`
@ -464,7 +454,7 @@ Builder].
the newly-deployed waterfall bot, so it knows which JSON file to load
out of src/testing/buildbot and which entry to look at.
1. Sometimes it is necessary to retrain recipe expectations
(`scripts/slave/recipes.py test train`). This is usually needed only
(`recipes/recipes.py test train`). This is usually needed only
if the bot adds untested code flow in a recipe, but it's something
to watch out for if your CL fails presubmit for some reason.
@ -502,8 +492,8 @@ writing only NVIDIA). To do this:
before proceeding.
1. Create a CL in the [`tools/build`][tools/build] workspace, adding the new
Release tester to `win10_chromium_x64_rel_ng`'s `bot_ids` list
in `scripts/slave/recipe_modules/chromium_tests/trybots.py`. Rerun
`scripts/slave/recipes.py test train`.
in `recipes/recipe_modules/chromium_tests/trybots.py`. Rerun
`recipes/recipes.py test train`.
1. Once the above CL lands, the commit queue will **immediately** start
running tests on the CoolNewGPUType configuration. Be vigilant and make
sure that tryjobs are green. If they are red for any reason, revert the CL
@ -557,7 +547,7 @@ trybot for the Win7 NVIDIA GPUs in Release mode. We will call the new bot
CL](https://chromium-review.googlesource.com/c/chromium/tools/build/+/1979113).
1. Adds the new trybot to a "Manually-triggered GPU trybots" section in
`scripts/slave/recipe_modules/chromium_tests/trybots.py`. Create this
`recipes/recipe_modules/chromium_tests/tests/trybots.py`. Create this
section after the "Optional GPU bots" section for the appropriate
tryserver (`tryserver.chromium.win`, `tryserver.chromium.mac`,
`tryserver.chromium.linux`, `tryserver.chromium.android`). Have the bot
@ -570,7 +560,7 @@ trybot for the Win7 NVIDIA GPUs in Release mode. We will call the new bot
tests to run and on what physical hardware.
1. It may be necessary to retrain recipe expectations for
[`tools/build`][tools/build] workspace CLs
(`scripts/slave/recipes.py test train`). This shouldn't be necessary
(`recipes/recipes.py test train`). This shouldn't be necessary
for just adding a manually triggered trybot, but it's something to
watch out for if your CL fails presubmit for some reason.
@ -628,15 +618,14 @@ Win10 Release (CoolNewGPUType)".
[`luci-scheduler.cfg`][luci-scheduler.cfg],
[`cr-buildbucket.cfg`][cr-buildbucket.cfg]. Double-check your work
there.
1. Update [`src/tools/mb/mb_config.pyl`][mb_config.pyl] and
[`src/tools/mb/mb_config_buckets.pyl`][mb_config_buckets.pyl]
1. Update [`src/tools/mb/mb_config.pyl`][mb_config.pyl]
to include `win-myproject-rel`.
1. *After* the Chromium-side CL lands and the bot is on the console, create a CL
in the [`tools/build`][tools/build] workspace which does the
following. Here's an [example CL](https://crrev.com/c/1554272).
1. Adds "MyProject GPU Win10 Release
(CoolNewGPUType)" to [`chromium_gpu_fyi.py`][chromium_gpu_fyi.py] in
`scripts/slave/recipe_modules/chromium_tests/`. You can copy a similar
`recipes/recipe_modules/chromium_tests/builders/`. You can copy a similar
step.
1. Adds `win-myproject-rel` to [`trybots.py`][trybots.py] in the same folder.
This is where you associate "MyProject GPU Win10 Release