0

Update gpu_testing_bot_details.md

- No longer sharing GCEs with Chromium in chromium.star
- tools/build repo restructure
- No more mb_config_buckets.pyl

Bug: 1123215
Change-Id: I3874ba6a63873dc560837c87372ee22a6a7a061a
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2444440
Reviewed-by: Brian Sheedy <bsheedy@chromium.org>
Commit-Queue: Yuly Novikov <ynovikov@chromium.org>
Cr-Commit-Position: refs/heads/master@{#813284}
This commit is contained in:
Yuly Novikov
2020-10-02 18:23:43 +00:00
committed by Commit Bot
parent bfd85c7106
commit 55b23a6412

@@ -66,7 +66,7 @@ differences in behavior between the tryservers and waterfall bots. Since the
tryservers mirror waterfall bots, if the waterfall bot is working, the tryservers mirror waterfall bots, if the waterfall bot is working, the
tryserver must almost inherently be working as well. tryserver must almost inherently be working as well.
[chromium_trybot.py]: https://chromium.googlesource.com/chromium/tools/build/+/master/scripts/slave/recipes/chromium_trybot.py [chromium_trybot.py]: https://chromium.googlesource.com/chromium/tools/build/+/master/recipes/recipes/chromium_trybot.py
There are some GPU configurations on the waterfall backed by only one machine, There are some GPU configurations on the waterfall backed by only one machine,
or a very small number of machines in the Swarming pool. A few examples are: or a very small number of machines in the Swarming pool. A few examples are:
@@ -137,7 +137,7 @@ See [Adding new steps to the GPU bots] for details on this process.
In the [`tools/build`][tools/build] workspace: In the [`tools/build`][tools/build] workspace:
* `scripts/slave/recipe_modules/chromium_tests/`: * `recipes/recipe_modules/chromium_tests/`:
* [`chromium_gpu.py`][chromium_gpu.py] and * [`chromium_gpu.py`][chromium_gpu.py] and
[`chromium_gpu_fyi.py`][chromium_gpu_fyi.py] define the following for [`chromium_gpu_fyi.py`][chromium_gpu_fyi.py] define the following for
each builder and tester: each builder and tester:
@@ -177,9 +177,9 @@ In the [`tools/build`][tools/build] workspace:
specific hardware configuration. specific hardware configuration.
[tools/build]: https://chromium.googlesource.com/chromium/tools/build/ [tools/build]: https://chromium.googlesource.com/chromium/tools/build/
[chromium_gpu.py]: https://chromium.googlesource.com/chromium/tools/build/+/master/scripts/slave/recipe_modules/chromium_tests/chromium_gpu.py [chromium_gpu.py]: https://chromium.googlesource.com/chromium/tools/build/+/master/recipes/recipe_modules/chromium_tests/builders/chromium_gpu.py
[chromium_gpu_fyi.py]: https://chromium.googlesource.com/chromium/tools/build/+/master/scripts/slave/recipe_modules/chromium_tests/chromium_gpu_fyi.py [chromium_gpu_fyi.py]: https://chromium.googlesource.com/chromium/tools/build/+/master/recipes/recipe_modules/chromium_tests/builders/chromium_gpu_fyi.py
[trybots.py]: https://chromium.googlesource.com/chromium/tools/build/+/master/scripts/slave/recipe_modules/chromium_tests/trybots.py [trybots.py]: https://chromium.googlesource.com/chromium/tools/build/+/master/recipes/recipe_modules/chromium_tests/trybots.py
In the [`chromium/src`][chromium/src] workspace: In the [`chromium/src`][chromium/src] workspace:
@@ -208,9 +208,6 @@ In the [`chromium/src`][chromium/src] workspace:
behavior in the GN build. behavior in the GN build.
* [`src/tools/mb/mb_config.pyl`][mb_config.pyl] * [`src/tools/mb/mb_config.pyl`][mb_config.pyl]
* Defines the GN arguments for all of the bots. * Defines the GN arguments for all of the bots.
* [`src/tools/mb/mb_config_buckets.pyl`][mb_config_buckets.pyl]
* A new version of [`mb_config.pyl`][mb_config.pyl] that should supersede
it.
* [`src/infra/config`][src/infra/config]: * [`src/infra/config`][src/infra/config]:
* Definitions of how bots are organized on the waterfall, * Definitions of how bots are organized on the waterfall,
how builds are triggered, which VMs or machines are used for the how builds are triggered, which VMs or machines are used for the
@@ -226,7 +223,6 @@ In the [`chromium/src`][chromium/src] workspace:
[chromium.gpu.fyi.json]: https://chromium.googlesource.com/chromium/src/+/master/testing/buildbot/chromium.gpu.fyi.json [chromium.gpu.fyi.json]: https://chromium.googlesource.com/chromium/src/+/master/testing/buildbot/chromium.gpu.fyi.json
[gn_isolate_map.pyl]: https://chromium.googlesource.com/chromium/src/+/master/testing/buildbot/gn_isolate_map.pyl [gn_isolate_map.pyl]: https://chromium.googlesource.com/chromium/src/+/master/testing/buildbot/gn_isolate_map.pyl
[mb_config.pyl]: https://chromium.googlesource.com/chromium/src/+/master/tools/mb/mb_config.pyl [mb_config.pyl]: https://chromium.googlesource.com/chromium/src/+/master/tools/mb/mb_config.pyl
[mb_config_buckets.pyl]: https://chromium.googlesource.com/chromium/src/+/master/tools/mb/mb_config_buckets.pyl
[generate_buildbot_json.py]: https://chromium.googlesource.com/chromium/src/+/master/testing/buildbot/generate_buildbot_json.py [generate_buildbot_json.py]: https://chromium.googlesource.com/chromium/src/+/master/testing/buildbot/generate_buildbot_json.py
[mixins.pyl]: https://chromium.googlesource.com/chromium/src/+/master/testing/buildbot/mixins.pyl [mixins.pyl]: https://chromium.googlesource.com/chromium/src/+/master/testing/buildbot/mixins.pyl
[waterfalls.pyl]: https://chromium.googlesource.com/chromium/src/+/master/testing/buildbot/waterfalls.pyl [waterfalls.pyl]: https://chromium.googlesource.com/chromium/src/+/master/testing/buildbot/waterfalls.pyl
@@ -245,9 +241,6 @@ sorry):
GPUs. New GPU hardware should be added to this pool. GPUs. New GPU hardware should be added to this pool.
* Also defines the GCEs, Mac VMs and Mac machines used for CI builders * Also defines the GCEs, Mac VMs and Mac machines used for CI builders
on GPU and GPU.FYI waterfalls and trybots. on GPU and GPU.FYI waterfalls and trybots.
* [`chromium.star`][chromium.star]
* Defines Swarming pools of GCEs, shared with Chromium, which are used
for CI builders on GPU and GPU.FYI waterfalls and trybots.
* [`pools.cfg`][pools.cfg] * [`pools.cfg`][pools.cfg]
* Defines the Swarming pools for GCEs and Mac VMs used for manually * Defines the Swarming pools for GCEs and Mac VMs used for manually
triggered trybots. triggered trybots.
@@ -302,17 +295,14 @@ The process is:
to be added to the right Swarming pools in a CL in the to be added to the right Swarming pools in a CL in the
[`infradata/config`][infradata/config] (Google internal) workspace. [`infradata/config`][infradata/config] (Google internal) workspace.
1. GCEs for Windows CI builders and builder/testers should be added to 1. GCEs for Windows CI builders and builder/testers should be added to
`luci-chromium-ci-win10-8` group in [`chromium.star`][chromium.star]. `luci-chromium-gpu-ci-win10-8` group in [`gpu.star`][gpu.star].
[Example](https://chrome-internal-review.googlesource.com/c/infradata/config/+/2077803).
1. GCEs for Linux and Android CI builders and builder/testers should be added to 1. GCEs for Linux and Android CI builders and builder/testers should be added to
one of `luci-chromium-ci-xenial-*-8` groups (but not `*ssd-8`) in `luci-chromium-gpu-ci-xenial-8` group in [`gpu.star`][gpu.star].
[`chromium.star`][chromium.star].
[Example](https://chrome-internal-review.googlesource.com/c/infradata/config/+/2077803).
1. VMs for Mac CI builders and builder/testers should be added to 1. VMs for Mac CI builders and builder/testers should be added to
`gpu_ci_bots` group in [`gpu.star`][gpu.star]. `builderfull_gpu_ci_bots` group in [`gpu.star`][gpu.star].
[Example](https://chrome-internal-review.googlesource.com/c/infradata/config/+/1166889). [Example](https://chrome-internal-review.googlesource.com/c/infradata/config/+/1166889).
1. GCEs for CI testers for all OSes should be added to 1. GCEs for CI testers for all OSes should be added to
`luci-chromium-ci-xenial-2` group in [`chromium.star`][chromium.star]. `luci-chromium-gpu-ci-xenial-2` group in [`gpu.star`][gpu.star].
[Example](https://chrome-internal-review.googlesource.com/c/infradata/config/+/2016410). [Example](https://chrome-internal-review.googlesource.com/c/infradata/config/+/2016410).
1. GCEs and VMs for CQ and optional CQ GPU trybots for should be added to 1. GCEs and VMs for CQ and optional CQ GPU trybots for should be added to
a corresponding `gpu_try_bots` group in [`gpu.star`][gpu.star]. a corresponding `gpu_try_bots` group in [`gpu.star`][gpu.star].
@@ -347,7 +337,8 @@ The process is:
longer be necessary per [crbug.com/942301](http://crbug.com/942301), longer be necessary per [crbug.com/942301](http://crbug.com/942301),
but consult with the Chrome Infra team to find out which of the but consult with the Chrome Infra team to find out which of the
[zones](https://cloud.google.com/compute/docs/regions-zones/) has [zones](https://cloud.google.com/compute/docs/regions-zones/) has
available capacity. available capacity. This also can be checked on viceroy
[dashboard](https://viceroy.corp.google.com/chrome_infra/Quota/chrome?duration=7d).
1. Get this reviewed and landed. This step associates the VM or pool of VMs 1. Get this reviewed and landed. This step associates the VM or pool of VMs
with the bot's name on the waterfall for "builderful" bots or increases with the bot's name on the waterfall for "builderful" bots or increases
swarmed pool capacity for "builderless" bots. swarmed pool capacity for "builderless" bots.
@@ -440,8 +431,7 @@ Builder].
1. Run `main.star` in [`src/infra/config`][src/infra/config] to update the 1. Run `main.star` in [`src/infra/config`][src/infra/config] to update the
generated files. Double-check your work there. generated files. Double-check your work there.
1. If you were adding a new builder, you would need to also add the new 1. If you were adding a new builder, you would need to also add the new
machine to [`src/tools/mb/mb_config.pyl`][mb_config.pyl] and machine to [`src/tools/mb/mb_config.pyl`][mb_config.pyl].
[`src/tools/mb/mb_config_buckets.pyl`][mb_config_buckets.pyl].
1. After the Chromium-side CL lands it will take some time for all of 1. After the Chromium-side CL lands it will take some time for all of
the configuration changes to be picked up by the system. The bot the configuration changes to be picked up by the system. The bot
@@ -454,7 +444,7 @@ Builder].
following. Here's an [example following. Here's an [example
CL](https://chromium-review.googlesource.com/1041145). CL](https://chromium-review.googlesource.com/1041145).
1. Adds the new bot to [`chromium_gpu_fyi.py`][chromium_gpu_fyi.py] in 1. Adds the new bot to [`chromium_gpu_fyi.py`][chromium_gpu_fyi.py] in
`scripts/slave/recipe_modules/chromium_tests/`. Make sure to set the `recipes/recipe_modules/chromium_tests/builders/`. Make sure to set the
`serialize_tests` property to `True`. This is specified for waterfall `serialize_tests` property to `True`. This is specified for waterfall
bots, but not trybots, and helps avoid overloading the physical bots, but not trybots, and helps avoid overloading the physical
hardware. Double-check the `BUILD_CONFIG` and `parent_buildername` hardware. Double-check the `BUILD_CONFIG` and `parent_buildername`
@@ -464,7 +454,7 @@ Builder].
the newly-deployed waterfall bot, so it knows which JSON file to load the newly-deployed waterfall bot, so it knows which JSON file to load
out of src/testing/buildbot and which entry to look at. out of src/testing/buildbot and which entry to look at.
1. Sometimes it is necessary to retrain recipe expectations 1. Sometimes it is necessary to retrain recipe expectations
(`scripts/slave/recipes.py test train`). This is usually needed only (`recipes/recipes.py test train`). This is usually needed only
if the bot adds untested code flow in a recipe, but it's something if the bot adds untested code flow in a recipe, but it's something
to watch out for if your CL fails presubmit for some reason. to watch out for if your CL fails presubmit for some reason.
@@ -502,8 +492,8 @@ writing only NVIDIA). To do this:
before proceeding. before proceeding.
1. Create a CL in the [`tools/build`][tools/build] workspace, adding the new 1. Create a CL in the [`tools/build`][tools/build] workspace, adding the new
Release tester to `win10_chromium_x64_rel_ng`'s `bot_ids` list Release tester to `win10_chromium_x64_rel_ng`'s `bot_ids` list
in `scripts/slave/recipe_modules/chromium_tests/trybots.py`. Rerun in `recipes/recipe_modules/chromium_tests/trybots.py`. Rerun
`scripts/slave/recipes.py test train`. `recipes/recipes.py test train`.
1. Once the above CL lands, the commit queue will **immediately** start 1. Once the above CL lands, the commit queue will **immediately** start
running tests on the CoolNewGPUType configuration. Be vigilant and make running tests on the CoolNewGPUType configuration. Be vigilant and make
sure that tryjobs are green. If they are red for any reason, revert the CL sure that tryjobs are green. If they are red for any reason, revert the CL
@@ -557,7 +547,7 @@ trybot for the Win7 NVIDIA GPUs in Release mode. We will call the new bot
CL](https://chromium-review.googlesource.com/c/chromium/tools/build/+/1979113). CL](https://chromium-review.googlesource.com/c/chromium/tools/build/+/1979113).
1. Adds the new trybot to a "Manually-triggered GPU trybots" section in 1. Adds the new trybot to a "Manually-triggered GPU trybots" section in
`scripts/slave/recipe_modules/chromium_tests/trybots.py`. Create this `recipes/recipe_modules/chromium_tests/tests/trybots.py`. Create this
section after the "Optional GPU bots" section for the appropriate section after the "Optional GPU bots" section for the appropriate
tryserver (`tryserver.chromium.win`, `tryserver.chromium.mac`, tryserver (`tryserver.chromium.win`, `tryserver.chromium.mac`,
`tryserver.chromium.linux`, `tryserver.chromium.android`). Have the bot `tryserver.chromium.linux`, `tryserver.chromium.android`). Have the bot
@@ -570,7 +560,7 @@ trybot for the Win7 NVIDIA GPUs in Release mode. We will call the new bot
tests to run and on what physical hardware. tests to run and on what physical hardware.
1. It may be necessary to retrain recipe expectations for 1. It may be necessary to retrain recipe expectations for
[`tools/build`][tools/build] workspace CLs [`tools/build`][tools/build] workspace CLs
(`scripts/slave/recipes.py test train`). This shouldn't be necessary (`recipes/recipes.py test train`). This shouldn't be necessary
for just adding a manually triggered trybot, but it's something to for just adding a manually triggered trybot, but it's something to
watch out for if your CL fails presubmit for some reason. watch out for if your CL fails presubmit for some reason.
@@ -628,15 +618,14 @@ Win10 Release (CoolNewGPUType)".
[`luci-scheduler.cfg`][luci-scheduler.cfg], [`luci-scheduler.cfg`][luci-scheduler.cfg],
[`cr-buildbucket.cfg`][cr-buildbucket.cfg]. Double-check your work [`cr-buildbucket.cfg`][cr-buildbucket.cfg]. Double-check your work
there. there.
1. Update [`src/tools/mb/mb_config.pyl`][mb_config.pyl] and 1. Update [`src/tools/mb/mb_config.pyl`][mb_config.pyl]
[`src/tools/mb/mb_config_buckets.pyl`][mb_config_buckets.pyl]
to include `win-myproject-rel`. to include `win-myproject-rel`.
1. *After* the Chromium-side CL lands and the bot is on the console, create a CL 1. *After* the Chromium-side CL lands and the bot is on the console, create a CL
in the [`tools/build`][tools/build] workspace which does the in the [`tools/build`][tools/build] workspace which does the
following. Here's an [example CL](https://crrev.com/c/1554272). following. Here's an [example CL](https://crrev.com/c/1554272).
1. Adds "MyProject GPU Win10 Release 1. Adds "MyProject GPU Win10 Release
(CoolNewGPUType)" to [`chromium_gpu_fyi.py`][chromium_gpu_fyi.py] in (CoolNewGPUType)" to [`chromium_gpu_fyi.py`][chromium_gpu_fyi.py] in
`scripts/slave/recipe_modules/chromium_tests/`. You can copy a similar `recipes/recipe_modules/chromium_tests/builders/`. You can copy a similar
step. step.
1. Adds `win-myproject-rel` to [`trybots.py`][trybots.py] in the same folder. 1. Adds `win-myproject-rel` to [`trybots.py`][trybots.py] in the same folder.
This is where you associate "MyProject GPU Win10 Release This is where you associate "MyProject GPU Win10 Release