0

[wptrunner][docs] Add side-by-side examples for WPT metadata

To help users understand WPT metadata, this change adds equivalent
examples in TestExpectations/*-expected.txt format.

Also:
* Update the finalized table of properties.
* General rewording.

Bug: 1442392
Change-Id: Ifcf856fc9fe2d48f7e658e19ce5db596d1a78d9b
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4649951
Reviewed-by: Weizhong Xia <weizhong@google.com>
Commit-Queue: Jonathan Lee <jonathanjlee@google.com>
Cr-Commit-Position: refs/heads/main@{#1163272}
This commit is contained in:
Jonathan Lee
2023-06-27 22:52:52 +00:00
committed by Chromium LUCI CQ
parent 2ce2e63c5c
commit 510e12ddb2

@ -107,11 +107,9 @@ Results for the bots use the existing layout test
## Expectations
[Similar to `run_web_tests.py`](web_test_expectations.md), `wptrunner` allows
engineers to specify what results to expect and which tests to skip.
This information is stored in [WPT metadata
files](https://web-platform-tests.org/tools/wptrunner/docs/expectation.html).
Each metadata file is checked in with an `.ini` suffix appended to its
`wptrunner` uses [WPT metadata files] to specify which tests should run and what
results to expect.
Each metadata file is checked in with an `.ini` extension appended to its
corresponding test file's path:
```
@ -120,164 +118,243 @@ external/wpt/folder/my-test-expected.txt <-- run_web_tests.py baseline
external/wpt/folder/my-test.html.ini <-- wptrunner metadata
```
A metadata file is roughly equivalent to a `run_web_tests.py` baseline and the
test's corresponding lines in [web test expectation
files](web_test_expectations.md#Kinds-of-expectations-files).
Metadata files record test and subtest expectations in a structured INI-like
text format:
A metadata file is roughly equivalent to a [`run_web_tests.py`
baseline](writing_web_tests.md#Text-Test-Baselines) and the
test's lines in [web test expectation files].
As the extension implies, metadata follow an INI-like structured text format:
[WPT metadata files]: https://web-platform-tests.org/tools/wptrunner/docs/expectation.html
[web test expectation files]: web_test_expectations.md#Kinds-of-expectations-files
|||---|||
###### `TestExpectations`
```
[my-test.html]
expected: OK
bug: crbug.com/123 # Comments start with '#'
# Flakily slow
crbug.com/3 external/wpt/a/reftest.html [ Pass Timeout ]
```
[First subtest name (flaky)]
expected: [PASS, FAIL] # Expect either a pass or a failure
(This `TestExpectations` line is equivalent to the metadata file on the right.)
[Second subtest name: [\]] # The backslash escapes a literal ']' in the subtest name
###### `external/wpt/a/reftest.html.ini`
```
[reftest.html]
bug: crbug.com/3
# Flakily slow
expected: [PASS, TIMEOUT]
```
|||---|||
* The brackets `[...]` start a (sub)test section whose contents follow in an
indented block.
* The section heading should contain either the subtest name or the test URL
without the dirname (i.e., should contain basename and query parameters, if
any).
* A section may contain `<key>: <value>` pairs. Important keys that `wptrunner`
understands:
* `expected`: A
[status](https://firefox-source-docs.mozilla.org/mozbase/mozlog.html#data-format)
(or list of possible statuses) to expect.
* Common test statuses include `TIMEOUT`, `CRASH`, and either `OK/ERROR`
for testharness tests to represent the overall harness status, or
`PASS/FAIL` for non-testharness tests that only have a single result
(e.g., reftests).
* Common subtest statuses include `PASS`, `FAIL`, `TIMEOUT`, or
`NOTRUN`.
* For convenience, `wptrunner` expects `OK` or `PASS` when `expected` is
omitted.
Deleting the entire metadata file implies an all-`PASS` test.
* `disabled`: Any nonempty value will skip the test or ignore the subtest
result. By convention, `disabled` should contain the reason the (sub)test
is disabled, with the literal `neverfix` for [`NeverFixTests`][13].
* `#` starts a comment that extends to the end of the line.
*** note
**Note**: For testharness tests, the harness statuses `OK/ERROR` are orthogonal
to `PASS/FAIL` and have different semantics:
* `OK` only means all subtests ran to completion normally; it does *not* imply
that every subtest `PASS`ed. A test may `FAIL` subtests while still reporting
the harness is `OK`.
* `ERROR` indicates some problem with the harness, such as a WebDriver remote
end disconnection.
* `PASS/FAIL` represent passing or failing assertions against the browser under
test.
***
`testharness.js` subtest expectations are represented by a section nested under
the relevant test:
|||---|||
###### `external/wpt/test-expected.txt`
```
This is a testharness.js-based test.
PASS passing subtest
FAIL failing subtest whose name needs an escape []
Harness: the test ran to completion.
```
###### `external/wpt/test.html.ini`
```
[test.html]
[failing subtest whose name needs an escape [\]]
expected: FAIL
```
The brackets `[...]` denote the start of a (sub)test section, which can be
hierarchically nested with significant indentation.
Each section can contain `<key>: <value>` pairs.
Important keys that `wptrunner` understands:
* `expected`: The
[statuses](https://firefox-source-docs.mozilla.org/mozbase/mozlog.html#data-format)
to expect.
* Tests commonly have these harness statuses: `OK`, `ERROR`, `TIMEOUT`, or
`CRASH` (for tests without subtests, like reftests, `PASS` replaces `OK`
and `FAIL` replaces `ERROR`)
* Subtests commonly have: `PASS`, `FAIL`, or `TIMEOUT`
* For convenience, `wptrunner` expects `OK` or `PASS` when `expected` is
omitted.
Deleting the entire metadata file implies an all-`PASS` test.
* `disabled`: Any nonempty value will disable the test or ignore the subtest
result.
*** note
**Note**: As shown in the example above, a `testharness.js` test may have a
test-level status of `OK`, even if some subtests `FAIL`. This is a common
point of confusion: `OK` only means that the test ran to completion and did not
`CRASH` or `TIMEOUT`. `OK` does not imply that every subtest `PASS`ed.
***
*** note
**Note**: Currently, `wptrunner` can inherit expectations from
`TestExpectations` files through a [translation
step](https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/tools/blinkpy/w3c/wpt_metadata_builder.py).
Due to lost subtest coverage, we are actively working to deprecate this and use
checked-in metadata natively in Chromium.
***
|||---|||
### Conditional Values
`run_web_tests.py` encodes platform- or flag-specific results using [platform
tags](https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/tools/blinkpy/web_tests/port/base.py;l=140-164;drc=023529555939e01068874ddff3a2ea8455125efb;bpv=0;bpt=0)
in test expectations, separate [`FlagExpectations/*`
files](https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/web_tests/FlagExpectations/),
and [baseline fallback](web_test_baseline_fallback.md).
WPT metadata uses a Python-like [conditional
syntax](https://web-platform-tests.org/tools/wptrunner/docs/expectation.html#conditional-values)
instead to store all expectations in one file:
`run_web_tests.py` reads platform- or flag-specific results from [platform
tags] in `TestExpectations`, [`FlagExpectations/*`][10], and [baseline
fallback](web_test_baseline_fallback.md).
WPT metadata uses a Python-like [conditional syntax] instead to store all
expectations in one file:
|||---|||
###### `TestExpectations`
```
[my-test.html]
[ Win Debug ] test.html [ Crash ] # DCHECK triggered
[ Mac11-arm64 ] test.html [ Pass Timeout ]
```
###### `external/wpt/test.html.ini`
```
[test.html]
expected:
if not debug: FAIL
if os == "mac" or (os == "linux" and version != "trusty"): [FAIL, PASS]
TIMEOUT # If no branch matches, use this default value.
if os == "win" and debug: CRASH # DCHECK triggered
if port == "mac11-arm64": [PASS, TIMEOUT]
# Resolves to this default value when no conditions
# match. An `OK/PASS` here can be omitted because
# it's implied by an absent value.
PASS
```
|||---|||
To evaluate a conditional value, `wptrunner` takes the right-hand side of the
first branch where the condition evaluates to a truthy value.
`wptrunner` resolves a conditional value to the right-hand side of the first
branch whose expression evaluates to a truthy value.
Conditions can contain arbitrary Python-like boolean expressions that will be
evaluated against **properties** (*i.e.*, variables) pulled from the [test
environment](https://firefox-source-docs.mozilla.org/build/buildsystem/mozinfo.html).
evaluated against **properties**, variables detected from the [test
environment].
Properties available in Chromium are shown below:
| Property | Type | Description | Choices |
| - | - | - | - |
| `os` | `str` | OS family | `linux`, `android` |
| `version` | `str` | OS version | Depends on `os` |
| `product` | `str` | Browser or browser component | `chrome`, `content_shell`, `chrome_android`, `android_webview` |
| `processor` | `str` | CPU specifier | `arm`, `x86`, `x86_64` |
| `flag_specific` | `str` | Flag-specific suite name | See [`FlagSpecificConfig`](https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/web_tests/FlagSpecificConfig) (will be falsy for the generic suite) |
| `os` | `str` | OS family | `linux`, `mac`, `win`, `android`, `ios` |
| `port` | `str` | Port name (includes OS version and architecture) | See [`Port.ALL_SYSTEMS`][12] (e.g., `mac12-arm64`) |
| `product` | `str` | Browser or browser component | `chrome`, `content_shell`, `chrome_android`, `android_webview`, `chrome_ios` |
| `flag_specific` | `str` | Flag-specific suite name | See [`FlagSpecificConfig`][3] (the empty string `""` represents the generic suite) |
| `virtual_suite` | `str` | Virtual test suite name | See [`VirtualTestSuites`][2] (the empty string `""` represents the generic suite) |
| `debug` | `bool` | `is_debug` build? | N/A |
### Test Parameterization
[platform tags]: /third_party/blink/web_tests/TestExpectations#1
[test environment]: https://firefox-source-docs.mozilla.org/build/buildsystem/mozinfo.html
[conditional syntax]: https://web-platform-tests.org/tools/wptrunner/docs/expectation.html#conditional-values
The WPT suite supports forms of test parameterization where a test file on disk
may map to more than one test ID: [multiglobal `.any.js`
tests](https://web-platform-tests.org/writing-tests/testharness.html#tests-for-other-or-multiple-globals-any-js)
and [test
variants](https://web-platform-tests.org/writing-tests/testharness.html#variants).
The metadata for these parameterizations live in the same file (test file path
with the `.ini` suffix), but under different top-level sections.
For example, suppose a test `external/wpt/a.any.js` generates test IDs
`a.any.html?b`, `a.any.html?c`, `a.any.worker.html?b`, and
`a.any.worker.html?c`.
Then, a file named `external/wpt/a.any.js.ini` stores expectations for all
parameterizations:
### Parameterized Tests
In WPT, [multiglobal `.any.js` tests][11] and [test variants] are forms of
parameterization where a test file may generate more than one test ID.
The metadata for these parameterizations live in the same `.ini` file, but under
different top-level sections.
For example, a test `external/wpt/a/b.any.js` that generates `.any.html` and
`.any.worker.html` scopes with variants `?c` and `?d` can express its expectations as:
|||---|||
###### `TestExpectations`
```
[a.any.html?b]
expected: OK
a/b.any.html?c [ Crash ]
a/b.any.html?d [ Crash ]
a/b.any.worker.html?c [ Timeout ]
a/b.any.worker.html?d [ Timeout ]
```
[a.any.html?c]
###### `external/wpt/a/b.any.js.ini`
```
[b.any.html?c]
expected: CRASH
[a.any.worker.html?b]
[b.any.html?d]
expected: CRASH
[b.any.worker.html?c]
expected: TIMEOUT
[a.any.worker.html?c]
[b.any.worker.html?d]
expected: TIMEOUT
```
|||---|||
[test variants]: https://web-platform-tests.org/writing-tests/testharness.html#variants
### Directory-Wide Expectations
To set expectations or disable tests under a directory without editing an `.ini`
file for every test, place a file named `__dir__.ini` under the desired
directory with contents like:
file for every contained test, create a special `__dir__.ini` file under the
desired directory with top-level keys, which work identically to those for
per-test metadata:
|||---|||
###### `FlagExpectations/highdpi`
```
# Redundant coverage
external/wpt/a/* [ Skip ]
```
###### `external/wpt/a/__dir__.ini`
```
expected:
if os == "linux": CRASH
disabled:
if flag_specific == "highdpi": skip highdpi for these non-rendering tests
if flag_specific == "highdpi": redundant coverage
```
|||---|||
Metadata closer to affected test files take greater precedence.
For example, expectations set by `a/b/c.html.ini` override those of
`a/b/__dir__.ini`, which overrides `a/__dir__.ini` in turn.
The special value `disabled: @False` can selectively reenable tests or
directories disabled by an ancestor `__dir__.ini`.
### Update Tool
To update expectations in bulk for all tested configurations,
[`blink_tool.py`][5] has an [`update-metadata`][6] subcommand that can trigger
[try builds](#Builders) and update expectations from the results (similar to
[`rebaseline-cl`][7]).
The workflow is very similar to [rebaselining]:
```sh
# Create a CL, if one has not been created already.
git cl upload
# Trigger try builds against the current patchset.
./blink_tool.py update-metadata
# After the try builds complete, collect the results and update expectations
# for `external/wpt/css/CSS2/` (sub)tests that only failed unexpectedly. Any
# test section updated will be annotated with `bug: crbug.com/123`.
./blink_tool.py update-metadata --bug=123 css/CSS2/
# Commit and upload the staged `.ini` files.
git commit -m "Update WPT expectations" && git cl upload
```
Note that there is no section heading `[my-test.html]`, but the keys work
exactly the same as for per-test metadata.
The [WPT autoroller](web_platform_tests.md#Automatic-import-process) uses
`update-metadata` to automatically suppress imported tests with new failures.
Metadata closer to affected test files take higher precedence.
For example, expectations set by `a/b/test.html.ini` override those of
`a/b/__dir__.ini`, which overrides `a/__dir__.ini`.
`update-metadata` can also suppress failures occurring on trunk:
The special value `disabled: @False` can selectively reenable child tests or
directories that would have been disabled by a parent `__dir__.ini`.
### Tooling
To help update expectations in bulk,
[`blink_tool.py`](https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/tools/blinkpy/tool/blink_tool.py)
has an
[`update-metadata`](https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/tools/blinkpy/tool/commands/update_metadata.py)
subcommand that can automatically update expectations from try job results
(similar to
[`rebaseline-cl`](https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/tools/blinkpy/tool/commands/rebaseline_cl.py)).
Example invocation:
``` sh
./blink_tool.py update-metadata --verbose --bug=123 \
--build=linux-wpt-content-shell-fyi-rel:30 css/
```sh
# Suppress tests that caused any `linux-wpt-fyi-rel` CI builds 3000-3005
# (inclusive) to fail.
./blink_tool.py update-metadata --build=ci/linux-wpt-fyi-rel:3000-3005
```
This will update the `expected` statuses for `external/wpt/css/` (sub)tests that
ran unexpectedly on [build 30 of
`linux-wpt-content-shell-fyi-rel`](https://ci.chromium.org/ui/p/chromium/builders/try/linux-wpt-content-shell-fyi-rel/30/overview).
Any updated test section will be annotated with `bug: crbug.com/123`.
[rebaselining]: web_test_expectations.md#Rebaselining-using-try-jobs
## Debugging Support
@ -287,16 +364,26 @@ You can interact with the paused test page afterwards, including with DevTools:
![wptrunner paused](images/web-tests/wptrunner-paused.png)
Closing the tab or window will unpause wptrunner and run the next test.
Closing the tab or window will unpause `wptrunner` and run the next test.
In the future, we intend to support hooking up [text-based debuggers] like `rr`
to test runs.
[text-based debuggers]: https://crbug.com/1440021
## Known issues
## Known Issues
Please [file bugs and feature requests](https://crbug.com/new) against
`Blink>Infra`, tagging the title with `[wptrunner]`.
[`Blink>Infra` with the `wptrunner`
label](https://bugs.chromium.org/p/chromium/issues/list?q=component%3ABlink%3EInfra%20label%3Awptrunner&can=2).
[1]: https://source.chromium.org/search?q=run_wpt_tests.py%20lang:gn
[2]: /third_party/blink/web_tests/VirtualTestSuites
[3]: /third_party/blink/web_tests/FlagSpecificConfig
[5]: /third_party/blink/tools/blink_tool.py
[6]: /third_party/blink/tools/blinkpy/tool/commands/update_metadata.py
[7]: /third_party/blink/tools/blinkpy/tool/commands/rebaseline_cl.py
[10]: /third_party/blink/web_tests/FlagExpectations
[11]: https://web-platform-tests.org/writing-tests/testharness.html#tests-for-other-or-multiple-globals-any-js
[12]: https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/tools/blinkpy/web_tests/port/base.py;l=152-163;drc=b35e75299a6fda0eb51e9ba3139cce216f7f8db0;bpv=0;bpt=0
[13]: /third_party/blink/web_tests/NeverFixTests