
In fact it is not necessary to set this env var but it is worth mentioning to show that VA-API also works when the gpu process is running in sandbox mode. Bug: 917091 Change-Id: I0d7b43dcff4d7887e945b5d54d8d13d5244d1544 Reviewed-on: https://chromium-review.googlesource.com/c/1433032 Commit-Queue: Julien Isorce <julien.isorce@chromium.org> Reviewed-by: Miguel Casas <mcasas@chromium.org> Cr-Commit-Position: refs/heads/master@{#625717}
11 KiB
VaAPI
This page documents tracing and debugging the Video Acceleration API (VaAPI or VA-API) on ChromeOS. The VA-API is an open-source library and API specification, providing access to graphics hardware acceleration capabilities for video and image processing. The VaAPI is used on ChromeOS on both Intel and AMD platforms.
[TOC]
Overview
VaAPI code is developed upstream on the VaAPI GitHub repository, from which ChromeOS is a downstream client via the libva package, with packaged backends for e.g. both Intel and AMD.
Tracing VaAPI video decoding
A simplified diagram of the buffer circulation is provided below. The "client"
is always a Renderer process via a Mojo/IPC communication. Essentially the VaAPI
Video Decode Accelerator (VaVDA) receives encoded BitstreamBuffers from the
"client", and sends them to the "va internals", which eventually produces
decoded video in PictureBuffers. The VaVDA may or may not use the Vpp
unit for
pixel format adaptation, depending on the codec used, silicon generation and
other specifics.
K BitstreamBuffers +-----+ +-------------------+
C --------------------->| Va | -----> |
L <---------------------| VDA | <---- va internals |
I (encoded stuff) | | | |
E | | | +-----+ +----+
N <---------------------| | <----| |<------| lib|
T --------------------->| | ---->| Vpp |------>| va |
N +-----+ +-+-----+ M +----+
PictureBuffers VASurfaces
(decoded stuff)
*** aside
PictureBuffers are created by the "client" but allocated and filled in by the
VaVDA. K
is unrelated to both M
and N
.
Tracing memory consumption
Tracing memory consumption is done via the MemoryInfra system. Please take a
minute and read that document (in particular the difference between
effective_size
and size
). The VaAPI lives inside the GPU process (a.k.a.
Viz process), so please familiarize yourself with the GPU Memory Tracing
document. The VaVDA provides information by implementing the Memory Dump
Provider interface, but the information provided varies with the executing mode
as explained next.
Internal VASurfaces accountancy
The usage of the Vpp
unit is controlled by the member variable
|decode_using_client_picture_buffers_|
and is very advantageous in terms of
CPU, power and memory consumption (see crbug.com/822346).
- When
|decode_using_client_picture_buffers_|
is false,libva
uses a set of internally allocated VASurfaces that are accounted for in thegpu/vaapi/decoder
tracing category (see screenshot below). Each of these VASurfaces is backed by a Buffer Object large enough to hold, at least, the decoded image in YUV semiplanar format. In the diagram above,M
varies: 4 for VP8, 9 for VP9, 4-12 for H264/AVC1 (seeGetNumReferenceFrames()
).
- When
|decode_using_client_picture_buffers_|
is true,libva
can decode directly on the client's PictureBuffers,M = 0
, and thegpu/vaapi/decoder
category is not present in the GPU MemoryInfra.
PictureBuffers accountancy
VaVDA allocates storage for the N PictureBuffers provided by the client by means of VaapiPicture{NativePixmapOzone}s, backed by NativePixmaps, themselves backed by DmaBufs (the client only knows about the client Texture IDs). The GPU's TextureManager accounts for these textures, but:
- They are not correctly identified as being backed by NativePixmaps (see crbug.com/514914).
- They are not correctly linked back to the Renderer or ARC++ client on behalf of whom the allocation took place, like e.g. the probe-gpu example (see crbug.com/721674).
See e.g. the following ToT example for 10 1920x1080p textures (32bpp); finding
the desired context_group
can be tricky.
Tracing power consumption
Power consumption is available on ChromeOS test/dev images via the command line
binary dump_intel_rapl_consumption
; this tool averages the power
consumption of the four SoC domains over a configurable period of time, usually
a few seconds. These domains are, in the order presented by the tool:
pkg
: estimated power consumption of the whole SoC; in particular, this is a superset of pp0 and pp1, including all accessory silicon, e.g. video processing.pp0
: CPU set.pp1
/gfx
: Integrated GPU or GPUs.dram
: estimated power consumption of the DRAM, from the bus activity.
Googlers can read more about this topic under go/power-consumption-meas-in-intel.
dump_intel_rapl_consumption
is usually run while a given workload is active
(e.g. a video playback) with an interval larger than a second to smooth out all
kinds of system services that would show up in smaller periods, e.g. WiFi.
dump_intel_rapl_consumption --interval_ms=2000 --repeat --verbose
E.g. on a nocturne main1, the average power consumption while playing back the first minute of a 1080p VP9 video, the average consumptions in watts are:
pkg |
pp0 |
pp1 /gfx |
dram |
---|---|---|---|
2.63 | 1.44 | 0.29 | 0.87 |
As can be seen, pkg
~= pp0
+ pp1
+ 1W, this extra watt is the cost of all
the associated silicon, e.g. bridges, bus controllers, caches, and the media
processing engine.
Tracing CPU cycles and instantaneous buffer usage
TODO(mcasas): fill in this section.
Verifying VaAPI installation and usage
Verify the VaAPI is correctly installed and can be loaded
vainfo
is a small command line utility used to enumerate the supported
operation modes; it's developed in the libva-utils repository, but more
concretely available on ChromeOS dev images (media-video/libva-utils package)
and under Debian systems (vainfo). vainfo
will try to load the appropriate
backend driver for the system and/or GPUs and fail if it cannot find/load it.
Verify the VaAPI supports and/or uses a given codec
A few steps are customary to verify the support and use of a given codec.
To verify that the build and platform supports video acceleration, launch
Chromium and navigate to chrome://gpu
, then:
-
Search for the "Video Acceleration Information" Section: this should enumerate the available accelerated codecs and resolutions.
-
If this section is empty, oftentimes the "Log Messages" Section immediately below might indicate an associated error, e.g.:
vaInitialize failed: unknown libva error
that can usually be reproduced with
vainfo
, see the previous section.
To verify that a given video is being played back using the accelerated video decoding backend:
- Navigate to a url that causes a video to be played. Leave it playing.
- Navigate to the
chrome://media-internals
tab. - Find the entry associated to the video-playing tab.
- Scroll down to "
Player Properties
" and check the "video_decoder
" entry: it should say "GpuVideoDecoder".
VaAPI on Linux
This configuration is unsupported (see docs/linux_hw_video_decode.md), the following instructions are provided only as a reference for developers to test the code paths on a Linux machine.
- Follow the instructions under the Linux build setup document, adding the GN
argument
use_vaapi=true
in the args.gn file (please refer to the Setting up the build) Section). - To support proprietary codecs such as, e.g. H264/AVC1, add the options
proprietary_codecs = true
andffmpeg_branding = "Chrome"
to the GN args. - Build Chromium as usual.
At this point you should make sure the appropriate VA driver backend is working
correctly; try running vainfo
from the command line and verify no errors show
up.
To run Chromium using VaAPI two arguments are necessary:
--ignore-gpu-blacklist
--use-gl=desktop
or--use-gl=egl
./out/gn/chrome --ignore-gpu-blacklist --use-gl=egl
Note that you can set the environment variable MESA_GLSL_CACHE_DISABLE=false
if you want the gpu process to run in sandboxed mode, see
crbug.com/264818. To check if the running gpu
process is sandboxed or not, just open chrome://gpu
and search for
Sandboxed
in the driver information table. In addition, passing
--gpu-sandbox-failures-fatal=yes
will prevent the gpu process to run in
non-sandboxed mode.
Refer to the previous section to verify support and use of the VaAPI.