
Now that all users of CacheStorageContext(Impl) are using mojo interfaces instead of calling direct functions, there is no longer any need to support cross-sequence manager access. All CacheStorageManager access is done from the cache storage task runner. Bug: 1147642 Change-Id: I2e759fdbda07e3a3000a8b64660030e52277b106 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2653623 Commit-Queue: enne <enne@chromium.org> Reviewed-by: Victor Costan <pwnall@chromium.org> Reviewed-by: Ben Kelly <wanderview@chromium.org> Cr-Commit-Position: refs/heads/master@{#852266}
Architecture (as of July 29th 2016)
This document describes the browser-process implementation of the Cache Storage specification.
As of June 2018, Chrome components can use the Cache Storage interface via
CacheStorageManager
to store Request/Response key-value pairs. The concept of
CacheStorageOwner
was added to distinguish and isolate the different
components.
Major Classes and Ownership
Ownership
Where '=>' represents ownership, '->' is a reference, and '~>' is a weak reference.
CacheStorageContextImpl
->CacheStorageManager
=>CacheStorage
=>CacheStorageCache
- A
CacheStorageManager
can own multipleCacheStorage
objects. - A
CacheStorage
can own multipleCacheStorageCache
objects.
StoragePartitionImpl
->CacheStorageContextImpl
StoragePartitionImpl
effectively owns theCacheStorageContextImpl
in the sense that it callsCacheStorageContextImpl::Shutdown()
on deletion which resets itsCacheStorageManager
.
RenderProcessHost
->CacheStorageDispatcherHost
->CacheStorageContextImpl
CacheStorageDispatcherHost
=>CacheStorageCacheHandle
~>CacheStorageCache
- The
CacheStorageDispatcherHost
holds onto handles for:- JavaScript references to cache objects
CacheStorageDispatcherHost
=>CacheStorageHandle
~>CacheStorage
- The
CacheStorageDispatcherHost
holds onto handles for:- JavaScript references to caches
CacheStorageCacheDataHandle
=>CacheStorageCacheHandle
~>CacheStorageCache
CacheStorageCacheDataHandle
is the blob data handle for a response body and it holds aCacheStorageCacheHandle
. It streams from thedisk_cache::Entry
response stream. It's necessary that thedisk_cache::Backend
(owned byCacheStorageCache
) stays open so long as one of itsdisk_cache::Entry
s is reachable. Otherwise, a new backend might open and clobber the entry.
CacheStorageCache
=>CacheStorageCacheHandle
~>CacheStorageCache
- The
CacheStorageCache
will hold a self-reference while executing an operation. This self-reference is dropped between subsequent operations, so shutdown is possible when there are no external references even if there are more operations in the scheduler queue.
CacheStorageDispatcherHost
- Receives IPC messages from a render process and creates the appropriate
CacheStorageManager
orCacheStorageCache
operation. - For each operation, holds a
CacheStorageCacheHandle
to keep the cache alive since the operation is asynchronous. - For each cache reference held by the render process, holds a
CacheStorageCacheHandle
. - For each CacheStorage reference held by the renderer process, holds a
CacheStorageHandle
. This is used to inform the CacheStorage about whether its externally used so it can keep warmed cache objects alive to mitigate rapid opening/closing/opening churn.
CacheStorageManager
- Forwards calls to the appropriate
CacheStorage
for a given origin-owner pair, loadingCacheStorage
s on demand. - Handles
QuotaManager
andBrowsingData
calls.
CacheStorage
- Manages the caches for a single origin-owner pair.
- Handles creation/deletion of caches and updates the index on disk accordingly.
- Manages operations that span multiple caches (e.g.,
CacheStorage::Match
). - Backend-specific information is handled by
CacheStorage::CacheLoader
CacheStorageCache
- Creates or opens a net::disk_cache (either
SimpleCache
orMemoryCache
) on initialization. - Handles add/put/delete/match/keys calls.
- Owned by
CacheStorage
and deleted either whenCacheStorage
deletes or when the lastCacheStorageCacheHandle
for the cache is gone.
CacheStorageIndex
- Manages an ordered collection of metadata (CacheStorageIndex::CacheStorageMetadata) for each CacheStorageCache owned by a given CacheStorage instance.
- Is serialized by CacheStorage::CacheLoader (WriteIndex/LoadIndex) as a Protobuf file.
CacheStorageCacheHandle
- Holds a weak reference to a
CacheStorageCache
. - When the last
CacheStorageCacheHandle
to aCacheStorageCache
is deleted, so to is theCacheStorageCache
. - The
CacheStorageCache
may be deleted before theCacheStorageCacheHandle
(onCacheStorage
destruction), so it must be checked for validity before use.
CacheStorageHandle
- Holds a weak reference to a
CacheStorage
. - When the last
CacheStorageHandle
to aCacheStorage
is deleted, internal state is cleaned up. TheCacheStorage
object is not deleted, however. - The
CacheStorage
may be deleted before theCacheStorageHandle
(on browser shutdown), so it must be checked for validity before use.
Directory Structure
$PROFILE/Service Worker/CacheStorage/origin
/cache
/
Where origin
is a hash of the origin and cache
is a GUID generated at the
cache's creation time.
The reason a random directory is used for a cache is so that a cache can be doomed and still used by old references while another cache with the same name is created.
Directory Contents
CacheStorage
creates its own index file (index.txt), which contains a
mapping of cache names to its path on disk. On CacheStorage
initialization,
directories not in the index are deleted.
Each CacheStorageCache
has a disk_cache::Backend
backend, which writes in
the CacheStorageCache
's directory.
Layout of the disk_cache::Backend
A cache is represented by a disk_cache::Backend
. The Request/Response pairs
referred to in the specification are stored as disk_cache::Entry
s. Each
disk_cache::Entry
has three streams: one for storing a protobuf with the
request/response metadata (e.g., the headers, the request URL, and opacity
information), another for storing the response body, and a final stream for
storing any additional data (e.g., compiled JavaScript).
The entries are keyed by full URL. This has a few ramifications:
- Multiple vary responses for a single request URL are not supported.
- Operations that may require scanning multiple URLs (e.g.,
ignoreSearch
) must scan every entry in the cache.
The above could be fixed by changes to the backend or by introducing indirect entries in the cache. The indirect entries would be for the query-stripped request URL. It would point to entries to each query request/response pair and for each vary request/response pair.
Threads
- CacheStorage classes live on the IO thread. Exceptions include:
CacheStorageContextImpl
which is created on UI but otherwise runs and is deleted on IO.CacheStorageDispatcherHost
which is created on UI but otherwise runs and is deleted on IO.
- Index file manipulation and directory creation/deletion occurs on a
SequencedTaskRunner
assigned atCacheStorageContextImpl
creation. - The
disk_cache::Backend
lives on the IO thread and uses its own worker pool to implement async operations.
Asynchronous Idioms in CacheStorage and CacheStorageCache
- All async methods should asynchronously run their callbacks.
- The async methods often include several asynchronous steps. Each step passes a continuation callback on to the next. The continuation includes all of the necessary state for the operation.
- Callbacks are guaranteed to run so long as the object
(
CacheStorageCacheCache
orCacheStorage
) is still alive. Once the object is deleted, the callbacks are dropped. We don't worry about dropped callbacks on shutdown. If deleting prior to shutdown, one shouldClose()
aCacheStorage
orCacheStorageCache
to ensure that all operations have completed before deleting it.
Scheduling Operations
Operations are scheduled in a sequential scheduler (CacheStorageScheduler
).
Each CacheStorage
and CacheStorageCache
has its own scheduler. If an
operation freezes, then the scheduler is frozen. If a CacheStorage
call winds
up calling something from every CacheStorageCache
(e.g.,
CacheStorage::Match
), then one frozen CacheStorageCache
can freeze the
CacheStorage
as well. This has happened in the past (Cache::Put
called
QuotaManager
to determine how much room was available, which in turn called
Cache::Size
). Be careful to avoid situations in which one operation triggers
a dependency on another operation from the same scheduler.
At the end of an operation, the scheduler needs to be kicked to start the next operation. The idiom for this in CacheStorage/ is to wrap the operation's callback with a function that will run the callback as well as advance the scheduler. So long as the operation runs its wrapped callback the scheduler will advance.
Opaque Resource Size Obfuscation
Applications can cache cross-origin resources as per Cross-Origin Resources and CORS. Opaque responses are also cached, but in order to prevent "leaking" the size of opaque responses their sizes are obfuscated. Random padding is added to the actual size making it difficult for an attacker to ascertain the actual resource size via quota APIs.
When Chromium starts, a new random padding key is generated and used for all new caches created. This key is used by each cache to calculate padding for opaque resources. Each cache's key is persisted to disk in the cache index file
Each cache maintains the total padding for all opaque resources within the cache. This padding is added to the actual resource size when reporting sizes to the quota manager.
The padding algorithm version is also written to each cache allowing for it to be changed at a future date. CacheStorage will use the persisted key and padding from the cache's index unless the padding algorithm has been changed, one of values is missing, or deemed to be incorrect. In this situation the cache is enumerated and the padding recalculated during open.