Web PackagingGooglejyasskin@chromium.org
gen
DispatchInternet-DraftWeb Packages provide a way to bundle up groups of web resources to
transmit them together. These bundles can then be signed to establish
their authenticity.People would like to use content offline and in other situations where
there isn’t a direct connection to the server where the content
originates. However, it’s difficult to distribute and verify the
authenticity of applications and content without a connection to the
network. The W3C has addressed running applications offline with
Service Workers (), but not
the problem of distribution.We’ve started work on this problem
in , but we suspect that the
IETF may be the right place to standardize the overall format. More
details can be found in that repository.People with expensive or intermittent internet connections are used
to sharing files via P2P links and shared SD cards. They should be
able to install web applications they received this way. Installing a
web application requires a TLS-type guarantee that it came from and
can use data owned by a particular origin.Verification of the origin of the content isn’t always necessary.
For example, users currently share screenshots and MHTML documents
with their peers, with no guarantee that the shared content is
authentic. However, these formats have low fidelity (screenshots)
and/or aren’t interoperable (MHTML). We’d like an interoperable format
that lets both publishers and readers package such content for use in
an untrusted mode.CDNs want to re-publish other origins’ content so readers can access
it more quickly or more privately. Currently, to attribute that
content to the original origin, they need the full ability to publish
arbitrary content under that origin’s name. There should be a way to
let them attribute only the exact content that the original origin
published.WICG/webpackage#45Publishers and readers should be able to generate a package once, and have it
usable by all browsers.The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL
NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “NOT RECOMMENDED”,
“MAY”, and “OPTIONAL” in this document are to be interpreted as
described in BCP 14 when, and only when, they
appear in all capitals, as shown here.This specification defines how conformant web package parsers convert a sequence
of bytes into the semantics of a web package. It does not constrain how web
package encoders produce such a package: although there are some guidelines in
, encoders MAY produce any sequence of bytes that a
conformant parser would parse into the intended semantics.In places, this specification says the parser “MAY return” some data. This
indicates that the described data is complete enough that later parsing failures
do not need to discard it.In places, this specification says the parser “MUST fail”. The parser MAY report
these failures to its caller in any way, but MUST NOT return any data it has
parsed so far that wasn’t mentioned in a “MAY return” statement.This specification creates local variables with the phrase “Let variable-name
be …”. Use of a variable before it’s created is a defect in this
specification.The package is roughly a CBOR item with the following CDDL schema, but package
parsers are required to successfully parse some byte strings that aren’t valid
CBOR. For example, sections may have padding between them, or even overlap, as
long as the embedded relative offsets cause the parsing algorithm in this
specification to return data.The parser MAY begin parsing at either the beginning
or end of the byte string representing the package. Parsing from
the end is useful when the package is embedded in another format such as a
self-extracting executable, while parsing from the beginning is useful when
loading from a stream.To parse from the end, the parser MUST load the last 18 bytes as the following
group in array context:
CDDL doesn’t actually define how to use it as a schema
to load CBOR data.If the bytes don’t match this group or these two CBOR items don’t occupy exactly
18 bytes, parsing MUST fail.Otherwise, continue as if the byte length bytes before the end of the string
were the beginning of the package, and the parser were
a from the beginning parser.If the first 10 bytes of the package are not “85 48 F0 9F 8C 90 F0 9F 93 A6”
(the CBOR encoding of the 5-item array header and 8-byte bytestring header,
followed by 🌐📦 in UTF-8), parsing MUST fail.Parse one CBOR item starting at the 11th byte of the package. If this does not
match the CDDLor it is not a Canonical CBOR item (Section 3.9 of ), parsing MUST fail.Let sections-start be the offset of the byte after the section-offsets item.
For example, if section-offsets were 52 bytes long, sections-start would be
63.This specification defines two section names: “indexed-content” and “manifest”.If section-offsets[“indexed-content”] is not present, parsing MUST fail.The parser MUST ignore unknown keys in the section-offsets map because new
sections may be defined in future specifications.
Do we need to mark critical section names?Let index be the result of parsing the bytes starting at offset
sections-start + section-offsets[“indexed-content”] using the instructions in
.If section-offsets[“manifest”] is present, let manifest be the
result of parsing the bytes starting at offset sections-start +
section-offsets[“manifest”] using the instructions in .The parser MAY return a semantic package consisting of index, and,
if initialized, manifest.To parse each resource described within index, the parser MUST follow the
instructions in .The main content of a package is an index of HTTP requests pointing to HTTP
responses. These request/response pairs hold the manifests of sub-packages and
the resources in the package and all of its sub-packages. Both the requests and
responses can appear in any order, usually chosen to optimize loading while the
package is streamed.To parse the index, starting at offset index-start, the parser MUST do the
following:If the byte at index-start is not 0x82 (the header for a 2-element
array), the parser MUST fail.Load a CBOR item starting at index-start + 1 as the index array in the following CDDL:If the item doesn’t match this CDDL, or it is not a Canonical CBOR item (Section
3.9 of ), the parser MUST fail.Let resources-start be the offset immediately after the index item. For
example, if index-start were 75 and the index item were 105 bytes long,
resources-start would be 75+1+105=181. (1 for the 0x82 array header.)Decode all of the resource-keys using , with an
initially-empty dynamic table for each one.
This spec has different security constraints from the
ones that drove HPACK, so we may be able to do better with another
compression format. The decoded resource-keys are header lists
(, Section 1.3), ordered lists of name-value pairs.The parser MUST fail if any of the following is true:HPACK decoding encountered an error.Any resource-key’s first three headers are not named “:scheme”,
“:authority”, and “:path”, in that order. Note that “:method” is
intentionally omitted because only the GET method is meaningful.Any of the pseudo-headers’ values violates a requirement in Section
8.1.2.3 of .Any resource-key has a non-pseudo-header name that includes the
“:” character or is not lower-case ascii (, Section
8.1.2).Any two decoded resource-keys are the same. Note that header
lists with the same header fields in a different order are not the
same.Increment all offsets by resources-start.Return the resulting index, an array of decoded-resource-key, adjusted-offset,
and optional-length triples.The optional length field in the index entries is redundant with the length
prefixes on the response-headers and body in the content, but it can be used
to issue Range requests for responses that appear late in the
content.A package’s manifest contains some metadata for the package; hashes, used in
, for all resources included in that package; and
validity information for any sub-packages the package depends
on. The manifest is signed, so that UAs can trust that it comes from its claimed
origin.
This section doesn’t describe a manifest
(https://www.merriam-webster.com/dictionary/manifest#h3), so consider
renaming it to something like “authenticity”.To parse a manifest starting at manifest-start, a parser MUST do the following:Load one CBOR item starting at manifest-start as a signed-manifest from the
following CDDL:If the item doesn’t match the CDDL or it’s not a Canonical CBOR item (Section
3.9 of ), parsing MUST fail.Parse the elements of certificates as X.509 certificates within the
profile. If any certificate fails to parse, parsing MUST fail.Let message be the concatenation of the following byte strings. This matches
the format to avoid cross-protocol attacks when
TLS certificates are used to sign manifests.A string that consists of octet 32 (0x20) repeated 64 times.A context string: the ASCII encoding of “Web Package Manifest”.A single 0 byte which serves as a separator.The bytes of the manifest CBOR item.Let signing-certificates be an empty array.For each element signature of signatures:Let certificate be certificates[signature[“keyIndex”]].The parser MUST define a partial function from public key types to signing
algorithms, with the following map as a subset:
rsa_pss_sha256 as defined in Section 4.2.3 of
ecdsa_secp256r1_sha256 as defined in Section 4.2.3 of
ecdsa_secp384r1_sha384 as defined in Section 4.2.3 of
Let signing-alg be the result of applying this function to the key type in
certificate’s Subject Public Key Info. If the function is undefined on this
input, the parser MUST continue to the next signature.Use signing-alg to verify that signature[“signature”] is message’s
signature by certificate’s public key. If it’s not, the parser MUST
continue to the next signature.Append certificate to signing-certificates. Note that failed signatures
simply cause their certificate to be ignored, so that packagers can give new
signature types to parsers that understand them.Let origin be manifest[“metadata”][“origin”].Try to find a certificate in signing-certificates that has an identity
(, Section 3.1) matching origin’s hostname, and that is trusted
for serverAuth (, Section 4.2.1.12) using paths built from elements
of certificates or any other certificates the parser is aware of. If no such
certificate is found, and the package is not already trusted as received from
origin’s hostname, for example because it was received over a TLS connection
to that host, then parsing MUST fail.TODO: Process the subpackages item by fetching those manifests via the index,
and checking their signatures and dates/hashes, recursively.The parsed manifest consists of the set of signing-certificates and the
manifest CBOR item. The items in manifest[“metadata”] SHOULD be interpreted
as described in the specification.A sub-package is represented by a file looked up as
a within the indexed-content section. The sub-package’s
resources are not otherwise distinguished from the rest of the resources in the
package. Sub-packages can form an arbitrarily-deep tree.There are three possible forms of dependencies on sub-packages, of which we
allow two. Because a sub-package’s manifest is protected by its own signature,
if the main package trusts the sub-package’s server, it could avoid specifying a
version of the sub-package at all. However, this opens the main package up to
downgrade attacks, where the sub-package is replaced by an older, vulnerable
version, so we don’t allow this option.If the main package wants to load either the sub-package it was built with or
any upgrade, it can specify the date of the original sub-package:Constraining packages with their date makes it possible to link together
sub-packages with common dependencies, even if the sub-packages were built at
different times.If the main package wants to be certain it’s loading the exact version
of a sub-package that it was built with, it can constrain sub-package
with a hash of its manifest:Note that because the sub-package may include sub-sub-packages by date, the top
package may need to explicitly list those sub-sub-packages’ hashes in order to
be completely constrained.To parse the resource from a package corresponding to a header-list, a
parser MUST do the following:Find the (resource-key, offset, length) triple in package’s index where
resource-key is the same as header-list. If no such triple exists, the
parser MUST fail.Parse one CBOR item starting at offset as the following CDDL:If the item doesn’t match the CDDL or it’s not a Canonical CBOR item (Section
3.9 of ), parsing MUST fail.Decode the response-headers field using , with an
initially-empty dynamic table. The decoded response-headers is a
header list (, Section 1.3), an ordered list of name-value
pairs.The parser MUST fail if any of the following is true:HPACK decoding encountered an error.The first header name within response-headers is not “:status”,
or this pseudo-header’s value violates a requirement in Section
8.1.2.3 of .Any other header name includes the “:” character or is not
lower-case ascii (, Section 8.1.2).The header-list contains any header names other than “:scheme”,
“:authority”, “:path”, and either response-headers has no “vary”
header (Section 7.1.4 of ) or these header names aren’t
listed in it.Let origin be the Web Origin of header-list’s “:scheme” and
“:authority” headers.Let resource-bytes be the result of encoding the array of
[header-list, response-headers, body] as Canonical CBOR in the
following CDDL schema:
This step would be inside the manifest-only block, but
then the code block is rendered out-of-order.Note that this uses the decoded header fields, not the bytes originally included
in the package.The hashed data differs from , which only hashes the body. Including the
headers will usually prevent a package from relying on some of its contents
being transferred as normal network responses, unless its author can guarantee
the network won’t change or reorder the headers.If the package contains a manifest:TODO: Let origin-manifest be the signed manifest for origin, found
by searching through manifest’s subpackages for a matching origin.Let alg be one of the hash-algorithms within origin-manifest. The
parser SHOULD select the most collision-resistant hash algorithm. If the
parser also implements , it SHOULD use the same order as its
getPrioritizedHashFunction() implementation.If the digest of resource-bytes using alg does not appear in the
origin-manifest’s resource-hashes[alg] array, the parser MUST fail.Return the (decoded response-headers, body) pair.Packages SHOULD consist of a single Canonical CBOR item matching the
webpackage CDDL rule in .Every resource’s hash SHOULD appear in every array within
resource-hashes: otherwise the set of valid resources will depend on
the parser’s choice of preferred hash algorithm.Signature validation is difficult.Packages with a valid signature need to be invalidated when eitherthe private key for any certificate in the signature’s validation
chain is leaked, ora vulnerability is discovered in the package’s contents.Because packages are intended to be used offline, it’s impossible to
inject a revocation check into the critical path of using the package,
and even in online scenarios,
such
revocation checks don’t actually work.
Instead, package consumers must check for a sufficiently recent set of
validation files, consisting of OCSP responses and signed
package version constraints, for example within the last 7-30 days.
TODO: These version constraints aren’t designed yet.Relaxing the requirement to consult DNS when determining authority for an origin
means that an attacker who possesses a valid certificate no longer needs to be
on-path to redirect traffic to them; instead of modifying DNS, they need only
convince the user to visit another Web site, in order to serve packages signed
as the target.All subpackages that mention a particular origin need to be validated
before loading resources from that origin. Otherwise, package A could
include package B and an old, vulnerable version of package C that B
also depends on. If B’s dependency isn’t checked before loading
resources from C, A could compromise B.IANA maintains the registry of Internet Media Types
at https://www.iana.org/assignments/media-types.Type name: applicationSubtype name: package+cbor
I suspect the mime type will need to be a bit longer:
application/webpackage+cbor or similar.Required parameters: N/AOptional parameters: N/AEncoding considerations: binarySecurity considerations: See of this document.Interoperability considerations: N/APublished specification: This documentApplications that use this media type: None yet, but it is expected that web
browsers will use this format.Fragment identifier considerations: N/AAdditional information: Deprecated alias names for this type: N/AMagic number(s): 85 48 F0 9F 8C 90 F0 9F 93 A6File extension(s): .wpkMacintosh file type code(s): N/APerson & email address to contact for further information:
See the Author’s Address section of this specification.Intended usage: COMMONRestrictions on usage: N/AAuthor:
See the Author’s Address section of this specification.Change controller:
The IESG iesg@ietf.orgProvisional registration? (standards tree only): Not yet.Concise Binary Object Representation (CBOR)The Concise Binary Object Representation (CBOR) is a data format whose design goals include the possibility of extremely small code size, fairly small message size, and extensibility without the need for version negotiation. These design goals make it different from earlier binary serializations such as ASN.1 and MessagePack.CBOR data definition language (CDDL): a notational convention to express CBOR data structuresThis document proposes a notational convention to express CBOR data structures (RFC 7049). Its main goal is to provide an easy and unambiguous way to express structures for protocol messages and data formats that use CBOR.Web App ManifestHTMLWHATWGSubresource IntegrityKey words for use in RFCs to Indicate Requirement LevelsIn many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.Ambiguity of Uppercase vs Lowercase in RFC 2119 Key WordsRFC 2119 specifies common key words that may be used in protocol specifications. This document aims to reduce the ambiguity by clarifying that only UPPERCASE usage of the key words have the defined special meanings.HPACK: Header Compression for HTTP/2This specification defines HPACK, a compression format for efficiently representing HTTP header fields, to be used in HTTP/2.Hypertext Transfer Protocol Version 2 (HTTP/2)This specification describes an optimized expression of the semantics of the Hypertext Transfer Protocol (HTTP), referred to as HTTP version 2 (HTTP/2). HTTP/2 enables a more efficient use of network resources and a reduced perception of latency by introducing header field compression and allowing multiple concurrent exchanges on the same connection. It also introduces unsolicited push of representations from servers to clients.This specification is an alternative to, but does not obsolete, the HTTP/1.1 message syntax. HTTP's existing semantics remain unchanged.Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) ProfileThis memo profiles the X.509 v3 certificate and X.509 v2 certificate revocation list (CRL) for use in the Internet. An overview of this approach and model is provided as an introduction. The X.509 v3 certificate format is described in detail, with additional information regarding the format and semantics of Internet name forms. Standard certificate extensions are described and two Internet-specific extensions are defined. A set of required certificate extensions is specified. The X.509 v2 CRL format is described in detail along with standard and Internet-specific extensions. An algorithm for X.509 certification path validation is described. An ASN.1 module and examples are provided in the appendices. [STANDARDS-TRACK]The Transport Layer Security (TLS) Protocol Version 1.3This document specifies version 1.3 of the Transport Layer Security (TLS) protocol. TLS allows client/server applications to communicate over the Internet in a way that is designed to prevent eavesdropping, tampering, and message forgery.HTTP Over TLSThis memo describes how to use Transport Layer Security (TLS) to secure Hypertext Transfer Protocol (HTTP) connections over the Internet. This memo provides information for the Internet community.Hypertext Transfer Protocol (HTTP/1.1): Semantics and ContentThe Hypertext Transfer Protocol (HTTP) is a stateless \%application- level protocol for distributed, collaborative, hypertext information systems. This document defines the semantics of HTTP/1.1 messages, as expressed by request methods, request header fields, response status codes, and response header fields, along with the payload of messages (metadata and body content) and mechanisms for content negotiation.The Web Origin ConceptThis document defines the concept of an "origin", which is often used as the scope of authority or privilege by user agents. Typically, user agents isolate content retrieved from different origins to prevent malicious web site operators from interfering with the operation of benign web sites. In addition to outlining the principles that underlie the concept of origin, this document details how to determine the origin of a URI and how to serialize an origin into a string. It also defines an HTTP header field, named "Origin", that indicates which origins are associated with an HTTP request. [STANDARDS-TRACK]X.509 Internet Public Key Infrastructure Online Certificate Status Protocol - OCSPThis document specifies a protocol useful in determining the current status of a digital certificate without requiring Certificate Revocation Lists (CRLs). Additional mechanisms addressing PKIX operational requirements are specified in separate documents. This document obsoletes RFCs 2560 and 6277. It also updates RFC 5912.Service Workers 1Hypertext Transfer Protocol (HTTP/1.1): Range RequestsThe Hypertext Transfer Protocol (HTTP) is a stateless application- level protocol for distributed, collaborative, hypertext information systems. This document defines range requests and the rules for constructing and combining responses to those requests.Media Type Specifications and Registration ProceduresThis document defines procedures for the specification and registration of media types for use in HTTP, MIME, and other Internet protocols. This memo documents an Internet Best Current Practice.Thanks to Adam Langley and Ryan Sleevi for in-depth feedback about the security
impact of this proposal.