dispatch J. Yasskin Internet-Draft Google Intended status: Informational August 30, 2017 Expires: March 3, 2018 Use Cases and Requirements for Web Packages draft-yasskin-webpackage-use-cases-00 Abstract This document lists use cases for signing and/or bundling collections of web pages, and extracts a set of requirements from them. Note to Readers Discussion of this draft takes place on the ART area mailing list (art@ietf.org), which is archived at https://mailarchive.ietf.org/arch/search/?email_list=art. The source code and issues list for this draft can be found in https://github.com/WICG/webpackage. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on March 3, 2018. Copyright Notice Copyright (c) 2017 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of Yasskin Expires March 3, 2018 [Page 1] Internet-Draft Use Cases and Requirements for Web Packages August 2017 publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Use cases . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.1. Essential . . . . . . . . . . . . . . . . . . . . . . . . 3 2.1.1. Offline installation . . . . . . . . . . . . . . . . 3 2.1.2. Offline browsing . . . . . . . . . . . . . . . . . . 5 2.1.3. Save and share a web page . . . . . . . . . . . . . . 5 2.2. Nice-to-have . . . . . . . . . . . . . . . . . . . . . . 6 2.2.1. Packaged Web Publications . . . . . . . . . . . . . . 6 2.2.2. Third-party security review . . . . . . . . . . . . . 7 2.2.3. Building packages from multiple libraries . . . . . . 7 2.2.4. CDNs . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2.5. Installation from a self-extracting executable . . . 8 2.2.6. Ergonomic replacement for HTTP/2 PUSH . . . . . . . . 9 2.2.7. Packages in version control . . . . . . . . . . . . . 9 3. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 9 3.1. Essential . . . . . . . . . . . . . . . . . . . . . . . . 10 3.1.1. Indexed by URL . . . . . . . . . . . . . . . . . . . 10 3.1.2. Request headers . . . . . . . . . . . . . . . . . . . 10 3.1.3. Response headers . . . . . . . . . . . . . . . . . . 10 3.1.4. Signing as an origin . . . . . . . . . . . . . . . . 10 3.1.5. Random access . . . . . . . . . . . . . . . . . . . . 11 3.1.6. Resources from multiple origins in a package . . . . 11 3.1.7. Cryptographic agility . . . . . . . . . . . . . . . . 11 3.1.8. Unsigned content . . . . . . . . . . . . . . . . . . 11 3.1.9. Certificate revocation . . . . . . . . . . . . . . . 11 3.1.10. Downgrade prevention . . . . . . . . . . . . . . . . 11 3.1.11. Metadata . . . . . . . . . . . . . . . . . . . . . . 11 3.1.12. Implementations are hard to get wrong . . . . . . . . 12 3.2. Nice to have . . . . . . . . . . . . . . . . . . . . . . 12 3.2.1. Streamed loading . . . . . . . . . . . . . . . . . . 12 3.2.2. Cross-signatures . . . . . . . . . . . . . . . . . . 12 3.2.3. Binary . . . . . . . . . . . . . . . . . . . . . . . 12 3.2.4. Deduplication of diamond dependencies . . . . . . . . 12 3.2.5. Old crypto can be removed . . . . . . . . . . . . . . 12 3.2.6. Compress transfers . . . . . . . . . . . . . . . . . 13 3.2.7. Compress stored packages . . . . . . . . . . . . . . 13 3.2.8. Subsetting and reordering . . . . . . . . . . . . . . 13 3.2.9. Packaged validity information . . . . . . . . . . . . 13 3.2.10. Signing uses existing TLS certificates . . . . . . . 13 Yasskin Expires March 3, 2018 [Page 2] Internet-Draft Use Cases and Requirements for Web Packages August 2017 3.2.11. External dependencies . . . . . . . . . . . . . . . . 13 3.2.12. Trailing length . . . . . . . . . . . . . . . . . . . 13 4. Non-goals . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.1. Store confidential data . . . . . . . . . . . . . . . . . 14 4.2. Generate packages on the fly . . . . . . . . . . . . . . 14 4.3. Non-origin identity . . . . . . . . . . . . . . . . . . . 14 4.4. DRM . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 5. Security Considerations . . . . . . . . . . . . . . . . . . . 14 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 14 7.1. Informative References . . . . . . . . . . . . . . . . . 15 7.2. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 17 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 17 1. Introduction People would like to use content offline and in other situations where there isn't a direct connection to the server where the content originates. However, it's difficult to distribute and verify the authenticity of applications and content without a connection to the network. The W3C has addressed running applications offline with Service Workers ([ServiceWorkers]), but not the problem of distribution. Previous attempts at packaging web resources (e.g. Resource Packages [3] and the W3C TAG's packaging proposal [4]) were motivated by speeding up the download of resources from a single server, which is probably better achieved through other mechanisms like HTTP/2 PUSH, possibly augmented with a simple manifest of URLs a page plans to use [5]. This attempt is instead motivated by avoiding a connection to the origin server at all. It may still be useful for the earlier use cases, so they're still listed, but they're not primary. 2. Use cases These use cases are in rough descending priority order. If use cases have conflicting requirements, the design should enable more important use cases. 2.1. Essential 2.1.1. Offline installation Alex can download a file containing a website (a PWA [6]) including a Service Worker from origin "O", and transmit it to their peer Bailey, and then Bailey can install the Service Worker with a proof that it Yasskin Expires March 3, 2018 [Page 3] Internet-Draft Use Cases and Requirements for Web Packages August 2017 came from "O". This saves Bailey the bandwidth costs of transferring the website. Associated requirements: o Indexed by URL: Resources on the web are addressed by URL. o Request headers: If Bailey's running a different browser from Alex or has a different language configured, the "accept*" headers are important for selecting which resource to use at each URL. o Response headers: The meaning of a resource is heavily influenced by its HTTP response headers. o Signing as an origin: To prove that the file came from "O". o Signing uses existing TLS certificates: So "O" doesn't have to spend lots of money buying a specialized certificate. o Resources from multiple origins in a package: So the site can be built from multiple components (Section 2.2.3). o Cryptographic agility: Today's algorithms will eventually be obsolete and will need to be replaced. o Certificate revocation: "O"'s certificate might be compromised or mis-issued, and the attacker shouldn't then get an infinite ability to mint packages. o Downgrade prevention: "O"'s site might have an XSS vulnerability, and attackers with an old signed package shouldn't be able to take advantage of the XSS forever. o Metadata: The browser needs to know which resource within a package file to treat as its Service Worker and/or initial HTML page. 2.1.1.1. Online use Bailey may have an internet connection through which they can, in real time, fetch updates to the package they received from Alex. 2.1.1.2. Fully offline use Or Bailey may not have any internet connection a significant fraction of the time, either because they have no internet at all, because they turn off internet except when intentionally downloading content, or because they use up their plan partway through each month. Yasskin Expires March 3, 2018 [Page 4] Internet-Draft Use Cases and Requirements for Web Packages August 2017 Associated requirements beyond Offline installation: o Packaged validity information: Even without a direct internet connection, Bailey should be able to check that their package is still valid. 2.1.2. Offline browsing Alex can download a file containing a large website (e.g. Wikipedia) from its origin, save it to transferrable storage (e.g. an SD card), and hand it to their peer Bailey. Then Bailey can browse the website with a proof that it came from "O". Bailey may not have the storage space to copy the website before browsing it. Associated requirements beyond Offline installation: o Random access: To avoid needing a long linear scan before using the content. o Compress stored packages: So that more content can fit on the same storage device. 2.1.3. Save and share a web page Casey is viewing a web page and wants to save it either for offline use or to show it to their friend Dakota. Since Casey isn't the web page's author, they don't have the private key needed to sign the page. Browsers currently allow their users to save pages, but each browser uses a different format (MHTML, Web Archive, or files in a directory), so Dakota and Casey would need to be using the same browser. Casey could also take a screenshot, at the cost of losing links and accessibility. Associated requirements: o Unsigned content: A client can't sign content as another origin. o Resources from multiple origins in a package: General web pages include resources from multiple origins. o Indexed by URL: Resources on the web are addressed by URL. o Response headers: The meaning of a resource is heavily influenced by its HTTP response headers. Yasskin Expires March 3, 2018 [Page 5] Internet-Draft Use Cases and Requirements for Web Packages August 2017 2.2. Nice-to-have 2.2.1. Packaged Web Publications The W3C's Publishing Working Group [7], merged from the International Digital Publishing Forum (IDPF) and in charge of EPUB maintenance, wants to be able to create publications on the web and then let them be copied to different servers or to other users via arbitrary protocols. See their Packaged Web Publications use cases [8] for more details. Associated requirements: o Indexed by URL: Resources on the web are addressed by URL. o Signing as an origin: So that readers can be sure their copy is authentic and so that copying the package preserves the URLs of the content inside it. o Downgrade prevention: An early version of a publication might contain incorrect content, and a publisher should be able to update that without worrying that an attacker can still show the old content to users. o Metadata: A publication can have copyright and licensing concerns; a title, author, and cover image; an ISBN or DOI name; etc.; which should be included when that publication is packaged. Other requirements are similar to those from Offline installation: o Random access: To avoid needing a long linear scan before using the content. o Compress stored packages: So that more content can fit on the same storage device. o Request headers: If different users' browsers have different capabilities or preferences, the "accept*" headers are important for selecting which resource to use at each URL. o Response headers: The meaning of a resource is heavily influenced by its HTTP response headers. o Signing uses existing TLS certificates: So a publisher doesn't have to spend lots of money buying a specialized certificate. o Cryptographic agility: Today's algorithms will eventually be obsolete and will need to be replaced. Yasskin Expires March 3, 2018 [Page 6] Internet-Draft Use Cases and Requirements for Web Packages August 2017 o Certificate revocation: The publisher's certificate might be compromised or mis-issued, and an attacker shouldn't then get an infinite ability to mint packages. 2.2.2. Third-party security review Some users may want to grant certain permissions only to applications that have been reviewed for security by a trusted third party. These third parties could provide guarantees similar to those provided by the iOS, Android, or ChromeOS app stores, which might allow browsers to offer more powerful capabilities than have been deemed safe for unaudited websites. Binary transparency for websites is similar: like with Certificate Transparency [RFC6962], the transparency logs would sign the content of the package to provide assurance that experts had a chance to audit the exact package a client received. Associated requirements: o Cross-signatures 2.2.3. Building packages from multiple libraries Large programs are built from smaller components. In the case of the web, components can be included either as Javascript files or as "