W3C

Reporting API 1

W3C Working Group Note

This version:
http://www.w3.org/TR/2016/NOTE-reporting-1-20160607/
Latest version:
http://www.w3.org/TR/reporting-1/
Latest Reporting API version:
http://www.w3.org/TR/reporting/
Editor's Draft:
https://w3c.github.io/reporting/
Previous version:
http://www.w3.org/TR/2016/WD-reporting-1-20160407/
Version History:
https://github.com/w3c/reporting/commits/master/index.src.html
Editors:
(Google Inc.)
(Google Inc.)
Participate:
File an issue (open issues)

Abstract

This document defines a generic reporting framework which allows web developers to associate a set of named reporting endpoints with an origin. Various platform features (like Content Security Policy, Network Error Reporting, and others) will use these endpoints to deliver feature-specific reports in a consistent manner.

Status of this document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This is a proposal and may change without any notices. Interested parties should bring discussions to the Web Platform Incubator Community Group.

This document was published by the Web Performance Working Group as a Working Group Note for history purposes. If you wish to make comments regarding this document, please direct them to the Web Platform Incubator Community Group instead. All comments are welcome.

You may find historical discussion in public-web-perf@w3.org (archives) with [Reporting] at the start of your email's subject.

Publication as a Working Group Note does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

This document is governed by the 1 September 2015 W3C Process Document.

1. Introduction

1.1. Guarantees

This specification aims to provide a best-effort report delivery system that executes out-of-band with website activity. The user agent will be able to do a better job prioritizing and scheduling delivery of reports, as it has an overview of cross-origin activity that individual websites do not, and can deliver reports based on error conditions that would prevent a website from loading in the first place.

The delivery is not, however, guaranteed in a strict sense. We spell out a reasonable set of retry rules in the algorithms below, but it’s quite possible for a report to be dropped on the floor if things go badly.

Reporting can generate a good deal of traffic, so we allow developers to set up groups of endpoints in order to distribute load. Each of these endpoints will receive a subset of the generated reports which target that group. The user agent will do its best to deliver a particular report to at most one endpoint in a group. That is, reports will not fan-out to all the endpoints in a group, but the user agent will attempt delivery to one endpoint, and fallback to another upon failure.

1.2. Examples

MegaCorp Inc. wants to collect Content Security Policy and Key Pinning violation reports. It can do so by delivering the following header to define a set of reporting endpoints named "endpoint-1":
Report-To: { "url": "https://example.com/reports",
                    "group": "endpoint-1",
                    "max-age": 10886400 },
           { "url": "https://backup.com/reports",
             "group": "endpoint-1",
             "max-age": 10886400 }

And the following headers, which direct CSP and HPKP reports to that group:

Content-Security-Policy: ...; report-to=endpoint-1
Public-Key-Pins: ...; report-to=endpoint-1
After processing reports for a little while, MegaCorp Inc. decides to split the processing of these two types of reports out into two distinct endpoints in order to make the processing scripts simpler. It can do so by delivering the following header to define two reporting endpoints:
Report-To: { "url": "https://example.com/csp-reports",
                    "group": "csp-endpoint",
                    "max-age": 10886400 },
           { "url": "https://example.com/hpkp-reports",
             "group": "hpkp-endpoint",
             "max-age": 10886400 }

And the following headers, which direct CSP and HPKP reports to those named endpoint:

Content-Security-Policy: ...; report-to=csp-endpoint
Public-Key-Pins: ...; report-to=hpkp-endpoint

2. Concepts

2.1. Endpoints

An endpoint#endpointReferenced in:1.1. Guarantees2.1. Endpoints (2) (3) (4) (5) (6)2.2. Clients (2) (3)2.4. Storage (2) (3)3.1. The Report-To HTTP Response Header Field3.1.2. The group member (2) (3)3.1.4. The max-age member3.2. Process reporting endpoints for response to request (2) (3) (4)4.2. Does endpoint match report? 4.3. Send reports (2)4.4. Attempt to deliver reports to endpoint 5.2. Garbage Collection (2) (3) (4) is location to which reports may be sent.

Each endpoint has a url#dom-endpoint-urlReferenced in:3.2. Process reporting endpoints for response to request (2)4.4. Attempt to deliver reports to endpoint , which is a URL.

Each endpoint has a failures#dom-endpoint-failuresReferenced in:3.2. Process reporting endpoints for response to request 4.3. Send reports (2)5.2. Garbage Collection, which is a non-negative integer representing the number of consecutive times this endpoint has failed to respond to a request.

Each endpoint has a retry-after#dom-endpoint-retry-afterReferenced in:2.1. Endpoints3.2. Process reporting endpoints for response to request 4.3. Send reports (2), which is either null, or a timestamp after which delivery should be retried.

Each endpoint has a clients#dom-endpoint-clientsReferenced in:2.1. Endpoints3.2. Process reporting endpoints for response to request (2) (3)4.2. Does endpoint match report? list, which is a list of clients.

An endpoint is expired#endpoint-expiredReferenced in:4.3. Send reports 5.2. Garbage Collection if every client in its list of clients is expired.

An endpoint is pending#endpoint-pendingReferenced in:4.3. Send reports if its retry-after is not null, and represents a time in the future.

2.2. Clients

A client#clientReferenced in:2.1. Endpoints (2)2.2. Clients (2) (3) (4) (5) (6)3.2. Process reporting endpoints for response to request represents a particular origin’s relationship to an endpoint.

Each client has an origin#dom-client-originReferenced in:3.2. Process reporting endpoints for response to request (2) (3)4.2. Does endpoint match report? (2), which is an origin.

Each client has a subdomains#dom-client-subdomainsReferenced in:3.2. Process reporting endpoints for response to request (2)3.3. Parse reporting endpoint tuples from value 4.2. Does endpoint match report? flag (which is either "include" or "exclude").

Each client has a group#dom-client-groupReferenced in:2.3. Reports3.2. Process reporting endpoints for response to request (2)3.3. Parse reporting endpoint tuples from value 4.2. Does endpoint match report? which is an ASCII string.

Each client has a ttl#dom-client-ttlReferenced in:2.2. Clients3.2. Process reporting endpoints for response to request (2) (3)3.3. Parse reporting endpoint tuples from value representing the the number of seconds the client remains valid for an endpoint.

Each client has a creation#dom-client-creationReferenced in:2.2. Clients3.2. Process reporting endpoints for response to request which is the timestamp at which the client was added to an endpoint.

A client is expired#client-expiredReferenced in:2.1. Endpoints4.2. Does endpoint match report? if its creation plus its ttl represents a time in the past.

2.3. Reports

A report#reportReferenced in:2.1. Endpoints2.3. Reports (2) (3) (4) (5) (6) (7)2.4. Storage (2) (3)3.1.2. The group member4. Report Delivery4.1. Queue data as type for endpoint group on settings (2)4.2. Does endpoint match report? 4.3. Send reports (2) (3)4.4. Attempt to deliver reports to endpoint 5.2. Garbage Collection (2) (3)7.1. Capability URLs is a collection of arbitrary data which the user agent is expected to deliver to a specified endpoint.

Each report has a body#dom-report-bodyReferenced in:4.1. Queue data as type for endpoint group on settings 4.4. Attempt to deliver reports to endpoint 7.1. Capability URLs, which is either null or an object which can be serialized into a JSON text.

Each report has an url#dom-report-urlReferenced in:4.1. Queue data as type for endpoint group on settings 4.4. Attempt to deliver reports to endpoint , which is the address of the Document or Worker from which the report was generated.

Note: We strip the username, password, and fragment from this serialized URL. See §7.1 Capability URLs.

Each report has an origin#dom-report-originReferenced in:4.1. Queue data as type for endpoint group on settings 4.2. Does endpoint match report? (2), which is an origin representing the report’s initiator.

Each report has an group#dom-report-groupReferenced in:4.1. Queue data as type for endpoint group on settings 4.2. Does endpoint match report? , which is a string representing the group to which this report will be sent.

Each report has a type#dom-report-typeReferenced in:4.1. Queue data as type for endpoint group on settings 4.4. Attempt to deliver reports to endpoint , which is a non-empty string specifying the type of data the report contains.

Each report has a timestamp#dom-report-timestampReferenced in:4.1. Queue data as type for endpoint group on settings 4.4. Attempt to deliver reports to endpoint , which records the time at which the report was generated, in milliseconds since the unix epoch.

Each report has a attempts#dom-report-attemptsReferenced in:4.1. Queue data as type for endpoint group on settings 4.4. Attempt to deliver reports to endpoint 5.2. Garbage Collection, which is a non-negative integer representing the number of times the user agent attempted to deliver the report.

2.4. Storage

A conformant user agent MUST provide a reporting cache#reporting-cacheReferenced in:3.1.4. The max-age member3.2. Process reporting endpoints for response to request (2) (3) (4)4. Report Delivery4.1. Queue data as type for endpoint group on settings (2)4.3. Send reports (2) (3) (4) (5)8.5. Clearing the reporting cache (2), which is a storage mechanism that maintains a set of endpoints that websites have instructed the user agent to associate with their origins, and a set of reports which are queued for delivery.

This storage mechanism is opaque, vendor-specific, and not exposed to the web, but it MUST provide the following methods which will be used in the algorithms this document defines:

  1. Insert, update, and remove endpoints.

  2. Enqueue and dequeue reports for delivery.

  3. Retrieve a list of endpoint objects.

  4. Retrieve a list of queued report objects.

  5. Clear the cache.

3. Endpoint Delivery

A server MAY define a set of reporting endpoints for an origin it controls via the Report-To HTTP response header field. This mechanism is defined in §3.1 The Report-To HTTP Response Header Field, and its processing in §3.2 Process reporting endpoints for response to request.

The Report-To#report-toReferenced in:1.2. Examples (2)3. Endpoint Delivery3.2. Process reporting endpoints for response to request (2) HTTP response header field instructs the user agent to store a reporting endpoints for an origin. The header is represented by the following ABNF grammar [RFC5234]:

Report-To = json-field-value
            ; See Section 2 of [[HTTP-JFV]], and Section 2 of [[RFC7159]]

The header’s value is interpreted as an array of JSON objects, as described in Section 4 of [HTTP-JFV].

Each object in the array defines an endpoint to which reports may be delivered, and will be parsed as defined in §3.3 Parse reporting endpoint tuples from value.

The following subsections defined the initial set of known members in each JSON object the header’s value defines. Future versions of this document may define additional such members, and user agents MUST ignore unknown members when parsing the header.

3.1.1. The url member

The REQUIRED url#urlReferenced in:1.2. Examples (2) (3) (4)3.3. Parse reporting endpoint tuples from value (2) member is a string that defines the location of a reporting endpoint. The member’s value MUST be a string; any other type will result in a parse error.

Moreover, the URL that the member’s value represents MUST be potentially trustworthy [SECURE-CONTEXTS]. Non-secure endpoints will be ignored.

3.1.2. The group member

The OPTIONAL group#groupReferenced in:1.2. Examples (2) (3) (4)3.3. Parse reporting endpoint tuples from value member is a string that associates a name with the reporting endpoint. The member’s value MUST be a string; any other type will result in a parse error. If no member named "group" is present in the object, the endpoint will be associated with a group named "default".

The group name is not unique and multiple endpoints may use the same name to create a set of reporting endpoints that can be used for backup and failover purposes. If the member is omitted, the endpoint MUST be associated with the "default" group name.

Note: If a group resolves to multiple endpoints, the user agent will deliver a particular report to at most one endpoint in that group on a best-effort basis.

3.1.3. The includeSubdomains member

The OPTIONAL includeSubdomains#includesubdomainsReferenced in:3.3. Parse reporting endpoint tuples from value member is a boolean that enables an endpoint for all subdomains of the current origin’s host. If no member named "includeSubdomains" is present in the object, or its value is not "true", the endpoint will not be enabled for subdomains.

3.1.4. The max-age member

The REQUIRED max-age#max-ageReferenced in:1.2. Examples (2) (3) (4)3.3. Parse reporting endpoint tuples from value (2) member defines the reporting endpoint’s lifetime, as a non-negative integer number of seconds. The member’s value MUST be a number; any other type will result in a parse error.

A value of "0" will cause the endpoint to be removed from the user agent’s reporting cache.

3.2. Process reporting endpoints for response to request

Given a response (response) and a request (request), this algorithm extracts a list of endpoint objects, and updates the reporting cache accordingly.

Note: This algorithm is called from around step 13 of main fetch [FETCH], and only updates the reporting cache if the response has been delivered securely.

Fetch monkey patching. Talk to Anne.

  1. Abort these steps if any of the following conditions are true:

    1. response’s HTTPS state is not "modern", and the origin of response’s url is not potentially trustworthy.

    2. response’s header list does not contain a header whose name is "Report-To".

  2. Let header be the value of the header in response’s header list whose name is "Report-To".

  3. Let origin be the origin of response’s url.

  4. Let tuples be the result of executing §3.3 Parse reporting endpoint tuples from value on header.

  5. For each tuple in tuples:

    1. Let new be a new client whose properties are set as follows:

      origin

      origin

      subdomains

      tuple’s subdomains

      group

      tuple’s group

      ttl

      tuple’s ttl

      creation

      The current timestamp

    2. If there exists an endpoint (endpoint) in the reporting cache whose url is tuple’s URL:

      1. For each client in endpoint’s clients, delete client if new’s origin is the same as client’s origin.

      2. If new’s ttl is greater than 0, append new to endpoint’s clients list.

      3. Skip to the next tuple.

    3. Otherwise, there is no existing endpoint, so let endpoint be a new endpoint whose properties are set as follows:

      url

      tuple’s URL

      failures

      0

      retry-after

      null

      clients

      A new list, containing tuple

    4. Insert endpoint into the reporting cache for origin.

3.3. Parse reporting endpoint tuples from value

Given a string (value) this algorithm will return a list of tuples containing a URL, subdomains flag, ttl, and a group. This list will be empty if no valid endpoints could be parsed.

  1. Let list be the result of executing the algorithm defined in Section 4 of [HTTP-JFV]. If that algorithm results in an error, abort these steps.

  2. Let tuples be an empty list.

  3. For each item in list:

    1. If item has no member named "url", or that member’s value is not a string, skip to the next item.

    2. If item has no member named "max-age", or that member’s value is not a number, skip to the next item.

    3. Let tuple be a new tuple containing the following:

      "url"

      The result of executing the URL parser on item’s "url" member’s value.

      "subdomains"

      "Include" if item has a member named "includeSubdomains" whose value is true, "Exclude" otherwise.

      "ttl"

      item’s "max-age" member’s value.

      "group"

      item’s "group" member’s value if present, and "default" otherwise.

    4. If tuple’s "url" is not potentially trustworthy, or tuple’s "ttl" is a negative number, skip to the next item.

      Note: User agents SHOULD raise some sort of developer-visible parse error in this case.

    5. Append tuple to tuples.

  4. Return tuples.

4. Report Delivery

Over time, various features will queue up a list of reports in the user agent’s reporting cache. The user agent will periodically grab the list of currently pending reports, and deliver them to the associated endpoints. This document does not define a schedule for the user agent to follow, and assumes that the user agent will have enough contextual information to deliver reports in a timely manner, balanced against impacting a user’s experience.

That said, a user agent SHOULD make a effort to deliver reports as soon as possible after queuing, as a report’s data might be significantly more useful in the period directly after its generation than it would be a day or a week later.

4.1. Queue data as type for endpoint group on settings

Given a serializable object (data), a string (type), another string (endpoint group), and an environment settings object (settings), the following algorithm will create a report, and add it to reporting cache’s queue for future delivery.

  1. Let report be a new report object with its values initialized as follows:

    body

    data

    origin

    settings’s origin

    group

    endpoint group

    type

    type

    timestamp

    The current timestamp.

    attempts

    0

  2. Let url be settings’s creation URL.

  3. Set url’s username to the empty string, and its password to null.

  4. Set report’s url to the result of executing the URL serializer on url with the exclude fragment flag set.

  5. Add report to the reporting cache.

Note: We strip the username, password, and fragment from the serialized URL in the report. See §7.1 Capability URLs.

Note: The user agent MAY reject reports for any reason. This API does not guarantee delivery of arbitrary amounts of data, for instance.

4.2. Does endpoint match report?

Given an endpoint (endpoint) and a report (report), this algorithm returns "Match" if report should be sent to endpoint, and "Does Not Match" otherwise:

  1. For each client in endpoint’s clients:

    1. If client is expired, skip to the next client.

      Note: In this case, the user agent MAY remove client from endpoint, or it may wait and collect garbage en masse at some point in the future as described in §5.2 Garbage Collection.

    2. Return "Match" if each of the following criteria is met:

      1. report’s group is an ASCII case-insensitive match for client’s group

      2. If client’s subdomains is "exclude", report’s origin is the same as client’s origin

        Otherwise, report’s origin's host is either a superdomain match or congruent match for client’s origin's host [RFC6797]

  2. Return "Does Not Match".

4.3. Send reports

A user agent sends reports by executing the following steps:

  1. Let reports be a copy of the list of queued report objects in reporting cache.

  2. Let endpoint map be an empty map of endpoint objects to lists of report objects.

  3. For each report in reports:

    1. For endpoint in the reporting cache:

      1. If endpoint is expired, skip to the next endpoint.

        Note: In this case, the user agent MAY remove endpoint from the reporting cache, or it may wait and collect garbage en masse at some point in the future as described in §5.2 Garbage Collection.

      2. If endpoint is pending, skip to the next endpoint.

      3. If §4.2 Does endpoint match report? returns "Match" when executed upon endpoint and report:

        1. Append report to endpoint map’s list of reports for endpoint.

        2. Skip to the next report.

          Note: This ensures that each report is assigned to a single endpoint, even if it matches multiple. In order to ensure an even distribution across endpoints, the user agent SHOULD randomize the order in which it walks through endpoints.

    2. If we reach this step, the report did not match any endpoint and the user agent MAY remove report from the reporting cache directly. Depending on load, the user agent MAY instead wait for §5.2 Garbage Collection at some point in the future.

  4. For each (endpoint, reports) pair in endpoint map, execute the following steps asynchronously:

    1. Let result be the result of executing §4.4 Attempt to deliver reports to endpoint on endpoint and reports.

    2. If result is "Success":

      1. Set endpoint’s failures to 0, and its retry-after to null.

      2. Remove each report in reports from the reporting cache.

      Otherwise:

      1. Increment endpoint’s failures.

      2. Set endpoint’s retry-after to a point in the future which the user agent chooses.

        Note: We don’t specify a particular algorithm here, but user agents are encouraged to employ some sort of exponential backoff algorithm which increases the retry period with the number of failures, with the addition of some random jitter to ensure that temporary failures don’t lead to a crush of reports all being retried on the same schedule.

        Add in a reasonable reference describing a good algorithm. Wikipedia, if nothing else.

Note: User agents MAY decide to attempt delivery for only a subset of the collected reports or endpoints (because, for example, sending all the reports at once would consume an unreasonable amount of bandwidth, etc). As reports are only removed from the cache when they’re successfully delivered, skipped reports will simply be delivered later.

4.4. Attempt to deliver reports to endpoint

Given a list of reports (reports) and an endpoint (endpoint), this algorithm will construct a request, and attempt to deliver it to endpoint. It returns "Success" if that delivery succeeds, "Remove Endpoint" if the endpoint explicitly removes itself as a reporting endpoint by sending a 410 response, and "Failure" otherwise.

  1. Let collection be a new ECMAScript Array object [ECMA-262].

  2. For each report in reports:

    1. Let data be a new ECMAScript Object with the following properties [ECMA-262]:

      age

      The number of milliseconds between report’s timestamp and the current time.

      type

      report’s type

      url

      report’s url

      report

      report’s body

      Note: Client clocks are unreliable and subject to skew. We therefore deliver an age attribute rather than an absolute timestamp. See also §8.2 Clock Skew

    2. Increment report’s attempts.

    3. Append data to collection.

  3. Let request be a new request with the following properties [FETCH]:

    url

    endpoint’s url

    header list

    A new header list containing a header named "Content-Type" whose value is "application/report"

    client

    null

    window

    "no-window"

    skip-service-worker flag

    Set.

    initiator

    ""

    type

    "report"

    destination

    ""

    mode

    "cors"

    credentials

    "include"

    body

    The string resulting from executing the JSON.stringify() algorithm on collection [ECMA-262]

    The "report" type does not exist in Fetch. Talk to Anne.

  4. Queue a task to fetch request.

  5. Wait for a response (response).

  6. If response’s status is an OK status (200-299), return "Success".

  7. If response’s status is 410 Gone [RFC7231], return "Remove Endpoint".

  8. Return "Failure".

5. Implementation Considerations

5.1. Delivery

The user agent SHOULD attempt to deliver reports as soon as possible to provide feedback to developers as quickly as possible. However, when this desire is balanced against the impact on the user, the user wins. With that in mind, the user agent MAY delay delivery of reports based on its knowledge of the user’s activities and context.

For instance, the user agent SHOULD prioritize the transmission of reporting data lower than other network traffic. The user’s explicit activities on a website should preempt reporting traffic.

The user agent MAY choose to withhold report delivery entirely until the user is on a fast, cheap network in order to prevent unnecessary data cost.

The user agent MAY choose to prioritize reports from particular origins over others (perhaps those that the user visits most often?)

5.2. Garbage Collection

Periodically, the user agent SHOULD walk through the cached reports and endpoints, and discard those that are no longer relevant. These include:

6. Sample Reports

POST / HTTP/1.1
Host: example.com
...
Content-Type: application/report

[{
  type: "csp",
  age: 10,
  url: "https://example.com/vulnerable-page/",
  report: {
    "blocked": "https://evil.com/evil.js",
    "directive": "script-src",
    "policy": "script-src 'self'; object-src 'none'",
    "status": 200,
    "referrer": "https://evil.com/"
  }
}, {
  type: "hpkp",
  age: 32,
  url: "https://www.example.com/",
  report: {
    "date-time": "2014-04-06T13:00:50Z",
    "hostname": "www.example.com",
    "port": 443,
    "effective-expiration-date": "2014-05-01T12:40:50Z"
    "include-subdomains": false,
    "served-certificate-chain": [
      "-----BEGIN CERTIFICATE-----\n
      MIIEBDCCAuygAwIBAgIDAjppMA0GCSqGSIb3DQEBBQUAMEIxCzAJBgNVBAYTAlVT\n
      ...
      HFa9llF7b1cq26KqltyMdMKVvvBulRP/F/A8rLIQjcxz++iPAsbw+zOzlTvjwsto\n
      WHPbqCRiOwY1nQ2pM714A5AuTHhdUDqB1O6gyHA43LL5Z/qHQF1hwFGPa4NrzQU6\n
      yuGnBXj8ytqU0CwIPX4WecigUCAkVDNx\n
      -----END CERTIFICATE-----",
      ...
    ]
  }
}, {
  type: "nel",
  age: 29,
  url: "https://example.com/thing.js",
  report: {
    "referrer": "https://www.example.com/",
    "server-ip": "234.233.232.231",
    "protocol": "",
    "status-code": 0,
    "elapsed-time": 143,
    "age": 0,
    "type": "http.dns.name_not_resolved"
  }
}]

7. Security Considerations

7.1. Capability URLs

Some URLs are valuable in and of themselves. To mitigate the possibility that such URLs will be leaked via this reporting mechanism, we strip out credential information and fragment data from the URL we store as a report’s originator. It is still possible, however, for a feature to unintentionally leak such data via a report’s body. Implementers SHOULD ensure that URLs contained in a report’s body are similarly stripped.

8. Privacy Considerations

8.1. Network Leakage

Because this reporting mechanism is out-of-band, and doesn’t rely on a page being open, it’s entirely possible for a report generated while a user is on one network to be sent while the user is on another network, even if they don’t explicitly open the page from which the report was sent.

Consider mitigations. For example, we could drop reports if we change from one network to another. <https://github.com/WICG/BackgroundSync/issues/107>

8.2. Clock Skew

Each report is delivered along with an age property, rather than the timestamp at which it was generated. We do this because each user’s local clock will be skewed from the clock on the server by an arbitrary amount. The difference between the time the report was generated and the time it was sent will be stable, regardless of clock skew, and we can avoid the fingerprinting risk of exposing the clock skew via this API.

8.3. Cross-origin correlation

If multiple origins all use the same reporting endpoint, that endpoint may learn that a particular user has interacted with a certain set of websites, as it will receive origin-tagged reports from each. This doesn’t seem worse than the status quo ability to track the same information from cooperative origins, and doesn’t grant any new tracking ability above and beyond what’s possible with <img> today.

8.4. Subdomains

This specification allows any resource on a host to declare a set of reporting endpoints for that host and each of its subdomains. This doesn’t have privacy implications in and of itself (beyond those noted in §8.5 Clearing the reporting cache), as the reporting endpoints themselves don’t take any real action, as features will need to opt-into using these reporting endpoints explicitly. Those features certainly will have privacy implications, and should carefully consider whether they should be enabled across origin boundaries.

8.5. Clearing the reporting cache

A user agent’s reporting cache contains data about a user’s activity on the web, and user agents ought to handle this data carefully. In particular, if a user agent gives users the ability to clear their site data, browsing history, browsing cache, or similar, the user agent MUST also clear the reporting cache. Note that this includes both the pending reports themselves, as well as the endpoints to which they would be sent. Both MUST be cleared.

8.6. Disabling Reporting

Reporting is, to some extent, a question of commons. In the aggregate, it seems useful for everyone for reports to be delivered. There is direct benefit to developers, as they can fix bugs, which means there’s indirect benefit to users, as the sites they enjoy will be more stable and enjoyable. As a concrete example, Content Security Policy grants something like herd immunity to cross-site scripting attacks by alerting developers about potential holes in their sites' defenses. Fixing those bugs helps every user, even those whose user agents don’t support Content Security Policy.

The calculus, of course, depends on the nature of data that’s being delivered, and the relative maliciousness of the reporting endpoints, but that’s the value proposition in broad strokes.

That said, it can’t be the case that this general benefit be allowed to take priority over the ability of a user to individually opt-out of such a system. Sending reports costs bandwidth, and potentially could reveal some small amount of additional information above and beyond what a website can obtain in-band ([NETWORK-ERROR-LOGGING], for instance). User agents MUST allow users to disable reporting with some reasonable amount of granularity in order to maintain the priority of constituencies espoused in [HTML-DESIGN-PRINCIPLES].

9. IANA Considerations

The permanent message header field registry should be updated with the following registration: [RFC3864]

9.1. Report-To

Header field name

Report-To

Applicable protocol

http

Status

standard

Author/Change controller

W3C

Specification document

This specification (see §3.1 The Report-To HTTP Response Header Field)

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[ECMA-262]
ECMAScript Language Specification. URL: https://tc39.github.io/ecma262/
[FETCH]
Anne van Kesteren. Fetch Standard. Living Standard. URL: https://fetch.spec.whatwg.org/
[HTML]
Ian Hickson. HTML Standard. Living Standard. URL: https://html.spec.whatwg.org/multipage/
[HTTP-JFV]
Julian Reschke. A JSON Encoding for HTTP Header Field Values. URL: https://greenbytes.de/tech/webdav/draft-reschke-http-jfv-02.html
[RFC3864]
G. Klyne; M. Nottingham; J. Mogul. Registration Procedures for Message Header Fields. September 2004. Best Current Practice. URL: https://tools.ietf.org/html/rfc3864
[RFC5234]
D. Crocker, Ed.; P. Overell. Augmented BNF for Syntax Specifications: ABNF. January 2008. Internet Standard. URL: https://tools.ietf.org/html/rfc5234
[RFC6797]
J. Hodges; C. Jackson; A. Barth. HTTP Strict Transport Security (HSTS). November 2012. Proposed Standard. URL: https://tools.ietf.org/html/rfc6797
[RFC7159]
T. Bray, Ed.. The JavaScript Object Notation (JSON) Data Interchange Format. March 2014. Proposed Standard. URL: https://tools.ietf.org/html/rfc7159
[RFC7231]
R. Fielding, Ed.; J. Reschke, Ed.. Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content. June 2014. Proposed Standard. URL: https://tools.ietf.org/html/rfc7231
[SECURE-CONTEXTS]
Mike West; Yan Zhu. Secure Contexts. URL: https://w3c.github.io/webappsec-secure-contexts/
[WHATWG-URL]
Anne van Kesteren; Sam Ruby. URL Standard. Living Standard. URL: https://url.spec.whatwg.org/

Informative References

[CSP]
Brandon Sterne; Adam Barth. Content Security Policy 1.0. 19 February 2015. NOTE. URL: http://www.w3.org/TR/CSP1/
[HTML-DESIGN-PRINCIPLES]
Anne van Kesteren; Maciej Stachowiak. HTML Design Principles. 26 November 2007. WD. URL: http://www.w3.org/TR/html-design-principles/
[NETWORK-ERROR-LOGGING]
Ilya Grigorik; et al. Network Error Logging. 25 February 2016. WD. URL: http://www.w3.org/TR/network-error-logging/
[RFC7469]
C. Evans; C. Palmer; R. Sleevi. Public Key Pinning Extension for HTTP. April 2015. Proposed Standard. URL: https://tools.ietf.org/html/rfc7469

Issues Index

Fetch monkey patching. Talk to Anne.
Add in a reasonable reference describing a good algorithm. Wikipedia, if nothing else.
The "report" type does not exist in Fetch. Talk to Anne.
Consider mitigations. For example, we could drop reports if we change from one network to another. <https://github.com/WICG/BackgroundSync/issues/107>