20789 – "digest" (cryptographic hash) attribute for <script>

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 20789 - "digest" (cryptographic hash) attribute for <script>

Summary: "digest" (cryptographic hash) attribute for <script>

Status:	RESOLVED NEEDSINFO

Alias:	None

Product:	HTML WG
Classification:	Unclassified
Component:	HTML5 spec (show other bugs)
Version:	unspecified
Hardware:	PC Linux

Importance:	P2 normal
Target Milestone:	---
Assignee:	Edward O'Connor
QA Contact:	HTML WG Bugzilla archive list

URL:	https://github.com/pwnall/script-dige...
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2013-01-28 02:19 UTC by Victor Costan
Modified:	2015-04-08 16:10 UTC (History)
CC List:	15 users (show)

See Also:

Attachments

Description Victor Costan 2013-01-28 02:19:08 UTC

Please add a "signature" attribute to the <script> tag, which can be used to ensure that the script that will be executed matches the script that the page author believes will be executed.

* Example

The following example shows a <script> with a signature attribute that matches this proposal.

* Motivation

Many Web sites import popular scripts from CDNs (content distribution networks) to improve the user experience by increasing cache hit ratios. Unfortunately, this requires full trusting CDNs, which receive the power to execute arbitrary JavaScript with the credentials of the sites that use them.

If <script> supports signature checking, the CDNs can at most perform a denial of service attack by returning the wrong data. Note that using the https: scheme does not solve this problem, because it only protects the JavaScript while it is in transit between the server and the client.

* Proposed Syntax

This proposal introduces an optional 'signature' attribute to the <script> tag. 'signature' is silently ignored if the script does not have a 'src' attribute.

The proposed syntax of 'signature' is as follows:

signature-value := algorithm-id : hash-value
algorithm-id := one or more of the following characters: A-Z, a-z, 0-9, _
hash-value := one or more of the characters in RFC 2045 Section 6.8 (base64) [1]

* Hashing Algorithms

All the hashing algorithms considered by this specification operate on binary data, which is called "script material" in this proposal. The script material is the exact binary representation of the script as it is in the body of the HTTP request used to obtain the script. This is purposefully dependent on the script's character encoding.

This proposal introduces one algorithm with algorithm-id sha256. According to this algorithm, a script's hash value is obtained by first computing the SHA-256 hash of the script material according to FIPS 180-4, and then base64-encoding it according to Section 6.8 of RFC 2045.

On a system that has the curl and openssl command-line tools installed, the hash value for a script can be computed using the following command:

curl -s http://cdn.com/script.js | openssl dgst -sha256 -binary | openssl enc -base64

* Fallback

No fallback is required. User agents that do not understand the 'signature' attribute will silently ignore it.

* References

[1] https://tools.ietf.org/html/rfc2045#section-6.8
[2] http://csrc.nist.gov/publications/fips/fips180-4/fips-180-4.pdf

Comment 1 Victor Costan 2013-01-28 02:42:04 UTC

I forgot to talk about failure modes.


* Failure Modes

If the signature attribute's value does not match the syntax in this specification, or if the algorithm-id is not supported by the user agent, the signature attribute should be ignored. The user agent can optionally log a warning to its console.

If the hash-value in the signature attribute value does not match the value computed for the fetched script, the user agent shall consider that the fetch resulted in a network error. Most importantly, the user agent shall not execute the script.

Comment 2 David Mays 2013-01-28 02:46:06 UTC

What will happen in the event that a user modifies the script after it loads, by using a JavaScript console in the user-agent/browser? Is this signature hash meant to be computed and compared at load-time of the script?

Comment 3 Xi Wang 2013-01-28 02:56:35 UTC

One minor comment.  I'd prefer a hexadecimal/base16 signature, which is case-insensitive easier to compute.

<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/jquery/1.9.0/jquery.min.js" signature="sha256:7fa0d5c3f538c76f878e012ac390597faecaabfe6fb9d459b919258e76c5df8e">
</script>

Comment 4 Victor Costan 2013-01-28 03:03:29 UTC

@David: The intent of this proposal is to protect against compromised / malicious CDN servers. I do not want to restrict the user's ability to run code in their browser.

I'm not sure the HTML5 specification should cover the operation of developer tools, but if that is the case, signature checking should only occur when the script is fetched from a remote source (file://, http://, https:// etc.), or from a cache for a remote source.

Comment 5 nickolai 2013-01-28 03:33:59 UTC

I think it would be better to call this attribute hash=, since it is a hash and not a signature.

It might be worthwhile to propose such an attribute for other elements that can be loaded from a specified URL.  I'm primarily thinking of <style src=..> tags, which can be used to attack a page, but also <img src=..>, where I might want to ensure that the displayed contents of my page are not affected by a compromised server.

One security consideration is that such a tag may allow the parent page to learn something about the content of the specified resource -- namely, whether its content hashes to the specified value -- by observing whether the resource loads correctly or not.

Comment 6 Victor Costan 2013-01-28 05:38:16 UTC

@nickolai: these are two very good points.

I would prefer "hash" as an attribute name. I didn't propose it because I was afraid it might be confused with "window.location.hash".

The content-matching is a very good point! I see two avenues for solving this:

1) The presence of a "hash" / "signature" attribute with a valid value causes the script resource to be fetched according to the CORS specification [3] where withCredentials is false. This relies on proven existing standards, but requires infrastructure changes on the CDNs, which would have to add the HTTP header "access-control-allow-origin: *"

2) The hash check only succeeds if the script contains a magic comment "//@ allowHashing", along the lines of the source maps specification [4]. For inter-operability with source maps, the magic comment should be allowed to occur anywhere in the file. This is likely to be easier to implement in user agents and CDNs, assuming an appropriate magic comment can be figured out.

[3] http://www.w3.org/TR/cors/
[4] https://github.com/mozilla/source-map

Comment 7 estark 2013-01-28 17:08:24 UTC

I agree that "signature" might be confusing as an attribute name and that hash would be better. To avoid confusion with window.location.hash, "digest" might be appropriate.

One advantage of requiring the CDN to serve CORS headers is that it would make this a more backwards-compatible proposal: if websites want to always verify their static content, they could use a script tag with a signature/hash/digest attribute when available, and otherwise fall back to fetching the script via XHR, hashing it in JS, and eval'ing it.

Comment 8 Victor Costan 2013-01-28 18:36:34 UTC

@estark: Thank you for the comments!

I really like "digest". It's both more accurate and shorter than "signature". I changed the bug title to reflect this.

Xi Wang just pointed out that HTTP also uses "digest" to refer to cryptographic hashes, in its Digest authentication method [5].

I think it's good to push CDNs to use CORS headers when serving JavaScript in general, and I think the first alternative should definitely become a part of the specification.

The second alternative can be the starting point for exploring a fallback mechanism for opting a script into "digest" if you don't have control over your server's HTTP headers. This would be consistent with allowing <meta charset> [6] as a fallback for not being able to set Content-Type and allowing a <meta http-equiv> as a fallback for not being able to set Content-Security-Policy [7] headers.

[5] https://tools.ietf.org/html/rfc2617#section-3
[6] http://www.w3.org/TR/html-markup/meta.charset.html
[7] https://dvcs.w3.org/hg/content-security-policy/raw-file/tip/csp-specification.dev.html#html-meta-element--experimental

Comment 9 Victor Costan 2013-01-28 23:43:11 UTC

I updated the proposal to reflect the feedback in the previous comments and posted it on GitHub.

https://github.com/pwnall/script-digest/blob/master/README.md

I have also updated the bug's URL to point to the page above.

Comment 10 Edward O'Connor 2013-02-06 20:37:50 UTC

EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are
satisfied with this response, please change the state of this bug to CLOSED. If
you have additional information and would like the Editor to reconsider, please
reopen this bug. If you would like to escalate the issue to the full HTML
Working Group, please add the TrackerRequest keyword to this bug, and suggest
title and text for the Tracker Issue; or you may create a Tracker Issue
yourself, if you are able to do so. For more details, see this document:

   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Rejected
Change Description: No spec change.
Rationale: Duplicate of bug 11402.

*** This bug has been marked as a duplicate of bug 11402 ***

Comment 11 Victor Costan 2013-02-06 21:41:24 UTC

@Edward: thank you very much for reading through this proposal!

I respectfully disagree that this is a duplicate of the proposal in bug 11402, although they share some of the same mechanisms.

11402 proposes using hashes for bandwidth savings. In that proposal, a hash match short-circuits the download process. In the meantime, CDNs have emerged as an alternative method for achieving the same bandwidth savings without the need for a standard change.

This proposal introduces a hash verification step after the script is downloaded. It is not susceptible to the cache poisoning attack in bug 11402, because scripts are always downloaded from their origins. Even if an attacker can carry out a second pre-image attack against SHA2, they still have to compromise the CDN provider and cause the CDN to deliver the attacker's script. This is an improvement over the current situation, where an attacker that can compromise the CDN gets to execute arbitrary scripts in the context of the original site.

Also, while the bug 11402 proposal features a similar syntax for specifying cryptographic digests, it does not handle the information leak attack in #5. I believe that is a consequence of the fact that 11402 was put together with performance in mind, while this proposal is focused on improving security.

Given these concerns, I think it would be constructive to consider this proposal on its own, separately from bug 11402.

Comment 12 estark 2013-02-06 21:53:56 UTC

I agree with Victor's comments and also wanted to emphasize that neither of the two controversies in bug 11402 seem to apply here:

1.) The cache poisoning attack doesn't seem to be relevant even to bug 11402, since the attack can only be carried out successfully if the browser fails to verify the hash before caching the script, which would be a major implementation error in the browser.

2.) In this proposal, the digest attribute does not affect the browser's caching behavior, so the bitrot problem mentioned in bug 11402 would not apply to the proposed digest hash. If a developer updates a library and forgets to update some script tag's digest attribute, then the bug will show up for all users, and its manifestation won't depend on the state of a user's cache as in bug 11402. In practice, libraries hosted on CDNs often include version numbers in the filenames anyway (e.g. http://code.jquery.com/jquery-1.9.1.min.js) so script tags already have to be updated when new versions are pushed.

Comment 13 Robin Berjon 2013-02-18 13:37:52 UTC

My first reaction to this would be that if you can't trust your CDN, you probably shouldn't be using a CDN in the first place, no? Or at least not that one.

That being said, if you wish to push this further the next step for you would be to garner support from an implementer. A new feature like this won't make it into the spec unless we know someone plans to implement it. Based on that, we can see how to move it forward (or not).

Comment 14 Edward O'Connor 2013-02-22 18:11:22 UTC

EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are
satisfied with this response, please change the state of this bug to CLOSED. If
you have additional information and would like the Editor to reconsider, please
reopen this bug. If you would like to escalate the issue to the full HTML
Working Group, please add the TrackerRequest keyword to this bug, and suggest
title and text for the Tracker Issue; or you may create a Tracker Issue
yourself, if you are able to do so. For more details, see this document:

   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Additional Information Needed
Change Description: No spec change.
Rationale: Resolving NEEDSINFO per Robin's comment 13.

Comment 15 Victor Costan 2013-02-22 19:45:19 UTC

@Robin: I disagree that Web application authors have to trust the CDNs. If the CDN is down, there are ways to fall back to a resource on the author's server, using JavaScript. I hope that we'll eventually have a general method for doing that in HTML, perhaps along the lines of srcset. It would be nice to have the same failure mode if the CDN is compromised and starts serving content that differs from what authors intend it to serve.

Thank you very much for the advice of approaching browser vendors. I will make one more pass over the specification and then I will start doing that. I would appreciate any feedback w.r.t. unresolved issues in the specification, especially if it comes before I start trying to implement it :)

Comment 16 Bnaya Peretz 2015-04-08 16:10:10 UTC

It would be nice to have somekind of callback/error handler if the signature validation  fails.
This can give the options to gracefully fallback to origin server url or just give the developer a way to report himself about it like CSP violation reports