This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
Specification: http://www.whatwg.org/specs/web-apps/current-work/multipage/webappapis.html Multipage: http://www.whatwg.org/C#atob Complete: http://www.whatwg.org/c#atob Referrer: http://www.whatwg.org/specs/web-apps/current-work/multipage/workers.html Comment: Based on my testing, step 3 (remove all space characters from input) does not match the current behavior of WebKit, Blink, Firefox and IE10. Should the specification be updated to match the behavior of all major browsers? Posted from: 91.154.118.240 User agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.71 Safari/537.36
For e.g. the following throws an InvalidCharacterError on all browsers I could test: atob("abcd "); This case is part of the following test suite and is expected to succeed: http://www.w3c-test.org/html/tests/submission/AryehGregor/base64.html
Seems like a case where improving the implementations might be in order... unless anyone depends on the exception. It's common to have white space in base64 data.
Yeah the space stripping was a deliberate change to ease the burden on authors and to not waste memory on creating a new string without the whitespace.
Breaking on non-alphabet characters is an explicit recommendation of RFC 4648, backed by security considerations (see <http://tools.ietf.org/html/rfc4648#section-12>). It seems like a bad idea to diverge from authoritative spec and from all implementations at the same time.
The security issues seem pretty minor, but at the end of the day, it's up to the implementors whether they want to do this or not.
FYI, this is now implemented in Blink: https://code.google.com/p/chromium/issues/detail?id=314682
> at the end of the day, it's up to the implementors whether they want to do this or not. This doesn't really add up. Can't exactly the same be said of every spec mistake, word for word? Having 1:1 mapping between original and encoded forms is a really nice trait, adding randomness to the process is strange to say the least.
> This doesn't really add up. Can't exactly the same be said of every spec > mistake, word for word? Yes. At the end of the day, the WHATWG HTML spec is just going to spec whatever browsers end up implementing; my job as editor is just to propose what I think (based on the data and arguments I can find and that people bring up) is the optimal behaviour. I don't file bugs on browsers to implement what I spec, or apply pressure in the form of patches, test cases, or even, except in rare cases, directed advocacy, because I want implementors to carefully consider the text and independently review it and decide whether it's sane or not before implementing. On this particular feature, there's minor security concerns on the one side (people who compare base64 data before decoding it, I guess? I don't really understand how we end up with a security problem here), and there's minor performance wins on the other side (spaces in base64 data are common, and not requiring that scripts strip the spaces is a minor win). The back-compat issues don't seem major (it's unlikely that people are relying on this throwing an exception on spaces in a way that they'd act worse if it ignored spaces, as far as I can tell). Thus, on the whole, the current text seems like a win to me. However, I could be wrong, and if implementors think I am then they shouldn't implement it, and if they don't, then we'll move the spec back to what they do implement.
Ian, was there any evidence of this being an actual measurable performance improvement on any real life web sites? We already had interoperability across all browsers, supported by RFC recommendation and security concerns, as minor as those may be. The behavior was conceptually cleaner, we don't want to be liberal in what we accept unless absolutely necessary. The bar was quite high, and I do not think that this change came anywhere close to meeting it.
I don't have a strong feeling about this. (As far as I recall, this wasn't even something I originally specced, it was Aryeh's good work.) As noted in comment 8, for me it's just a minor performance win vs a minor security risk; I don't really see this as a particularly important issue one way or the other. If the conclusion from implementors is that this is a mistake, then let's change it.
So, let's change it back?
FWIW: Firefox matches the spec as of Firefox 27 (currently on Aurora); see https://bugzilla.mozilla.org/show_bug.cgi?id=711180
Let's fix the spec quickly then, before the change is shipped.
So Firefox and Blink already follow the specification so it looks like there was interest from browser vendors. There is also a patch up-for-review to implement this in WebKit so we are actually really close to cross-browser support.
Was there an analysis of these issues performed by Mozilla or Google? Again, this is a basic feature of Base64 encoding. Generally, it's a very desirable trait for any encoding that you can't add undetectable noise to it. Many applications of Base64 (notably JOSE, used in WebCrypto) additionally forbid padding with '=' characters. So for security applications, space and '=' padding is forbidden. Do we really want to re-evaluate all applications of Base64 just to make this small silly change?
cc'ing abarth who may have an opinion about the security implications, since Chrome does this.
I'm not sure I understand what the security issue is supposed to be. The RFC seems worried about covert channels, which aren't usually a threat model we worry about in the web platform
Given the lay of the land, I think the logical thing to do is to leave the spec as-is instead of changing it to match the Safari behaviour. I understand that this is not something where everyone is on the same page, but I don't see how else to make progress.
It's not "Safari behavior", it's everyone's behavior before this essentially unmotivated spec change was made :-/