12296 – Rules for parsing an integer don't match ES parseInt()

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 12296 - Rules for parsing an integer don't match ES parseInt()

Summary: Rules for parsing an integer don't match ES parseInt()

Status:	CLOSED WONTFIX

Alias:	None

Product:	HTML WG
Classification:	Unclassified
Component:	LC1 HTML5 spec (show other bugs)
Version:	unspecified
Hardware:	Other other

Importance:	P3 normal
Target Milestone:	---
Assignee:	Ian 'Hixie' Hickson
QA Contact:	HTML WG Bugzilla archive list

URL:	http://www.whatwg.org/specs/web-apps/...
Whiteboard:
Keywords:	Disagree

Depends on:
Blocks:

Reported:	2011-03-13 22:30 UTC by contributor
Modified:	2011-11-23 11:30 UTC (History)
CC List:	8 users (show)

See Also:	12220

Attachments

Description contributor 2011-03-13 22:30:46 UTC

Specification: http://www.whatwg.org/specs/web-apps/current-work/multipage/common-microsyntaxes.html
Section: http://www.whatwg.org/specs/web-apps/current-work/#signed-integers

Comment:
Whitespace handling here doesn't match ES, or Gecko, or Opera.	Is there some
reason to not just say it works the same as parseInt?  Test case alerts 2 in
Firefox and Opera (like parseInt()), 0 in Chrome: data:text/html,<!doctype
html><input
tabindex="&nbsp;2"><script>alert(document.querySelector("input").tabIndex);</s
cript>

Posted from: 68.175.61.233
User agent: Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/534.20 (KHTML, like Gecko) Chrome/11.0.672.2 Safari/534.20

Comment 1 Aryeh Gregor 2011-03-13 22:32:30 UTC

More copy-pasteably:

data:text/html,<!doctype html>
<input tabindex="&nbsp;2">
<script>
alert(document.querySelector("input").tabIndex);
</script>

ES spec:

http://es5.github.com/#x15.1.2.2

I didn't spot any difference other than whitespace handling, but it would be simpler for everyone if we reused the same definition and said this worked like parseInt(x, 10).

Comment 2 Aryeh Gregor 2011-03-14 14:15:16 UTC


*** This bug has been marked as a duplicate of bug 12220 ***

Comment 3 Aryeh Gregor 2011-03-14 14:16:01 UTC

Oh, wait, bug 12220 is about floats, while this is about ints.  Related, but not the same.

Comment 4 Aryeh Gregor 2011-03-14 17:54:11 UTC

Actually, here's an even better test-case:

data:text/html,<!doctype html>
<script>
var ol = document.createElement("ol");
ol.setAttribute("start", "\x0b2");
alert(ol.start);
</script>

Per spec, this should alert "1".  It alerts "2" in all browsers (IE9 RC, Firefox 4 RC, Chrome 11 dev, and Opera 11).

Comment 5 Ian 'Hixie' Hickson 2011-05-05 20:57:56 UTC

This is intended to match legacy attribute parsing. It's apparently not perfect, and I'm happy to address specific compatibility problems, but referring to ES' parseInt() here is a non-starter. For example, consider:

<ol><li value="0x20">1

Firefox, Opera, and WebKit all get different results, but none of them match ES. (WebKit and IE agree on this case, but that's because they clamp to 1 rather than 0 like the other browsers. Note that the spec here has removed clamping to allow negative numbers, but that's another issue.)

Or consider:

<p tabindex="0x20"><script>w(document.getElementsByTagName('p')[0].tabIndex)</script>

Again, Firefox, Opera, and WebKit all get different results. The spec matches WebKit/IE for this one. ES doesn't match any of them.

EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Rejected
Change Description: no spec change
Rationale:

Regarding U+00A0 (NBSP), it's one of a number of areas where browsers happen to coincide and not match the spec, but overall the interoperability is pretty poor. Unless there are specific compatibility problems, I would much rather we keep the spec at the current pretty simple level rather than adding additional complexity to handle specific non-conforming cases that happen to be interoperable today.

Here's a test case that shows some of the weird behaviour in how characters are handled before the number:
http://www.hixie.ch/tests/adhoc/html/attribute-parsing/001.html

Comment 6 Aryeh Gregor 2011-05-06 17:29:47 UTC

(In reply to comment #5)
> This is intended to match legacy attribute parsing. It's apparently not
> perfect, and I'm happy to address specific compatibility problems, but
> referring to ES' parseInt() here is a non-starter. For example, consider:
> 
>    <ol><li value="0x20">1
> 
> Firefox, Opera, and WebKit all get different results, but none of them match
> ES. (WebKit and IE agree on this case, but that's because they clamp to 1
> rather than 0 like the other browsers. Note that the spec here has removed
> clamping to allow negative numbers, but that's another issue.)

parseInt("0x20", 10) results in 0, which is the same as the current algorithm in the spec.  Of course, parseInt("0x20") is 32, but I never said we should use parseInt() without a radix argument.

> Or consider:
> 
>    <p
> tabindex="0x20"><script>w(document.getElementsByTagName('p')[0].tabIndex)</script>
> 
> Again, Firefox, Opera, and WebKit all get different results. The spec matches
> WebKit/IE for this one. ES doesn't match any of them.

See above.  parseInt("0x20", 10) is 0, same as the spec.

What's a case where the current algorithm gives different results from parseInt(x, 10), *and* where the current algorithm's result is preferable?

> Regarding U+00A0 (NBSP), it's one of a number of areas where browsers happen to
> coincide and not match the spec, but overall the interoperability is pretty
> poor. Unless there are specific compatibility problems, I would much rather we
> keep the spec at the current pretty simple level rather than adding additional
> complexity to handle specific non-conforming cases that happen to be
> interoperable today.
> 
> Here's a test case that shows some of the weird behaviour in how characters are
> handled before the number:
>    http://www.hixie.ch/tests/adhoc/html/attribute-parsing/001.html

It would be much simpler if the spec just matched ES here, not more complex.  You could simply defer to ES.  That's easier to spec, easier to implement, and easier for authors to understand.

Comment 7 Ian 'Hixie' Hickson 2011-05-06 20:16:01 UTC

It's not easier to spec, since this is already done.

Comment 8 Aryeh Gregor 2011-05-06 20:36:32 UTC

Okay, fine.  But spec writers are next-to-last in the priority of constituencies.

Comment 9 Ian 'Hixie' Hickson 2011-08-03 00:32:56 UTC

EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Rejected
Change Description: no spec change
Rationale: The main way in which the HTML spec's algorithm is superior to the ES algorithm is that ES' algorithm changes every time Unicode add new whitespace characters. It's also simpler to implement (either from scratch — it really is a simpler algorithm — or, if you're going to use a library, it's simpler because you would only need to use an HTML library which you're likely already and which is likely to expose this as a function explicitly using rather than having to find a JS library that exposes this particular algorithm), and simpler to understand from an authoring perspective. It's also easier to spec (no need to track another spec with moving references, not to mention that that spec itself tracks another one), and easier to understand when reading the spec.

Comment 10 Aryeh Gregor 2011-08-03 17:08:50 UTC

I don't think any of those reasons are sufficient to justify the inconsistency, but I doubt I can convince you, so there's no point in arguing further.

Comment 11 Michael[tm] Smith 2011-08-04 05:34:43 UTC

mass-move component to LC1