This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 24241 - Adopt the ES6 "safe integer" range for (unsigned) long longs
Summary: Adopt the ES6 "safe integer" range for (unsigned) long longs
Status: RESOLVED FIXED
Alias: None
Product: WebAppsWG
Classification: Unclassified
Component: WebIDL (show other bugs)
Version: unspecified
Hardware: PC Windows NT
: P2 normal
Target Milestone: ---
Assignee: Cameron McCormack
QA Contact: public-webapps-bugzilla
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-01-08 19:23 UTC by sof
Modified: 2014-01-28 02:01 UTC (History)
8 users (show)

See Also:


Attachments

Description sof 2014-01-08 19:23:44 UTC
The request is for WebIDL to consider adopting ES6's "safe integer" range for its (unsigned) long long ES conversions steps ([EnforceRange]).

Some reasoning on why this is preferable is outlined via

 http://lists.w3.org/Archives/Public/public-script-coord/2014JanMar/0000.html
Comment 1 Tab Atkins Jr. 2014-01-08 20:54:51 UTC
To inline the reasoning:

2^53 is the endpoint of the contiguous representable integers.  However, it's not *unambiguously* representable.  In particular, if an integral expression would evaluate to 2^53 + 1, it'll get stored as 2^53.

This means that you can't be sure that your answer is exact just because it's in the [-2^53, 2^53] range.  To be precise, something like this:

var c = a + b;
if(safe(a) && safe(b) && safe(c)) {
  // Definitely got the exact answer!
  return c;
} else {
  // Maybe got a slightly incorrect answer!
  throw "Whoops!";
}

...is buggy, because `var a=2^53, b=1, c=a+b;" passes the checks but gives the wrong result.  You have to lop off one integer from each side of the range for it to work correctly.

Since WebIDL's definition of long long int is the larger range, this means that spec code that wants unambiguous integers has to specify an additional test, which is wasteful.

It would be nice if WebIDL's definition of long long int was the "safe" range instead.
Comment 2 Marcos Caceres 2014-01-09 01:15:14 UTC
Maybe we should just add a new "number" type that represents ES numbers (and deprecate all the other ones that don't make sense in JS - and are used wrongly in a lot of specs). EnforceRange is not really used anywhere in the platform, AFAIK - so should probably be deprecated also.
Comment 3 Boris Zbarsky 2014-01-09 03:24:47 UTC
Marcos, what is the goal of getting rid of numeric types?

They were added because various specifications wanted to coerce numbers to integers in some range, and having everyone reinvent the wheel for how to do that is pretty suboptimal....  now you may argue that people shouldn't be coercing numbers to integers, but it turns out to be necessary when you need to talk to platform (like operating system, outside the browser) APIs or whatnot.  And then we need to define the exact coercion steps.

As for EnforceRange, it was added so that specifications that want to throw on out of range input instead of the somewhat asinine default modulo behavior (which was left for backwards compat) don't have to reimplement coercion to a narrower integer type from scratch.  And it's used by at least IndexedDB and the Media Source Extensions APIs.
Comment 4 Domenic Denicola 2014-01-09 03:53:22 UTC
As much as I would love to kill all the number types, I think their replacement needs a bit more thought first. E.g. a survey of the usages is important, to find out what predefined ranges people use (and whether they actually depend on it being in that range, or if they also do other range validation and just choose e.g. "short" because the valid values are in the range 1-3, or...).

And you'd want some kind of specification for what to replace it with, which would probably be something vaguely [ThrowIf(>= 10, < 0)] or [ClampTo(1, 8)] or [Round] or [Floor] or similar.

I think what's most offensive about the numeric types is not the behavior of coercing to integers or mandating a specific ranges, both of which are useful and in some cases necessary. What's offensive is that we pretend this behavior has something to do with traditional C-ish concepts of float, double, short, long, long long, etc., or that the variables are actually "typed" that way, when of course they're all just `Number` in JS.
Comment 5 Marcos Caceres 2014-01-09 04:36:02 UTC
(In reply to Boris Zbarsky from comment #3)
> Marcos, what is the goal of getting rid of numeric types?

I'm not suggesting we get rid of the types (as some have a legacy purpose in DOM/HTML, like "long" in various places - though even there the use is highly suspect) - I'm suggesting we make a new type and discourage the use of the other types unless there is a really good reason to do so (as it's already done for various things in WebIDL). A lot of the time, people just need some argument to be coerced into a signed or unsigned number - or need a readonly attribute that returns a number when gotten.

As Domenic suggests, I guess we would need to look at what predefined ranges specs use (and why).
 
> They were added because various specifications wanted to coerce numbers to
> integers in some range, and having everyone reinvent the wheel for how to do
> that is pretty suboptimal....

I thought this was mostly a carry over from OMGIDL from the DOM1 days and for compatibility with Java. But yes, reinventing the wheel would not be fun. However, a more flexible range mechanism (as Domenic suggested in #4 and as I've suggested to Cam in the past also) would make more sense - specially for the silly types like Octet, Byte, Short, etc. which are just range enforcers that are masquerading as types.  

>  now you may argue that people shouldn't be
> coercing numbers to integers, but it turns out to be necessary when you need
> to talk to platform (like operating system, outside the browser) APIs or
> whatnot.  And then we need to define the exact coercion steps.

Right, but this is the exception - not the rule AFAICT. As Domenic suggests, we would probably need to check what's actually used in the platform and what value each type actually brings. 

> As for EnforceRange, it was added so that specifications that want to throw
> on out of range input instead of the somewhat asinine default modulo
> behavior (which was left for backwards compat) don't have to reimplement
> coercion to a narrower integer type from scratch.  And it's used by at least
> IndexedDB and the Media Source Extensions APIs.

We should check if the uses there are actually valid/necessary or if there is a better solution (or something that a new type could give you for free).
Comment 6 Boris Zbarsky 2014-01-09 04:44:06 UTC
> What's offensive is that we pretend this behavior has something to do with
> traditional C-ish concepts of float, double, short, long, long long, etc

Well, it does, in actual browser implementations right now...  It's not so much pretending.

> or that the variables are actually "typed" that way, when of course they're
> all just `Number` in JS.

I think that depends on what you mean by "typed", and I think it's worth thinking of incoming and outgoing values separately.

Incoming values aren't even "Number" in JS, really; they're just arbitrary values.  If you pass { valueOf: function() { return Math.random() * 1000; } } to an IDL method labeled as taking "long" no one will stop you.  Of course it might be faster to pass in actual Number values.  But it might also be faster to pass in integer-valued Numbers than it is to pass in Numbers with a nonzero fractional part!  I agree that for incoming values we may want a richer set of constraint/behavior annotations than WebIDL provides now, by the way, so that spec algorithms can just assume their inputs are sane based on the IDL and get on with the bits they really care about.

For outgoing values, on the other hand, at least SpiderMonkey actually uses information like "this DOM method always returns an integer that we can losslessly represent as a 32-bit machine integer" to perform certain JIT optimizations, and I hope to do more of that sort of thing in the future (e.g feeding more information to range analysis).  While of course such optimizations can be attempted based on some sort of dynamic guards, it sure helps when you statically know something about value ranges.  Again, it might help to have annotations about the _actual_ range of values here, not just the current size labels, but it's a lot harder to make the case that spec writers should care about annotations like this that in fact do not affect observable behavior...  Right now picking which particular numeric type to return from a method is somewhat a waste of the spec writer's time, since they really do all en up as Number in the end, conceptually.  Except that given JITs, they _don't_ actually all end up as IEEE doubles in practice.
Comment 7 Boris Zbarsky 2014-01-09 04:47:00 UTC
And to be clear, I'm all for people reading through existing specs and so forth.  We definitely need to do it!  Separate bug?
Comment 8 Marcos Caceres 2014-01-09 04:53:12 UTC
Separate bug WFM. Apologies for the derail.
Comment 9 Cameron McCormack 2014-01-28 02:01:47 UTC
Thanks, I've changed the spec to match this:

https://github.com/heycam/webidl/commit/c569851a755f7e51f53ec7b751f2aaaa08abbd4a