Bug 16604 - RFE: add unsigned byte as synonym for octet
Summary: RFE: add unsigned byte as synonym for octet
Alias: None
Product: WebAppsWG
Classification: Unclassified
Component: WebIDL (show other bugs)
Version: unspecified
Hardware: PC Windows 3.1
: P2 enhancement
Target Milestone: ---
Assignee: Cameron McCormack
QA Contact: public-webapps-bugzilla
Depends on:
Reported: 2012-04-02 21:46 UTC by Kenneth Russell
Modified: 2013-06-17 01:14 UTC (History)
6 users (show)

See Also:


Note You need to log in before you can comment on or make changes to this bug.
Description Kenneth Russell 2012-04-02 21:46:20 UTC
Would it be possible to add the type 'unsigned byte' as a synonym for 'octet'? Doing so would improve symmetry with the other signed and unsigned types in the spec (short/unsigned short, etc.).
Comment 1 Anne 2012-04-03 07:30:52 UTC
Why do we have both byte and octet to begin with? This is confusing with many specifications that use byte as terminology and treat e.g. 0xFF as byte. I'd prefer renaming octet to byte and dropping the current byte.
Comment 2 Kenneth Russell 2012-04-03 19:21:10 UTC
From the standpoint of the typed array and WebGL specs it's essential to have both signed and unsigned byte concepts, so dropping support for signed bytes doesn't work for us.

Since the other integer types in Web IDL are signed, and use the "unsigned" modifier for the unsigned variants, it seems most symmetric to do the same for byte.
Comment 3 Ian 'Hixie' Hickson 2012-04-03 21:58:20 UTC
"byte" should (and indeed does, in almost all contexts) mean 0..255, IMHO.

I recommend "short int" and "unsigned short int" if you need symmetric names, with "byte" as a synonym for "unsigned short int". Or alternatively, "tinyint" and "unsigned tinyint".
Comment 4 Kenneth Russell 2012-04-03 22:11:41 UTC
short and unsigned short already exist in Web IDL and map to C's int16 / uint16. Those typedefs are also needed for the typed array spec and likely elsewhere.

"tiny" / "tiny int" and unsigned variants could work. Not sure about the potential for namespace collisions with existing code.

Changing byte to be an unsigned type has downsides. It requires updating all existing HTML5 specs which refer to that type, and would imply introducing a "signed byte" type which is again asymmetric with how the other integer types in Web IDL behave.

I would still prefer "byte" and "unsigned byte", but would also prefer the pair of types "signed byte" and "byte" over introducing a new "tiny" concept.
Comment 5 Anne 2012-04-04 07:34:33 UTC
How many specifications are we talking about here? I cannot find usage of this within HTML itself for instance. We could also name the type "short short", just like we have "long long" for longer than long, "short short" could be shorter than short.
Comment 6 Marcos Caceres 2012-04-04 19:20:48 UTC
I still think if we go down this route we should just be allowed to define custom ranges (e.g., percent [0-100], angle [0-360], richter[0-10.0], etc.)
Comment 7 John Thomas 2012-04-05 00:02:03 UTC
Nice context for byte vs octet - http://www.tcpipguide.com/free/t_BinaryInformationandRepresentationBitsBytesNibbles-3.htm

Careful with "short int" and "unsigned short int" as octet I believe generally (though w/ basic data type lengths you can never be sure) maps to C/C++ signed/unsigned char != signed/unsigned int


Ultimately, even though you might confuse byte with short with octet, you will rarely confuse octet with something that is not 8-bits long.

Also, with octet - it's not about range, it's about storage size, although if an octet's unsigned (which I think it has to be), it should go from 0-255
Comment 8 Cameron McCormack 2012-04-06 01:47:18 UTC
Bits, bytes, words and longs, how many were going to St Ives?

octet came from OMG IDL, where it already meant an unsigned 8 bit integer type.  When we needed to introduce a signed type, because OMG IDL didn't have one, I used byte because Java's byte type is signed, rather than introduce "signed octet".

I agree the current names are a bit sucky, though.  But I'm not really in favour of introducing synonyms.  Seeing different names for the same concept in different specs, depending on the style preferences of the author, will be confusing.

I think there are sufficiently few specs using byte that updating them if we decide to change the names will be easy enough.  (It might even just be WebGL and the Typed Arrays spec.)

If people think the status quo is unacceptable, then I think my next preference would be to have "byte" and "signed byte", or "octet" and "signed octet".  The former looks and sounds nicer, but the latter more strongly feels like something unsigned to me.  (And true enough maybe technically a byte doesn't imply 8 bits necessarily, but I don't think it's a real enough concern.)
Comment 9 Anne 2012-04-06 06:40:38 UTC
byte and signed byte works. E.g. http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html and HTML and many other specifications always talk about bytes in the unsigned sense. I think it would make sense if we did that consistently throughout the platform standards.
Comment 10 Cameron McCormack 2013-06-17 01:14:08 UTC
I don't think it's worth changing at this point.