TF-JSON/Semantics of JSON

From RDF Working Group Wiki
Revision as of 17:08, 25 March 2011 by Rcygania2 (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

To define a representation of RDF in JSON, the TF-JSON might require an account of the semantics of JSON: What exactly does a JSON document serialize? This page states such a semantics, based entirely on RFC 4627. This sometimes requires reading between the lines of that document.

Syntax of JSON

The syntax is defined in RFC 4627.

Question: Are there any glitches there that the WG needs to keep in mind?

Semantics of JSON

A JSON text is a serialized object or array.

An object is an unordered collection of zero or more name/value pairs.

A name is a string. The names within an object SHOULD be unique.

An array is an ordered list of zero or more values.

A value is an object, array, number, or string, or one of the following three literal names:

  • false
  • null
  • true

A string is a sequence of zero or more Unicode characters.

A number is any decimal number (including fractions). Note that IEEE floating point “numbers” that cannot be represented as sequences of digits (such as Infinity and NaN) are not permitted.

Question: What about -0?

Implementation considerations

According to RFC 4627:

  • An implementation may set limits on the size of texts that it accepts.
  • An implementation may set limits on the maximum depth of nesting.
  • An implementation may set limits on the range of numbers.
  • An implementation may set limits on the length and character contents of strings.

Question: What are the reasonable limits here? Can I cut strings down to the empty string?

Practical considerations

Duplicate names: Almost all popular implementations do not handle objects with duplicate names, so they SHOULD (very strongly) be avoided.

Question: Does this mean they raise a syntax error, or (silently) throw away some information?

Dot notation compatibility: If names match the Identifier production in Section 7.6 of ECMA-262, then they support access using Javascript's convenient “dot notation”. This has usability advantages for Javascript developers. Glossing over a lot of details, the gist is that identifiers can start with a UnicodeLetter, “$” or “_”, and continue with these characters as well as UnicodeDigit, UnicodeCombiningMark, or UnicodeConnectingPunctuation.

Numbers and booleans: Popular implementations tend to treat non-fractional numbers as Integer, and fractional numbers as Double, in the programming language's type system. The true and false literal names are usually mapped to the boolean type.

Null values: Users may not be fully aware of the difference between a null-valued name/value pair, and an absent name/value pair. If possible, this distinction should not carry significant application semantics.

Question: What about this different in the presence of another value for the name, particularly if it comes from using a prototype?