JSON Syntax Options
From RDF Working Group Wiki
Contents |
JSON Syntax Options
This page is being used by the RDF WG to harvest different approaches to enabling the key features of RDF, in JSON.
URI Properties
RDF uses URIs to name things, including properties. A key benefit of this is that it allows different data sources to all use properties defined in open vocabularies, thus enabling shared understanding of data.
JSON on the other hand, is typically used for domain specific / silo based information where properties are simple lexical terms (like "name") and what the property "means" is documented somewhere out of band, for instance in API documentation, or in a JSON-Schema document.
There follows a collection of different approaches we can take which enable the use of URI identified properties in JSON.
Full URIs
{ "http://xmlns.com/foaf/0.1/name": "Bob" }
Benefits:
- Unambiguous and easy to process.
- When following your nose around the web, property equivalence uses the in serialization URI.
Drawbacks:
- Increased bytesize over the wire.
- Can be verbose to use when using the returned (JSON.parsed) data without an API or tooling.
- Verbose to author.
Example usage (assuming the returned data has been JSON.parsed):
obj["http://xmlns.com/foaf/0.1/name"]
obj[ foaf('name') ] // when using a tabulator ns style approach in your code
obj[ resolve('foaf:name') ] // when using a function which allows the resolution of CURIEs as found in the RDF API
CURIEs
Note: this example uses JSON-LD syntax for the prefix maps:
{
"#": { "foaf": "http://xmlns.com/foaf/0.1/" },
"foaf:name": "Bob"
}
To reconstruct the URI, one must split "foaf:name" on the colon, replace "foaf" with it's related mapping in the prefix map "http://xmlns.com/foaf/0.1/", concatenate "name" to "http://xmlns.com/foaf/0.1/".
Separator Options:
- : colon (familiar, but can't use . notation in JSON.parsed output)
- _ underscore (unfamiliar, ambiguous when property also contains an underscore)
- $ dollar (unfamiliar, but can use . notation in JSON.parsed output)
Benefits:
- Reduced bytesize over the wire
- Familiar to traditional RDF users
- Easier to author
Drawbacks:
- Requires tooling to normalize CURIEs prior to using the data when following your nose around the web.
- Requires CURIE resolution to do property comparison (equivalence must be between URIs not CURIEs)
- Unreliable when following your nose around the web (the same URI could be shortened to "ns0:ame" or "f:name")
- Unfamiliar to traditional JSON users
- Verbose to use when using the returned (JSON.parsed) data without an API or tooling.
Example usage (assuming the returned data has been JSON.parsed):
obj["foaf:name"]; // but ONLY when you are familiar with the data and NOT when following your nose
TERMs (no colon)
Note: this example uses JSON-LD syntax for the prefix maps:
{
"#": { "name": "http://xmlns.com/foaf/0.1/name" },
"name": "Bob"
}
To reconstruct the URI, one must replace "name" with it's related value in the map "http://xmlns.com/foaf/0.1/name"
Benefits:
- Reduced bytesize over the wire
- Familiar to traditional JSON users
- Easy to author
- Easy to use when using the returned (JSON.parsed) data without an API or tooling.
Drawbacks:
- Requires tooling to normalize TERMs prior to using the data when following your nose around the web.
- Requires TERM resolution to do property comparison (equivalence must be between URIs not TERMs)
- Unreliable when following your nose around the web (the same URI could be shortened to "foo" or "bar")
- Unfamiliar to traditional RDF users
Example usage (assuming the returned data has been JSON.parsed):
obj.name; // but ONLY when you are familiar with the data and NOT when following your nose
TERMs (with colon allowed)
Note: this example uses JSON-LD syntax for the prefix maps:
{
"#": { "name": "http://xmlns.com/foaf/0.1/name", "rdfs:label": "http://www.w3.org/2000/01/rdf-schema#label" },
"name": "Bob",
"rdfs:label": "Bob"
}
To reconstruct the URI, one must replace the term ("name", "rdfs:label") with it's related value in the map ("http://xmlns.com/foaf/0.1/name", "http://www.w3.org/2000/01/rdf-schema#label")
Benefits:
- Reduced bytesize over the wire
- Familiar to traditional JSON users
- Familiar to traditional RDF users
- Easy to author
- non-colon names only: Easy to use when using the returned (JSON.parsed) data without an API or tooling.
Drawbacks:
- Requires tooling to normalize TERMs prior to using the data when following your nose around the web.
- Requires TERM resolution to do property comparison (equivalence must be between URIs not TERMs)
- Unreliable when following your nose around the web (the same URI could be shortened to "foo" or "bar")
- with colon names only: Verbose to use when using the returned (JSON.parsed) data without an API or tooling.
Example usage (assuming the returned data has been JSON.parsed):
obj.name; // non-colon - but ONLY when you are familiar with the data and NOT when following your nose obj["rdfs:label"]; // with colon - but ONLY when you are familiar with the data and NOT when following your nose
TERMs + Single Vocab
Note: this example uses a made up syntax!
{
"#vocab": "http://example.org/my-vocab#",
"name": "Bob",
}
To reconstruct the URI, one must append "name" to the value of #vocab ("http://example.org/my-vocab#")
Note: This may look wonderful, but comes with the one-vocab caveat that means when publishers require multiple terms, they will be likely to create "proxy" vocabularies that simply pull together many terms from different vocabularies and merge them. There is a processing and understanding cost to that which can't be stepped in to lightly.
Benefits:
- Minimal bytesize over the wire
- Familiar to traditional JSON users
- Familiar to RDFa users
- Easy to author
- Unambiguous and easy to process.
- Easy to use when using the returned (JSON.parsed) data without an API or tooling.
- Encourages vocabulary merging and reuse.
- Potentially far easier to deploy, doesn't require publishers to implement/have a sem web stack.
Drawbacks:
- Requires understanding of equivalent property statements in custom vocabularies when following your nose around the web.
- Real world property equivalence is far more complicated.
Example usage (assuming the returned data has been JSON.parsed):
obj.name; // non-colon - but ONLY when you are familiar with the data and NOT when following your nose
Option: External Maps
Three of the above options ( CURIEs, TERMs no colon, TERMS with colon ) all require a prefix or term map to be included in order to turn shortened properties in to full URIs.
There is a possibility that these maps could be factored out and referenced externally, this option comes with it's own set of benefits and drawbacks.
Benefits:
- Minimal data over the wire.
- When used with either of the TERMs options, allows bootstrapping of existing JSON data in the wild.
- Encourages vocabulary merging and reuse.
- Potentially far easier to deploy, doesn't require publishers to implement/have a sem web stack.
Drawbacks:
- Sometimes requires two GETs when following your nose.
- External map unavailability removes your ability to see the data as RDF.
- Some changes to external maps could change the meaning of the data.
Datatypes
RDF includes support for specifying the datatype of literals, commonly referred to as "Typed Literals", this allows any literal to be given a specific datatype, typically one of the xsd: types.
JSON has inbuilt support for a minimal set of datatypes, namely strings, numbers (which covers integers, doubles and decimals), booleans, arrays and objects.
Commonly used datatypes which are not in JSON but frequently used in RDF, are IRI and the various forms of date and time.
Note: Many other JSON related specifications have found a need to define support for IRIs and various forms of date/time, for example Activity Streams JSON.
Note: Objects and Arrays will typically have special meaning/usage for RDF - JSON, so will not be discussed further here.
Limited Expressibility
This approach would constrain the syntax to only being able to express those datatypes already existing in JSON, namely:
* String (Unicode) * Number (Integer, Double, Decimal) * Boolean (true, false) * Null (? does RDF have a concept of null,or datatype for it ?)
Benefits:
- Requires no special processing of data
- Familiar to most JSON users
- Simple
Drawbacks:
- No way to use other common or custom datatypes
Limited Expressibility + IRIs and Date/Time
This approach would extend the native JSON datatypes to include support for IRI, Date, Time and DateTime:
* IRI * Date * DateTime * Time * String (Unicode) * Number (Integer, Double, Decimal) * Boolean (true, false) * Null (? does RDF have a concept of null,or datatype for it ?)
note: the additional types would need to be quoted like strings in order to keep JSON compatibility, e.g. "http://example.org/" rather than the same without quotes.
Benefits:
- Potentially requires no special processing of data
- Familiar to most users
- Simple
- Enough to cover most common use cases.
Drawbacks:
- No way to use other common or custom datatypes
Property Range from Vocab
Either of the Limited Expressibility options could be augmented with type hinting from the range of the property being used.
Benefits:
- Potentially requires no special processing of data (when not following your nose)
- Familiar to most users
- Simple
- Enough to cover most common use cases.
- Allows expression of common or custom datatypes
Drawbacks:
- Potentially requires understanding of properties when following your nose & tooling to do so. (nathan: is this a drawback??)
Map the property to a datatype
Either of the Limited Expressibility options could be augmented with type hinting on the property, this could be included in the serialization, or in an external map as with the External Maps option for URIs.
Benefits:
- Potentially requires no special processing of data (when not following your nose)
- Familiar to most users
- Simple
- Enough to cover most common use cases.
- Allows expression of common or custom datatypes
Drawbacks:
- Potentially requires understanding of properties when following your nose & new tooling to do so.
Datatypes from JSON schema
As above, what we're doing could be merged with JSON Schema, in fact we could fully externalize and work with JSON Schema to create a single spec which covers most of the webs JSON needs, and our own RDF needs - but that's perhaps too wild for this group and out of charter.
(nathan likes this idea)
In-String TypedLiterals
This approach involves including both the data and the datatype in a single quoted string, for example "FDE3^^xsd:base64Binary"
note: the exact format of the combined string would be up for discussion, we may want to use full IRIs for datatypes, may explicitly offer a set of predefined tokens mapped to IRIs (e.g. "^int"), may have the datatype prefixed or postfixed - many different approaches
Benefits:
- Can express all common and custom datatypes
Drawbacks:
- Always requires special processing
- Unfamiliar to most typical JSON users
- Verbose
- What to do when you don't understand a datatype?
Paired Values - value/datatype
Using either the object or array syntax from JSON, we could specify typed literals like such:
{ "property": {
"_value": "FDE3",
"_datatype": "xsd:base64Binary",
}
}
Options:
- All typed literals like this, including numbers.
- Only some typed literals like this.
Benefits:
- Can express all common and custom datatypes
Drawbacks:
- Always requires special processing
- Unfamiliar to most typical JSON users
- Verbose
Paired Values - datatype arcs
Using either the object or array syntax from JSON, we could specify typed literals like such:
{ "property": { "xsd:base64Binary": "FDE3" } }
note: see JSN3 for more examples
Options:
- All typed literals like this, including numbers.
- Only some typed literals like this.
Benefits:
- Can express all common and custom datatypes
- Smaller bytesize on the wire (allows repetition)
Drawbacks:
- Always requires special processing
- Unfamiliar to most typical JSON users
Languages
RDF currently includes support for specifying the language strings (for example english or dutch), Plain Literals, support is often serialization specific, with RDFa delegating to the lang/xml:lang attributes, and turtle taking the "Bob"@en approach.
JSON currently has no support for specifying the language of strings.
No Language
It's an option... JSON natively supports unicode, thus strings like "花澄" are perfectly acceptable, and JSON is used effectively throughout the web without requiring a language tag, and further often text consists of multiple different languages and which language tag to use is not clear. For example:
彭博社:2987名人大代表中70名最富的人资产总值为4931亿人民币,约751亿美元!The richest 70 of the 2,987 members have a combined wealth of 493.1 billion yuan ($75.1 billion)
Property Specifies Language
This option would involve language specific properties being created in vocabs, for example "rdfs:label-en" and "rdfs:label-ja".
Not saying much about this one as it's a huge change to RDF and quite possibly entirely impractical from almost every angle. But, it is an option.
Property Modifiers
This option involves adding a language hint to the property, as serialization sugar only, for example:
{ "label@en": "London" }
Benefits:
- Can express languages
- Potentially lighter to process than "in-string language"
- Potentially smaller bytesize on the wire than both of the paired values option (and allows repetition)
Drawbacks:
- Always requires special processing to use the data
- Unfamiliar to most typical JSON users
- Can be verbose when working with data in many languages (requires a min of one property value pair per language)
In-String Language
This approach involves including both the data and the language in a single quoted string, for example "花澄@ja"
note: the exact format of the combined string would be up for discussion, we may want to use IRIs for languages, may explicitly offer a set of predefined tokens (e.g. "@en"), may have the language prefixed ("ja@花澄") or postfixed ("花澄@ja") - many different approaches
Benefits:
- Can express languages
Drawbacks:
- Always requires special processing (including tracking back over parsed data)
- Unfamiliar to most typical JSON users
Paired Values - value/language
Using either the object or array syntax from JSON, we could specify plain literals with languages as such:
{ "property": {
"_value": "花澄",
"_language": "ja",
}
}
Benefits:
- Can express languages
- Lighter to process than "in-string language"
Drawbacks:
- Always requires special processing to use the data
- Unfamiliar to most typical JSON users
- Verbose
Paired Values - language arcs
Using either the object or array syntax from JSON, we could specify plain literals with languages as such:
{ "property": {"@en": "London"} }
note: see JSN3 for more examples
Benefits:
- Can express languages
- Lighter to process than "in-string language"
- Smaller bytesize on the wire than the other paired values option (allows repetition)
Drawbacks:
- Always requires special processing to use the data
- Unfamiliar to most typical JSON users
