- 1 JSON RDF Task Force
- 1.1 Inputs
- 1.2 Deliverables
- 1.3 Questions to Contemplate
- 1.4 RDF in JSON Use Cases
- 1.5 RDF in JSON Design Requirements
- 1.5.1 There should be two serialization formats
- 1.5.2 A primary goal SHOULD be to build a human-friendly version of the serialization for JSON developers
- 1.5.3 A primary goal SHOULD be to build a machine-optimized version of the serialization
- 1.5.4 The serialization SHOULD be able to transform most JSON in use today into RDF
- 1.5.5 Developers do not need to be familiar at all with RDF to start using the serialization
- 1.5.6 The serialization MAY include features not in RDF
- 1.5.7 The serialization MUST be 100% compatible with the JSON spec
- 1.5.8 It is a requirement that all RDF concepts MUST be expressible in the serialization
- 1.5.9 There should be a migration story for going from existing JSON in the wild to this new format
- 1.5.10 Memory usage and CPU usage while processing SHOULD be a primary consideration
- 1.5.11 The design target is small snippets of RDF Data
- 1.5.12 Design target: graphs or resources
- 1.5.13 The serialization MUST support disjoint/unconnected graphs
- 1.5.14 The serialization MUST provide a normalization algorithm
- 1.5.15 The serialization SHOULD enable digital signatures
- 1.5.16 The serialization SHOULD support advanced graph concepts
- 1.5.17 The serialization MUST support automatic typing
- 1.5.18 The serialization SHOULD support type coercion
- 1.5.19 The serialization SHOULD rely on microsyntaxes instead of nested structures
- 1.5.20 The serialization SHOULD provide an API
- 1.5.21 There SHOULD be one and only one way to serialize a given triple
- 1.6 Participants
JSON RDF Task Force
The JSON RDF Task Force is primarily responsible for creating a JSON serialization of RDF.
- RDF JSON, by Talis.
- JSON-LD, by Manu Sporny.
- JRON by Sandro Hawke.
- JSON serialization in the Linked Data API.
- SPARQL Query Results in JSON by DAWG.
- JSN3 by Nathan.
- Flat triples approach to RDF graphs in JSON by Dominik Tomaszuk
- Ideas and issues from the community from RDF Core Work Items build on RDF/NextStepWorkshop, are reproduced below.
- JTriples by Michael Hausenblas
Materials from RDF Next Step WorkShop
- Multiple JSON formats and implementations (some interoperable) already exist showing interest in this work
- Current JSON formats are not aligned - differnent approaches - making it JSON-user friendly versus making it familiar to existing RDF users.
- Needs some R&D and alignment.
- Risk that the result would be some standard that would not be adopted if it was not 'web author' friendly.
- JSON Serialization of RDF
Questions to Contemplate
- What are the use cases for the JSON serialization?
- Are we to create a lightweight JSON based RDF interchange format optimized for machines and speed, or an easy to work with JSON view of RDF optimized for humans (developers)?
- Is it necessary for developers to know RDF in order to use the simplest form of the RDF-in-JSON serialization?
- Should we attempt to support more than just RDF? Key-value pairs as well? Literals as subjects?
- Must all major RDF concepts be expressible via the RDF in JSON syntax?
- Should we go more for human-readability, or terse/compact/machine-friendly formats? What is the correct balance?
- Should there be a migration story for the JSON that is already used heavily on the Web? For example, in REST-based services?
- Should processing be a single-pass or multi-pass process? Should we support SAX-like streaming?
- Should there be support for disjoint graphs?
- Should we consider how the structure may be digitally signed?
- How should normalization occur?
- Should graph literals be supported?
- Should named graphs be supported?
- Should automatic typing be supported?
- Should type coercion be supported?
- Should there be an API defined in order to easily map RDF-in-JSON to/from language-native formats?
RDF in JSON Use Cases
RDF REST Web Services
Frank wants to be able to easily post and get RDF data RESTfully via Web Services. He wants to make sure that the data that is exchanged looks very much like the JSON data that is passed to and from popular services like Twitter's API. He wants to utilize the current JSON-based tools and workflows that he uses for all of his other data on the Web, but add semantics to that data in a way that is easy to explain to his fellow developers.
Expose a service that internally uses RDF in a JSON-friendly way
Stacy operates several Web Services. She designed the data that is sent and received by her Web Services in a way that maps very easily to RDF. She wants to be able to take the data that she is already publishing and transform it into RDF for internal use. She wants to be able to do this without impacting the developers that are currently using her system.
She also wants to be able to give the developers that care about RDF a data model that maps to RDF well. She would like to support both regular JSON developers and semantic web JSON developers at the same time via her JSON-based Web Services API.
Digital Signatures on Graphs
Graeme would like to publish assets for sale on his website via a JSON-based Web Services API. He would like this data to be cached on third party sites without the pricing information being changed or forged. He accomplishes this by digitally signing the graph of information that he publishes such that search engines and other caching mechanisms can relay the information without needing to directly access his site. By cryptographically signing the graph, he is also ensuring that information about the asset, including pricing information, cannot be changed or forged to different values.
Universal Payment Standard for the Web
The PaySwarm Web platform is an open web standard that enables Web browsers and Web devices to perform Universal Web Payment. The nascent standard is using a form of RDF in JSON extensively in order to support distributed listing of assets, description of licenses and digital contracts, and digital signatures on graphs of RDF information. Information is published via HTML+RDFa and then used in JSON-form when transmitted to and from PaySwarm-aware Web Services.
RDF in JSON Design Requirements
There should be two serialization formats
There should be a machine-friendly serialization format and there should be a human-friendly serialization format.
- -1 Manu Sporny, given the limited time for this working group, I think we should focus on the human-friendly serialization format. RDF already has a number of machine-friendly serialization formats.
- +1 Andy. A simple "s", "p", "o" format is not the same amount of work as a human-friendly form. See SPARQL JSON result format
- 0 Lee. I'd worry about the WG's available time and resources.
- +1 Nathan if possible.
- -0 Matteo Brunati not enough time maybe
A primary goal SHOULD be to build a human-friendly version of the serialization for JSON developers
- +1 Manu Sporny
- -1 Lee. Given the existing work in the RDFa group on an API, I'd rather see a simple, machine-friendly format that implementations can then make available via an API. I'm not convinced that a standard human-friendly JSON format is a big win.
- -0 Andy Different uses cases lead to different design tradeoffs. (e.g LDA is a tree; ideal for them, bad for different uses.)
- +1 Nathan but only if the product can be considered simple JSON objects (k/v objects with a subject set) and the caveat is recognized that by not requiring an RDF toolkit or understanding of properties, inference etc, the data isn't really RDF... it's RDF-able - else -1, waste of time.
- +1 Matteo Brunati +1 Nathan observations
A primary goal SHOULD be to build a machine-optimized version of the serialization
The serialization SHOULD be able to transform most JSON in use today into RDF
There should be a flexible mechanism, such as a "context", that is capable of mapping from JSON key-value pairs to RDF triples. This mechanism could be specified either in-band or out-of-band from the serialization. Having this feature could map much of the existing JSON in the wild into RDF.
- +1 Manu Sporny
- -1 Lee. Seems out-of-scope; do existing RDF-in-JSON solutions already have such mechanisms?
- -1 Andy The original data was not written to be used in this way.
- +1 Nathan Assuming we're still talking two serializations, then this would be very valuable, for twitter to be able to say here's our data, view it as simple objects or rdf graphs; although I'm unsure we can get there without a common vision across the water.
Developers do not need to be familiar at all with RDF to start using the serialization
Understanding the semantic web and the concepts of RDF (triples, graphs, etc.) should not be required in order to use the format. That means that the format may have a very simple, stripped down version for beginners and a more advanced set of features for semantic web enthusiasts.
- +1 Manu Sporny
- +1 Nathan only if two serializations, and as per previous comments.
- -1 Richard Cyganiak I think I disagree. If you don't want to expose developers to RDF at all, then why not just use vanilla JSON? Also I don't understand how the beginner/advanced thing should work. A server will have to generate the one or the other, so it's not like client-side developers get to choose which version they want to be exposed to.
- -1 Matteo Brunati I think a minimal semweb context is necessary: thinking on SIMILE Exhibit framework. It's not simple to use without a prior knowledge of the model.
The serialization MAY include features not in RDF
There are certain features, such as generic key-value pairs in JSON that do not map well to RDF. They would map well if RDF had a concept of plain literals in the subject or predicate position. The serialization could include these concepts but may specify that the values may not be serialized to all RDF serialization formats (such as RDF/XML, TURTLE or RDFa).
- +1 Manu Sporny
- -1 Andy creates an incompatible sub-community of applications.
- +1 Nathan useful for allowing "junk" data like debugging info and session tokens, again only if two serializations.
- -1 Richard Cyganiak as per Andy. Generic key-value pairs can be translated to <> <#key> "value" or somesuch.
- -1 Matteo Brunati as for Andy. making a default rule to the generic key-value stuff
The serialization MUST be 100% compatible with the JSON spec
Additional features such as comments or short-hand notation to support datatypes could be supported in the serialization if we extended the JSON format. This would mean that the serialization would be incompatible with vanilla JSON readers and writers. While this may make serialization nicer, we should not make any additions/modifications to the JSON format to ensure maximum compatibility with pre-existing processors.
It is a requirement that all RDF concepts MUST be expressible in the serialization
There are concepts like RDF datatypes and g-snaps/graph literals that could be omitted from the serialization in order to reduce learning and implementation complexity.
- -1 Manu Sporny, Good design is a balancing act - we should only include what will help the most number of people.
- +1 Lee. I'd hesitate to say "all", but in general, a JSON RDF serialization would not be useful to us unless it was as much a 1st-class serialization of the RDF model as turtle, RDF/XML, etc.
- +1 Andy for the machine-friendly form to work with non-JSON apps and systems.
- -1 Andy for the human-friendly form but the features dropped will vary from usage to usage.
- +1 Nathan for machine (rdf in json)
- -1 Nathan for human (rdf-able json objects)
There should be a migration story for going from existing JSON in the wild to this new format
The serialization task force should ensure that there is a subset of the serialization that is useful to beginners that use pure JSON, then show how developers could sprinkle in a little RDF into their JSON, then show how developers can fully migrate to the new serialization format. The transition to the serialization format will probably take multiple years The transition should be as smooth and organic as possible. We should also understand that many may not need to transition to RDF - JSON may work just fine for their application. We should not assume that people will go straight from regular JSON to the new serialization format.
Memory usage and CPU usage while processing SHOULD be a primary consideration
Memory and CPU usage for processing JSON is low. We should ensure that processing the serialization format is only slightly more complex than processing regular JSON.
- +0 Manu Sporny, we want to be cognizant of resource usage but I don't think this should be a primary driver for design decisions for the language.
- -1 Lee. Seems like an implementation detail to me.
- -1 Andy (NB: JSON structures are read entirely into memory before the application gets to see them.)
- +0.5 Nathan there is a balance between memory and processing to be struck, ntriples = more byte, turtle = more processing, same considerations for JSON.
The design target is small snippets of RDF Data
"small" might be less that 1 million triples, not 10.
- +1 Andy
- 0 Nathan two different considerations for machine or human, I'd say under 10k for human, over and beyond for machine
Design target: graphs or resources
A human friendly JSON format can be designed more towards graphs (multiple subjects) or more targeted on just describing one resource (subject). This is not to exclude one possibility over the other - this is to decide the focus.
- graphs Andy
- machine: graphs, human: resource Nathan
- graphs Manu Sporny, but I don't think we'll need to choose between the two if we're smart about it. For instance, JSON-LD allows expressing graphs just as easily as expressing resources.
The serialization MUST support disjoint/unconnected graphs
All current RDF serialization formats allow you to express two graphs that are not necessarily connected to one another. The new serialization format should allow the same mechanism. This is also important because normalization is difficult to achieve in a general way without also supporting disjoint graphs in the serialization. JSON-LD disjoint graphs example.
- +1 Manu Sporny
- +1 Andy One graph with two+ disjoint components per serialization
- +0 Andy Multiple graphs per serialziation. No more than follow work in other TFs.
- +1 Nathan as per andy's comments
The serialization MUST provide a normalization algorithm
Normalization, also known as canonicalization, is typically used when determining whether two sub-graphs that are expressed in different ways are identical. It is also very useful when hashing sub-graphs for checksumming or digital signature purposes. JSON-LD normalization example.
- +1 Manu Sporny, I think we need normalization because we need to have a good digital signatures story
- ? Andy. Unclear - are we signing the graph or the serialization? Is a Turtle-signed graph the same graph? Would it include IRI normalization?
- +0 Nathan
The serialization SHOULD enable digital signatures
Digital Signatures have a number of useful purposes. When combined with g-snaps/graph literals they provide a very easy way of establishing cryptographically verifiable provenance. These features are used heavily in electronic commerce. JSON-LD digital signature example.
- +1 Manu Sporny
- +0 Nathan
- 0 Richard Cyganiak Nice to have but won't make or break the format.
- +0 Matteo Brunati
The serialization SHOULD support advanced graph concepts
The serialization format should support advanced graph concepts such as g-box, g-snap and g-text such that you can make statements about snapshots of graphs. Annotating graphs with metadata such as graph retrieval time, digital signatures on the contents of the graph, and other metadata associated with graphs are an important feature for higher-level concepts like provenance. Sandro's explanation of advanced graph concepts.
- +1 Manu Sporny
- -1 Richard Cyganiak Has security implications for RDF crawlers; requires larger API surface; SPARQL only returns single graphs anyways; use case is unclear
- -1 Andy Not unless the format is following standard work done in other TFs.
- +0.5 Nathan follow other TFs
The serialization MUST support automatic typing
Being able to transform a JSON document into a native object is one of the key benefits of using JSON over other serialization formats. Automatically typing of numbers and boolean values into language-native datatypes removes an extra step that developers must perform without this feature. For example, one could easily transform a serialized number that is an xsd:integer into a language-native integer. JSON-LD automatic typing example.
The serialization SHOULD support type coercion
While not immediately obvious, type coercion allows one to map regular JSON into RDF in a way that may add datatype decorators to object literals. In other words, it provides for a way to get Typed Literals from regular JSON data. JSON-LD type coercion example.
The serialization SHOULD rely on microsyntaxes instead of nested structures
There are two common approaches to expressing RDF in JSON. One of them is to use nested structures to express language and type information for literals. The other approach is to use shallow structures with microsyntaxes mirroring TURTLE to express language and type information for literals.
- +1 Manu Sporny
- -1 Richard Cyganiak It's ugly as hell and makes the language unusable without an API
- -1 Nathan
The serialization SHOULD provide an API
An API would allow developers to transform incoming documents into a format that is easier for them to work with. In other words, it would allow them to drop all type information if it wasn't useful to them, or remove any micro-syntaxes that would get in the way of basic usage of the data. Keep in mind that even JSON has an api: JSON.parse(). JSON-LD API example.
- +1 Manu Sporny
- -1 Nathan the machine one will have the RDF API, the human one is pointless if it needs and API.
There SHOULD be one and only one way to serialize a given triple
The more different ways there are to express the same triple or graph, the harder it gets to use the host language's native toolbox (that is, pure JS expressions) to process data. At some point, using the host language becomes impossible without using a parser library layered on top of the host language, negating the benefit of basing the language on JSON in the first place. (Note, this is about using different JSON structures to express the same triple; not about different triples expressing the same statement in RDF Semantics, like "foo" vs "foo"^^xsd:string).
- +1 Richard Cyganiak This is the lesson to be learnt from RDF/XML.
- +0 Manu Sporny, while I agree in principle I don't know how we'd enforce this in practice - that is, what's the difference between "foo" and "foo"^^xsd:string in JSON? Would you serialize the plain literal "foo" and the Typed Literal "foo"^^xsd:string in the same way in JSON? If the answer is yes, isn't the translation lossy?
- This one is inherent to the way the RDF model is defined. There's nothing that can be done about it in the syntax. The concern here was about using different JSON structures to express the same triple. I clarified the description.
- +1 Matteo Brunati as for Richard, RDF/XML is the lesson