YAML-LD

Final Community Group Report

This version:
https://www.w3.org/community/reports/json-ld/CG-FINAL-yaml-ld-20231206/
Latest published version:
https://www.w3.org/community/reports/json-ld/CG-FINAL-yaml-ld-20231206/
Latest editor's draft:
https://json-ld.github.io/yaml-ld/
Test suite:
https://json-ld.github.io/yaml-ld/tests/
Editor:
JSON-LD Community
Feedback:
GitHub json-ld/yaml-ld (pull requests, new issue, open issues)
public-linked-json@w3.org with subject line [yaml-ld] … message topic … (archives)

Abstract

Objective
This document defines YAML-LD, a set of conventions built on top of YAML, which outlines how to serialize Linked Data as YAML based on JSON-LD syntax, semantics, and APIs. The emergence of YAML as a more concise format for representing information previously serialized as JSON, including Linked Data, has led to the development of YAML-LD.

Methods
This document defines constraints on YAML so that any YAML-LD document can be represented in JSON-LD. This is necessary because YAML is more expressive than JSON, in terms of both available data types and document structure. This document also registers the application/ld+yaml media type.

Results
This document provides a clear description of how to serialize Linked Data in YAML. It also describes the basic concepts and core requirements for implementing YAML-LD, including a comparison of JSON versus YAML, the supported YAML features, and encoding considerations.

Limitations
The YAML feature set is richer than that of JSON, and a number of YAML features are not supported in this specification. However, ground is laid for future development of a version of YAML-LD which will support those features — via the Extended YAML-LD Profile.

Conclusions
YAML-LD offers an efficient way to encode Linked Data in a variety of programming languages which can use YAML.

An introductory YAML-LD example is presented below.

Example 1: Basic YAML-LD document
"@context":
  - https://json-ld.org/contexts/dollar-convenience.jsonld
  - "@base": https://json-ld.github.io/yaml-ld/spec/
    rdfs: http://www.w3.org/2000/01/rdf-schema#
    schema: https://schema.org/
    license:
      "@type": "@id"

$id: https://json-ld.github.io/yaml-ld/spec/
rdfs:label: YAML-LD
license: https://spdx.org/licenses/W3C.html
schema:hasPart:
  - rdfs:label: Abstract
  - rdfs:label: Status of This Document
  - rdfs:label: Introduction

Status of This Document

This specification was published by the JSON for Linking Data Community Group. It is not a W3C Standard nor is it on the W3C Standards Track. Please note that under the W3C Community Final Specification Agreement (FSA) other conditions apply. Learn more about W3C Community and Business Groups.

This document has been developed by the JSON-LD Community Group.

GitHub Issues are preferred for discussion of this specification. Alternatively, you can send comments to our mailing list. Please send them to public-linked-json@w3.org (subscribe, archives).

1. Introduction

[JSON-LD11] is a JSON-based format to serialize Linked Data. In recent years, [YAML] has emerged as a more concise format to represent information that had previously been serialized as [JSON], including API specifications, data schemas, and Linked Data.

This document defines YAML-LD as a set of conventions on top of YAML which specify how to serialize Linked Data [LINKED-DATA] as [YAML] based on JSON-LD syntax, semantics, and APIs.

Since YAML is more expressive than JSON, both in the available data types and in the document structure (see [I-D.ietf-httpapi-yaml-mediatypes]), this document identifies constraints on YAML such that any YAML-LD document can be represented in JSON-LD.

Editor's note
See YAML-LD description of this spec at spec.yaml .

1.1 How to read this document

This section is non-normative.

To understand the basics of this specification, one must be familiar with the following:

This document is intended primarily for two main audiences, as described below.

1.2 Terminology

This section is non-normative.

This document uses the following terms as defined in external specifications and defines terms specific to JSON-LD.

A YAML-LD stream is a YAML stream of YAML-LD documents.

A YAML-LD document is any YAML document from which a conversion to [JSON] produces a valid JSON-LD document which can be interpreted as [LINKED-DATA].

The term media type is imported from [RFC6838].

The term JSON is imported from [JSON].

The term JSON document represents a serialization of a resource conforming to the [JSON] grammar.

The terms JSON-LD document, and value object are imported from [JSON-LD11].

The terms internal representation, and documentLoader are imported from [JSON-LD11-API].

The terms array, boolean, map, map entry, null, and string are imported from [INFRA].

The term number is imported from [ECMASCRIPT].

The terms YAML, YAML representation graph, YAML stream, YAML directive, TAG directive, YAML document, YAML sequence (either block sequence or flow sequence), YAML mapping (either block mapping or flow mapping), node, scalar, node anchor, node tags, and alias node, are imported from [YAML].

The term content negotiation is imported from [RFC9110].

The terms RDF literal, language-tagged string, datatype IRI, and language tag are imported from [RDF11-CONCEPTS].

The terms fragment and fragment identifier in this document are to be interpreted as in [URI].

The term Linked Data is imported from [LINKED-DATA].

1.3 Namespace Prefixes

This section is non-normative.

This specification makes use of the following namespace prefixes:

Prefix IRI
ex https://example.org/
i18n https://www.w3.org/ns/i18n#
rdf http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfs http://www.w3.org/2000/01/rdf-schema#
xsd http://www.w3.org/2001/XMLSchema#
schema https://schema.org/
prov http://www.w3.org/ns/prov#
Editor's note
See YAML-LD version of this table at namespace-prefixes.yaml .

These are used within this document as part of a compact IRI as a shorthand for the resulting IRI, such as schema:url used to represent https://schema.org/url.

2. Conformance

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The key words MAY, MUST, MUST NOT, RECOMMENDED, and SHOULD in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

A YAML-LD document complies with the YAML-LD Basic profile of this specification if it follows the normative statements from this specification and can be transformed into a JSON-LD representation, then back to a conforming YAML-LD document, without loss of semantic information.

For convenience, normative statements for documents are often phrased as statements on the properties of the document.

3. Basic Concepts

This section is non-normative.

3.1 JSON vs YAML comparison

YAML is more flexible than JSON, as illustrated by comparison table below.

Features [JSON] [YAML]
Allowed encodings
UTF-8
UTF-16
UTF-32
Native data types
{} object
[] array
string
number
integer
floating point
bool
null
Features
Custom types ✅ via tags
Cycles
Documents per file 1 ⩾ 1 via YAML stream
Comments
Anchors & aliases
Mapping key types string Any type representable in YAML, from strings to mappings
Editor's note
See YAML-LD version of this table at json-vs-yaml.yaml .

The first goal of this specification is to allow a JSON-LD document to be processed and serialized into YAML, and then back into JSON-LD, without losing any semantic information.

This is always possible because

Example: The JSON-LD document below

Example 2: Basic JSON-LD document
{
  "@context": "https://schema.org",

  "@id": "https://w3.org/yaml-ld/",
  "@type": "WebContent",
  "name": "YAML-LD",
  "author": {
    "@id": "https://www.w3.org/community/json-ld",
    "name": "JSON-LD Community Group"
  }
}

Can be serialized as YAML as follows. Note that entries starting with @ need to be enclosed in quotes because @ is a reserved character in YAML.

Example 3: Basic YAML-LD document
"@context": https://schema.org

"@id": https://w3.org/yaml-ld/
"@type": WebContent
name: YAML-LD
author:
  "@id": https://www.w3.org/community/json-ld
  name: JSON-LD Community Group

This document is based on YAML 1.2.2, but YAML-LD is not tied to a specific version of YAML. Implementers concerned about features related to a specific YAML version can specify it in documents using the %YAML directive (see 6. Interoperability Considerations).

4. Core Requirements

4.1 YAML features supported by YAML-LD

This section is non-normative.

Encoding
UTF-8 only.
Native data types
Every native data type that the YAML-LD parser supports.
Tags
Ignored.
Comments
Treated as whitespace.
Anchors & aliases
Resolved by YAML parser. Anchors & alias names are ignored.
Cycles defined using anchors & aliases
Not permitted.
YAML Streams
Not supported.
Mapping key types other than string
Not supported.

Perspectives for support of the additional YAML features are analyzed in Extended Profile informative addendum to this specification.

4.2 Encoding

A YAML-LD document MUST be encoded in UTF-8, to ensure interoperability with [JSON]; otherwise, an invalid-encoding MUST be detected, and processing aborted.

4.3 Comments

Comments in YAML-LD documents are treated as white space.

See Interoperability considerations of [I-D.ietf-httpapi-yaml-mediatypes] for more details.

4.4 Anchors and Aliases

Since anchor names are a serialization detail, such anchors MUST NOT be used to convey relevant information, MAY be altered when processing the document, and MAY be dropped when interpreting the document as JSON-LD.

A YAML-LD document MAY contain anchored nodes and alias nodes, but its representation graph MUST NOT contain cycles; otherwise, a loading-document-failed error MUST be detected, and processing aborted.

When interpreting the document as JSON-LD, alias nodes MUST be resolved by value to their target nodes.

The YAML-LD document in the following example contains alias nodes for the {"@id": "country:US"} object:

Example 4: YAML-LD with node anchors
"@context":
  "@import": https://schema.org
  country: https://example.org/country/

"@included":
  - &US
    "@id": country:US
  - "@id": https://www.w3.org/community/json-ld
    "@type": Organization
    member:
      - "@id": https://github.com/gkellogg
        "@type": Person
        name: Gregg Kellogg
        country: *US
      - "@id": https://github.com/BigBlueHat
        "@type": Person
        name: Benjamin Young
        country: *US
    # - …

While the representation graph (and eventually the in-memory representation of the data structure, e.g., a Python dictionary or a Java hashmap) will still contain references between nodes, the JSON-LD serialization will not — since, by the time it is formed, all the anchors have been resolved, as shown below.

Example 5: JSON-LD resulting from YAML with node anchors
{
  "@context": {
    "@import": "https://schema.org",
    "country": "https://example.org/country/"
  },
  "@included": [
    {
      "@id": "country:US"
    },
    {
      "@id": "https://www.w3.org/community/json-ld",
      "@type": "Organization",
      "member": [
        {
          "@id": "https://github.com/gkellogg",
          "@type": "Person",
          "name": "Gregg Kellogg",
          "country": {
            "@id": "country:US"
          }
        },
        {
          "@id": "https://github.com/BigBlueHat",
          "@type": "Person",
          "name": "Benjamin Young",
          "country": {
            "@id": "country:US"
          }
        },
        …
      ]
    }
  ]
}

4.5 Mapping Key Types

Mapping key type MUST be a string. Otherwise, a processing error is raised.

5. Security Considerations

This section is non-normative.

See Security considerations in JSON-LD 1.1. Also, see the YAML media type registration.

6. Interoperability Considerations

This section is non-normative.

For general interoperability considerations on the serialization of JSON documents in [YAML], see YAML and the Interoperability consideration of application/yaml [I-D.ietf-httpapi-yaml-mediatypes].

The YAML-LD format and the media type registration are not restricted to a specific version of YAML, but implementers that want to use YAML-LD with YAML versions other than 1.2.2 need to be aware that the considerations and analysis provided here, including interoperability and security considerations, are based on the YAML 1.2.2 specification.

A. IANA Considerations

This section has been submitted to the Internet Engineering Steering Group (IESG) for review, approval, and registration with IANA.

This section describes the information required to register the above media type according to [RFC6838]

A.1 application/ld+yaml

Type name:
application
Subtype name:
ld+yaml
Required parameters:
N/A
Optional parameters:
profile

A non-empty list of space-separated URIs identifying specific constraints or conventions that apply to a YAML-LD document according to [RFC6906]. A profile does not change the semantics of the resource representation when processed without profile knowledge, so that clients both with and without knowledge of a profiled resource can safely use the same representation. The profile parameter MAY be used by clients to express their preferences in the content negotiation process. If the profile parameter is given, a server SHOULD return a document that honors the profiles in the list which it recognizes, and MUST ignore the profiles in the list which it does not recognize. It is RECOMMENDED that profile URIs are dereferenceable and provide useful documentation at that URI. For more information and background please refer to [RFC6906].

This specification allows the use of the profile parameters listed in and additionally defines the following:

http://www.w3.org/ns/json-ld#extended
To request or specify extended YAML-LD document form.
Editor's note
This is a placeholder for specifying something like an extended YAML-LD document form making use of YAML-specific features.

When used as a media type parameter [RFC4288] in an HTTP Accept header field [RFC9110], the value of the profile parameter MUST be enclosed in quotes (") if it contains special characters such as whitespace, which is required when multiple profile URIs are combined.

When processing the "profile" media type parameter, it is important to note that its value contains one or more URIs and not IRIs. In some cases it might therefore be necessary to convert between IRIs and URIs as specified in section 3 Relationship between IRIs and URIs of [RFC3987].

Encoding considerations:
See YAML media type.
Security considerations:
See 5. Security Considerations.
Interoperability considerations:
See 6. Interoperability Considerations.
Published specification:
http://www.w3.org/TR/yaml-ld
Applications that use this media type:
Any programming environment that requires the exchange of directed graphs.
Additional information:
Magic number(s):
See application/yaml
File extension(s):
  • .yaml
  • .yamlld
Macintosh file type code(s):
TEXT
Person & email address to contact for further information:
Philippe Le Hégaret <plh@w3.org>
Intended usage:
Common
Restrictions on usage:
N/A
Author(s):
Roberto Polli, Gregg Kellogg
Change controller:
W3C

B. Best Practices

This section is non-normative.

Here, we propose to YAML-LD users a bit of advice which, although optional, might suggest one or two useful thoughts.

Best Practice 1: Follow JSON-LD best practices

…in order to achieve a greater level of reusability, performance, and human friendliness among YAML-LD aware systems. The [json-ld-bp] document is as relevant to YAML-LD as it is to [JSON-LD11].

Best Practice 2: Do not force users to author contexts

Instead, provide pre-built contexts that the user can reference by URL for a majority of common use cases.

YAML-LD is intended to simplify the authoring of Linked Data for a wide range of domain experts; its target audience is not comprised solely of IT professionals. [YAML] is chosen as a medium to minimize syntactic noise, and to keep the authored documents concise and clear. [JSON-LD11] (and hence YAML-LD) Context comprises a special language of its own. A requirement to author such a context would make the domain expert's job much harder, which we, as system architects and developers, should try to avoid.

Best Practice 3: Use a default context

If most, or all, of a user's documents are based on one particular context, try to make it the default in order to rescue the user from copy-pasting the same technical incantation from one document to another.

For instance, according to [JSON-LD11-API], the expand() method of a JSON-LD processor accepts an expandContext argument which can be used to provide a default system context.

Best Practice 4: Alias JSON-LD keywords

If possible, map JSON-LD keywords containing the @ character to keywords that do not contain it.

The @ character is reserved in YAML, and thus requires quoting (or escaping), as in the following example:

Example 6: Example YAML-LD document with quoted keywords
"@context": https://schema.org
"@id": https://w3.org/yaml-ld/
"@type": WebContent

The need to quote these keywords has to be learnt, and introduces one more little irregularity to the document author's life. Further, on most keyboard layouts, typing quotes will require Shift, which reduces typing speed, albeit slightly.

In order to avoid this, the context might introduce custom mappings for JSON-LD keywords — to make authoring more convenient. The exact mapping might vary depending on the domain, but we provide two examples, both published at json-ld.org:

Best Practice 5: Use a Convenience Context

YAML-LD users may use a JSON-LD context provided as part of this specification, or a similar custom context, to improve the authoring experience and readability.

Unfortunately, @context keyword cannot be aliased as per JSON-LD specification and will have to stay as-is.

Consider Example 6 reformatted using the $-convenience context:

Example 7: Example YAML-LD document with Convenience Context
"@context":
  - https://json-ld.org/contexts/dollar-convenience.jsonld
  - https://schema.org

$id: https://w3.org/yaml-ld/
$type: WebContent

C. Why are comments treated as whitespace?

This section is non-normative.

C.1 Consistency

[TURTLE] and other Linked Data serializations which support comments
do not provide a means to preserve them when processing and serializing the document in other formats
YAML
requires that parts of the document not reflected by representation graph, such as
  • comments
  • directives
  • mapping key order
  • anchor names
must not be used to convey application level information

C.2 Predictability

Theoretically, we could try harvesting YAML comments into JSON-LD documents. We would define a specific predicate, like https://json-ld.org/yaml-ld/comment, and convert every # My comment fragment into a {"yaml-ld:comment": "My comment"} piece of the JSON-LD document.

This would, however, have the following impacts on implementations:

D. Streams

This section is non-normative.

Every YAML-LD file is a YAML-LD stream and might contain multiple YAML-LD documents, as shown in the example below.

Example 8: YAML-LD with several documents in one file
"@context": https://schema.org
"@id": https://w3.org/yaml-ld/
"@type": WebContent
name: YAML-LD
---
"@context": https://schema.org
"@id": https://www.w3.org/TR/json-ld11/
"@type": WebContent
name: JSON-LD
Issue 63: YAML Streams and JSON Sequences spec

YAML streams may correspond more directly to JavaScript Object Notation (JSON) Text Sequences, which are not presently part of the JSON-LD internal representation. The description here more closely aligns with how JSON-LD interprets HTML Scripts.

Current specification does not support this feature. Implementations MAY choose, for example, to do any of the following:

Note: Interoperability considerations on YAML streams

For interoperability considerations on YAML streams, see the relevant section in YAML Media Type.

E. Extended YAML-LD Profile

This section is non-normative.

E.1 Motivation

The YAML-LD specification relies upon YAML to serialize Linked Data to the extent that YAML is compatible with JSON, which simplifies the operation and usage of YAML-LD. However, the more expressive feature set of YAML invites us to represent Linked Data in a more expressive way.

In the cases described above, one of the possible expressive methods is a specific feature of YAML language. To leverage those methods, we propose an Extended YAML-LD Profile which will implement all such features.

Editor's note
The Extended Profile is out of scope for the normative part of this specification; we leave it for later versions, pending feedback from the community and new knowledge gained from practical experience of using the basic version of YAML-LD that we will henceforth call Basic Profile of YAML-LD.

E.1.1 Specify node @type

When converting JSON-LD to RDF, @type translates to one of the following:

  • an rdf:type edge
  • a datatype mark for a Literal node

Possible ways to specify this in YAML-LD are the following:

  • In the @context, but there we can only say that the node is an IRI, we cannot specify a particular rdf:type
  • Using [RDF-SCHEMA] and [OWL2-SYNTAX] based logical reasoning, for instance, via rdfs:domain or rdfs:range properties
  • Inline, using the @type keyword
  • Using a YAML Tag, as shown below:

    Example 9: YAML-LD with tags
    %TAG !xsd! http://www.w3.org/2001/XMLSchema%23
    ---
    "@context": https://schema.org
    "@id": https://w3.org/yaml-ld/
    dateModified: !xsd:date 2023-06-26

    Here, %TAG declares the !xsd: prefix for tags used in the document. YAML treats tags as IRIs, which brings it close to the LD family of data formats. Note that the directives section must be separated from the main document with --- (a line containing exactly three hyphens).

E.1.2 Reduce duplication

If a segment of a YAML document has to be repeated more than once, one of the following approaches can be taken:

  • Repeat the segment as many times as necessary
  • If the segment represents a node, designate it once with a YAML-LD @id, and then address it by the given identifier
  • Use YAML anchors & aliases as shown in Example 4.

E.2 Approaches

Two alternative approaches have been proposed to implement the Extended profile:

E.3 Extended Internal Representation

This approach implies extending the JSON-LD internal representation to allow a more complete expression of native data types within YAML-LD, and allows use of the complete JSON-LD 1.1 Processing Algorithms and API [JSON-LD11-API] Application Programming Interface to manipulate extended YAML-LD documents.

A YAML-LD document complies with the YAML-LD extended profile of this specification if it follows the normative statements from this specification and can be transformed into the JSON-LD extended internal representation, then back to a conforming YAML-LD document, without loss of semantic information.

As [YAML] has well-defined representation requirements, all YAML-LD streams MUST form a well-formed stream and use alias node defined by a previous node with a corresponding anchor; otherwise, a loading-document-failed error has been detected and processing is aborted.

The YAML-LD extended profile allows full use of anchor names and alias nodes subject to the requirements described above in this section.

If the extendedYAML API flag is true, the processing result will be in the extended internal representation.

When processing using the YAML-LD Basic profile, documents MUST NOT contain alias nodes; otherwise, a profile-error error has been detected and processing is aborted.

E.3.1 Conversion to the Internal Representation

YAML-LD processing is defined by converting YAML to the internal representation and using JSON-LD 1.1 Processing Algorithms and API to process on that representation, after which the representation is converted back to YAML. As information specific to a given YAML document structure is lost in this transformation, much of the specifics of that original representation are therefore lost in that conversion, limiting the ability to fully round-trip a YAML-LD document back to an equivalent representation. Consequently, round-tripping in this context is limited to preservation of the semantic representation of a document, rather than a specific syntactic representation.

The conversion process represented here is compatible with the description of "Composing the Representation Graph" from the 3.1.2 Load section of [YAML]. The steps described below for converting to the internal representation operate upon that YAML Ain’t Markup Language (YAML™) version 1.2.2.

When operating using the YAML-LD Basic profile, it is intended that the common feature provided by most YAML libraries of transforming YAML directly to JSON satisfies the requirements for parsing a YAML-LD file.

Issue 12: Convert JSON-LD to YAML-LD using standard YAML libraries UCR

As a developer, I want to be able to convert JSON-LD documents to YAML-LD by simply serializing the document using any standard YAML library, So that the resulting YAML is valid YAML-LD, resolving to the same graph as the original JSON-LD.

E.3.1.1 Converting a YAML stream

A YAML stream is composed of zero or more YAML documents.

  1. Set stream content to an empty array.
  2. If the stream is empty, set stream content to an empty array.
  3. Otherwise, if the stream contains a single YAML document, set stream content the result of E.3.1.2 Converting a YAML document.
  4. Otherwise: for each document in the stream:
    1. Set doc to the result of E.3.1.2 Converting a YAML document for document.
    2. If doc is an array, merge it to the end of stream content.
    3. Otherwise, append doc to stream content
    Editor's note
    This step is inconsistent with other statements about processing each document separately, resulting in some other stream of JSON-LD output (i.e., something like NDJSOND-LD). Also, presumably an empty stream would result in either an empty NDJSON-LD document, or an empty [JSON-LD] document.
  5. The conversion result is stream content.

Any error reported in a recursive processing step MUST result in the failure of this processing step.

E.3.1.2 Converting a YAML document

From the YAML grammar, a YAML document MAY be preceded by a Document Prefix and/or a set of directives followed by a YAML bare document, which is composed of a single node.

  1. Create an empty named nodes map which will be used to associate each alias node with the node having the corresponding node anchor.
  2. Set document content to the result of processing the node associated with the YAML bare document, using the appropriate conversion step defined in this section. If that node is not one of the following, a loading-document-failed error has been detected and processing is aborted.
    Note
    A node may be of another type, but this is incompatilbe with JSON-LD, where the top-most node must be either an array or map.
  3. The conversion result is document content.

Any error reported in a recursive processing step MUST result in the failure of this processing step.

E.3.1.3 Converting a YAML sequence

Both block sequences and flow sequences are directly aligned with an array in the internal representation.

  1. Set sequence content to an empty array.
  2. If the sequence has a node anchor, add a reference from the anchor name to the sequence in the named nodes map.
  3. For each node n in the sequence, append the result of processing n to sequence content using the appropriate conversion step.
  4. The conversion result is sequence content.

Any error reported in a recursive processing step MUST result in the failure of this processing step.

E.3.1.4 Converting a YAML mapping

Both block mappings and flow mappings are directly aligned with a map in the internal representation.

  1. Set mapping content to an empty map.
  2. Otherwise, if the mapping has a node anchor, add a reference from the anchor name to the mapping in the named nodes map.
  3. For each entry in the mapping composed of a key/value pair:
    1. Set key and value to the result of processing entry using the appropriate conversion step.
    2. If key is not a string, a mapping-key-error error has been detected and processing MUST be aborted.
    3. Add a new entry to mapping content using key and value.
  4. The conversion result is mapping content.

Any error reported in a recursive processing step MUST result in the failure of this processing step.

E.3.1.5 Converting a YAML scalar
  1. If the extendedYAML flag is true, and node n has a node tag t, n is mapped as follows:
    1. If t resolves with a prefix of tag:yaml.org.2002:, the conversion result is mapped through the YAML Core Schema.
    2. Otherwise, if t resolves with a prefix of https://www.w3.org/ns/i18n#, and the suffix does not contain an underscore ("_"), the conversion result is a language-tagged string with value taken from n, and a language tag taken from the suffix of t.
      Note
      Node tags including an underscore ("_"), such as i18n:ar-eg_rtl describe a combination of language and text direction. See The i18n Namespace in [JSON-LD11].
    3. Otherwise, the conversion result is an RDF literal with value taken from n and datatype IRI taken from t.
  2. Otherwise, the conversion result is mapped through the YAML Core Schema.
Note

Implementations may retain the representation as an YAML Integer, or YAML Floating Point, but a JSON-LD processor must treat them uniformly as a number, although the specific type of number value SHOULD be retained for round-tripping.

E.3.1.6 Converting a YAML alias node

The conversion result is the value of the entry in the named nodes map having the node entry. If none exist, the document is invalid, and processing MUST end in failure.

If an alias node is encountered when processing the YAML representation graph and the extendedYAML flag is false, the YAML-LD Basic profile has been selected. A profile-error error has been detected and processing MUST be aborted.

If a cycle is detected, a processing error MUST be returned, and processing aborted.

E.3.2 Conversion to YAML

The conversion process from the internal representation involves turning that representation back into a YAML representation graph and relies on the description of "Serializing the Representation Graph" from the 3.1.1 Dump section of [YAML] for the final serialization.

As the internal representation is rooted by either an array or a map, the process of transforming the internal representation to YAML begins by preparing an empty representation graph which will be rooted with either a YAML mapping or YAML sequence.

Although outside of the scope of this specification, processors MAY use YAML directives, including TAG directives, and Document markers, as appropriate for best results. Specifically, if the extendedYAML API flag is true, the document SHOULD use the %YAML directive with version set to at least 1.2. To improve readability and reduce document size, the document MAY use a %TAG directive appropriate for RDF literals contained within the representation.

Note

The use of %TAG directives in YAML-LD is similar to the use of the PREFIX directive in [Turtle] or the general use of terms as prefixes to create Compact IRIs in [JSON-LD11]: they not change the meaning of the encoded scalars.

Example 11: Serialized representation of the extended internal representation
%TAG !xsd! http://www.w3.org/2001/XMLSchema%23
---
"@context": https://schema.org
"@id": https://github.com/gkellogg
"@type": Person
name: !xsd!string Gregg Kellogg
birthDate: !xsd!date 1970-01-01
Note

Although allowed within the YAML Grammar, some current YAML parsers do not allow the use of "#" within a tag URI. Substituting the "%23" escape is a workaround for this problem, that will hopefully become unnecessary as implementations are updated.

Issue 6: Use tags to distinguish "plain" YAML-LD from "idiomatic" YAML-LD UCRspec

A concrete proposal in that direction would be to use a tag at the top-level of any "idiomatic" YAML-LD document, applying to the whole object/array that makes the document.

It might also include a version to identify the specification that it relates to, allowing for version announcement that could be used for future-proofing.

The following block is one example:

!yaml-ld
$context: http://schema.org
$type: Person
name: Pierre-Antoine Champin

See Example 11 for an example of serializing the extended internal representation.

E.3.2.1 Converting From the Internal Representation

This algorithm describes the steps to convert each element from the internal representation into corresponding YAML nodes by recursively processing each element n.

  1. If n is an array, the conversion result is a YAML sequence with child nodes of the sequence taken by converting each value of n using this algorithm.
  2. Otherwise, if n is an map, the conversion result is a YAML mapping with keys and values taken by converting each key/value pair of n using this algorithm.
  3. Otherwise, if n is an RDF literal:
    1. If the datatype IRI of n is xsd:string, the conversion is a YAML scalar with the value taken from that value of n.
    2. Otherwise, if n is a language-tagged string, the conversion is a YAML scalar with the value taken from that value of n and a node tag constructed by appending that language tag to https://www.w3.org/ns/i18n#.
    3. Otherwise, the conversion is a YAML scalar with the value taken from that value of n and a node tag taken from the datatype IRI of n.
  4. Otherwise, if n is a number, the conversion result is a YAML scalar with the value taken from n.
  5. Otherwise, if n is a boolean, the conversion result is a YAML scalar with the value either true or false based on the value of n.
  6. Otherwise, if n is null, the conversion result is a YAML scalar with the value null.
  7. Otherwise, conversion result is a YAML scalar with the value taken from n.

E.3.3 Application Profiles

This section identifies two application profiles for operating with YAML-LD:

Application profiles allow publishers to use YAML-LD either for maximum interoperability, or for maximum expressivity. The YAML-LD Basic profile provides for complete round-tripping between YAML-LD documents and JSON-LD documents. The YAML-LD extended profile allows for fuller use of YAML features to enhance the ability to represent a larger number of native datatypes and reduce document redundancy.

Application profiles can be set using the JsonLdProcessor API interface, as well as an HTTP request profile (see A. IANA Considerations).

E.3.3.1 YAML-LD Basic Profile

The YAML-LD Basic profile is based on the YAML Core Schema, which interprets only a limited set of node tags. YAML scalars with node tags outside of the defined range SHOULD be avoided and MUST be converted to the closest scalar type from the YAML Core Schema, if found. See E.3.1.5 Converting a YAML scalar for specifics.

Although YAML supports several additional encodings, YAML-LD documents in the YAML-LD Basic Profile MUST NOT use encodings other than UTF-8.

Keys used in a YAML mapping MUST be strings.

Although YAML-LD documents MAY include node anchors, documents MUST NOT use alias nodes.

A YAML stream MUST include only a single YAML document, as the JSON-LD internal representation only supports a single document model.

E.3.3.2 YAML-LD Extended Profile

The YAML-LD extended profile extends the YAML Core Schema, allowing node tags to specify RDF literals by using a JSON-LD extended internal representation capable of directly representing RDF literals.

As with the YAML-LD Basic profile, YAML-LD documents in the YAML-LD extended profile MUST NOT use encodings other than UTF-8.

As with the YAML-LD Basic profile, keys used in a YAML mapping MUST be strings.

YAML-LD docucments MAY use alias nodes, as long as dereferencing these aliases does not result in a loop.

As with the YAML-LD Basic profile, a YAML stream MUST include only a single YAML document, as the JSON-LD extended internal representation only supports a single document model.

Issue 79: YAML-LD IRI tags

Consier something like !id as a local tag to denote IRIs.

E.3.3.2.1 The JSON-LD Extended Internal Representation

This specification defines the JSON-LD extended internal representation , an extension of the JSON-LD internal representation.

In addition to maps, arrays, and strings, the internal representation allows native representation of numbers, boolean values, and nulls. The extended internal representation allows for native representation of RDF literals, both with a datatype IRI, and language-tagged strings.

When transforming from the extended internal representation to the internal representation — for example when serializing to JSON or to the YAML-LD Basic profile — implementations MUST transform RDF literals to the closest native representation of the internal representation:

Editor's note

An alternative would be to transform such literals to JSON-LD value objects, and we may want to provide a means of transforming between the internal representation and extended internal representation using value objects, but this treatment is consistent with [YAML] Core Schema Tag Resolution.

E.3.4 The Application Programming Interface

This specification extends the JSON-LD 1.1 Processing Algorithms and API [JSON-LD11-API] Application Programming Interface and the JSON-LD 1.1 Framing [JSON-LD11-FRAMING] Application Programming Interface to manage the serialization and deserialization of [YAML] and to enable an option for setting the YAML-LD extended profile.

E.3.4.1 JsonLdProcessor

The JSON-LD Processor interface is the high-level programming structure that developers use to access the JSON-LD transformation methods. The updates below is an experimental extension of the JsonLdProcessor interface defined in the JSON-LD 1.1 API [JSON-LD11-API] to serialize output as YAML rather than JSON.

compact()
Updates step 10 of the compact() algorithm to serialize the the result as YAML rather than JSON as defined in E.3.2 Conversion to YAML.
expand()
Updates step 9 of the expand() algorithm to serialize the the result as YAML rather than JSON as defined in E.3.2 Conversion to YAML.
flatten()
Updates step 7 of the flatten() algorithm to serialize the the result as YAML rather than JSON as defined in E.3.2 Conversion to YAML.
Updates step 22 of the frame() algorithm to serialize the the result as YAML rather than JSON as defined in E.3.2 Conversion to YAML.
fromRdf()
Updates step 3 of the fromRdf() algorithm to serialize the the result as YAML rather than JSON as defined in E.3.2 Conversion to YAML.
Updates the RDF to Object Conversion algorithm before step 2.6 as follows:
Otherwise, if both the useNativeTypes and extendedYAML flags are set and the datatype IRI of value is not xsd:string:
  1. If value is a language-tagged string set converted value to a new RDF literal composed of the lexical form of value and datatype IRI composed of https://www.w3.org/ns/i18n# followed by the language tag of value.
  2. Otherwise, et converted value to value.
toRdf()
Updates the Object to RDF Conversion algorithm before step 10 as follows:
  1. Otherwise, if value is an RDF literal, value is left unmodified. This will only be the case when processing a value from an extended internal representation.
E.3.4.2 JsonLdOptions

The JsonLdOptions type is used to pass various options to the JsonLdProcessor methods.

WebIDLpartial dictionary JsonLdOptions {
  boolean extendedYAML = false;
};

In addition to those options defined in the JSON-LD 1.1 API [JSON-LD11-API] and JSON-LD 1.1 Framing [JSON-LD11-FRAMING], this specification defines these additional options:

extendedYAML
When used for serializing the internal representation (or extended internal representation) into a YAML representation graph:
When used for the documentLoader, it causes documents of type application/ld+yaml to be parsed into a YAML representation graph and generates an internal representation (or extended internal representation):
E.3.4.3 Remote Document and Context Retrieval

This section describes an update to the built-in LoadDocumentCallback to load YAML streams and documents into the internal representation, or into the extended internal representation if the extendedYAML API flag is true.

The LoadDocumentCallback algorithm in [JSON-LD11-API] is updated as follows:

Note

These updates are intended to be compatible with other updates to the LoadDocumentCallback, such as Process HTML as defined in [JSON-LD11-API].

E.3.4.4 YamlLdErrorCode

The YamlLdErrorCode represents the collection of valid YAML-LD error codes, which extends the JsonLdErrorCode definitions.

WebIDLenum YamlLdErrorCode {
  "invalid-encoding",
  "mapping-key-error",
  "profile-error"
};
invalid-encoding
The character encoding of an input is invalid.
mapping-key-error
A YAML mapping key was found that was not a string.
profile-error
The parsed YAML document contains features incompatible with the specified profile.

E.3.5 Implementations

TODO: Implementations for Extended Internal Representation.

E.4 Convert Extended YAML-LD to Basic YAML-LD and back

This approach is simpler than the Extended Internal Representation because it does not require any changes to the internal structures of existing JSON-LD libraries.

Instead, we implement two API functions:

extended_to_basic(extended_document: YAML-LD) → YAML-LD
  • Converts the document to Basic YAML-LD form
basic_to_extended(basic_document: YAML-LD) → YAML-LD
  • Converts YAML-LD → JSON
  • Performs JSON-LD expansion → the resulting JSON-LD document
  • Converts Expanded JSON-LD document back to YAML-LD
  • Converts it to the Extended form, making use of YAML-LD features to express the document more concisely.
Note
You won't typically need to perform these steps manually because libraries such as rdflib will take care of them under the covers, but it can help with troubleshooting and optimization to know what's going on. So, you start with YAML, convert it to JSON, perform JSON-LD Expansion, convert that to YAML-LD, and do any necessary basic → extended or extended → basic conversion on the YAML-LD. Alternatively, your library might do YAML-LD expansion directly on the initial YAML document, and then do any necessary basic → extended or extended → basic conversion on the YAML-LD.

Both of these functions recursively process the source document. Every branch and leaf are copied as is, unless they match one of the following cases.

Generally, these two equalities do not hold:

When the extended → basic conversion resolves YAML tags we no longer know where the original document used tags and where it used @type calls. Thus, information is lost.

Both of these functions lose information about anchors and references because they're resolved by the YAML processor underlying the implementation.

extended_to_basic basic_to_extended
YAML Tags Convert YAML !tags@type JSON-LD keywords (nothing)
Anchors and aliases Resolve anchors and aliases (nothing)
Comments Keep as-is Remove
(Due to JSON-LD & Expansion.)

E.4.1 YAML !tags@type declarations

Example 12: Extended YAML-LD with tags
%TAG !xsd! http://www.w3.org/2001/XMLSchema%23
---
"@context": https://schema.org
"@id": https://github.com/gkellogg
"@type": Person
name: !xsd!string Gregg Kellogg
birthDate: !xsd!date 1970-01-01
Example 13: Basic YAML-LD
"@context":
  - "@import": https://schema.org
  - xsd: "http://www.w3.org/2001/XMLSchema#"
"@id": https://github.com/gkellogg
"@type": Person
name: Gregg Kellogg
birthDate:
  "@value": 1970-01-01
  "@type": xsd:date

E.4.2 &anchors and *aliases

Substitute every *alias with the content of the &anchor alias references to. This is standard behavior of YAML tools and libraries.

E.5 Fragment identifiers

This section is non-normative.

Fragment identifiers used with application/ld+yaml are treated as in RDF syntaxes, as per RDF 1.1 Concepts and Abstract Syntax [RDF11-CONCEPTS] and do not follow the process defined for application/yaml.

Editor's note
Perhaps more on fragment identifiers from Issue 31.

F. References

F.1 Normative references

[I-D.ietf-httpapi-yaml-mediatypes]
YAML Media Type. Roberto Polli; Erik Wilde; Eemeli Aro. IETF. 2022-08-05. WG Document. URL: https://www.ietf.org/archive/id/draft-ietf-httpapi-yaml-mediatypes-03.html
[JSON]
The JavaScript Object Notation (JSON) Data Interchange Format. T. Bray, Ed.. IETF. December 2017. Internet Standard. URL: https://www.rfc-editor.org/rfc/rfc8259
[JSON-LD11]
JSON-LD 1.1. Gregg Kellogg; Pierre-Antoine Champin; Dave Longley. W3C. 16 July 2020. W3C Recommendation. URL: https://www.w3.org/TR/json-ld11/
[JSON-LD11-API]
JSON-LD 1.1 Processing Algorithms and API. Gregg Kellogg; Dave Longley; Pierre-Antoine Champin. W3C. 16 July 2020. W3C Recommendation. URL: https://www.w3.org/TR/json-ld11-api/
[LINKED-DATA]
Linked Data Design Issues. Tim Berners-Lee. W3C. 27 July 2006. W3C-Internal Document. URL: https://www.w3.org/DesignIssues/LinkedData.html
[RFC2119]
Key words for use in RFCs to Indicate Requirement Levels. S. Bradner. IETF. March 1997. Best Current Practice. URL: https://www.rfc-editor.org/rfc/rfc2119
[RFC3986]
Uniform Resource Identifier (URI): Generic Syntax. T. Berners-Lee; R. Fielding; L. Masinter. IETF. January 2005. Internet Standard. URL: https://www.rfc-editor.org/rfc/rfc3986
[RFC3987]
Internationalized Resource Identifiers (IRIs). M. Duerst; M. Suignard. IETF. January 2005. Proposed Standard. URL: https://www.rfc-editor.org/rfc/rfc3987
[RFC4288]
Media Type Specifications and Registration Procedures. N. Freed; J. Klensin. IETF. December 2005. Best Current Practice. URL: https://www.rfc-editor.org/rfc/rfc4288
[RFC6838]
Media Type Specifications and Registration Procedures. N. Freed; J. Klensin; T. Hansen. IETF. January 2013. Best Current Practice. URL: https://www.rfc-editor.org/rfc/rfc6838
[RFC6906]
The 'profile' Link Relation Type. E. Wilde. IETF. March 2013. Informational. URL: https://www.rfc-editor.org/rfc/rfc6906
[RFC8174]
Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words. B. Leiba. IETF. May 2017. Best Current Practice. URL: https://www.rfc-editor.org/rfc/rfc8174
[RFC9110]
HTTP Semantics. R. Fielding, Ed.; M. Nottingham, Ed.; J. Reschke, Ed.. IETF. June 2022. Internet Standard. URL: https://httpwg.org/specs/rfc9110.html
[YAML]
YAML Ain’t Markup Language (YAML™) version 1.2.2. Oren Ben-Kiki; Clark Evans; Ingy döt Net. 2021-10-01. URL: https://yaml.org/spec/1.2.2/

F.2 Informative references

[ECMASCRIPT]
ECMAScript Language Specification. Ecma International. URL: https://tc39.es/ecma262/multipage/
[INFRA]
Infra Standard. Anne van Kesteren; Domenic Denicola. WHATWG. Living Standard. URL: https://infra.spec.whatwg.org/
[JSON-LD]
JSON-LD 1.0. Manu Sporny; Gregg Kellogg; Markus Lanthaler. W3C. 3 November 2020. W3C Recommendation. URL: https://www.w3.org/TR/json-ld/
[json-ld-bp]
JSON-LD Best Practices. Gregg Kellogg; Ivan Herman; BigBlueHat; A. Soroka; Ruben Taelman; David I. Lehn; Philippe Le Hegaret. W3C. 2022-05-24. W3C Group Note. URL: https://w3c.github.io/json-ld-bp/
[JSON-LD11-FRAMING]
JSON-LD 1.1 Framing. Dave Longley; Gregg Kellogg; Pierre-Antoine Champin. W3C. 16 July 2020. W3C Recommendation. URL: https://www.w3.org/TR/json-ld11-framing/
[OWL2-SYNTAX]
OWL 2 Web Ontology Language Structural Specification and Functional-Style Syntax (Second Edition). Boris Motik; Peter Patel-Schneider; Bijan Parsia. W3C. 11 December 2012. W3C Recommendation. URL: https://www.w3.org/TR/owl2-syntax/
[RDF-SCHEMA]
RDF Schema 1.1. Dan Brickley; Ramanathan Guha. W3C. 25 February 2014. W3C Recommendation. URL: https://www.w3.org/TR/rdf-schema/
[RDF11-CONCEPTS]
RDF 1.1 Concepts and Abstract Syntax. Richard Cyganiak; David Wood; Markus Lanthaler. W3C. 25 February 2014. W3C Recommendation. URL: https://www.w3.org/TR/rdf11-concepts/
[rfc2045]
Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies. N. Freed; N. Borenstein. IETF. November 1996. Draft Standard. URL: https://www.rfc-editor.org/rfc/rfc2045
[RFC6839]
Additional Media Type Structured Syntax Suffixes. T. Hansen; A. Melnikov. IETF. January 2013. Informational. URL: https://www.rfc-editor.org/rfc/rfc6839
[RFC7464]
JavaScript Object Notation (JSON) Text Sequences. N. Williams. IETF. February 2015. Proposed Standard. URL: https://www.rfc-editor.org/rfc/rfc7464
[TURTLE]
RDF 1.1 Turtle. Eric Prud'hommeaux; Gavin Carothers. W3C. 25 February 2014. W3C Recommendation. URL: https://www.w3.org/TR/turtle/
[URI]
Uniform Resource Identifier (URI): Generic Syntax. T. Berners-Lee; R. Fielding; L. Masinter. IETF. January 2005. Internet Standard. URL: https://www.rfc-editor.org/rfc/rfc3986
[xmlschema11-2]
W3C XML Schema Definition Language (XSD) 1.1 Part 2: Datatypes. David Peterson; Sandy Gao; Ashok Malhotra; Michael Sperberg-McQueen; Henry Thompson; Paul V. Biron et al. W3C. 5 April 2012. W3C Recommendation. URL: https://www.w3.org/TR/xmlschema11-2/