Avoid that collections break relationships
Disclaimer: The various proposals used both the term List and Collection. For consistency, the summary in this document has been harmonized to use the term Collection. It is still undecided whether Collection, List, or another term will be used in the final design.
Problem description
Let's assume we want to build a Web API that exposes information about persons and their friends. Using schema.org, the data would look somewhat like this:
</alice> a schema:Person ; schema:knows </bob>, ... </zorro> .
respectively
{ "@id": "/alice", "@type": "Person", "knows": [ "/bob", ... "/zorro" ] }
All this information would be available in the document at /alice. Depending on the number of friends, the document however may grow too large. Web APIs typically solve that by introducing an intermediary (paged) resource such as /alice/friends/. In Hydra we have collections to facilitate that:
</alice> a schema:Person ; schema:knows </alice/friends/> .
</alice/friends/> a hydra:Collection ; hydra:member </bob>, ... </zorro> .
respectively
{ "@id": "/alice", "@type": "Person", "knows": "/alice/friends/" }
{ "@id": "/alice/friends/", "@type": "Collection", "member": [ "/bob", ... "/zorro" ] }
This works, but has two problems:
- it breaks the /alice --[knows]--> /bob relationship
- it states that /alice --[knows]--> /alice/friends
While 1) can easily be fixed, 2) is much trickier--especially if we consider cases that don't use schema.org with its "weak semantics" but a vocabulary that uses rdfs:range, such as FOAF. In that case, the statement
</alice> foaf:knows </alice/friends/> .
and the fact that
foaf:knows rdfs:range foaf:Person .
would yield to the wrong inference that /alice/friends is a foaf:Person.
Proposed solutions
There have been a lot of discussions both on public-hydra as well as wider lists such as public-vocabs (schema.org's mailing list). Below is a summary of the proposed solutions.
Link to the collection via a generic property
</alice> a schema:Person ; rdfs:seeAlso </alice/friends/> .
</alice/friends/> a hydra:Collection ; foaf:topic schema:knows .
respectively
{ "@id": "/alice", "@type": "Person", "seeAlso": { "@id": "/alice/friends/", "@type": "Collection", "foaf:topic": "schema:knows" } }
or with different terms such as :describedBy, schema:about or VoID:
</alice/friends/> a void:Linkset; # or e.g. :LinkPage void:linkPredicate :knows .
(VoID is typically used for different datasets whereas the use cases at hand typically deal with a single dataset)
It would of course also be possible to introduce a more explicit property for this "indirection". Something like:
</alice> hydra:hasCollection <alice/friends> .
</alice/friends/> a hydra:Collection ; foaf:topic schema:knows .
very similar to
{ "@id": "/alice", "hasCollection": { "@id": "/alice/friends", "@type": "Collection", "manages": { "property": "schema:knows", "subject": "/alice" } } }
or
{ "@id": "/alice", "hasRelationshipIndirection": { "property": "schema:knows", "resource": "/alice/friends" } }
or (same model, different terms) hasList, hasMany, relatesTo, relatesToMany:
{ "@id": "/alice", "hasList": { "property": "schema:knows", "object": "/alice/friends" } }
Pros:
- clean modeling, explicit semantics (especially with more explicit properties than seeAlso)
Cons:
- bigger payloads as the collection has not only to be referenced but also partly described to give clients enough information to decide whether to dereference it or not
- decreased performance, instead of a direct lookup, a query is needed to find the right collection
Furthermore, in order to indicate to a client that it should look into rdfs:seeAlso etc., the API documentation (SupportedProperty or Constraint) could be augmented to express that
property: schema:knows managedByCollection: true -- or -- managedIndirectly: true
respectively
{ "property": "schema:knows", "managedByCollection": true }
Pros:
- increased efficiency
Cons:
- overhead / added complexity due to a larger vocabulary
To "reduce the cost of the indirection", also the following solution was discussed but it has the same problems as using the property directly on /alice, namely that the wrong inference would be made that /alice/friends/ is a person:
</alice> hydra:hasRelationshipIndirector [ foaf:knows </alice/friends/> ] .
respectively
{ "@id": "/alice", "hasRelationshipIndirector": { "foaf:knows": "/alice/friends/" } }
Use of a blank node collection member to indirectly point to the collection
</alice> a foaf:Person; schema:knows [ hydra:isMemberOf </alice/friends/> ]
respectively
{ "@id" "alice", "schema:knows": { "isMemberOf": "alice/friends/" } }
Pros:
- nothing to add to the vocabulary
Cons:
- difficult to understand
- introduces an undesired triple that has to be filtered out when ingesting the data
Use of a separate property to reference collections
:knowsCollection :collectionPropertyOf schema:knows .
where :collectionPropertyOf has the semantic condition
?collectionProperty :collectionPropertyOf ?property ?subject ?collectionProperty ?collection ?collection hydra:member ?member
imply
?subject ?property ?member
The direction could of course also be turned around:
schema:knows :collectionProperty :knowsCollection
Pros:
- explicit semantics
Cons:
- doubles the size of the vocabulary
- perhaps difficult to find the "collection property"/interpret it as such
There has also been a suggestions to either use plural properties names (colleagues vs. colleague) or specific URL templates (/{property}/collection) to reference the collection instead of its members.
</alice> a schema:Person; schema:colleagues </alice/friends/> ; schema:colleague </bob>, ...
</alice/friends> a hydra:Collection; hydra:member </bob>, ...
or
</alice> schema:knows/collection </alice/friends>
Pros:
- simple
Cons:
- error prone (plural vs. singular)
- violates URI opacity principle (template)
- effectively doubles the size of the vocabulary
Use of an operation with an explicitly defined target
</alice> a foaf:Person hydra:supportedOperation [ a GetRelatedCollectionOperation; hydra:title "Get known relations"; hydra:method "GET"; hydra:uri </alice/friends/>; hydra:property schema:knows; ] .
Pros:
- more or less explicit semantics
Cons:
- operations are generally used for state-changing (unsafe) interactions, navigation isn't one
- the introduction of hydra:uri might motivate people to hardcode the information into their clients