Warning:
This wiki has been archived and is now read-only.

ISSUE-37

From Linked Data Platform
Jump to: navigation, search

1 Linked Data Platform

Section is work towards ISSUE-37 What is the LDP data model and the LDP interaction model? Note that this issue is not about proposing alternative models or other new features, normal issue opening-proposing-resolving WG process should work for that. ISSUE-37 is about gather up an alternative informative description of the model that could be inserted into the specification as introductory and informative language. The plan is to insert into this description of the model existing issues or the need to create new ones (all working towards normal issue resolution process). There are separate issues (ISSUE-34 and the like) for aggregation that ISSUE-37 isn't about.

Model Aspect Linked Data Platform (LDP)
End-point basics

A LDP server exposes at least one URI to a container-type resource. A client can learn about the container and its contents (members), by simply fetching the container. If only the non-member information is needed, a client can request only that. If a client needs to know if it can create things in this container, then usual HTTP methods can be used such as HEAD or OPTIONS to learn about which verbs are permitted on the request URI.

Need: It might be useful to define a common property for external vocabularies to use to indicate the end-point, take for example: <subject> <ldp:endpoint> <my-LDP-Server>

Alternative: LDP provides no new facilities for web clients to acquire URLs to Containers. Web clients might use out of band knowledge, home resources, POWDER, emailing a friend a link, or any other mechanism deemed appropriate by the application's needs.

It is possible for a server to expose a single Container as a home resource, which web clients GET and examine to locate other resources (including other Containers). It is also possible for that same home resource to be serialized and provided to a client as a document. LDP assumes that each client is able to obtain whatever URLs it needs through non-LDP means, and restricts itself to an interaction model for Containers and their members.

Container basics

A container has a URI where you can start interacting with the container, at the most basic level you can GET a list of members, and for containers with many members it is likely the response is broken into pages. Through HTTP HEAD/OPTIONS introspection, a client can learn which HTTP verbs are permitted on this. Since LDP containers are described using RDF constructs, a LDP server may allow clients to POST the representation of the container to create it.

Need: How are the acceptable content-types communicated via HTTP headers?

Alternative: Before asking 'how' acceptable content-types are communicated, we need to agree that doing so is required (should it be in-scope for LDP - part of ISSUE-21 perhaps). In the simplest case, once the client "knows" it is dealing with a conformant LDP Container it does know (at least) that Turtle is supported.

If communicating supported content-types is in-scope, then a syntax needs to be defined (either new, or re-used). Potential re-use choices might include (and these are not necessarily mutually exclusive, although to keep clients simple it would help to have at least one required method if it is in-scope):

Aside: If we do place it in-scope, we have the same question about members, so it might be that we talk about this introspection at the LDPR level and let LDPCs "inherit" it from LDPRs. LDPRs might leave everything optional, and LDPC requires one particular method while allowing others, for example.

Creating container members (resources)

Any kind of resource can be listed in a container, and everything that is in a container is represented by a member. Members follow the standardized metadata model defined by RDF. If a client has POSTed a "media resource" (pretty much anything other than RDF) to a container URI, which has one of the media types that the container accepts, the server accepts this media resource and creates the resource, which will then represent this media resource when you list the container contents. The interesting aspect of this setup is that you POST one thing, its representation gets converted by the server to some appropriate storage and its membership added to the container.

email: there is only one container, and what you POST determines whether it's self-contained or not. if the POSTed resource contains all data (<content>...</content> in XML), it's what we now call "composition". if the POSTed resource contains a link to "external content" (<content src=""/> in XML), it's what we now call "aggregation".

Servers might try to be smart about populating additional data fields for the resource, for example creation date and who created it. They also might consider come properties to be "server managed", i.e. read-only from any client's point of view. ISSUE-11 questions how much the LDP spec constrains them.

Member (resource) identity

A container is a set of members and (perhaps) non-member properties, and thus contains a list of members (entries) representing each individual member. The model uses a URI to represent the identity of each member resource, which will be used for identification globally. The identification of the member as an entry, will be represented as using a simple RDF statement: <container> rdfs:member <member>.

Note: some proposals on the mailing list for ISSUE-34 include the use of different membership predicates like ldp:owns or ldp:contains in place of rdfs:member. If accepted, those predicates might cover some of the ground for ISSUE-21 (affordances).

Interacting with container members

Members are listed when GETting a container, and their data is exposed through regular RDF mechanisms. If members are editable, a client can introspect via HEAD/OPTIONs to see if PUT, PATCH or DELETE verbs are permitted. A server may provide some member data or just a reference to it in the response of a request for container information.

Interacting with containers (composition)

Bringing into existence: typically containers just exist, though some servers may allow creation of containers by POSTing a representation of a container to an already-existing container.

Removing from existence (DELETEing): containers are like any other HTTP web resource, they have a URI which a server may accept a DELETE request on. The question becomes, what happens to the members when a container is DELETEd? Based on resolution for ISSUE-25, in order for the container to be deleted all its members must also be deleted. There have been proposals for another model of containers, which may or may not have this rule, see ISSUE-34.

Removing a member from a container: simply update the container's membership by PATCHing it with a statement to remove the membership triple. The resource still exists, its ownership is unknown (or perhaps a concept of a "default" container?).

Adding an existing member to a container: potentially not allowed (ISSUE-25 resolution)

Moving a member from one container to another: potentially not allowed (ISSUE-25 resolution)

Alternative: PATCH to remove a member is not allowed; the client must DELETE the member to remove it from the container.

2 AtomPub example

The AtomPub example is just used as an example of how we could describe the model. The intent is not to compare and contrast. This section will be removed in the short-term.

Model Aspect AtomPub
service basics

any atompub server exposes a "service document" that describes what the server is exposing. each service document lists a set of "workspaces", which are just a grouping construct for "collections". workspaces have no interaction semantics in atompub, there is no protocol for creating or deleting them; they just exist. each workspace lists a set of "collections", which is by far the most central construct in atompub. a collection can be listed in more than one workspace.

collection basics

a collection has a URI where you can start interacting with the collection, at the most basic level you can GET a list of members, and very likely this is somehow paginated. through its listing in a service document, the collection exposes some interaction information, such as which kind of mediatypes it will accept, and how members of the collection might use categories to be classified.

creating collection members

any kind of resource can be listed in a collection, and everything that is in a collection is represented by an entry. entries follow the standardized metadata model defined by atom, but atompub distinguishes two kinds of entries. if a client POSTs an "entry resource" (an atom entry following atom's metadata model), the server pretty much takes this entry resource and starts listing it in the collection as a member. if a client has POSTed a "media resource" (pretty much anything that's not an entry by itself, often something like an image media type), which has to be in one of the accepted media types of the collection, the server accepts this media resource and creates a "media link entry", which will then represent this media resource when you list the collection contents. servers might try to be smart about populating metadata fields in the entry and for example look at exif data to populate certain fields. the interesting aspect of this setup is that you POST one thing, and create two resources, and big media files might for example get added to a CDN and get a CDN URI, whereas the entry gets some URI under the control of the atompub server.

member identity

a collection is a representation of members, and thus contains a list of entries representing each individual member. according to the atom model, each entry MUST have an <id>, which is a URI but has no interaction semantics (specifically, best practice suggests that minting URIs that are not actionable might be a good idea). entries may have embedded content, or may link to the content they are representing. identity is established by the entry <id>, and this is particularly important in scenarios where collections may be aggregated and filtered and repurposed: entry identity must always be visible in the <id>, and thus identity can be tracked across paths where entries may get repurposed in various collections.

interacting with members

members are listed when GETting a collection, and their identity and metadata about them is exposed through regular atom mechanisms. if members are editable, an "edit" link in the entry will allow clients to update the member entry, by using this link to PUT or DELETE the entry resource. if the entry is a media link entry, then there might be a "edit-media" link in the entry, which will allow clients to update the media resource, by using this link to PUT or DELETE the media resource. this model allows clients to both interact with a media resource's metadata (the "media link entry"), and the media resource itself.

interacting with collections

like workspaces, collections just exist, and atompub does not define how to create or delete them (i am currently working on a small addition to the spec that addresses that). also, collections have no structure, they have a URI and accept entries. this means there is no hierarchy to collections, it's a flat space.

2.1 Atom Pub example in LDP

Below is an example of what LDP looks like with an Atom-Pub model. This was sent initially to the ldp mailing list and received Erik Wilde's approval, with comments that should be worked into the summary below.

Summary:

  • only rdfs:member relation on LDPCs as specified in current spec ( and minimal proposed ontology )
  • one can add links, ie aggregation semantics ( i.e. without deletion behaviour ), through an indirection to the metadata resource. This is in fact a special case of Simple Aggregation Proposal
  • one can POST binaries which create intermediate metadata resources. (This is the major difference from the current spec)
  • metadata about resources is shown consistently in the LDPC for each resource ( Not incompatible with current spec )

currently missing from the examples:

  • use of :edit links to resource to edit them, and other variations thereof
  •  ?

To do this we are using the Atom-OWL ontology which comes with XSLTs and XQuery transforms from atom xml to the ontology. ( Below we have made some obvious simplification to the ontology for the sake of clarity ).

2.1.1 Relation of Atom Ontology to LDP

As argued on 31 January 2010 Feeds and Entries, with detailed links to atom history and ontology, the following would probably make most sense as a way of relating atom:Entry and atom:Feed to LDPR and LDPCs:

@prefix awol: <http://bblfish.net/work/atom-owl/2006-06-06/#> .

awol:Feed rdfs:subClassOf ldp:Container .

awol:Entry rdfs:subClassOf [ 
          a rdfs:Class;
          rdfs:label "LDP_:X";
          owl:intersectionOf ( [ owl:complementOf ldp:Container ] ldp:Resource ) 
  ] .

Perhaps with serious work one could find a way of thinking of atom:Entries to be closer to LDPRs. But a number of issues would need to be worked out, such as that atom:entries can have certain types of links which containers cannot have.

2.1.2 Presupposition: an empty Container

Let us start with an empty Container at http://atom.example/ which we GET

GET / HTTP/1.1
<> a ldp:Container .

Let me ignore other data that could be part of it of that container, for this example.

2.1.3 POSTing Content

Now we post an Atom entry using the AtomOWL ontology, with some obvious simplifications.

POST / HTTP/1.1
...

@prefix : <http://bblfish.net/work/atom-owl/2006-06-06/#> .

 <> :Entry;
    :author <http://joe.example/#i>;
    :title "Atom Powered Robots Run Amok";
    :id "urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a"^^xsd:anyURI;
    :updated "2013-01-13T18:30:02Z"^^xsd:dateTime;
    :summary "Some text.";
    :content "Some text or other content - of course the mime type of the content needs to be specified" .
HTTP/1.1 201 Created
Content-Type: text/turtle
Link: <http://bblfish.net/work/atom-owl/2006-06-06/#Entry>; rel="http://www.w3.org/1999/02/22-rdf-syntax-ns#type"
Location: <entry1>
...

Note that in the Atom Protocol RFC5023 mix the content type and the returned type of the content. In section 9.2.1 they show the 201 created returns a Content-Type of:

 Content-Type: application/atom+xml;type=entry;charset="utf-8"

Given that we don't want to make the same mistake of mixing syntax and semantics as the AtomPub protocol did I have moved the type to a link element. Then say if Twitter abandonds one format - as it recently abandoned Atom XML in favor of JSON then the protocol will have no problem and still be able to adapt itself.

Note that this would create a new entry in the ldp:Container, of type :Entry which one could find out by doing a GET on the container:

GET / HTTP/1.1
@prefix : <http://bblfish.net/work/atom-owl/2006-06-06/#> .

<> a ldp:Container;
   rdfs:member [ owl:sameAs <entry1>;
                 a :Entry;
                 :author  <http://joe.example/#i>;
                 :title "Atom Powered Robots Run Amok";
                 :id "urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a"^^xsd:anyURI;
                 :updated "2013-01-13T18:30:02Z"^^xsd:dateTime;
                 :summary "Some text." ] .

Notice that the ldp:Container does not show all the information: for example it does not show the content. To do that one has to GET <entry1> like this:

GET /entry1 HTTP/1.1
@prefix : <http://bblfish.net/work/atom-owl/2006-06-06/#> .

<> a :Entry;
   :author  <http://joe.example/#i>;
   :title "Atom Powered Robots Run Amok";
   :id "urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a"^^xsd:anyURI;
   :updated "2013-01-13T18:30:02Z"^^xsd:dateTime;
   :summary "Some text.";
   :content "Some text or other content - of course the mime type of the content needs to be specified ." .

So to DELETE <entry1> would delete also the rdfs:member information in the <http://atom.example/> container, with of course all the metadata about it.

2.1.4 POSTing a link to another resource

Next we want to post a link to something without content

POST / HTTP/1.1
...

@prefix : <http://bblfish.net/work/atom-owl/2006-06-06/#> .

 <> :Entry;
    :author <http://joe.example/#i>;
    :title "picture of a cat";
    :updated "2013-01-13T18:45:23Z"^^xsd:dateTime;
    :summary "Cat with a funny hat made of bread";
    :content <http://cat.example/cat1.jpg> .

So this creates a resource <entry2> which which contains exactly what was POSTed as in the example with the content. A GET on the container now has the following ( assuming we have not deleted <entry1> yet of course )

GET / HTTP/1.1

<pre?> @prefix : <http://bblfish.net/work/atom-owl/2006-06-06/#> .

<> a ldp:Container;

  rdfs:member [ owl:sameAs <entry1>;
                a :Entry;
                :author  <http://joe.example/#i>;
                :title "Atom Powered Robots Run Amok";
                :id "urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a"^^xsd:anyURI;
                :updated "2013-01-13T18:30:02Z"^^xsd:dateTime;
                :summary "Some text." ],
              [ owl:sameAs <entry2>;
                a :Entry;
                :author <http://joe.example/#i>;
                :id "http://atom.example/entry2";
                :title "picture of a cat";
                :updated "2013-01-13T18:45:23Z"^^xsd:dateTime;
                :summary "Cat with a funny hat made of bread";
                :content <http://cat.example/cat1.jpg>]  .
</pre>

Now again DELETEing <entry2> deletes it from the container <http://atom.example/> but does not delete the content.

2.1.5 POSTING Binary Content

Here we post some binary content onto the container at <http://atom.example/>

POST / HTTP/1.1
Slug: mouse
Content-Type: image/gif
Content-Length: 1024
...

The result is meant to be that the server will create a binary and an atom entry about that published resource.

GET / HTTP/1.1
@prefix : <http://bblfish.net/work/atom-owl/2006-06-06/#> .

<> a ldp:Container;
   rdfs:member [ owl:sameAs <entry1>;
                 a :Entry;
                 :author  <http://joe.example/#i>;
                 :title "Atom Powered Robots Run Amok";
                 :id "urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a"^^xsd:anyURI;
                 :updated "2013-01-13T18:30:02Z"^^xsd:dateTime;
                 :summary "Some text." ],
               [ owl:sameAs <entry2>;
                 a :Entry;
                 :author <http://joe.example/#i>;
                 :id "http://atom.example/entry2";
                 :title "picture of a cat";
                 :updated "2013-01-13T18:30:02Z"^^xsd:dateTime;
                 :summary "Cat with a funny hat made of bread";
                 :content <http://cat.example/cat1.jpg> ],
               [ owl:sameAs <entry3>;
                 a :Entry;
                 :author <http://joe.example/#i>;
                 :id "http://atom.example/entry3";
                 :title "mouse";
                 :updated "2013-01-13T19:10:02Z"^^xsd:dateTime;
                 :content <mouse.gif> ].  

Notice how the <entry3> is now an element of the <> Container. But this does not quite match the behavior of the previous examples: with <entry1> and <entry2> a GET on each of them would return you the content you POSTed. But with <entry3> you GET back the metadata. For more on this issue, and a proposed solution, see the mail POSTing binary data - the Atom Use Case sent on February 1, 2013.

Here DELETEing <entry3> may or may not also delete <cat.gif> or vice-versa. I am not sure. But a GET on <entry3> would return

GET /entry3 HTTP/1.1
@prefix : <http://bblfish.net/work/atom-owl/2006-06-06/#> .

 <> :Entry;
    :author <http://joe.example/#i>;
    :title "mouse";
    :updated "2013-01-13T19:10:02Z"^^xsd:dateTime;
    :content <mouse.gif> .

2.1.6 Conclusion and further work

That seems to be along the lines of what it would take to include Atom as part of LDP.

But there are a number of issues when one extends this a bit:

  • Non Atom resources including RDF resources all seem to fall into the binary case above, as explained in Creating non-Atom LDPRs: AtomStrict & AtomRelax. As a result it may be better to think of Atom as just a metadata vocabulary to describe resources in the LDPC, and not to use it to wrap content POSTed to an LDPC.
  • Modelling Atom correctly seems to require N3 (as argued here), not Turtle, since Atom is a metadata format that speaks about data. To speak of other data inside a string literal requires tedious and prone to error escaping of the content. It may be then that one needs to wait for N3 to become a standard.

3 Ontological Modelling

One very good way of modelling a space is to expose the terms of the discussion as an ontology, and define the relations the elements have to each other in a set theoretical manner or via rules. By looking at the consequences of each statement one can the better reveal implicit assumptions and one can ask questions much more precisely. See the Ontology page of the wiki for details.

4 Feedback / Comments

4.1 Ashok's comments

--Ashok Malhotra Dec 10, 2012

Some questions and comments:

This is a good start but the LDP Model is richer and more complex than AtomPub.

1. We need to be able to create and delete collections.

(Steve S) What makes you think this is prohibited?

2. When a collection is deleted are its members deleted also? Or is there an option?

(Steve S) This has already been resolved with ISSUE-25.

3. Can collections contain collections? In other words, are collections hierarchical?

"(Steve S) has no language to prevent this, therefore it is implementation dependent. Also can look at it as just how RDF works, an implementation may support this and what do we need to say in the spec. Are there scenarios we need to be more specific? If so, we could create an issue to address specific gaps in the spec."

4. Does each LDP model have/need a service document? If yes, perhaps collections could be created by PUT on the service document?

"(Steve S) I look at what a client needs to know about a container or server, it just wants to poke a URL to see if it can do certain operations. If there are questions that can't be answered, then we can open issues for them"


5 Proposed Spec Section (for the LDP Spec)

5.1 Top Level

5.1.1 Erik Wilde

LDP is based on a simple model of containers and members. An LDP Service exposes a home resource which can be used to gain access to all containers exposed by the service. Following links from the home resource, clients can gain access to containers. Containers expose information about the containers themselves, as well as about their contents. They optionally provide mechanisms for paging. Contents of containers can be members or other containers, meaning that LDP allows containers to be nested. When getting a container's contents, clients get access to container metadata and a list of entries (members and/or containers), and get links which can be used to interact with the entries (members and/or containers) individually. A member provides access to metadata properties managed by the LDP service; it also provides access to member content, which is either embedded (and thus managed by the LDP service), or linked (and thus managed by some other service and made available at a URI that is linked from the member).

The following sections explain those concepts in more details, and also describe in more detail how interactions are designed, and how hyperlinks can be used to use an LDP service for creating, reading, updating, and deleting LDP containers and members. When in doubt about the specifics of interactions and data models, please consult the specification text about data and interaction models, which always is the normative reference.

5.1.2 Henry Story

An LDP server serves resources (LDPRs) over HTTP. Some of these are Containers (LDPCs). Containers expose information about the containers themselves, as well as about their contents which they refer to by the name at which can be dereferenced and interacted with. They optionally provide mechanisms for paging. Containers can contain LDPRs and so other Containers. For each member a Container should display a standard set of metadata, describing relations such as it's ownership, creation time, types, metadata resource, the types of interactions available on it, etc...

The following document explain those concepts in more detail, and especially how to interact with resources using HTTP GET, PUT, POST, DELETE and PATCH methods . In order to accomodate legacy browsers that do not have access to all these HTTP verbs, factory resources are introduced that allow these limitations to be bypassed. When in doubt about the specifics of interactions and data models, please consult the specification text about data and interaction models, which always is the normative reference.

[edit]

5.2 Home Resource

<< at this point in time it still is controversial whether we want to have something like that, or not. it might allow us to be a little friendlier to generic clients (providing a convenient starting point for exploration) but it certainly is not essential for LDP. >>

5.3 Containers

Containers are the most important resource type of an LDP service. Containers provide access to their entries (members and/or containers), and thus expose the most important functionality of LDP: the ability to manage sets of resources. Containers provide a variety of interaction affordances:

  • They link to their parent (a container if they have a parent container, otherwise they link to the home resource).
  • Along with listing their entries, they also include links to each entry (member and/or container), allowing clients to navigate to these entries.
  • By using HTTP POST, clients can create new entries in the container, and clients can POST entry or container resources. The LDP service issues an HTTP redirect to allow clients to discover the newly created entries.
  • By using HTTP DELETE, clients can delete containers, which results in a cascading delete following the containment hierarchy of containers.
  • By using HTTP PATCH, clients can update a container's metadata, without making any changes to the container's set of entries.

Optionally, LDP services may provide paged access to a container's entries, which is provided via additional links that are included in the container representation. These links allow clients to navigate through pages using "next" and "prev" links that provide access to the next respectively previous page of container entries. Notice that this mechanism is under the sole discretion of the server, and clients have no control over turning paging on and off, specifying the page size, or jumping directly to a specific page number. Paging URIs found in the "next" and "prev" links often may follow certain URI patterns, but clients SHOULD NOT make any assumptions that these patterns will be followed consistently, and thus should not attempt to "guess" URIs of specific pages.

5.4 Members