Difference between revisions of "ISSUE-37"

From Linked Data Platform
Jump to: navigation, search
(Ontological Modelling)
(Atom Pub example in LDP)
Line 182: Line 182:
 
It satisfies following properties:
 
It satisfies following properties:
  
* only rdfs:member for collections as specified in current ontology
+
* only rdfs:member for containers as specified in current spec ( and [http://www.w3.org/2012/ldp/track/issues/45 minimal proposed ontology]
* one can add a contained member
+
* one can add a light weight relation ( i.e. without deletion behaviour ). This is in fact a special case of [http://www.w3.org/2012/ldp/wiki/Issue-34_-_Aggregation:_simple_proposal Simple Aggregation Proposal]
* one can add a light weight relation ( ie without deletion behavior )
+
* one can POST binaries which create intermediate  metadata resources. This is the major difference from the current spec.
* one can POST binaries which create intermediate  metadata resources
+
  
 
To do this we are using the [http://bblfish.net/work/atom-owl/2006-06-06/AtomOwl.html Atom-OWL ontology]
 
To do this we are using the [http://bblfish.net/work/atom-owl/2006-06-06/AtomOwl.html Atom-OWL ontology]

Revision as of 08:08, 31 January 2013

1 Linked Data Platform

Section is work towards ISSUE-37 What is the LDP data model and the LDP interaction model? Note that this issue is not about proposing alternative models or other new features, normal issue opening-proposing-resolving WG process should work for that. ISSUE-37 is about gather up an alternative informative description of the model that could be inserted into the specification as introductory and informative language. The plan is to insert into this description of the model existing issues or the need to create new ones (all working towards normal issue resolution process). There are separate issues (ISSUE-34 and the like) for aggregation that ISSUE-37 isn't about.

Model Aspect Linked Data Platform (LDP)
End-point basics

A LDP server exposes at least one URI to a container-type resource. A client can learn about the container and its contents (members), by simply fetching the container. If only the non-member information is needed, a client can request only that. If a client needs to know if it can create things in this container, then usual HTTP methods can be used such as HEAD or OPTIONS to learn about which verbs are permitted on the request URI.

Need: It might be useful to define a common property for external vocabularies to use to indicate the end-point, take for example: <subject> <ldp:endpoint> <my-LDP-Server>

Alternative: LDP provides no new facilities for web clients to acquire URLs to Containers. Web clients might use out of band knowledge, home resources, POWDER, emailing a friend a link, or any other mechanism deemed appropriate by the application's needs.

It is possible for a server to expose a single Container as a home resource, which web clients GET and examine to locate other resources (including other Containers). It is also possible for that same home resource to be serialized and provided to a client as a document. LDP assumes that each client is able to obtain whatever URLs it needs through non-LDP means, and restricts itself to an interaction model for Containers and their members.

Container basics

A container has a URI where you can start interacting with the container, at the most basic level you can GET a list of members, and for containers with many members it is likely the response is broken into pages. Through HTTP HEAD/OPTIONS introspection, a client can learn which HTTP verbs are permitted on this. Since LDP containers are described using RDF constructs, a LDP server may allow clients to POST the representation of the container to create it.

Need: How are the acceptable content-types communicated via HTTP headers?

Alternative: Before asking 'how' acceptable content-types are communicated, we need to agree that doing so is required (should it be in-scope for LDP - part of ISSUE-21 perhaps). In the simplest case, once the client "knows" it is dealing with a conformant LDP Container it does know (at least) that Turtle is supported.

If communicating supported content-types is in-scope, then a syntax needs to be defined (either new, or re-used). Potential re-use choices might include (and these are not necessarily mutually exclusive, although to keep clients simple it would help to have at least one required method if it is in-scope):

Aside: If we do place it in-scope, we have the same question about members, so it might be that we talk about this introspection at the LDPR level and let LDPCs "inherit" it from LDPRs. LDPRs might leave everything optional, and LDPC requires one particular method while allowing others, for example.

Creating container members (resources)

Any kind of resource can be listed in a container, and everything that is in a container is represented by a member. Members follow the standardized metadata model defined by RDF. If a client has POSTed a "media resource" (pretty much anything other than RDF) to a container URI, which has one of the media types that the container accepts, the server accepts this media resource and creates the resource, which will then represent this media resource when you list the container contents. The interesting aspect of this setup is that you POST one thing, its representation gets converted by the server to some appropriate storage and its membership added to the container.

email: there is only one container, and what you POST determines whether it's self-contained or not. if the POSTed resource contains all data (<content>...</content> in XML), it's what we now call "composition". if the POSTed resource contains a link to "external content" (<content src=""/> in XML), it's what we now call "aggregation".

Servers might try to be smart about populating additional data fields for the resource, for example creation date and who created it. They also might consider come properties to be "server managed", i.e. read-only from any client's point of view. ISSUE-11 questions how much the LDP spec constrains them.

Member (resource) identity

A container is a set of members and (perhaps) non-member properties, and thus contains a list of members (entries) representing each individual member. The model uses a URI to represent the identity of each member resource, which will be used for identification globally. The identification of the member as an entry, will be represented as using a simple RDF statement: <container> rdfs:member <member>.

Note: some proposals on the mailing list for ISSUE-34 include the use of different membership predicates like ldp:owns or ldp:contains in place of rdfs:member. If accepted, those predicates might cover some of the ground for ISSUE-21 (affordances).

Interacting with container members

Members are listed when GETting a container, and their data is exposed through regular RDF mechanisms. If members are editable, a client can introspect via HEAD/OPTIONs to see if PUT, PATCH or DELETE verbs are permitted. A server may provide some member data or just a reference to it in the response of a request for container information.

Interacting with containers (composition)

Bringing into existence: typically containers just exist, though some servers may allow creation of containers by POSTing a representation of a container to an already-existing container.

Removing from existence (DELETEing): containers are like any other HTTP web resource, they have a URI which a server may accept a DELETE request on. The question becomes, what happens to the members when a container is DELETEd? Based on resolution for ISSUE-25, in order for the container to be deleted all its members must also be deleted. There have been proposals for another model of containers, which may or may not have this rule, see ISSUE-34.

Removing a member from a container: simply update the container's membership by PATCHing it with a statement to remove the membership triple. The resource still exists, its ownership is unknown (or perhaps a concept of a "default" container?).

Adding an existing member to a container: potentially not allowed (ISSUE-25 resolution)

Moving a member from one container to another: potentially not allowed (ISSUE-25 resolution)

Alternative: PATCH to remove a member is not allowed; the client must DELETE the member to remove it from the container.

2 AtomPub example

The AtomPub example is just used as an example of how we could describe the model. The intent is not to compare and contrast. This section will be removed in the short-term.

Model Aspect AtomPub
service basics

any atompub server exposes a "service document" that describes what the server is exposing. each service document lists a set of "workspaces", which are just a grouping construct for "collections". workspaces have no interaction semantics in atompub, there is no protocol for creating or deleting them; they just exist. each workspace lists a set of "collections", which is by far the most central construct in atompub. a collection can be listed in more than one workspace.

collection basics

a collection has a URI where you can start interacting with the collection, at the most basic level you can GET a list of members, and very likely this is somehow paginated. through its listing in a service document, the collection exposes some interaction information, such as which kind of mediatypes it will accept, and how members of the collection might use categories to be classified.

creating collection members

any kind of resource can be listed in a collection, and everything that is in a collection is represented by an entry. entries follow the standardized metadata model defined by atom, but atompub distinguishes two kinds of entries. if a client POSTs an "entry resource" (an atom entry following atom's metadata model), the server pretty much takes this entry resource and starts listing it in the collection as a member. if a client has POSTed a "media resource" (pretty much anything that's not an entry by itself, often something like an image media type), which has to be in one of the accepted media types of the collection, the server accepts this media resource and creates a "media link entry", which will then represent this media resource when you list the collection contents. servers might try to be smart about populating metadata fields in the entry and for example look at exif data to populate certain fields. the interesting aspect of this setup is that you POST one thing, and create two resources, and big media files might for example get added to a CDN and get a CDN URI, whereas the entry gets some URI under the control of the atompub server.

member identity

a collection is a representation of members, and thus contains a list of entries representing each individual member. according to the atom model, each entry MUST have an <id>, which is a URI but has no interaction semantics (specifically, best practice suggests that minting URIs that are not actionable might be a good idea). entries may have embedded content, or may link to the content they are representing. identity is established by the entry <id>, and this is particularly important in scenarios where collections may be aggregated and filtered and repurposed: entry identity must always be visible in the <id>, and thus identity can be tracked across paths where entries may get repurposed in various collections.

interacting with members

members are listed when GETting a collection, and their identity and metadata about them is exposed through regular atom mechanisms. if members are editable, an "edit" link in the entry will allow clients to update the member entry, by using this link to PUT or DELETE the entry resource. if the entry is a media link entry, then there might be a "edit-media" link in the entry, which will allow clients to update the media resource, by using this link to PUT or DELETE the media resource. this model allows clients to both interact with a media resource's metadata (the "media link entry"), and the media resource itself.

interacting with collections

like workspaces, collections just exist, and atompub does not define how to create or delete them (i am currently working on a small addition to the spec that addresses that). also, collections have no structure, they have a URI and accept entries. this means there is no hierarchy to collections, it's a flat space.

2.1 Atom Pub example in LDP

Below is an example of what LDP looks like with an Atom-Pub model. This was sent initially to the ldp mailing list and received Erik Wilde's approval, with comments that should be worked into the summary below.

It satisfies following properties:

  • only rdfs:member for containers as specified in current spec ( and minimal proposed ontology
  • one can add a light weight relation ( i.e. without deletion behaviour ). This is in fact a special case of Simple Aggregation Proposal
  • one can POST binaries which create intermediate metadata resources. This is the major difference from the current spec.

To do this we are using the Atom-OWL ontology which comes with XSLTs and XQuery transforms from atom xml to the ontology. ( Below we have made some obvious simplification to the ontology for the sake of clarity ).

2.1.1 Presupposition: an empty Container

Let us start with an empty Container at http://atom.example/ which we GET

GET / HTTP/1.1
<> a ldp:Container .

Let me ignore other data that could be part of it of that container, for this example.

2.1.2 POSTing Content

Now we post an Atom entry using the AtomOWL ontology, with some obvious simplifications.

POST / HTTP/1.1
...

@prefix : <http://bblfish.net/work/atom-owl/2006-06-06/#> .

 <> :Entry;
    :author <http://joe.example/#i>;
    :title "Atom Powered Robots Run Amok";
    :id "urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a"^^xsd:anyURI;
    :updated "2013-01-13T18:30:02Z"^^xsd:dateTime;
    :summary "Some text.";
    :content "Some text or other content - of course the mime type of the content needs to be specified" .
HTTP/1.1 201 Created
Content-Type: text/turtle
Link: <http://bblfish.net/work/atom-owl/2006-06-06/#Entry>; rel="http://www.w3.org/1999/02/22-rdf-syntax-ns#type"
Location: <entry1>
...

Note that in the Atom Protocol RFC5023 mix the content type and the returned type of the content. In section 9.2.1 they show the 201 created returns a Content-Type of:

 Content-Type: application/atom+xml;type=entry;charset="utf-8"

Given that we don't want to make the same mistake of mixing syntax and semantics as the AtomPub protocol did I have moved the type to a link element. Then say if Twitter abandonds one format - as it recently abandoned Atom XML in favor of JSON then the protocol will have no problem and still be able to adapt itself.

Note that this would create a new entry in the ldp:Container, of type :Entry which one could find out by doing a GET on the container:

GET / HTTP/1.1
@prefix : <http://bblfish.net/work/atom-owl/2006-06-06/#> .

<> a ldp:Container;
   rdfs:member [ owl:sameAs </entry1>;
                 a :Entry;
                 :author  <http://joe.example/#i>;
                 :title "Atom Powered Robots Run Amok";
                 :id "urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a"^^xsd:anyURI;
                 :updated "2013-01-13T18:30:02Z"^^xsd:dateTime;
                 :summary "Some text." ] .

Notice that the ldp:Container does not show all the information: for example it does not show the content. To do that one has to GET </entry1> like this:

GET /entry1 HTTP/1.1
@prefix : <http://bblfish.net/work/atom-owl/2006-06-06/#> .

<> a :Entry;
   :author  <http://joe.example/#i>;
   :title "Atom Powered Robots Run Amok";
   :id "urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a"^^xsd:anyURI;
   :updated "2013-01-13T18:30:02Z"^^xsd:dateTime;
   :summary "Some text.";
   :content "Some text or other content - of course the mime type of the content needs to be specified ." .

So to DELETE </entry1> would delete also the rdfs:member information in the <http://atom.example/> container, with of course all the metadata about it.

2.1.3 POSTing a link to another resource

Next we want to post a link to something without content

POST / HTTP/1.1
...

@prefix : <http://bblfish.net/work/atom-owl/2006-06-06/#> .

 <> :Entry;
    :author <http://joe.example/#i>;
    :title "picture of a cat";
    :updated "2013-01-13T18:45:23Z"^^xsd:dateTime;
    :summary "Cat with a funny hat made of bread";
    :content <http://cat.example/cat1.jpg> .

So this creates a resource <entry2> which which contains exactly what was POSTed as in the example with the content. A GET on the container now has the following ( assuming we have not deleted </entry1> yet of course )

GET / HTTP/1.1

<pre?> @prefix : <http://bblfish.net/work/atom-owl/2006-06-06/#> .

<> a ldp:Container;

  rdfs:member [ owl:sameAs </entry1>;
                a :Entry;
                :author  <http://joe.example/#i>;
                :title "Atom Powered Robots Run Amok";
                :id "urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a"^^xsd:anyURI;
                :updated "2013-01-13T18:30:02Z"^^xsd:dateTime;
                :summary "Some text." ],
              [ owl:sameAs </entry2>;
                a :Entry;
                :author <http://joe.example/#i>;
                :id "http://atom.example/entry2";
                :title "picture of a cat";
                :updated "2013-01-13T18:45:23Z"^^xsd:dateTime;
                :summary "Cat with a funny hat made of bread";
                :content <http://cat.example/cat1.jpg>  .
</pre>

Now again deleting </entry2> deletes it from the container <http://atom.example/> but does not delete the content.

2.1.4 POSTING Binary Content

So here we post some binary content onto the container at <http://atom.example/>

POST / HTTP/1.1
Slug: mouse
Content-Type: image/gif
Content-Length: 1024
...

The result is meant to be that the server will create a binary and an atom entry about that published resource.

GET / HTTP/1.1
@prefix : <http://bblfish.net/work/atom-owl/2006-06-06/#> .

<> a ldp:Container;
   rdfs:member [ owl:sameAs <entry1>;
                 a :Entry;
                 :author  <http://joe.example/#i>;
                 :title "Atom Powered Robots Run Amok";
                 :id "urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a"^^xsd:anyURI;
                 :updated "2013-01-13T18:30:02Z"^^xsd:dateTime;
                 :summary "Some text." ],
               [ owl:sameAs <entry2>;
                 a :Entry;
                 :author <http://joe.example/#i>;
                 :id "http://atom.example/entry2";
                 :title "picture of a cat";
                 :updated "2013-01-13T18:30:02Z"^^xsd:dateTime;
                 :summary "Cat with a funny hat made of bread";
                 :content <http://cat.example/cat1.jpg> ],
               [ owl:sameAs <entry3>;
                 a :Entry;
                 :author <http://joe.example/#i>;
                 :id "http://atom.example/entry3";
                 :title "mouse";
                 :updated "2013-01-13T19:10:02Z"^^xsd:dateTime;
                 :content <mouse.gif> ].  

Here DELETEing <entry3> may or may not also delete <cat.gif> or vice-versa. I am not sure. But a GET on <entry3> would return

GET /entry3 HTTP/1.1
@prefix : <http://bblfish.net/work/atom-owl/2006-06-06/#> .

 <> :Entry;
    :author <http://joe.example/#i>;
    :title "mouse";
    :updated "2013-01-13T19:10:02Z"^^xsd:dateTime;
    :content <mouse.gif> .

Ok, so that seems to be along the lines of what it would take to include Atom as part of LDP.

3 Ontological Modelling

One very good way of modelling a space is to expose the terms of the discussion as an ontology, and define the relations the elements have to each other in a set theoretical manner or via rules. By looking at the consequences of each statement one can the better reveal implicit assumptions and one can ask questions much more precisely. See the Ontology page of the wiki for details.

4 Feedback / Comments

4.1 Ashok's comments

--Ashok Malhotra Dec 10, 2012

Some questions and comments:

This is a good start but the LDP Model is richer and more complex than AtomPub.

1. We need to be able to create and delete collections.

(Steve S) What makes you think this is prohibited?

2. When a collection is deleted are its members deleted also? Or is there an option?

(Steve S) This has already been resolved with ISSUE-25.

3. Can collections contain collections? In other words, are collections hierarchical?

"(Steve S) has no language to prevent this, therefore it is implementation dependent. Also can look at it as just how RDF works, an implementation may support this and what do we need to say in the spec. Are there scenarios we need to be more specific? If so, we could create an issue to address specific gaps in the spec."

4. Does each LDP model have/need a service document? If yes, perhaps collections could be created by PUT on the service document?

"(Steve S) I look at what a client needs to know about a container or server, it just wants to poke a URL to see if it can do certain operations. If there are questions that can't be answered, then we can open issues for them"