Use Cases And Requirements

From Linked Data Platform
Revision as of 00:00, 18 November 2012 by Sbattle2 (Talk | contribs)

Jump to: navigation, search

Contents

1 Linked Data Platform Use Cases And Requirements

This is a working document used to collect use cases and requirements for consideration by the WG. The starting point comes from Linked Data Basic Profile Use Cases and Requirements.

1.1 Scope and Motivation

Linked Data was defined by Tim Berners-Lee with the following [1]

  1. Use URIs as names for things
  2. Use HTTP URIs so that people can look up those names
  3. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL)
  4. Include links to other URIs. so that they can discover more things

These four rules have proven very effective in guiding and inspiring people to publish Linked Data on the web. The amount of data, especially public data, available on the web has grown rapidly, and an impressive number of extremely creative and useful “mashups” have been created using this data as result.

There has been much less focus on the potential of Linked Data as a model for managing data on the web - the majority of the Application Programming Interfaces (APIs) available on the Internet for creating and updating data follow a Remote Procedure Call (RPC) model rather than a Linked Data model.

If Linked Data were just another model for doing something that RPC models can already do, it would be of only marginal interest. Interest in Linked Data arises from the fact that applications with an interface defined using Linked Data can be much more easily and seamlessly integrated with each other than applications that offer an RPC interface. In many problem domains, the most important problems and the greatest value are found not in the implementation of new applications, but in the successful integration of multiple applications into larger systems.

Some of the features that make Linked Data exceptionally well suited for integration include:

  • A single interface – defined by a common set of HTTP methods – that is universally understood and is constant across all applications. This is in contrast with the RPC architecture where each application has a unique interface that has to be learned and coded to.
  • A universal addressing scheme – provided by HTTP URLs – for both identifying and accessing all “entities”. This is in contrast with the RPC architecture where there is no uniform way to either identify or access data.
  • A simple yet extensible data model – provided by RDF – for describing data about a resource in a way which doesn’t require prior knowledge of vocabulary being used.

Experience implementing applications and integrating them using Linked Data has shown very promising results, but has also demonstrated that the original four rules defined by Tim Berners-Lee for Linked Data are not sufficient to guide and constrain a writable Linked Data API. As was the case with the original four rules, the need generally is not for the invention of fundamental new technologies, but rather for a series of additional rules and patterns that guide and constrain the use of existing technologies in the construction of a Basic Profile for Linked Data to achieve interoperability.

The following list illustrates a few of the issues that require additional rules and patterns:

  • What URLs do I post to in order to create new resources?
  • How do I get lists of existing resources, and how do I get basic information about them without having to access each one?
  • How should I detect and deal with race conditions on write?
  • What media-types/representations should I use?
  • What standard vocabularies should I use?
  • What primitive data types should I use?

A good goal for the Basic Profile for Linked Data would be to define a specification required to allow the definition of a writable Linked Data API equivalent to the simple application APIs that are often written on the web today using the Atom Publishing Protocol (APP). APP shares some characteristics with Linked Data, such as the use of HTTP and URLs. One difference is that Linked Data relies on a flexible data model with RDF, which allows for multiple representations.

1.2 Organization of this Document

This document is organized as follows:

  • User Stories capture statements about system requirements written from a user or application perspective. They are typically lightweight and informal and can run from one line to a paragraph or two (sometimes described as an 'epic') [2]. Analysis of each user story will reveal a number of (functional) use-cases and other non-functional requirements. See Device API Access Control Use Cases and Requirements for a good example of user stories and their analysis.
  • Use Cases are used to capture and model functional requirements. Use cases describe the system’s behavior under various conditions [3], cataloguing who does what with the system, for what purpose, but without concern for system design or implementation [4]. Each use case is identified by a reference number to aid cross-reference from other documentation; use-case indexing in this document is based on rdb2rdf use-cases. A variety of styles may be used to capture use-cases, from a simple narrative to a structured description with actors, pre/post conditions, and step-by-step behaviours as in POWDER: Use Cases and Requirements, and non-functional requirements raised by the use-case. Use cases act like the hub of a wheel, with spokes supporting requirements analysis, scenario-based evaluation, testing, and integration with non-functional, or quality requirements.
  • Use Case Scenarios are more focused still, representing a single instance of a use case in action. Scenarios may be lightweight narratives as seen in Use cases and requirements for Media Fragments, or modeled as interaction diagrams. Each scenario may lead to the development of one or more test cases.

1.3 User Stories

1.3.1 Maintaining Social Contact Information

Many of us have multiple email accounts that include information about the people and organizations we interact with – names, email addresses, telephone numbers, instant messenger identities and so on. When someone’s email address or telephone number changes (or they acquire a new one), our lives would be much simpler if we could update that information in one spot and all copies of it would automatically be updated. In other words, those copies would all be linked to some definition of “the contact.” There might also be good reasons (like off-line email addressing) to maintain a local copy of the contact, but ideally any copies would still be linked to some central “master.”

Agreeing on a format for “the contact” is not enough, however. Even if all our email providers agreed on the format of a contact, we would still need to use each provider’s custom interface to update or replace the provider’s copy, or we would have to agree on a way for each email provider to link to the “master”. If we look outside our own personal interests, it would be even more useful if the person or organization exposed their own contact information so we could link to it.

What would work in either case is a common understanding of the resource, a few formats needed, and access guidance for these resources. This would support how to acquire a link to a contact, and how to use those links to interact with a contact (including reading, updating, and deleting it), as well as how to easily create a new contact and add it to my contacts and when deleting a contact, how it would be removed from my list of contacts. It would also be good to be able to add some application-specific data about my contacts that the original design didn’t consider. Ideally we’d like to eliminate multiple copies of contacts, there would be additional valuable information about my contacts that may be stored on separate servers and need a simple way to link this information back to the contacts. Regardless of whether a contact collection is my own, shared by an organization, or all contacts known to an email provider (or to a single email account at an email provider), it would be nice if they all worked pretty much the same way.

1.3.2 Keeping Track of Personal and Business Relationships

In our daily lives, we deal with many different organizations in many different relationships, and they each have data about us. However, it is unlikely that any one organization has all the information about us. Each of them typically gives us access to the information (at least some of it), many through websites where we are uniquely identified by some string – an account number, user ID, and so on. We have to use their applications to interact with the data about us, however, and we have to use their identifier(s) for us. If we want to build any semblance of a holistic picture of ourselves (more accurately, collect all the data about us that they externalize), we as humans must use their custom applications to find the data, copy it, and organize it to suit our needs.

Would it not be simpler if at least the Web-addressable portion of that data could be linked to consistently, so that instead of maintaining various identifiers in different formats and instead of having to manually supply those identifiers to each one’s corresponding custom application, we could essentially build a set of bookmarks to it all? When we want to examine or change their contents, would it not be simpler if there were a single consistent application interface that they all supported? Of course it would.

Our set of links would probably be a simple collection. The information held by any single organization might be a mix of simple data and collections of other data, for example, a bank account balance and a collection of historical transactions. Our bank might easily have a collection of accounts for each of its collection of customers.

1.3.3 System and Software Development Tool Integration

System and software development tools typically come from a diverse set of vendors and are built on various architectures and technologies. These tools are purpose built to meet the needs for a specific domain scenario (modeling, design, requirements and so on.) Often tool vendors view integrations with other tools as a necessary evil rather than providing additional value to their end-users. Even more of an afterthought is how these tools’ data -- such as people, projects, customer-reported problems and needs -- integrate and relate to corporate and external applications that manage data such as customers, business priorities and market trends. The problem can be isolated by standardizing on a small set of tools or a set of tools from a single vendor, but this rarely occurs and if does it usually does so only within small organizations. As these organizations grow both in size and complexity, they have needs to work with outsourced development and diverse internal other organizations with their own set of tools and processes. There is a need for better support of more complete business processes (system and software development processes) that span the roles, tasks, and data addressed by multiple tools. This demand has existed for many years, and the tools vendor industry has tried several different architectural approaches to address the problem. Here are a few:

  • Implement an API for each application, and then, in each application, implement “glue code” that exploits the APIs of other applications to link them together.
  • Design a single database to store the data of multiple applications, and implement each of the applications against this database. In the software development tools business, these databases are often called “repositories.”
  • Implement a central “hub” or “bus” that orchestrates the broader business process by exploiting the APIs described previously.

It is fair to say that although each of those approaches has its adherents and can point to some successes, none of them is wholly satisfactory. The use of Linked Data as an application integration technology has a strong appeal. OSLC

1.3.4 Library Linked Data

The W3C Library Linked Data working group has a number of use cases cited in their Use Case Report. LLD-UC These referenced use cases focus on the need to extract and correlate library data from disparate sources. Variants of these use cases that can provide consistent formats, as well as ways to improve or update the data, would enable simplified methods for both efficiently sharing this data as well as producing incremental updates without the need for repeated full extractions and import of data.

The 'Digital Objects Cluster' in LLD-UC contains a number of relevant use-cases:

  • Grouping: This should "Allow the end-users to define groups of resources on the web that for some reason belong together. The relationship that exists between the resources is often left unspecified. Some of the resources in a group may not be under control of the institution that defines the groups."
  • Enrichment: "Enable end-users to link resources together."
  • Re-use: "Users should have the capability to re-use all or parts of a collection, with all or part of its metadata, elsewhere on the linked Web."

The 'Collections' cluster also contains a number of relevant use-cases:

  • Collections discovery: "Enable innovative collection discovery such as identification of nearest location of a physical collection where a specific information resource is found or mobile device applications ... based on collection-level descriptions." The LDP does not define how clients discover LDPCs.
  • Community information services: Identify and classify collections of special interest to the community.

1.3.5 Municipality Operational Monitoring

Across various cities, towns, counties, and various municipalities there is a growing number of services managed and run by municipalities that produce and consume a vast amount of information. This information is used to help monitor services, predict problems, and handle logistics. In order to effectively and efficiently collect, produce, and analyze all this data, a fundamental set of loosely coupled standard data sources are needed. A simple, low-cost way to expose data from the diverse set of monitored services is needed, one that can easily integrate into the municipalities of other systems that inspect and analyze the data. All these services have links and dependencies on other data and services, so having a simple and scalable linking model is key.

1.3.6 Healthcare

For physicians to analyze, diagnose, and propose treatment for patients requires a vast amount of complex, changing and growing knowledge. This knowledge needs to come from a number of sources, including physicians’ own subject knowledge, consultation with their network of other healthcare professionals, public health sources, food and drug regulators, and other repositories of medical research and recommendations.

To diagnose a patient’s condition requires current data on the patient’s medications and medical history. In addition, recent pharmaceutical advisories about these medications are linked into the patient’s data. If the patient experiences adverse affects from medications, these physicians need to publish information about this to an appropriate regulatory source. Other medical professionals require access to both validated and emerging effects of the medication. Similarly, if there are geographical patterns around outbreaks that allow both the awareness of new symptoms and treatments, this information needs to quickly reach a very distributed and diverse set of medical information systems. Also, reporting back to these regulatory agencies regarding new occurrences of an outbreak, including additional details of symptoms and causes, is critical in producing the most effective treatment for future incidents.

1.3.7 Metadata enrichment in broadcasting

There are many different use cases when broadcasters show interest in metadata enrichment:

  • enrich archive or news metadata by linking facts, events, locations and personalities
  • enrich metadata generated by automatic extraction tools such as person identification, etc.
  • enrich definitions of terms in classification schemes or enumeration lists

This comes in support of more effective information management and data/content mining (if you can't find your content, it' like if you don't have and must either recreate or acquire it again, which is not financially effective).

However, there is a need for solutions facilitating linkage to other data sources and taking care of the issues such as discovery, automation, disambiguation. Etc. Other important issues that broadcasters would face are the editorial quality of the linked data, its persistence, and usage rights.

1.3.8 Aggregation and Mashups of Infrastructure Data

For infrastructure management (such as storage systems, virtual machine environments, and similar IaaS and PaaS concepts), it is important to provide an environment in which information from different sources can be aggregated, filtered, and visualized effectively. Specifically, the following use cases need to be taken into account:

  • While some data sources are based on Linked Data, others are not, and aggregation and mashups must work across these different sources.
  • Consumers of the data sources and aggregated/filtered data streams are not necessarily implementing Linked Data themselves, they may be off-the-shelf components such as dashboard frameworks for composing visualizations.
  • Simple versions of this scenario are pull-based, where the data is requested from data sources. In more advanced settings, without a major change in architecture it should be possible to move to a push-based interaction model, where data sources push notifications to subscribers, and data sources provide different services that consumers can subscribe to (such as "informational messages" or "critical alerts only").

In this scenario, the important factors are to have abstractions that allow easy aggregation and filtering, are independent from the internal data model of the sources that are being combined, and can be used for pull-based interactions as well as for push-based interactions.

1.3.9 Data Sharing

In a downscaled context, where the used of a central data repository is replaced by several smaller servers, it is necessary to be able to ship information among the servers. A device in the network may publish an information on a server with an other device as a target receiver. This message will then have to be forwarded from server to server until that target is reached. A set of common standards for updating the content of containers and the description of the resources will be necessary to implement such feature (not taking the routing aspect into consideration here).

1.3.10 RESTful Interactions

REST's main focus is on building interactions around the exchange transfer between clients and servers. For this to work, it must be possible to define and communicate expectations for certain state transfers. In this gist the discussion centers around book orders, but pretty much any interaction in a SOA context could be used: some interaction requires a specific state transfer between client and server, and there must be a way how this state transfer is

  • captured in the context of a bigger interaction flow (what a media type defines on the web), and
  • expressed by means of expectations/constraints that apply to a specific representations, so that a server can validate against those expectations/constraints, and only accept those representations which satisfy the expectations/constraints (what often is done with a combination of schemas and prose in the context of a media type's conversation).

"What Are Linked Data Services?" describes these requirements as the "service surface" that needs to be defined by any platform that is providing some sort of services. This becomes particularly important in any kind of loosely coupled scenario, where servers/services cannot trust clients to always do the "right thing" or "behave cooperatively". instead, the platform must provide support so that misbehaving and adversarial clients can be dealt with effectively, and that means that the "service contract" needs to define the service surface based on the state representations that are acceptable in the context of the use case that is addressed by the service, so that anything else can be easily rejected.

1.3.11 Hosting POSTed Resources

<http://dev.example/bugs> is a factory resource for creating new bugs (well, documenting existing bugs). It accepts <Bug>s of the form:

 _:newBug a <Bug> ;
          <product> <http://products.example.com/gun> ;
          <issueText> "kills people" ;
          dc:author "Bob" ;
          dc:date "2012-07-04T23:54"^^xsd:dateTime

By this definition "hosting" means changing _:newBug to <http://dev.example/bug/7>. LDP doesn't provide any guidance around that.

1.3.12 LDP and Authentication/Authorization

Access to the Linked Data Platform often may require authentication and authorization. Access by clients can depend on the interaction context (different resources are needed for accomplishing different goals), client identity (different clients have different levels of access), and possibly client roles (access control may be coupled to roles instead of identities, in which case client/role associations need to be established) and/or client attributes (access control may be coupled to attributes which can be used even when the client identities are unknown, assuming that the attributes can be reliably determined). On the Web, many different ways of identification (establishing naming schemes that uniquely identify entities of interest), authentication (frameworks for verifying someone's claim to have an identity), and authorization (granting access to a resource based on an identity and some access control scheme) exist. In addition, in many cases platforms must integrate into existing scenarios around these issues, and cannot freely pick a framework from scratch. Thus, LDP should provide developers with some guidance around the following issues:

  • How should LDP services integrate into existing landscapes of identification, authentication, and authorization? For some established technologies, maybe we can provide guidance and patterns on how to support them.
  • What is a reasonable authentication and authorization model? LDP will be used to drive many different services, and these might have different models of how clients should have access to LDP services in the context of those specific application scenarios. What is a reasonable model so that LDP providers can handle this flexibility, and still can support identification, authentication, and authorization frameworks that are dictated by the environment? In this context, is it more reasonable to deal with roles or attributes in granting the access to the clients?

1.3.13 Sharing binary resources and metadata

When publishing datasets about stars one may want to publish links to the pictures in which those stars appear, and this may well require publishing the pictures themselves. Vice versa: when publishing a picture of space we need to know which telescope took the picture, which part of the sky it was pointing at, what filters were used, which identified stars are visible, who can read it, who can write to it, ...

If LinkedData contains information about resources that are most naturally expressed in non-rdf formats (be they binary such as pictures or videos, or human readable documents in XML formats), those non RDF formats should be just as easy to publish to the LinkedData server as the RDF relations that link those resources up. A LinkedData server should therefore allow publishing of non linked data resources too, and make it easy to publish and edit metadata about those resources.


The resource comes in two parts - the image and information about the image (which may in the image file but better external to it as it's more general). The information about the image is vital. It's a compound item of image data and other data (being application metadata about the image does not distinguish from the platform's point-of-view

A key issue for our work is whether to link these two elements or treat them separately:

Coupled: treat as a single LDPR - one implication might be to allow a single POST/PUT with RDF and non-RDF parts, and have the LDPR server manage the URI naming for the non-RDF part.

Separate: treat as the RDF data as a LDPR - one implication might be that the image to be put somewhere with a URL, then receive just the metadata as a LDPR which references the URL.

1.3.14 Data catalogs

The Asset Description Metadata Schema (ADMS) provides the data model to describe semantic assets repositories contents, but this leaves many open challenges when building a federation of these repositories to serve the need of assets reuse. These include accessing and querying individual repositories and efficiently retrieving updated content without having to retrieve the whole content. Hence, we chose to build the integration solution capitalizing on the Data Warehousing integration approach. This allows us to cope with heterogeneity of sources technologies and to benefit from the optimized performance it offers, given that individual repositories do not usually change frequently. With Data Warehousing, the federation requires to:

  • understand the data, i.e. understand their semantic descriptions, and other systems.
  • seamlessly exchange the semantic assets metadata from different repositories
  • keep itself up-to-date.

Repositories owners can maintain de-referenceable URIs for their repository description and contained assets in a Linked Data compatible manner. ADMS provides the necessary data model to enable meaningful exchange of data. However, This leaves the challenge of efficient access to the data not fully addressed.

Related: Data Catalog Schema and Protocol

1.3.15 Constrained Devices and Networks

Information coming from resource constrained devices in the Web of Things (WoT) has been identified as a major driver in many domains, from smart cities to environmental monitoring to real-time tracking. The amount of information produced by these devices is growing exponentially and needs to be accessed and integrated in a systematic, standardized and cost efficient way. By using the same standards as on the Web, integration with applications will be simplified and higher-level interactions among resource constrained devices, abstracting away heterogeneities, will become possible. Up-coming IoT/WoT standards such as 6LowPAN - IPv6 for resource constrained devices - and the Constrained Application Protocol (CoAP), which provides a downscaled version of HTTP on top of UDP for the use on constrained devices, are already at a mature stage. The next step now is to support RESTful interfaces also on resource constrained devices, adhering to the Linked Data principles. Due to the limited resources available, both on the device and in the network (such as bandwidth, energy, memory) a solution based on SPARQL Update is at the current point in time considered not to be useful and/or feasible. An approach based on the HTTP-CoAP Mapping would enable constrained devices to directly participate in a Linked Data-based environment.

For a detailed description of application scenarios for constrained devices see a separate document kindly compiled by Myriam Leggieri.

1.3.16 Services supporting the process of science

General movitation

Many fields of science now include branches with in silico data-intensive methods, e.g. bioinformatics, astronomy. To support these new methods we look to move beyond the established platforms provided by scientific workflow systems to capture, assist, and preserve the complete lifecycle from record of the experiment, through local trusted sharing, analysis, dissemination (including publishing of experimental data "beyond the PDF"), and re-use.

Specific requirements

  • Aggregations, specifically Research Objects (ROs) that are exchanged between services and clients bringing together workflows, data sets, annotations, and provenance. We use an RDF model for this. While some aggregated contents are encoded using RDF and in increasing number are linked data sources, others are not; while some are stored locally "within" the RO, others are remote (in both cases this is often due to size of the resources or access policies).
  • Services that are distributed and linked. Some may be centralising for e.g. publication, others may be local, e.g. per lab. We need lightweight services that can be simply and easily integrated into and scale across the wide variety of softwares and data used in science: we have adopted a RESTful approach where possible.
    • Foundation services that collect and expose ROs for storage, modification, exploration, and reuse.
    • Services that provide added-value to ROs such as seamless import/export from scientific workflow systems, automated stability evaluation, or recommendation (and therefore interact with the foundation services to retrieve/store/modify/ROs).
  • Compatibility with access control that can reflect the needs for privacy and publication at different stages of the research lifecycle.

seeAlso: Wf4Ever

1.3.17 Project Membership Information : Information Evolution

Information about people and projects changes as roles change, as organisations change and as contact details change. Finding the current state of a project is important in enabling people to contact the right person in the right role. It can also be useful to look back and see who was performing what role in the past.

A use of a Link Data Platform could be to give responsibility for managing such information with the project team itself, not requiring updates to be requested of a centralised website administrator.

This could be achieved with:

  • Resource descriptions for each person and project
  • A container resource to describe roles/membership in the project.

To retain the history of the project, the old version of a resources, including container resources, should be retained so there is a need to address both specific items and also have a notion of "current".

Access to information has two aspects:

  • Access to the "current" state, regardless of the version of the resource description
  • Access to historical state, via access to a specific version of the resource description

See also Maintaining Social Contact Information.

1.3.18 Cloud Infrastructure Management

Cloud operators offer API support to provide customers with remote access for infrastructure management. Infrastructure consists of Systems, Computers, Networks, Discs, etc, and the overall structure can be seen as mostly hierarchical, (Cloud contains Systems, Systems contain Machines, etc). This is complemented with crossing links (e.g. Machines connected to a Network). The IaaS scenario also makes requirements for lifecycle management, non-instant changes and history capture. Infrastructure management can be seen as the manipulation of the underlying graph.

1.4 Use Cases

1.4.1 UC1: Create Resource

Create a new resource by sending an RDF representation to the LDP server. A new URI is minted for the resource. The resource may be created independently, or it may be created within a container. The RDF representation can contain links to other resources. These may be links to other (resolvable) information resources, and to (unresolvable) non-informartion resources. The RDF representation may be self-describing in that it may include same-document references (i.e. so-called null relative and fragment URIs). The LDP shouldn't mandate the form of the RDF. A given LDP application should not restrict the use of non-application specific properties. There it is not necessary for the LDP to inspect the RDF. In the simples cases, a request for the same resource will simply return the same RDF as was placed there. This means that secondary resources mentioned in the RDF may not be served by the LDP unless they are created independently. As most resources could also be seen as containers, this use-case also serves to create containers, and nested containers.

1.4.1.1 Scenarios

1.4.1.1.1 Create a resource within a container

From user-story, Maintaining Social Contact Information, It should be possible to "easily create a new contact and add it to my contacts." Contact details are captured as an RDF description and it's properties, including "names, email addresses, telephone numbers, instant messenger identities and so on." The description may include non-standard RDF; "data about my contacts that the original design didn’t consider." The new resource is created in a container representing "my contacts." The following RDF describes a contact resource including examples of same-document references.

 <> a foaf:PersonalProfileDocument;
   foaf:primaryTopic <#me> .

 <#me> a foaf:Person;
     foaf:name "Henry" .
 
1.4.1.1.2 Create an un-contained resource
1.4.1.1.3 Create a container
1.4.1.1.4 Create a nested container

The motivation for nested containers comes from the Hosting POSTed Resources user-story. The Helios bug-tracking ontology allow bugs to have sub-issues referenced by the membership predicate helios_bt:hasSubIssue. The 'top-level-container' contains issues that may themselves be containers.

@prefix helios_bt: <http://heliosplatform.sourceforge.net/ontologies/2010/05/helios_bt.owl#>.

<top-level-container> rdfs:member [
      a helios_bt:BugtrackerIssue;
      dc:identifier	"58366";
      dc:type	"bug";
      helios_bt:isInBugtracker eg:bugtracker ;

      bp:membershipPredicate helios_bt:hasSubIssue .
   ]
 

1.4.1.2 Issues

  • ISSUE-7: Confirm that it is possible to create a container under another container.
  • ISSUE-12: Should HTTP PATCH support the creation of resources as defined in RFC 5789?
  • ISSUE-20: How is the inserted resource identified (in the POSTed RDF) given that the resource URI is unknown by the client? See #Hosting POSTed Resources
  • ISSUE-20: Should POST support a user supplied local-name 'hint'; eg. based on the Atom 'Slug' header, to support more human-readable URIs?

1.4.2 UC2: Retrieve resource description

Access the current description of a resource, containing properties of that resource and links to related resources. The representation may include descriptions of related resources that cannot be accessed directly.

Depending upon the application, an LDP may enrich the retrieved RDF with additional triples. Examples include adding incoming links, sameAs closure and type closure.

The HTTP response should also include versioning information (i.e. last update or entity tag) so that subsequent updates can ensure they are being applied to the correct version.

1.4.2.1 Scenarios

1.4.2.1.1 Retrieve RDF representation of an information resource

Based on Maintaining Social Contact Information, a user should be able to read contact details so that they are able to interact with a contact. An LDP holds social contact information about Alice. In this example the contact details make no distinction between resources and the people they describe. Resource http://example.com/people/Alice delineates the following RDF model. A Request for this resource returns an RDF representation in the desired format which could be Turtle or another RDF serialisation.

@prefix : <http://example.com/people/>.

<> a foaf:Person;
   rdfs:label "Alice";
   foaf:mbox <mailto:alice@example.com>.
 
1.4.2.1.2 Retrieve RDF description of a non-information resource

1.4.2.2 Issues

  • ISSUE-10: Weak entity tags that assure semantic equivalence, whereas strong entity tags assure byte for byte equivalence. Weak entity tags may be more suited to dynamic content.
  • ISSUE-16: Does the LDP support the redirection of non-information resources to LDPRs? Can it redirect to a different authority?

1.4.3 UC3: Update existing resource

Change the RDF description of a LDP resource, potentially removing or overwriting existing data. This allows applications to enrich the representation of a resource by addling additional links to other resources.

1.4.3.1 Scenarios

1.4.3.1.1 Selective update of a resource

This relates to user-story Data catalogs, based on the Data Catalog Vocabulary. A catalogue is described by the following RDF model.

@prefix dcat: <http://www.w3.org/ns/dcat#>	.
@prefix dcterms: <http://purl.org/dc/terms/> .
   
 :catalog a dcat:Catalog ;
    dcat:dataset :dataset/001.
    dcterms:issued "2012-12-11"^^xsd:date.
 

A catalog may contain multiple datasets, so when linking to new datasets it would be simpler and preferable to selectively add just the new dataset links. A Talis changeset [5][6] could be used to add a new dc:title to the dataset. The following update would be directed to the catalogue to add an additional dataset.

@prefix : <http://example.com/>.
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix cs: <http://purl.org/vocab/changeset/schema#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.

<change1>
  a cs:ChangeSet ;
  cs:subjectOfChange :catalog ;
  cs:createdDate "2012-01-01T00:00:00Z" ;
  cs:changeReason "Update catalog datasets" ;
  cs:addition [
    a rdf:Statement ;
    rdf:subject :catalog ;
    rdf:predicate dcat:dataset ;
    rdf:object :dataset/002 .
  ] .
 
1.4.3.1.2 Overwrite a resource in full

This relates to user-story Metadata enrichment in broadcasting and is based on the BBC Sports Ontology.

@prefix sport: <http://www.bbc.co.uk/ontologies/sport/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
 
 :mens_sprint a sport:MultiStageCompetition;
    rdfs:label "Men's Sprint";
    sport:award <#gold_medal> .
<#gold_medal> a sport:Award .
 

We can enrich the description as events unfold, linking to the winner of the gold medal by substituting the above description with the following.

@prefix sport: <http://www.bbc.co.uk/ontologies/sport/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
 
 :mens_sprint a sport:MultiStageCompetition;
    rdfs:label "Men's Sprint";
    sport:award <#gold_medal> .
<#gold_medal> a sport:Award; 
    sport:awarded_to [
        a foaf:Agent ;
        foaf:name "Chris Hoy" .
    ] .
 

1.4.3.2 Issues

  • ISSUE-17: Should the LDP recommend at least one patch format, so that we are able to develop concrete examples and test-cases; for example the Talis Changeset.

1.4.4 UC4: Determine if a resource has changed

It should be possible to retrieve versioning information about a resource (e.g. last modified or entity tag) without having to download a representation of the resource. This information can then be compared with previous information held about that resource to determine if it has changed. This versioning information can also be used in subsequent conditional requests to ensure they are only applied if the version is unchanged.

1.4.4.1 Scenarios

1.4.4.1.1 Retrieve resource version

Based on the user-story, Constrained Devices and Networks, an LDP could be configured to act as an HTTP proxy for a CoAP based Web of Things. As an observer of CoAP resources, the LDP registers its interest so that it will be notified whenever the resource undergoes a change in state. In this way the LDP can respond to requests for, or based on, versioning information.

1.4.4.2 Issues

1.4.5 UC5: Delete resource

Delete a resource and all it's properties. If the resource resides within a container it will be removed from that container, however other links to the deleted resource may be left as dangling references. In the case where the resource is a container, the server may also delete any or all contained resources. In normal practice, a deleted resource cannot be reinstated. There are however, edge-cases where limited undelete may be desirable. Best practice states that "Cool URIs don't change", which means that deleted URIs shouldn't be recycled.

1.4.5.1 Scenarios

1.4.5.1.1 Delete resource

The client requests that the LDP delete a resource. Deletion is treated as an idempotent operation so that it is not an error to delete a resource that has already been deleted.

1.4.5.2 Issues

  • ISSUE-24: Should DELETED resources remain deleted?.
  • ISSUE-25: We must be clear about how the LDP supports both weak aggregation and strong composition in containers.

1.4.6 UC6: List resources within a collection

There is a requirement to be able to manage collections of resources. These are (weak) aggregations, potentially including resources that may not be owned by any specific container. Collections are potentially very large, so some means may be required to limit the size of RDF representation returned by the LDP (e.g. pagination). This use-case focuses on obtaining an item-level description of the resources aggregated by the collection.

1.4.6.1 Scenarios

From Keeping Track of Personal and Business Relationships, these relationships would be managed as a collection.

1.4.6.1.1 List contained resources

1.4.6.2 Issues

  • ISSUE-7: Should be hyperlinks representing affordances of the container, so that clients understand what they can do, and how they are supposed to to that, if they want to engage in these interactions.
  • ISSUE-7: Should the number of resources per page be an explicit property of the container along with other container affordances?
  • ISSUE-14: The default ordering is ascending. Can other (syntactic) orderings be specified?
  • ISSUE-18: Pagination is not (necessarily) robust to deletions and insertions of container members.

1.4.7 UC7: Retrieve collection-level description

This use-case extends the normal behaviour of retrieving an RDF description of a resource, by dynamically excluding specific (membership) properties. For containers, it is often desirable to be able to read a collection-level description that excludes the container membership. This may include container affordances, including detail of the container membership predicate that is specifically excluded in this use-case.

1.4.7.1 Scenarios

1.4.7.1.1 Retrieve non-membership properties of a container

1.4.7.2 Issues

1.4.8 UC8: Aggregate resources

Collections are sets of (weakly) aggregated resources. There is a need to be able to manage resource aggregation at the level of adding and deleting individual membership properties.

1.4.8.1 Scenarios

1.4.8.1.1 Add an existing resource to a collection

This example is from Library Linked Data and LLD-UC, specifically Subject Search.

There is an existing collection at <http://example.com/concept-scheme/subject-heading> that defines a collection of subject headings. This collection is defined as a skos:ConceptScheme and the client wishes to insert a new concept into the scheme. which will be related to the collection via a skos:inScheme link. The new subject-heading, "outer space exploration", is not necessarily owned by a container. The following RDF would be added to the (item-level) description of the collection.

@prefix scheme : <http://example.com/concept-scheme/>.
@prefix concept : <http://example.com/concept/>.

scheme:subject-heading a skos:ConceptScheme.

concept:Outer+space+Exploration skos:inScheme scheme:subject-heading.
 

As this is simply a manipulation of the RDF description of a collection, it should be possible to add the same resource to multiple collections.

1.4.8.2 Issues

  • ISSUE-7: According to the Linked Data Platform 1.0, PUT should not be used:- "LDPC servers should not allow HTTP PUT to update a LDPC’s members."
  • ISSUE-13: Can LDP containers have members that are not LDP resources (e.g. non-information resources).
  • ISSUE-21: Container membership properties should permit the use of reversed relationships such as skos:inScheme. The SKOS examples show that both the forward and reverse properties may be in use at the same time (broader/narrower).
  • ISSUE-25: We must be clear about how the LDP supports both weak aggregation and strong composition in containers.

1.5 Requirements

TODO: Refine these based on use case and scenario updates

  1. Define a minimal set of RDF media-types/representations
  2. Define a limited number of literal value types
  3. Use standard vocabularies as appropriate
  4. Update resources, either RDF-based or not
  5. Use optimistic collision detection on updates
  6. Ensure clients are ready for resource format and type changes
  7. Apply minimal constraints for creation and update
  8. Add a resource to an existing container
  9. Remove a resource, including any associations with a container
  10. Get members of a container
  11. When getting members of a container, provide data about the members
  12. Get just data about a container, without all the members
  13. Handle a large number of members of a container, breaking up representation into pages
  14. Allow pages to have order information for members, within a page and across all pages

1.6 Acknowledgements

1.7 References