W3C W3C Member Submission

SA-REST: Semantic Annotation of Web Resources

W3C Member Submission 05 April 2010

This version:
http://www.w3.org/submissions/2010/SUBM-SA-REST-20100405/
Latest version:
http://www.w3.org/submissions/SA-REST/
Authors:
Karthik Gomadam (Wright State University*)
Ajith Ranabahu (Wright State University)
Amit Sheth (Wright State University)

Abstract

SA-REST is a poshformat [Poshformat] to add additional meta-data to (but not limited to) REST [REST] API descriptions in HTML or XHTML. Meta-data from various models such an ontology, taxonomy or a tag cloud can be embedded into the documents. This embedded meta-data permits various enhancements, such as improve search, facilitate data mediation and easier integration of services.

Status of this document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications can be found in the W3C technical reports index at http://www.w3.org/TR/.

By publishing this document, W3C acknowledges that the Submitting Members have made a formal Submission request to W3C for discussion. Publication of this document by W3C indicates no endorsement of its content by W3C, nor that W3C has, is, or will be allocating any resources to the issues addressed by it. This document is not the product of a chartered W3C group, but is published as potential input to the W3C Process. A W3C Team Comment has been published in conjunction with this Member Submission. Publication of acknowledged Member Submissions at the W3C site is one of the benefits of W3C Membership. Please consult the requirements associated with Member Submissions of section 3.3 of the W3C Patent Policy. Please consult the complete list of acknowledged W3C Member Submissions.

Table of Contents

1.Introduction

1.1.Notational Conventions

1.2.XML Namespaces

2.SA-REST properties

2.1.Design Principles

2.2.Property types

2.3.Basic SA-REST Properties

2.4.Basic Usage Examples

2.5.RESTful API Example

3.SA-REST use cases

4.Processing SA-REST Documents

5.Acknowledgements

6.References

7.Appendix

1. Introduction

Semantic Annotations for REST (SA-REST) define three basic properties that can be used to non-intrusively annotate HTML/XHTML documents, typically to embed ontological meta-data. These properties, defined as a poshformat [Poshformat], are included as part of the XHTML document allowing a capable processor to gain extra information about the content of the document. Poshformats are the superset of microformats [Microformat]. While a poshformat may follow certain microformat design principles, it may not have gone through a rigorous community process as defined by microformat process guidelines [MicroformatProcess].

Basic SA-REST properties, namely domain-rel,sem-rel and sem-class are specified using the class attribute and the title attribute defined by the HTML specification [HTML 4.01]. Similar to microformats, the scope of the annotation is defined by the HTML element that bears the annotation.

The following example illustrates an XHTML fragment embedded with SA-REST annotations. The original text fragment is from Wikipedia for the subject computer([Computer]). The markup in bold highlight the SA-REST annotations.

(001) <p>
(002) A <b><span class="sem-class" title="http://tap.stanford.edu/#computer"> computer </span></b> 
(003) is a <a href="/wiki/Machine" title="Machine">machine</a> that manipulates
(004) <a href="/wiki/Data_(computing)" title="Data (computing)">data</a> according 
(005) to a set of <a href="/wiki/Source_code" title="Source code">instructions</a>.
(006) </p>
(007) <p>
(008) <span class="domain-rel" title="http://www.owl-ontologies.com/ComputingOntology.owl#History_of_Computing" >
(009) Although mechanical examples of computers have existed through much of recorded human 
(010) history, the first electronic computers were developed in the mid-20th century (1940–1945).</span> </p>

Line (002) illustrates the specification of the term computer using the sem-class property. Lines (008) to (010) exemplify the marking up of the text fragment to indicate that it belongs to the domain History of Computing.

1.1. Notational Conventions

The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC 2119].

When describing abstract data models, this specification uses the notational convention used by the XML Infoset [XML Infoset]. Specifically, abstract property names always appear in square brackets (e.g., [some property]).

1.2 XML Namespaces

Table 1 lists XML namespaces that are used in this specification. The choice of any namespace prefix is arbitrary and not semantically significant.

Table 1: Prefixes and XML namespaces used in this specification
Prefix XML Namespace Specification(s)
rdfs http://www.w3.org/2000/01/rdf-schema# RDF Schema [RDFS]
rdf http://www.w3.org/1999/02/22-rdf-syntax-ns# RDF [RDF]
sarest http://www.knoesis.org/research/srl/standards/sa-rest/# SA-REST

2. SA-REST Properties

2.1. Design Principles

A list of design patterns to follow when designing a XHTML based microformat is defined by the microformat community [Microformat]. These principles are exemplified in the hCalender microformat specification [Hcal]. Two main design considerations taken into account in designing SA-REST are listed below.

  1. Reuse of existing XHTML constructs to introduce minimum or no disruption to the regular machinery that interacts with the markup.
  2. Cater to human friendliness. SA-REST is designed for humans first and machines later. The Human in this case is the developer or the annotator.

2.2. Property Types

SA-REST has two types of properties. These types are only meant to distinguish the capability of a property to nest other properties. As listed in Section 2.3, all properties are multi-valued.

Block property
A block property may contain other (SA-REST) properties within the content that is being marked up.
Element property
A property that applies only to a single term and should not contain any other properties.

2.3. Basic SA-REST Properties

SA-REST has three basic properties.

[domain-rel] :Enumeration of URIs (Optional) : Block property
The domain-rel property allows a domain information description for a resource. If a given resource (such as a blog post) has content spanning multiple domains, it is desired to add multiple domain-rel property entries, each corresponding to a section of the resource. If such a separation cannot be made, the resource may be attached with an enumeration of values as the domain-rel property value.
[sem-rel] :Enumeration of URIs (Optional) : Element property
The sem-rel property captures the semantics of a link and evolves from the popular rel tag. The sem-rel property enables the addition of externalized annotations to third party documents. A sem-rel property may only be used with an anchor (<a>) element.
[sem-class] :Enumeration of URIs (Optional) : Element property
sem-class can be used to markup a single entity within a resource. The entity may be a text fragment or embedded objects such as a video.

2.4. Basic Usage Examples

Basic usage examples are illustrated below. The preferred style is to use a generic language/style container (see [HTML 4.01]) surrounding the text fragment of interest. class and title attributes are used to indicate the property and the value. When adding a generic structural element is cumbersome or not possible, any structural element with class and title attributes may be used.

[domain-rel]: Single reference
	
 <span class="domain-rel" title="http://apihut.com/schemas/socialnetworking#socialnetworks" >
   The growing trend of "liking" has recently caught a lot of attention of both network users as well as developers.
 </span>
	
[domain-rel]: Multiple references
	
 <span class="domain-rel" title="http://apihut.com/schemas/socialnetworking#socialnetworks http://apihut.com/schemas/economy#recession" >
   One often wonders the future ofadvertisement driven Web applications in the current economic scenario. 
   For example, social networking applications such as...
 </span>
	
[sem-rel]
	
    <a href="http://foo.xsd" class="sem-rel" title="http://taxonomy.org/computerscience#firstname" > This is the input schema </a> 
	
[sem-class]
	
   One striking observation in evolution of 
   <span class="sem-class" title="http://tap.stanford.edu/#computer">Computers</span> 
   is the relationship between speed and size.
	

2.5. RESTful API Example

SA-REST can be used to annotate RESTful API descriptions. The following example depicts an annotation across an HTML description of an API. The following HTML snippet is from the Yahoo! Developer Network mail Web Service API documentation [YDN-JSON-RPC]. These annotations highlight the use of the existing block elements to seamlessly integrate the annotations into an existing document.

<div class="section domain-rel" lang="en" title="sarest:Service" >
   <span class="domain-rel" title="sarest:Operation" > 
   <div class="titlepage">
      <div>
         <div>
            <h3 id="JSON-RPCEndpoint">JSON-RPC Endpoint</h3>
         </div>
      </div>
   </div>
   <p>The JSON-RPC endpoint implements the <a class="ulink" href="http://json-rpc.org/wiki/specification" target="_top">JSON-RPC spec</a> on
   top of the Web service. Requests are serialized JavaScript following a specific data format. 
   Each serialized JavaScript object contains the following properties:
   </p>

   <div class="itemizedlist domain-rel" title="sarest:InputMessage">
      <ul>
         <li class="bullist sem-class" title="sarest:Parameter">
         <code class="code">method</code>: 
             name of the API method being called.
         </li>
         ...
      </ul>
   </div>
</span> 
</div>

The above annotations are based on an extended version of the service model (Appendix A) described in [hRESTS].

3. SA-REST Use Cases

Faceted Search

The number of API descriptions available in programmableWeb [ProgrammableWeb] as of March 2010 is 1763, a 200% increase in one year. A general purpose search engine, such as Google, treat API documents like any other when indexing and ranking APIs. As a result, search for APIs (even when specific queries like "Maps API") results in API resources being scattered all over the result set. Web API directories like programmableWeb do present a more domain-specific solution. However, they largely rely on user tags for classification and searching. Addition of meta-data to capture the various facets of APIs (functionality, supported message types, clientside bindings, protocol etc) is essential for better indexing and searching. One such effort is APIHut [APIHut]. SA-REST can significantly improve faceted search by attaching explicit meta-data to the API descriptions.

Other enhancements include improvements to domain specific search capabilities. There is an ongoing effort with NCBO [NCBO] to test-drive the annotations on a selected set of NCBO Web services to improve the search capabilities.

Data Mediation and Mediatability
The importance of enabling easier approaches to data mediation is well understood. The issue of data mediation is particularly important in the context of light weight service compositions (also known as mashups) primarily because mediation would need to happen manually. SA-REST addresses this concern in two ways.
  1. Facilitates a lifting and lowering scheme similar to the mechanism described in SAWSDL[SAWSDL].
  2. Facilitates the calculation of Mediatability[Mediatability],a measure of the estimated human effort for manually performing data mediation.
Smart Mashups
A light weight service composition, also known as a mashup, is a popular way of creating new composite services. A smart mashup is a mashup with enough flexibility to provide the end user with a choice for certain services. An example is letting the end user select a map provider for a mashup that includes a map. SA-REST annotations may be utilized to create such flexible mashups.
Semi-automatic Text Annotation
Text annotation is an important research area, primarily due to the large volume of text data available. It is not viable to annotate such volumes of data by purely human effort and one needs to employ text processing techniques for automatic markup. One major challenge in text processing is disambiguation, selecting the correct semantics of a word that may be used across domains to represent different concepts. SA-REST annotations can act as a guide to qualify the semantics of a certain text fragment and provide disambiguation hints to automatic text annotators.
Service Oriented Sensor Networks (SOSN)
SOSNs transfer the concept of Service Oriented Architecture (SOA) into the sensor network domain. The objective of SOSN is to enable exploration, composition, and sharing of sensor devices on the Web. Exposing of sensors via Web allows two types of operations.
  1. Querying a sensor node for one or more sensor readings.
  2. Updating the state or the configuration of a sensor.
Similar to Web APIs, SA-REST annotations facilitate better exploration and composition capabilities in SOSNs.

4. Processing SA-REST Documents

Preferred methods to process SA-REST annotations are GRDDL [GRDDL] and XSLT [XSLT] to extract a useful representation. For example in most cases an RDF/XML (see [RDF]) would be useful. However SA-REST annotations may also be used to generate a structured representation such as a Web Services Description Language (WSDL)[WSDL] document. A simple GRDDL transformation reference is illustrated below.

     <html xmlns="http://www.w3.org/1999/xhtml"
      xmlns:grddl='http://www.w3.org/2003/g/data-view#'
      grddl:transformation="glean_api.xsl"  >

5. Acknowledgements

This submission includes use cases contributed by the following individuals and teams.

This submission has been developed as a result of discussions with and supported by the following W3C member institutes (listed with the W3C representative).

We also acknowledge the contributions from Jonathan Marsh of WSO2 during the early phases of SA-REST.

6. References

[RFC 2119]
Key words for use in RFCs to Indicate Requirement Levels, S. Bradner, Author. Internet Engineering Task Force, June 1999. Available at http://www.ietf.org/rfc/rfc2119.txt.
[XML Infoset]
XML Information Set, Cowan J., Tobin R. (Editors), W3C Recommendation, 24 October 2001.
[Computer]
"Computer", Wikipedia, 2009.
[Poshformat]
"poshformats", microformats.org, 2010.
[Microformat]
"About Microformats", microformats.org, 2010.
[MicroformatProcess]
"Microformat Development Process", microformats.org, 2010.
[REST]
"Architectural Styles and the Design of Network-based Software Architectures", R.T.Fieldings, University of California, Irvine, 2000.
[HTML 4.01]
"HTML 4.01 Specification", W3C, December 1999.
[ProgrammableWeb]
"Progrmmableweb : API and Mashup Listing", programmableweb.com, 2009.
[APIHut]
A Faceted Classification Based Approach to Search and Rank Web APIs , Gomadam K., Ranabahu A., Nagarajan M., Sheth A. P., Verma K. (Authors), in proceedings of the 6th IEEE International Conference on Web Services (ICWS), September. 2008.
[NCBO]
The National Center for Biomedical Ontology, National Centers for Biomedical Computing , 2010.
[Hcal]
"Hcalendar Microformat - Version 1.0", microformats.org, 2009.
[RDF]
Resource Description Framework (RDF): Concepts and Abstract Syntax, Klyne G., Carroll J. (Editors), W3C Recommendation, 10 February 2004. The latest version is http://www.w3.org/TR/rdf-concepts/.
[RDFS]
RDF Vocabulary Description Language 1.0: RDF Schema , Brickly D., Guha R.V. (Editors), W3C Recommendation, 10 February 2004.
[hRESTs]
hRESTS: An HTML Microformat for Describing RESTful Web Services , Kopecky J., Gomadam K., Vitvar T. (Authors), in proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01, December 2008.
[Mediatability]
Mediatability: Estimating the Degree of Human Involvement in XML Schema Mediation , Gomadam K., Ranabahu A., Ramaswamy L.,Verma K., Sheth A. P., (Authors), in proceedings of the 2nd IEEE International Conference on Semantic Computing (ICSC), July 2008.
[YDN-JSON-RPC]
Yahoo! Mail Web Service User Guide and API Reference - JSON-RPC Endpoint , January 2010.
[XSLT]
XSL Transformations (XSLT) Version 2.0, Kay M.(Editor). World Wide Web Consortium, 23 January 2007. This version is http://www.w3.org/TR/2007/REC-xslt20-20070123/. The latest version is available at http://www.w3.org/TR/xslt20/.
[SAWSDL]
Semantic Annotations for WSDL and XML Schema - Recommendation, Farrell J., Lausen H., (Editors). World Wide Web Consortium, 28 August 2007, Latest version available at http://www.w3.org/TR/sawsdl/.
[GRDDL]
Gleaning Resource Descriptions from Dialects of Languages (GRDDL) - Recommendation, Connolly D.,(Editor). World Wide Web Consortium, 11 September 2007, Latest version available at http://www.w3.org/TR/grddl/.
[WSDL]
Web Services Description Language (WSDL) Version 2.0 Part 1: Core Language, R. Chinnici, J-J. Moreau, A. Ryman, S. Weerawarana, Editors. World Wide Web Consortium, 23 May 2007. This version of the "Web Services Description Language (WSDL) Version 2.0 Part 1: Core Language" Specification is available is available at http://www.w3.org/TR/2007/PR-wsdl20-20070523. The latest version of "Web Services Description Language (WSDL) Version 2.0 Part 1: Core Language" is available at http://www.w3.org/TR/wsdl20.

7. Appendix A : Service model

The service model in RDFS/N3 is listed below.

@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix sarest: <http://www.knoesis.org/research/srl/standards/sa-rest/#> .

sarest:hasParameter rdf:type owl:ObjectProperty ;
              rdfs:range sarest:Parameter ;
              rdfs:domain sarest:Message .

sarest:hasAddress rdf:type owl:ObjectProperty ;
                  rdfs:domain sarest:Operation ;
                  rdfs:range sarest:URITemplate .

sarest:hasInputMessage rdf:type owl:ObjectProperty ;
                       rdfs:range sarest:InputMessage ;
                       rdfs:domain sarest:Operation .

sarest:hasMethod rdf:type owl:ObjectProperty ;
                 rdfs:range sarest:HTTPMethod ;
                 rdfs:domain sarest:Operation .

sarest:hasOperation rdf:type owl:ObjectProperty ;
                    rdfs:range sarest:Operation ;
                    rdfs:domain sarest:Service .

sarest:hasOutputMessage rdf:type owl:ObjectProperty ;
                        rdfs:range sarest:OutputMessage ;
                        rdfs:domain sarest:Operation .

sarest:InputMessage rdf:type owl:Class ;
              rdfs:subClassOf sarest:Message .

sarest:OutputMessage rdf:type owl:Class ;
               rdfs:subClassOf sarest:Message .

sarest:Parameter rdf:type owl:Class .
sarest:HTTPMethod rdf:type owl:Class .
sarest:Message rdf:type owl:Class .
sarest:Operation rdf:type owl:Class .
sarest:Service rdf:type owl:Class .
sarest:URITemplate rdf:type owl:Class .

sarest:DELETE rdf:type owl:Thing ,
                       sarest:HTTPMethod .
sarest:GET rdf:type owl:Thing ,
                    sarest:HTTPMethod .
sarest:POST rdf:type owl:Thing ,
                     sarest:HTTPMethod .
sarest:PUT rdf:type owl:Thing ,
                    sarest:HTTPMethod .

* Karthik Gomadam participated in this work before graduating in August 2009 and moving to University of Southern California