Use Case 5 - Your Website is your API

From W3C eGovernment Wiki
Revision as of 19:53, 23 December 2008 by Oscar.azanon (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


Use Case: Your Website is your API

Identifier

UC-EGIG-SID-005

Author(s)

Oscar Azañón

Problem Definition

Public organizations provide a wealth of services to citizens, companies and other public organizations. These services can be of different nature, in the digital web world we focus on:

  • Informational services. Where other parties can get access to information generated by governments (for instance, GBP value at the national level or the list of sports facilities at the local level).
  • Transactional services. Government access to services requiring some kind of data submission from involved parties – for instance, a tax submission form for a citizen. At this moment, a number of transactional services are provided by public organizations with an ever-increasing level of automation and technical sophistication.

Originally, governments founded their web presence focusing on informational services to later start proving on-line transactional services. However, with the advent of current technologies, many governmental organizations are not actually leveraging web technologies to its full potential in informational services, due to several reasons:

  • Use of proprietary formats. In many cases, information is not made available in standard formats (i.e. HTML or the Open Document Format) but in proprietary format, thus limiting access to the information to people using a determined software brand or component.
  • Limited automatic re-use of information. This happens when informational services are provided as human-readable documents, mixing content and structure. When additional automatic processing is required (for instance, mixing two datasets), this is not directly possible. For instance, if statistical data is made available only as an embedded <table> HTML tag in a web page, automatic/machine re-use is severely limited.

Taking this last scenario into account when designing a data publication strategy greatly boosts everyone’s ability to reuse the information, including:

  • Other public sector organizations, which may use someone else’s information to provide added value by mixing and combining their own information with other sources using web technologies, thus increasing data usability, visibility and value.
  • Other non-public organizations (ONGs, companies, social web communities, etc.) that may create pure-web, standards based applications that combine different datasets (mashups). For instance, someone can create a layer on top of a Geospatial map showing data derived from several sources of information.

Target population

information & services portal designers, mashup builders, etc.

Description

When the information is made available through the web using the appropriate standards, it can be used again and again- in new, unanticipated and exciting ways that can greatly enhance the value of the data by its re-use and combination with increasing automation. With this vision, governments provide basic and elaborated information using standards that empower third parties to further mix, enhance and share this information. This use case exemplifies several ways this can be achieved by governments.

Target software

web browsers, semantic web platforms

Identified problems or limitations

There is great room for improvement in current technologies. Semantic Web Technologies can greatly boost this area, providing the data - and letting 3rd parties decide on how information is generated out of that data.

Related initiatives

The following technologies are related to this problem, and offer some of the tools required:

  • XML
  • RDF
  • SOAP
  • REST
  • Linked Data
  • HTTP

Priorization

high


Examples

Example 1 - RSS/ATOM information

Many pieces of information provided by governments are suitable for distribution as news feed. In this scenario, people subscribe to a set of channels and receive the information. For instance:

  • Job openings
  • eProcurement / adquisitions
  • news
  • etc.

One of the core benefits for this approach is update notifications - when a piece of information is added or modified, subscribers can easily get to know this.

Users only need a news feeds reader, which they use to subscribe and read the information.

Example 2 - REST interfaces

REST provides an architecture to create web applications, using standards like HTTP and XML. Basically, a 'resource' is associated to a URI that can be used to access or modify its information. Under this paradigm, a web site can publish a set of URLS that provide a real programmer's API that 3rd parties can use to build applications that extend the site's capabilities - perhaps by mixing several different sites. This technology is therefore highly suitable for the development of mashup applications.

In the following example, the list of job openings could be located in the following URL (please note the parameter location passed as a HTTP GET parameter):

http://www.mygov.org/jobopenings/list?location=Oviedo

which would return an XML description of the job opening:

<?xml version='1.0'>
<jobopenings>
  <jobopening>
    <id>54</id>
    <subject>IT Specialist</subject>
    <description>The Government of the Principado de Asturias has open positions for IT specialists. Required
       skills include:...</description>
    <location>Oviedo</location>
    <startdate>10/10/2008</startdate>
    <enddate>05/12/2008</enddate>
   ...
  <jobopening>
  <jobopening>
    <id>55</id>
    <subject>International Lawyer</subject>
    <description>A layer with a degree in international policy is wanted for ...</description>
    <location>Gijon</location>
    <startdate>10/12/2008</startdate>
    <enddate>06/02/2009</enddate>
   ...
  <jobopening>
</jobopenings>

If more information is needed for a given entry, a details' service could be made available:

http://www.mygov.org/jobopenings/details?id=54

Which would return all the information for that specific post. Other operations (update, delete, etc.) can also be published this way.

Other agencies could use this API to publish the information - perhaps mixing several sites at the regional level and potting the data on a web map.

The Seniors Canada Online web site is currently providing such interfaces to perform searches on their databases - for instance, the following URL provides sports information:

http://www.seniors.gc.ca/servlet/SeniorsXMLSearch?search=sports
<referenceid>276061</referenceid>
<language>en</language>
<url>http://www.cserv.gov.bc.ca/seniors/guide/index.htm</url>
<dctitle>Bruce-Grey Trail Network</dctitle>
<dcdescription>Bruce and Grey Counties have a multitude of trail experiences available, from snowmobile 
trails, bike paths and walking trails along the Lake Huron and Georgian Bay shorelines, through unique habitats 
of the Bruce Peninsula, to inland forests and waterways of Saugeen Country and the Beaver Valley.</dcdescription>
<dcsource>Bruce Grey Trails Network</dcsource>

This approach can be used to provide rather sophisticated services, using parameterized queries - for instance, this link provides all keywords starting with letter 'L' in French:

http://www.seniors.gc.ca/servlet/SeniorsXMLKeywords?lang=fr&letter=l

Example 3 - Semantic Web technologies

Though powerful, the REST example is of aid for as long as the information needed is supplied by the RESTful interface through a service. This implies that web site owners need to carefully think about which information they hold and which queries are needed, in order to implement them. However, this means that only these queries are available, thus limiting the possibilities to the foreseen needs.

Semantic Web technologies can provide a huge development in the way the Internet is thought and used. Under this approach, data is made available in a standard way (RDF tiples) and users of the information can query it in ways unforeseen by the owners of the data or API designers. Under this approach, Public Sector provides the information while external actors (citizens, other public or private organizations, etc.) use it in their own context, thus mining all the value out of it.

Imagine a flight assistant that is thinking about relocating to another region, and is looking for a city with an airport and a good living standard. Median income information is publicly available in the census, and there are many places holding airport location information. This scenario depicts good use of public, exiting data in the internet - but is hard to support and implement using previous standards since it needs to fetch data from different datasets, merge them and query the data to find the information.


The DBPedia project (http://wiki.dbpedia.org/) is an example of how a given web site can be prepared for this kind of applications, using:

  • Internet standards and XML technologies. HTTP, URIs, XML Schema, etc.
  • The Resource Description Framework (RDF) for representing extracted information. Query results would be represented as XML. In the example, available flights.
  • A set of web sites that provide information (datasets). In the former example, travel agencies or flying companies would be datatasets providers.
  • A query language. A semantic web query language would be (used instead of SQL). This example would retrieve the list of cities with an airport in Spain - using 3rd party datasets:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX air: <http://www.daml.ri.cmu.edu/ont/AirportCodes.daml#>
SELECT ?city
FROM <http://www.daml.ri.cmu.edu/ont/AirportCodes.daml>
WHERE
{
 ?x air:country "Spain".
 ?x air:city ?city.
}

Similarly, this query would return cities with income greater than 40,000 within a given region:

PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dc:  <http://purl.org/dc/elements/1.1/>
PREFIX dcterms:  <http://purl.org/dc/terms/>
PREFIX political: <http://www.political.org/rdf/schema/political/>
PREFIX census: <http://www.mycountry.org/rdf/census/>
SELECT ?city ?medianincome WHERE {
 ?region dcterms:isPartOf <http://www.mycountry.org/rdf/geo/location/myregion> ;
 rdf:type political:Region ;
 dc:title ?city .
 ?region census:data[
   census:populationIncome [
     census:medianIncome ?medianincome
   ]
 ] .
 FILTER(?medianincome > 40000 ) .
} ORDER BY ?medianincome


Public governments would then publish their information so third parties could query it in distributed internet applications. This could provide huge benefits:

  • Publishing a pdf document on a portal provides no means for automation - while Semantic Web would indeed provide a high degree of automation.
  • While current technolgies (web services, REST, etc.) provide such automation, public administrations need to program some set of queries and offer them as an API. This provides value, but requires design - and the decission on which queries are supported (and which not). It is impossible to foresee all the scenarios of data usage, so its potential for re-use is therefore limited.
  • With Semantic Web approaches, public organizations publish 'data sets' - and offer a query interface for applications to access the information in a non-predefined way. This greatly boosts the ability of third parties to use and reuse the information provided by public governments, in ways and applications perhaps unforeseen (and unforeseeable) by them.