Projects/SDW

From W3C eGovernment Wiki
Jump to: navigation, search

Goal

  • Leverage the Social Data Web to develop metadata standards for government data to enable local design and achieve global integration across federated information domains.
    • (see below for Social Data Web brief explanation)

Objectives

  • Articulate the mechanics (standards and tools) and value proposition (features and capabilities) of the Social Data Web to government communities of interest engaged in collaborative development of metadata standards specifications for transparent information sharing.
  • Subject matter experts gather in multiple autonomous communities of interest and collaborate using Social Media tools that remove institutional boundaries to transparently standardize metadata in their respective government data domains.
  • Standardized domain metadata is expressed using the languages and standards of the Data Web, enabling the emergence of integration metadata that correlates the independently designed, developed and deployed domain metadata standards.
  • Authoritative government sponsors steward the management of domain and integration metadata in a voluntary consensus standards organization.


Below is a brief explanation of the Social Data Web and its use in a 'Data Management Strategy' - addressing Transparency through Web publishing, Collaboration through the use of Social Networking Software (and the resulting 'User Generated Content' most often referred to as 'Social Media'), and Information Sharing through the use of the Data Web. They combine to form the Goal keywords Social Data Web.


What is the Social Data Web?

First we briefly acknowledging the foundational features of the World Wide Web that make it the most successful distributed computing architecture in existence, we then quickly describe the capabilities of the Data Web as the most ubiquitous information sharing infrastructure ever invented, and finally we highlight the features of Social Media that make the Social Data Web the most powerful transparency, collaboration and information sharing medium we have ever known.

The Web

HTTP is the application protocol of the Web. It provides a set of actions that together are commonly referred to as an Application Programming Interface (or 'API') for interacting with the many and varied things on the Web. A user or program interacting with the Web invokes these actions to transition from thing to thing, with each thing being considered a Web 'resource'. Resources on the Web are typically documents that contain unstructured, semi-structured or structured data, and often include or refer to other multi-media file formats like audio and video. These different resource 'representations' are served in accordance with the format preferred by the user agent, which acts of behalf of the human or program. Browsers act as an agent of a human user to request and display Web documents that are most often represented using the HTML standard for reading, looking or watching. Hyperlinks (or just 'links') that connect any page to any other page (or more correctly, any resource representation to any other resource representation) are the basis of the Web ecosystem. Links establish value on the Web by virtue of a single 'anchor' relationship defined in HTML that links one document to another by invoking the built in actions of the HTTP protocol.

The Data Web

The 'Data Web' (also referred to as the 'Web of Data') is an extension of the 'Document Web'. We all use the Document Web everyday, using our browsers to invoke HTTP actions on web servers that provide access to resource representations we read or look at. Just as hyperlinks on the Document Web are links from one document to another document, hyperlinks on the Data Web are links from datum in one dataset to datum in another dataset. Datasets are defined by metadata that describes the structure and relationship of all the data entities that specify all the datum possible in the dataset. Metadata is therefore a formal specification of allowable data, sometimes referred to as a data schema, and can be used to constrain, govern, and validate data. A column named 'Amount' in a spreadsheet with numbers underneath in each cell of every row will likely be interpreted by people as numeric content, but without a formal schema specifying a numeric 'datatype', machines will incorrectly interpret each datum as textual content. When datasets are instantiations of datatypes that are formally defined by metadata, and that schema is itself designed using standards based languages, data processing is easier to automate. Metadata not only enhances automation, but significantly increases the quality and integrity of data, which are important when sharing information between people and machines. On the Data Web, hyperlinks are extended through custom named relationships corresponding to metadata specifications that define how one data entity and its instance datum are related to another one, within or across datasets that are independently published on the Web.

The Social Web

Every government agency has an existing Web presence, however the dominant use is 'one to many' publishing of just a few unstructured, non-machine readable file formats, which is a characteristic of first generation websites, where publishers have a one way conversation with anonymous consumers. Participation features including but not limited to existing data.gov ratings and suggestion functionality are common characteristics of so-called Web2.0 sites that engender a 'many to many' conversational dialog between content consumers and publishers. The 'Social Web' further enhances collaborative capabilities by making it free and easy for anyone to be a content publisher, often referred to as 'Social Media', through user friendly tools like wikis, blogs and forums. Communities of interest using 'Social Networking Software' form around mutually shared interests and abilities of individuals rather than organizational hierarchies or corporate mores previously imposed on them. Contributions among collaborators accumulate through 'activity streams' that feed historical events to subscribers interested in the emergent structure of their evolving content, organized by user generated metadata called 'tags' into informal groupings called 'folksonomies'. Boundaries of interpersonal trust adapt to information sharing needs of varying community challenges, ranging from emergency notification and rapid response, to marketing and communications, to research and development, and many more or less serious collaborative endeavors.

The Social Data Web

The Social Data Web is just a way to describe the ever expanding utility of the Web, a combination of existing social forms and data functions that significantly enhance our ability to realize the Open Government phenomena we're now experiencing, resulting in transparent, participatory and collaborative 'Government Linked Data'. Applying today's social collaborative capabilities to the development of formal metadata for standardized data schemas is just the beginning of the Social Data Web cycle. The social evolution of data standards enables the 'long-tail' of 'crowdsourced' contributors to apply authoritative metadata as tags where they exist. Perhaps more importantly, it also allows contributors to assert new folksonomies within or across the gaps of authoritative standards in accordance with new or non standard datasets. The Social Data Web cycle then repeats when these emergent folksonomic structures ultimately become authoritative ontologies, in a virtuous feedback loop of continuous government data and public information improvement.