Draft. This plan is still under development. Please send feedback to sandro@w3.org.
Latest version at: http://www.w3.org/2013/04/vocabs/
Old versions and diffs at: http://www.w3.org/2013/04/vocabs/Overview-history(version
history).
This is
Revision: 1.101.22 $ Date: 2013-04-25 17:33:022013-04-26 14:02:10 $
In order to promote the widespread interoperability of data, the W3C is beginning to offer a set of services to help people select, create, and maintain IRI vocabularies useful for creating reusable data. These vocabulary management services will build on existing W3C activities to create a sustainable community of people creating, maintaining, and using vocabularies for data sharing.
Effective data sharing requires people or systems to know how elements
in a dataset are supposed to be understood. For example, a csv file
can only be used by people who know what the column headings mean.
When software is written to collect and analyze data, its developers
need to understand these structural elements. When data is coming
from many different sources, the data producers have to agree on
structural elements (column names) or else the data consumers
willhave to learnanalyze and then write code for each different style.
One technique for addressing this problem is to use web addresses (URLs, URIs, or more recently IRIs) as identifiers for elements of the structure (eg column headings). This establishes a single authoritative source of information about the identifier's meaning — the web page — while still allowing everyone who can create a website to create as many new identifiers as desired. It also allows people to easily and unambiguously refer to the identifiers, for example when recommending them to a colleague or asking a question about the meaning.
While this use of IRIs has been adopted in some technical
communities, there are several barriers remainto wider adoption:
Lack of vocabulary hosting. In practice, creating and maintaining a website for your identifiers can be difficult. Many organizations structure their website around marketing requirements, not this kind of technical work, and the people who need to create new vocabularies of IRIs have little immediate motivation for creating a public-facing site for their work. LackCost of collaboration. Creating a vocabulary of identifiers that will be suitable for a range of applications usually requires input from a range of people. Finding and organizing the people with expertise in the domain of the vocabulary is one challenge; finding people with expertise in creating vocabularies is another. The result is often vocabularies developed by one or two people and suited only for narrow audiences.
Maintaining a website. In practice, creating and maintaining a website for your identifiers can be difficult. Many organizations structure their website around marketing requirements, not this kind of technical work, and the people who need to create new vocabularies of IRIs have little immediate motivation for creating a public-facing site for their work.
Lack of market information. There is no definitive listing of vocabularies, so it can be hard to determine whether a vocabulary for a particular purpose might exist. Even when one or more candidate vocabularies are found, it can be hard to find out how suitable they are: who else is using them, what are the relative strengths and weaknesses of the design, what software exists to support them, etc.
With the growing world-wide demand for data interoperability, these barriers are becoming increasingly problematic. Fortunately, W3C is well-positioned to address these problems. The vocabulary services outlined below build on existing W3C services and strengths. These services will make it much easier for people to obtain high quality vocabularies and help create new ones; this in turn will promote data sharing, reuse, and interoperability.
In general, these services are inexpensive to provide and can be offered for free to the public. W3C may, however, decide to charge for certain services and/or limit their use to W3C members.
Summary: We will promote and clarifythe existing policyuse of giving www.w3.org/ns space to anyW3C group, includingCommunity
groups. The goal is to allow vocabularies usefulGroups for open data interchange to be hosted by W3C and maintained by thegathering people who care about them. Background The longstanding W3C policy has beeninterested in developing and
maintaining individual vocabularies, and we will form a Vocabulary
Review Group to give out namespaces upon request by anyhelp guide others in creating better vocabularies.
W3C group. When the policy was created,has several different kinds of groups, including Community Groups could only(which
can be created by anyone with the approval ofjust four other interested people), and
Working Groups
(which are created after a review process by the W3C Advisory
Committee (representing the W3C Membership), and every group includedCommittee). For creating or maintaining a member ofvocabulary, Community
Groups are likely to be the W3C staff. Since then, W3C has begunpreferred option, because of their lower
financial and procedural overhead.
One drawback to supportCommunity Groups and Business Groups , which can be quickly created,(in contrast to Working Groups) is
that without membership approval, and do not have ongoingW3C staff participation. The namespace policy was interpretedto include thesehelp guide them, their progress depends on
their leadership figuring out, on their own, how to make a new
standard vocabulary. To reduce this burden and draw in people to use
Community Groups, as long aswe will create a guide to developing
vocabularies at W3C.
In order to further help people, particularly on the "shortname"technical
aspects of the process, we will form a cross-domain group of experts
prepared to examine vocabularies and give suggestions, a Vocabulary
Review Group. We plan to create this group by re-targeting the WebSchemas
(public-vocabs@w3.org) group. That group was basedcreated as a public
forum for discussing and reviewing new vocabularies and vocabulary
extensions, but it was initially motivated by just the schema.org
vocabulary and since then has largely focused on that one (large)
vocabulary.
With help from the chairs and schema.org, we intend to rename the
group and clarify the mission. The new Vocabulary Review Group name.will
be charged with helping people produce high-quality vocabularies by
examining ones that are submitted and offering advice. Depending on
interest within the VRG, these reviews might by performed by the
entire VRG (eg during a teleconference), by a subgroup, or by
individuals who volunteer. The advice may be public, and become part
of the information people consider in deciding whether to adopt a
vocabulary. The VRG will not, however, be a decision-making body, and
such advice will be attributed to particular individuals (perhaps with
other individuals concurring or dissenting), not the VRG as a whole.
Summary: We will promote and clarify the existing policy of giving www.w3.org/ns space to any W3C group, including community groups. The goal is to allow vocabularies useful for open data interchange to be hosted by W3C and maintained by the people who care about them.
The longstanding W3C policy has been to give
out namespaces (web space for vocabularies) upon request by any W3C
group, including
Community Groups and Business Groups. In
practice, few groups have taken advantage of this policy. It is not
widely known, and the process for updating the namespace document (the
vocabulary website) is not specified. PlanWith this in mind, we intend to
publicize this service and automate the namespace document publication
process.
Specifically: the chairs of W3C groups, including Community Groups,
will have access to a web form which allows them to reserve and update
namespace documents. The form will ask for some metadata, like whatrequire the chair to confirm having several facts, including:
When the system publishes the documents, it will add prominent notice of the status (not being endorsed by W3C) along with instructions for how to give feedback. At some point this interface may be expanded to provide software tools which support group development of vocabularies.
It may also be extendedThis service provides both an easy-to-maintain vocabulary website
and an institutional commitment to maintain that site as long as
people are willing to cover namespace documents on domains other than w3.org,participate in orderW3C groups to allow vocabulariesdo the work. This
second feature — a strong persistence policy — is
essential to potentially become independent of W3C.some potential vocabulary Groups Summary:users, and we will re-purpose the public-vocabs@w3.org group into a general group of vocabulary-development experts, with a missioncontinue
exploring ways to helpmake it even stronger. Options include establishing
institutional backup relationships (what happens to w3.org if W3C
shuts down?) and allowing vocabularies to be hosted on other groups produce high-quality vocabularies. We will promote the use ofdomains
so they can be transferred if there is community Groups for coordination among the people interested in developing and maintaining individual vocabularies. @@@ public-vocabs created....; who areconsensus that the
experts...? @@@ outreach? @@@ review sessions?work is better done away from W3C.
Summary: We will collect, maintain, and distribute
information about all available vocabularies, with the aim of helping
people identify and decide among alternatives. This will be an open
data application, freely interoperating with other suchrelated information
services that may exist. @@ background / history.services.
Even though at present vocabularies are generally available free of
charge, we may consider vocabulariesvocabulary adoption as a market, with
"consumers" trying to identify "products" and choose among them. From
this perspective, consumers in the current vocabulary market have very
little information about available productproducts and their features. This
is hardly surprising: there is little or no advertising,"advertising", there areno
retailers (helpingsimple business case for "retailers" to attract and guide consumers),consumers,
and there is little available "product information" as might be
printed on a package.
Instead,The current market uses little more than word-of-mouth.has had some "retailers" who have since gone
away (schema.net), some promising newcomers (LOV), and some successful
efforts in subdomains with this view, ouravailable funding (BioPortal). We plan isto
improve the market by improving theflow of information.information in this will be donemarket in two complementary
ways:
Vocabulary Directory.Directory Website. We will provide a "retail" website
where people can maintain and search listings of vocabularies, along
with useful metadata ("product information"). Metadata may include
simple endorsements ("like", "+1", star ratings) and more detailed
information like reviews, the list of open/closed issues, and the list
of public users/implementations.
Vocabulary Market Database. The directory will be an
open data application, making its internal data available for others
and consuming data feeds from others. With this,People who have existing
metadata will be able to easily provide it to the directory, and
people will be able to create new interfaces for exploring and
exploiting the data.
This will be architected to givea development project involving both developing
software and developing metadata vocabularies. W3C staff effort will
be partially supported by external funding, and we will work with
volunteers/partners. The directory no special status; peoplesoftware will be able to create alternativeopen source and the
vocabularies will, of course, make use of the vocabulary directories backedmanagement
services described above, using a Community Group and review by the
same data. (ThisVRG. Because this is likely to be work of interest to many members of
the VRG, we expect a "dogfood" project,particularly detailed consultation process.
Using an open data technologies to supportarchitecture for these services is important not
just as a validation and demonstration of the underlying technologies,
but because it keeps down the barriers to entry for everyone,
everywhere, trying to share data. Where a traditional closed
directory acts as a bottleneck, potentially stifling new approaches
and dissuading people from participating because of uncertainties
in how that directory might be run, an open directory welcomes
everyone to participate — listing new vocabularies, adding
information about vocabularies, and creating apps to help people find
and work with vocabularies. This openness and innovation is a hallmark
of shared data ecosystem.)and will be a key benefit of this service.
Copyright© 2013 W3C ® (MIT , ERCIM , Keio), All Rights Reserved.
Date: 2013-04-25 17:33:022013-04-26 14:02:10 $