Draft. This plan is still under development. Please send feedback to sandro@w3.org.
Latest version at: http://www.w3.org/2013/04/vocabs/
(version
history).
This is
Revision: 1.191.25 $ Date: 2013-04-25 20:37:042013-05-24 12:56:27 $
In order to promote the widespread interoperability of data, the W3C is beginning to offer a set of services to help people select, create, and maintain IRI vocabularies useful for creating reusable data. These vocabulary management services will build on existing W3C activities to create a sustainable community of people creating, maintaining, and using vocabularies for data sharing.
Effective data sharing requires people or systems to know how elements in a dataset are supposed to be understood. For example, a csv file can only be used by people who know what the column headings mean. When software is written to collect and analyze data, its developers need to understand these structural elements. When data is coming from many different sources, the data producers have to agree on structural elements (column names) or else the data consumers have to analyze and then write code for each different style.
One technique for addressing this problem is to use web addresses
(URLs, URIs, or more recently IRIs) as identifiers for elements of the
structure (eg column headings). This establishes a single
authoritative source of information about the identifier's meaning
— the web page — while still allowing everyone who can
create a website to create as many new identifiers as desired. It
also allows people to easily and unambiguously refer to the
identifiers, for example when recommending them to a colleague or
asking a question about the meaning.searching for datasets which provide particular data.
While this use of IRIs has been adopted in some technical communities, there are several barriers to wider adoption:
Cost of collaboration. Creating a vocabulary of identifiers that will be suitable for a range of applications usually requires input from a range of people. Finding and organizing the people with expertise in the domain of the vocabulary is one challenge; finding people with expertise in creating vocabularies is another. The result is often vocabularies developed by one or two people and suited only for narrow audiences.
Maintaining a website. In practice, creating and maintaining a website for your identifiers can be difficult. Many organizations structure their website around marketing requirements, not this kind of technical work, and the people who need to create new vocabularies of IRIs have little immediate motivation for creating a public-facing site for their work.
Lack of market information. There is no definitive listing of vocabularies, so it can be hard to determine whether a vocabulary for a particular purpose might exist. Even when one or more candidate vocabularies are found, it can be hard to find out how suitable they are: who else is using them, what are the relative strengths and weaknesses of the design, what software exists to support them, etc.
With the growing world-wide demand for data interoperability, these barriers are becoming increasingly problematic. Fortunately, W3C is well-positioned to address these problems. The vocabulary services outlined below build on existing W3C services and strengths. These services will make it much easier for people to obtain high quality vocabularies and help create new ones; this in turn will promote data sharing, reuse, and interoperability.
In general, these services are inexpensive to provide and can be offered for free to the public. W3C may, however, decide to charge for certain services and/or limit their use to W3C members.
Summary: We will promote the use of W3C Community
Groups for gathering people interested in developing and maintaining
individual vocabularies, and we will form a Vocabulary Review Groupencourage new and existing
cross-domain groups to help guide others in creating better
vocabularies.
W3C has several different kinds of groups, including Community Groups (which can be created by anyone with just four other interested people), and Working Groups (which are created after a review process by the W3C Advisory Committee). For creating or maintaining a vocabulary, Community Groups are likely to be the preferred option, because of their lower financial and procedural overhead.
One drawback to Community Groups (in contrast to Working Groups) is
that without W3C staff to help guide them, their progress depends on
their leadership figuring out, on their own, how to make a new
standard vocabulary. To reduce this burden and draw in people to use
Community Groups, we will createare creating a guide
to develolopingdeveloping vocabularies at W3C.
In order to further help people, particularly on the technical
aspects of the process, we will form a cross-domain group of experts prepared to examine vocabularies and give suggestions, a Vocabulary Review Group . We plancontinue to create this group by retargettingsupport the WebSchemas
(public-vocabs@w3.org) group. Thatgroup was createdas a public forum for discussing and reviewing new vocabularies and vocabulary extensions, but it was initially motivated by just the schema.orgcommunity of practitioners willing
to offer advice on vocabulary and since thendesign. Although this group has so far
largely focussedfocused on that one (large) vocabulary.vocabularies hosted at schema.org, its mission
covers all vocabularies. With help from the chairschair and schema.org, we
intend to rename thepromote this group and clarify the mission. The newmore broadly as a source for vocabulary
Review Group will be charged with helping people produce high-quality vocabularies by examining ones that are submittedtechnical reviews, coordination, and offering advice. Depending on interest within the VRG, these reviews might by performed by the entire VRG (eg during a teleconference), by a subgroup, or by individuals who volunteer. The advicedeveloping vocabulary design
expertise.
Other cross-domain groups may also be public, and become part of the information people consider in deciding whether to adopt a vocabulary.formed, following the VRG will not, however, be a decision-making body, andnormal
processes, such advice will be attributedas to particular individuals (perhaps with other individuals concurring or dissenting), not the VRGproduce a vocabulary-design Best Practices
document. As explained below, we may also form a whole.group to develop
vocabulary metadata suitable for helping people choose among available
vocabularies.
Summary: We will promote and clarify the existing policy of giving www.w3.org/ns space to any W3C group, including community groups. The goal is to allow vocabularies useful for open data interchange to be hosted by W3C and maintained by the people who care about them.
The longstanding W3C policy has been to give
out namespaces (spaces(web space for vocabularies) upon request by any W3C
group, including
Community Groups and Business Groups. In
practice, few groups have taken advantage of this policy. It is not
widely known, and the process for updating the namespace document (the
vocabulary website) is not specified. With this in mind, we intend to
publicize this service and automatestreamline the namespace document publication
process.
Specifically: the chairs of W3C groups, including Community Groups,
will have access toa web form which allows themsimple way to reserve and update namespace documents. The form will ask for some metadatadocuments,
should they choose to have their vocabulary hosted by W3C and request that the group seek review from public-vocabs@w3.org before making significant updates. It will require confirmationmanaged
according to its policies. If they make use of certain terms-of-service, including a stipulation that thethis service, they
will be required to confirm having several facts, including:
When we publish the documents, we will add prominent notice of the
status (not being endorsed by W3C) along with instructions for how to
give feedback. At some point this interfacewe may be expanded toprovide software tools which
support group development of vocabularies.
It may also be extendedThis service provides both an easy-to-maintain vocabulary website
and an institutional commitment to cover namespace documents on domains other than w3.org,maintain that site as long as
people are willing to participate in orderW3C groups to allowdo the work. This
second feature — a strong persistence policy — is
essential to some potential vocabulary users, and we will continue
exploring ways to make it even stronger. Options include establishing
institutional backup relationships (what happens to w3.org if W3C
shuts down?) and allowing vocabularies to potentially become independent ofbe hosted on other domains
so they can be transferred if there is community consensus that the
work is better done away from W3C.
Summary: We will collect, maintain, and distribute information about all available vocabularies, with the aim of helping people identify and decide among alternatives. This will be an open data application, freely interoperating with related information services.
Even though at present vocabularies are generally available free of
charge, weone may consider vocabulary adoption as a market, with
"consumers" trying to identify "products" and choose among them. From
this perspective, consumers in the current vocabulary market have
verylittle information about available products and their features. This
is hardly surprising: there is little or no "advertising", there no
simple business case for "retailers" to attract and guide consumers,
and there is little available "product information" as might be
printed on a package.
The current market has had some "retailers" who have since gone away (schema.net), some promising newcomers (LOV), and some successful efforts in subdomains with available funding (BioPortal). We plan to improve the flow of information in this market in two complementary ways:
Vocabulary Directory Website. We will provide a "retail"
website where people can maintain and search listings of vocabularies,vocabularies
(wherever they are hosted), along with useful metadata ("product
information"). Metadata may include simple endorsements ("like",
"+1", star ratings) and more detailed information like reviews, the
list of open/closed issues, and the list of public
users/implementations.
Vocabulary Market Database. The directory will be an open data application, making its internal data available for others and consuming data feeds from others. People who have existing metadata will be able to easily provide it to the directory, and people will be able to create new interfaces for exploring and exploiting the data.
The systemThis will be architected to give the directory website no special status; peoplea development project involving both developing
software and developing metadata vocabularies. W3C staff effort will
be able to create alternative vocabulary directories (other "retail" sites) backedpartially supported by external funding, and we will work with
volunteers/partners. The same data. (This is a "dogfood" project, usingsoftware will be open data technologies to supportsource and the
open data ecosystem.) By makingvocabularies will, of course, make use of the directoryvocabulary management
services described above, using a Community Group and review by
public-vocabs@w3.org.
Using an open data application, and by branding itarchitecture for these services is important not
just as a W3C service, we are likelyvalidation and demonstration of the underlying technologies,
but because it keeps down the barriers to be ableentry for everyone,
everywhere, trying to include all vocabularies currentlyshare data. Where a traditional closed
directory acts as a bottleneck, stifling new approaches and dissuading
people from participating because of uncertainties in use. By usinghow that
directory might be run, an open data architecture, we avoid stiffling the market; we provide an extremely low barrierdirectory welcomes everyone to
entry for eitherparticipate. Everyone is free to list new vocabulary producers or innovative retailers.vocabularies, add
information about vocabularies, and creat apps to help people find and
work with vocabularies. This openness and innovation is a hallmark of
shared data and will be a key benefit of this service.
Copyright© 2013 W3C ® (MIT , ERCIM , Keio), All Rights Reserved.
Date: 2013-04-25 20:37:042013-05-24 12:56:27 $