WebSchemas/StyleGuide

~~ACTION: Dan post draft style guide to wiki (2012-03-09)~~

The following text was sketched Oct 2011. It was intended as a start towards a public style guide. Consider this very rough. Not also WebSchemas/Singularity has recently been adopted, and the text doesn't reflect that properly yet. For now it's probably best if contributors stay mainly schema.org team, but that's a soft constraint...

DRAFT Schema.org Style Guide

This doc is designed to eventually be published in a more careful form. It describes the modeling style and assumptions behind the current structure, to help users of the Schema and with the design of extensions. Some of the material here duplicates content already on Schema.org (see <http://schema.org/docs/extension.html>.

This document does not address questions of markup structure (see Microdata and RDFa specs), or questions of process. For information on the public discussion group and associated issue tracking and proposal process see <www.w3.org/wiki/WebSchemas> and <http://www.w3.org/wiki/SchemaDotOrgProcess>.

Naming

Class and property names are in CamelCase. Properties are always named with initial lower case letters; classes with an initial Capital. e.g. ‘additionalName’ (a property), ‘Person’ (a Class).

Plurality: we initially had a convention towards repeatable properties being named with a final plural ‘s’. In March 2012 this approach was dropped in favour of singular property names.

General Approach

Schema.org takes a pragmatic approach, but with some general guidelines:

When considering some topic, we do not ask ourselves ‘what is the best model of X’; rather, we consider the kinds of simple factual data currently present in Web sites about X, and ask how its markup could be improved.

Avoid enumerations: we do not want to be in the business of maintaining gigantic lists of things (products, countries etc.). Schema.org should be considered a ‘documentation center’ and may grow towards keeping a few pointers to external resources that provide such lists. For example, we might choose to defer to the UN FAO ‘geopolitical ontology’ rather than attempt to track countries and country-like entities, or their properties.

Publisher-centric: we try always to keep things simple for data publishers, acknowledging that they may not have the time, interest of expertise to read specifications. Our stereotypical publisher is exposing data that is already kept in structured form (eg. in a relational database), so our role is to help them adjust their HTML-generation templates to include simple additional markup.

Relatively flat model: there is no single right way to model anything. For our purposes, we have a bias towards flat models. For example, in the Library world, the FRBR conceptual model distinguishes ‘Works’, ‘Manifestations’, ‘Expressions’ and ‘Items’ when describing books. Within a Library setting, these distinctions are very useful; however they are not so easily applied for in-page annotations. Many Web pages (and authors) conflate these into a single flattened notion of “book”. Our challenge is to find ways to enrich such pages, while making useful distinctions (e.g. a description of a signed copy of some book, versus any copy).

We don’t use has_ prefixes on property names.

Not Re-inventing Wheels. Wherever possible, Schema.org structures should be based upon existing established standards. For example, rather than invent a news-description vocabulary, we partnered with rNews.

Freely extensible: Schema.org tries to anticipate the basic kinds of thing that need describing, and many of their most useful properties. However any description can be freely augmented with extra properties, without any need for ‘official’ blessing.

Datatyping: Schema.org uses its own simple datatyping primitives, rather than RDF’s datatyping design, which puts excessive burden on publishers.

URLs and strings and chaining: say something here! When we’re describing a Movie page, what’s our pattern for describing the actor, linking to the actor-describing-page etc.?

We use multiple ranges and domains, for simplicity - to avoid cluttering the hierarchy with redundant similiar classes.

When we add some extension into Schema.org, we need:

to look for overlaps: things we can already ‘say’ that can be said differently in the new vocab.
at use of datatypes, property naming conventions
at other popular models for carrying this kind of data (e.g. if improving our contacts or people vocab, we would need to compare Portable Contacts, vCard, FOAF and others).