WebSchemas/Singularity

From W3C Wiki
< WebSchemas
Revision as of 02:02, 14 June 2013 by Scorlosq (Talk | contribs)

Jump to: navigation, search


This is a WebSchemas proposal Singularity for schema.org. See Proposals listing for more. Status: Published



Overview

  • This 'schema.org singularity' proposal describes a candidate change to schema.org's property naming design, based on feedback in the WebSchemas group.
  • It is not a proposal for substantive new vocabulary. Rather, it suggests minor edits to the spelling of property names to remove plural 's'.
  • Tantek has suggested that we use the opportunity of this change to reconsider bringing some property names closer to usage elsewhere.
  • The schema.org team agreed to move ahead with this proposal on 2012-03-09, details to be worked out here (see action in tracker)

Background:

  • Schema.org has many properties whose spelling ends in a final plural 's'.
  • The extension mechanism docs describe a naming convention that "Properties start with a lower case letter and are also camelCase. Properties that can take multiple values (such as parents) are plural and those that can take only a single value (such as dateOfBirth) are singular.
  • We have an open issue Determining intended cardinality of schema.org properties, from perspective of those wanting to know property cardinality
  • Others have commented from a markup perspective that the plural 's' looks odd in both Microdata and RDFa (less so in HTML5's Microdata JSON syntax).

Problem statement

The initial schema.org practice of writing plurals for property names gives rise to markup such as this:

  1.  <div itemscope="" itemtype="http://schema.org/SoftwareApplication">
  2.     <p itemprop="operatingSystems">OSX 10.6</p>,
  3.     <p itemprop="operatingSystems">Windows 7</p>
  4.  ...
  5.  </div>

Feedback suggests that authors, publishers and developers are surprised that we write 'operatingSystems' here, rather than 'operatingSystem'.

Another example (from Movie):


  1. <div itemscope itemtype="http://schema.org/Movie">
  2. <h1 itemprop="name">Pirates of the Carribean: On Stranger Tides (2011)</h1>
  3. <span itemprop="description">Jack Sparrow and Barbossa embark on a quest to
  4.  find the elusive fountain of youth, only to discover that Blackbeard and
  5.  his daughter are after it too.</span>
  6. Director:
  7.  <div itemprop="director" itemscope itemtype="http://schema.org/Person">
  8. <span itemprop="name">Rob Marshall</span>
  9. </div>
  10. Writers:
  11.  <div itemprop="author" itemscope itemtype="http://schema.org/Person">
  12. <span itemprop="name">Ted Elliott</span>
  13. </div>
  14. <div itemprop="author" itemscope itemtype="http://schema.org/Person">
  15. <span itemprop="name">Terry Rossio</span>
  16. </div>
  17. , and 7 more credits
  18. Stars:
  19.  <div itemprop="actors" itemscope itemtype="http://schema.org/Person">
  20. <span itemprop="name">Johnny Depp</span>,
  21.  </div>
  22. <div itemprop="actors" itemscope itemtype="http://schema.org/Person">
  23. <span itemprop="name">Penelope Cruz</span>,
  24. </div>
  25. <div itemprop="actors" itemscope itemtype="http://schema.org/Person">
  26. <span itemprop="name">Ian McShane</span>
  27. </div>
  • Note that we write 'director' (singular), 'author' (singular), 'actors' (plural). Even though we describe here two authors.
  • This would look similar in RDFa 1.1 Lite.
  • Since each itemprop describes a single actor, we might instead expect (abbreviating the example):
  1. <div itemscope itemtype="http://schema.org/Movie">
  2. <h1 itemprop="name">Pirates of the Carribean: On Stranger Tides (2011)</h1>
  3. ...
  4.  <div itemprop="actor" itemscope itemtype="http://schema.org/Person">
  5. <span itemprop="name">Johnny Depp</span>,
  6.  </div>
  7. <div itemprop="actor" itemscope itemtype="http://schema.org/Person">
  8. <span itemprop="name">Penelope Cruz</span>,
  9. </div>
  10. <div itemprop="actor" itemscope itemtype="http://schema.org/Person">
  11. <span itemprop="name">Ian McShane</span>
  12. </div>

This proposal explores ways of achieving this, while preserving documentation of the earlier form so that search engines can handle all variants.

Core Proposal

  • For every property name whose final 's' is due to the earlier practice of writing plurals for property names, create a new property with shorter, singular name
  • For each previous property name, agree this it is still acceptable (and somehow document this in schema.org for people and machines); but the singular form is preferred
  • TODO: check situation regarding classes/types.
  • TODO: check other in-progress draft proposals; they likely use the plural idiom.
  • Note that this gives us much more flexibility, as the things we say about property cardinality are note directly encoded in the spelling of properties that are used in millions of Web pages.

Discussion

TODO - link me

Details

The following draft list of properties was initially collected by Lin Clark. We should double-check before submitting any final change request.

(See also discussion of alumniOf vs alumnusOf; are there other non-'s' plurals we're missing?)

ACTION (09-03-2012): Dan to add detail to this list, and check each property carefully. (work-in-progress; first the shortlist, then the property definitions)

Notes:

  • in some (how many? one?) case, we end up with a class and property that have same name (event/Event). But we have this already (url/URL). Should note in * the intro somewhere to avoid confusion.
  • in many cases, plural form is used in the property definition too; these will need reviewing too.
  • we should extract a list of those that were initially singular, in case it is plausible to mark their status in schema instead.
  • essence of this issue should be extracted into a WebSchemas/StyleGuide.

Candidate list of additions / changes

The italic' singular term is the intended addition to schema.org.

Unchanged *-s properties

The following property names end in a final '-s', but are not grammatically plural and are 'unchanged by this proposal. The list is kept here for reference.

  • educationRequirements (on JobPosting; value is Text): educationRequirement
  • acceptsReservations from FoodEstablishment: Either Yes/No, or a URL at which reservations can be made.
  • benefits from JobPosting: (text) Description of benefits associated with the job.
  • calories from NutritionInformation: (Energy) The number of calories
  • mentions from CreativeWork: Indicates that the CreativeWork contains a reference to, but is not necessarily about a concept. Rationale: 3rd person.
  • numberOfEpisodes from TVSeason TVSeries: The number of episodes in this season or series.
  • numberOfPages from Book: The number of pages in the book.
  • numTracks from MusicPlaylist: The number of tracks in this album or playlist.
  • openingHours from LocalBusiness: The opening hours for a business. Opening hours can be specified as a weekly time range, starting with days, then times per day.
  • publishingPrinciples from CreativeWork: (URL) Link to page describing the editorial principles of the organization primarily responsible for the creation of the CreativeWork.
  • recipeInstructions from Recipe: (text) The steps to make the dish. Rationale: instructions don't necessarily break down very well in different values, see example.
  • workHours from JobPosting: (text) The typical working hours for this job (e.g. 1st shift, night shift, 8am-5pm).
  • experienceRequirements (the rest of these lacking the del markup; but also have been checked and aren't grammatically plural in the sense addressed here)
  • expires
  • incentives
  • ingredients
  • keywords
  • offers
  • qualifications
  • responsibilities
  • skills
  • specialCommitments

RDFa impact

The schema.org team discussed potential impact of this change on RDFa 1.1, since RDFa has a 'chaining' mechanism. See editor's Working Draft for details.


Summary:

  • RDFa 1.1 chaining includes a mechanism that allows a typed relationship to be indicated once, yet apply to the entities described by several sibling subelements.
  • This requires use of rel= attribute, rather than the property= mechanism used in the RDFa 1.1 Lite profile.
  • RDFa WG members (Manu, Stéphane) helped here, and urged that we adopt the WebSchemas/Singularity design; they didn't see a problem w.r.t. RDFa chaining.
  • From 1.1 spec: "in many situations the @property and @rel are interchangeable. This is not true for chaining."

Details follow

In this example, the 'dbp-owl:residence' property has two separate values. This creates two links of type 'dbp-owl:residence'; one to the German_Empire URI, the other to the Switzerland URI. The question is whether there is a strong intuition that a property name like 'residence' here should instead be 'residences'. A strong intuition against 'residence' would be evidence against this proposal.

Nobody felt this was a problem, but we record the examples here for the record.

  1. <div about="http://dbpedia.org/resource/Albert_Einstein" rel="dbp-owl:residence">
  2.  <span about="http://dbpedia.org/resource/German_Empire"></span>
  3.  <span about="http://dbpedia.org/resource/Switzerland"></span>
  4. </div>

Double-checking by using a real schema.org example, Movie (highlighting lines that indicate the actor(s) relationship):

  1. <div vocab="http://schema.org/" typeof="Movie">
  2.  <div property="actors"  typeof="Person"><span property="name">Johnny Depp</span>,</div>
  3.  <div property="actors"  typeof="Person"><span property="name">Penelope Cruz</span>,</div>
  4.  <div property="actors"  typeof="Person"><span property="name">Ian McShane</span></div>
  5.  </div>


Can we adjust this so that it chains? yes:

  1. <div vocab="http://schema.org/" typeof="Movie">
  2.  <div rel="actors">
  3.    <div typeof="Person"><span property="name">Johnny Depp</span>,</div>
  4.    <div typeof="Person"><span property="name">Penelope Cruz</span>,</div>
  5.    <div typeof="Person"><span property="name">Ian McShane</span></div>
  6.  </div>
  7. </div>

Here's a complete example of 'worst case scenario', using the proposed singular wording ie. "actors" -> "actor":


  1. <html>
  2. <head><title>A movie page</title></head>
  3. <body>
  4. <div vocab="http://schema.org/" typeof="Movie">
  5.  <div rel="actor">
  6.    <div typeof="Person"><span property="name">Johnny Depp</span>,</div>
  7.    <div typeof="Person"><span property="name">Penelope Cruz</span>,</div>
  8.    <div typeof="Person"><span property="name">Ian McShane</span></div>
  9.  </div>
  10. </div>

</body> </html>

This seems pretty OK, but worth documenting. Note that the use of rel="actor" takes us beyond the RDFa 1.1 Lite simple subset of the language.

Related Work

Follow up (June 29th, 2012)

Some properties were missed during the first pass.

Candidate list of additions / changes

The italic' singular term is the intended addition to schema.org.

Rationale (sent by Stéphane Corlosquet)

'ingredients' from Recipe ought to be singular since 1) its definition is "An ingredient used in the recipe." and 2) the example shows a list of multiple instances of ingredients. Finally 'ingredient' (singular) is consistent with the property defined in the Google documentation for Rich snippets - Recipes and used in microformats, microdata and RDFa. (I know this is not schema.org, but it will make the migration to schema.org markup easier and less error prone).

'offers' is another one that I think should be singular. It is used in CreativeWork, MediaObject, Event, and Product. Maybe it was meant to be the third person, but still that does not make sense: "Event has an Offer" sounds right, but "Event offers an Offer" does not sound right.