Warning:
This wiki has been archived and is now read-only.

User:Rcygania2/schema.org User Story

From RDF Data Shapes Working Group
Jump to: navigation, search

This user story focuses on the validation of schema.org instances against the constraints expressed in the schema.org model and metamodel. (The related user story, “S23: schema.org Constraints”, as well as Google's submission to the workshop, focus on domain-specific constraints attached to specific schema.org classes and properties, and not on the model and metamodel.)

A processor for our validation language should be able to accept a schema.org instance as well as the schema.org model, expressed in an RDF syntax, as inputs (perhaps as separate named graphs), and validate the instance against the model.

  • domainIncludes/rangeIncludes: In schema.org, properties can be associated with multiple types via the “domainIncludes” and “rangeIncludes” properties. The semantics is that the domain/range consist of the union of these types (rather than the intersection, as with the “domain” and “range” properties in RDFS). Validation requires that the subject and object of a triple can be compared against a set of types given in the model graph, and a validation error would be raised if the subject/object is not an instance of one of these types, or of one of their subtypes.
  • Datatypes and plain literals: In schema.org, properties may be associated with datatypes, but literals in instance data are always plain (string) literals. In other words, a property may be typed as a date property, but the date would be given as a plain literal, not as a xsd:date typed literal. Examples of named datatypes in schema.org include: ISO 8601 dates and datetimes; xsd:time; boolean “True” and “False”; integers. For validation, the language should be able to make use of annotations on the properties. For example, if we have { :thing schema:date "value" }, it should be possible to write a validation rule that depends on a “rangeIncludes” annotation on the schema:date property. As each named datatype is used many times throughout the model, it would also be good if the regular expression (or similar mechanism) for the datatype wouldn't have to be repeated for each property that uses the datatype, but could be referred to by reference, or by rule.
  • Conformance levels: Processing of schema.org by the major search engines tends to be quite permissive. For example, often, where an “Organization” instance is expected according to the model, a “Text” literal with the organization's name is sufficient. This could be treated as a warning/notice. Also, some literal properties contain markup recommendations such as for the “price” property: Putting “USD” into a separate currency property is preferred over sticking “$” into the numeric price literal. Again, values like “$99” could be treated as warnings.