Re: ISSUE-3 (DTF): Date and Time Format

All, hi.

just some random ranting on the possiblity of more flexible
underspecified dates than what has been proposed so far. I am towards
recommending Approach 3 myself, although Approach 1 has the merit of
perfectly fitting the current practice of using the first day of the
month to mean any time during the month.

Best,
Stasinos



Let us say that we want to use the property gld:birthDate to assert
that http://konstant.gr/#stasinos was born between Aug 15th and Sep
15th, 1973, but we are unable to provide a specific date.


Aproach 1

We define one specificity property for each property that ranges over
dates that can potentially be underspefied. This property ranges over
xsd:duration and means that the value of the original property should
be understood as an unknown xsd:date instance that lies within the
interval starting on the date shown by the original property and
lasting for the duration shown by the specificity property.

In our example we define
  gld:birthDateSpecificity rdfs:range xsd:duration
and specify that, if present, the value of gld:birthDate should be
understood as an unknown xsd:date instance that lies within the
interval starting on the date shown by gld:birthDate and lasting for
the duration shown by gld:birthDateSpecificity:

<http://konstant.gr/#stasinos>
   gld:birthDate "1973-08-15"^^xsd:date ;
   gld:birthDateSpecificity "P1M"^^xsd:duration .

Pros:
- It is simple to explain and populate.
- It is consistent with current practice of using midnight of the
  first day of the month (year) to mean an unknown date during that
  month (year). We can make explicit that a given "1973-01-01" value
  is actual meant to mean "sometime during 1973" WITHOUT retracting
  any statements, but by adding a specificity statement of
  "P1Y"^^xsd:duration.

Cons:
- It requires a new property for each property that we want to treat.
- It distributes a meaning over two properties that are not nested
  within the same pattern, but are at the same level as other, related
  properties of the same resource.


Aproach 2

Both cons above can be treated by introducing blank nodes (shudders)
or genids or whatever name is more palatable as values of
gld:birthDate. Such nodes would have properties of their own,
restricting the range of possible concrete values they can assume:

<http://konstant.gr/#stasinos> gld:birthDate [
   rdf:type gld:underSpecifiedDate ;
   gld:startDate "1973-08-15"^^xsd:date ;
   gld:specificity "P1M"^^xsd:duration
] .

We can, if so inclined, give more formal rigour by making explicit
that gld:underSpecifiedDate is an instance inside an inteval and not
the whole interval:

<http://konstant.gr/#stasinos> gld:birthDate [
   rdf:type gld:date ;
   gld:within [
     gld:dateInterval
     gld:startDate <http://dates.org/1973/08/15> ;
     gld:specificity "P1M"^^xsd:duration
   ]
] .

Note that the range of gld:birthDate is now a resource (since it has
properties of its own) so this breaks compatibility with using
xsd:date values when the date is known exactly. Exact dates would have
to either be date URIs or be blank nodes with a data property ranging
over xsd:date:

<http://konstant.gr/#stasinos>
  gld:birthDate <http://dates.org/1973/09/02> .

or

<http://konstant.gr/#stasinos>
  gld:birthDate [
    rdf:type gld:date ;
    gld:hasValue "1973-09-02"^^xsd:date ] .


Pros:
- Relatively simple to explain
- It defines a handful of types and properties that can be used for
  any property that ranges over dates. gld:date does not need to be a
  new type, but can be the type of any existing date URI schema.
- It collects all the triples about the underspecified date under
  single reosurce

Cons:
- Harder to populate than Approach 1
- It breaks compatibility with current practice, even for fully known
  dates.


Approach 3

We define gld:birthDate as a datatype property that ranges over the
union of xsd:date and gld:underspecifiedDate. gld:underspecifiedDate
is a simple datatype, derived by restricting xsd:string to:
  DD(SS)?
where DD is the lexical space of xsd:date and SS is the lexical space
of xsd:duration.

Semantics is start date and specificity as above. SS is optional and,
if missing, defaults to "P1D" (one day).

Examples:

<http://konstant.gr/#stasinos>
  gld:birthDate "1973-08-15P1M"^^gld:underspecifiedDate .

The following values are equal (although not identical, so functional
properties can have only one):
  "1973-09-02P1D"^^gld:underspecifiedDate .
  "1973-09-02"^^gld:underspecifiedDate .
  "1973-09-02"^^xsd:date .

The following values are not equal, as per the definition of
xsd:duration that states that no relationship exists between months
and days:

  gld:birthDate "1973-08-15P1M"^^gld:underspecifiedDate .
  gld:birthDate "1973-08-15P31D"^^gld:underspecifiedDate .

Pros:
- Relatively simple to explain and populate
- It maintains compatibility with xsd:date, although inconsistent with
  the practice of using midnight of the first day of the month (year)
  to mean an unknown date during that month (year), as all xsd:date
  values are interpreted as exact dates.

Cons:
- Harder to index, as "1973-08-15P1M", "1973-08-15P15D", and
  "1973-08-15" are all different values. Searching for all documents
  related to "1973-08-15" requires full-text search with globs; not
  a hard requirement (e.g., Solr does prefix* globs), but less
  efficient than searching for exact values.



Approach 4

One, rather cumbersome, solution using existing OWL 2 constructs is to
not make a direct gld:birthDate assertion, but instead restrict the
possible values of this property for this resource, if ever
discovered:

ClassAssertion(
  DataAllValuesFrom(
    gld:birthDate
    DatatypeRestriction(
      xs:dateTime
      xsd:minInclusive "1973-08-15T00:00:00Z"^^xsd:dateTime
      xsd:maxExclusive "1973-09-16T00:00:00Z"^^xsd:dateTime ))
  <http://konstant.gr/#stasinos> )

and as RDF triples:

<http://konstant.gr/#stasinos> rdf:type
  [ rdf:type owl:Restriction ;
    owl:onProperty gld:birthDate ;
    owl:allValuesFrom
      [ rdf:type rdfs:Datatype ;
        owl:onDatatype xsd:dateTime ;
        owl:withRestrictions (
          [ xsd:minInclusive "1973-08-15T00:00:00Z"^^xsd:dateTime ]
          [ xsd:maxExclusive "1973-09-16T00:00:00Z"^^xsd:dateTime ] )
      ]
  ] .

The use of midnight values of xsd:dateTime instead of xsd:date is
mandated by the fact that xsd:date does not permit the
xsd:minInclusive/xsd:maxExclusive restriction facets.

This is, obviously, not something any sane person would suggest that
GLD recommends, but goes to show that it is very well possible to
formalize a human-readable underspecified date format by transforming
to equivalent OWL 2 data.

Received on Monday, 30 January 2012 12:30:46 UTC