ISSUE-131: URI scheme used in NIF conversion

URI scheme used in NIF conversion

State:
CLOSED
Product:
MLW-LT Standard Draft
Raised by:
Felix Sasaki
Opened on:
2013-08-28
Description:
Copied from
http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Aug/0057.html

Felix,

this is the official review of the RDF WG on the ITS Draft, more exactly the NIF conversion section[1]. The RDF WG discussed the issue and took a resolution on this response[2]

The problem we see in the conversion algorithm is the URI-s that the algorithm generates, namely the URI-s of the form

<http://example.com/exampledoc.html#char=0,29>
<http://example.com/exampledoc.html#xpath(/html/body[1]/h2[1])>

although it is quite obvious what these are for, we do sense a problem with these nevertheless. Indeed

- RDF Concepts 1.1 Last Call document[3] refers to IRI-s: RFC3987[4]
- IRI-s map to URI-s: RFC3986[5]
- What RFC3986 says about fragments is:

[[[
The fragment's format and resolution is therefore dependent on the media type [RFC2046] of a potentially retrieved representation, even though such a retrieval is only performed if the URI is dereferenced. If no such representation exists, then the semantics of the fragment are considered unknown and are effectively unconstrained.
]]]

Looking at the URI-s above:

- The 'char' fragment id is defined by rfc 5147[6], but is defined for text/plain only. ITS talks about XML and HTML, ie, talks about resources whose media types are definitely _not_ text/plain
- The 'xpath' fragment id is fine for XML. But it is not defined for text/html

In view of this, we do not feel comfortable with the choice of the mapping; the resulting RDF triples will not be entirely correct because these URI-s are not correct. Additionally, although that is not an RDF requirement per se, the URI-s are not dereferenceable (because they are incorrect) which is also in contradiction with Linked Data Principles which are also prevalent in the community.

We do see two ways around this issue

1. The WG registers the 'char' fragment id-s (see also [7] for guidelines) through IETF for HTML and XML. (Actually, extending the usage of 'char' to XML/HTML would be generally very useful). Also, the WG registers 'xpath' for HTML (although we realize that this may be difficult because it might not be acceptable for the HTML WG which 'owns' the text/html media type)

2. The WG uses a different URI scheme, trying to avoid fragment ids. Something like:

http://www.w3.org/its?resource=http://example.com/exampldoc.html&char=0,29
http://www.w3.org/its?resource=http://example.com/exampldoc.html&xpath=/html/body[1]/h2[1]

where, of course, the www.w3.org/its part can be some other URI and, ideally, would refer to a service returning something feasible and intelligent on the request there.

However. We also recognize that the mapping in the ITS document is _not_ normative. As a consequence, the ITS WG is perfectly in its right to go ahead and not to follow the comments of the RDF Working Group. In other words, the ITS Working Group does not have to ask again for a formal approval of the RDF Working Group on any decision it may take (although I would be interested by the decision:-)

I hope this was helpful to you

Sincerely, in the name of the RDF Working Group

Ivan Herman (staff contact for the RDF WG)

P.S. Note that there are similar efforts elsewhere, like the string-range fragment id[8] or the work IDPF did for ebooks[9], but we recognize none of these offer an alternative.


[1] http://www.w3.org/TR/2013/WD-its20-20130820/#conversion-to-nif
[2] https://www.w3.org/2013/meeting/rdf-wg/2013-08-28#resolution_1
[3] http://www.w3.org/TR/2013/WD-rdf11-concepts-20130723/
[4] http://tools.ietf.org/html/rfc3987
[5] http://tools.ietf.org/html/rfc3986
[6] http://tools.ietf.org/html/rfc5147
[7] http://www.w3.org/TR/fragid-best-practices/
Related Actions Items:
No related actions
Related emails:
  1. Re: [ISSUE-131] update to NIF mapping section in spec re comments from RDF WG (from fsasaki@w3.org on 2013-09-08)
  2. Re: [ISSUE-131] update to NIF mapping section in spec re comments from RDF WG (from hellmann@informatik.uni-leipzig.de on 2013-09-07)
  3. Re: Review response from the MLW-LT WG (Re: Request for review from the RDF working group: ITS 2.0) (from ivan@w3.org on 2013-09-06)
  4. Review response from the MLW-LT WG (Re: Request for review from the RDF working group: ITS 2.0) (from fsasaki@w3.org on 2013-09-06)
  5. Re: [ISSUE-131] update to NIF mapping section in spec re comments from RDF WG (from fsasaki@w3.org on 2013-09-06)
  6. Re: [ISSUE-131] update to NIF mapping section in spec re comments from RDF WG (from hellmann@informatik.uni-leipzig.de on 2013-09-06)
  7. Re: [ISSUE-131] update to NIF mapping section in spec re comments from RDF WG (from fsasaki@w3.org on 2013-09-06)
  8. Re: [ISSUE-131] update to NIF mapping section in spec re comments from RDF WG (from fsasaki@w3.org on 2013-09-06)
  9. Re: [ISSUE-131] update to NIF mapping section in spec re comments from RDF WG (from hellmann@informatik.uni-leipzig.de on 2013-09-06)
  10. Re: [ISSUE-131] update to NIF mapping section in spec re comments from RDF WG (from hellmann@informatik.uni-leipzig.de on 2013-09-06)
  11. Re: [ISSUE-131] update to NIF mapping section in spec re comments from RDF WG (from fsasaki@w3.org on 2013-09-05)
  12. [ISSUE-131] update to NIF mapping section in spec re comments from RDF WG (from dave.lewis@cs.tcd.ie on 2013-09-05)
  13. [Agenda] MLW-LT call 4 September noon UTC (from fsasaki@w3.org on 2013-09-03)
  14. Re: mlw-lt-track-ISSUE-131: URI scheme used in NIF conversion [MLW-LT Standard Draft] (from fsasaki@w3.org on 2013-09-02)
  15. Re: mlw-lt-track-ISSUE-131: URI scheme used in NIF conversion [MLW-LT Standard Draft] (from fsasaki@w3.org on 2013-08-30)
  16. mlw-lt-track-ISSUE-131: URI scheme used in NIF conversion [MLW-LT Standard Draft] (from sysbot+tracker@w3.org on 2013-08-28)

Related notes:

RESOLUTION: accepted

Felix Sasaki, 6 Sep 2013, 13:38:11

COMMENTER-RESPONSE: satisfied, see http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Sep/0031.html

Felix Sasaki, 6 Sep 2013, 13:38:48

CHANGE-TYPE: editorial

Felix Sasaki, 6 Sep 2013, 13:39:03

DECISION-DETAILS: we adopted the proposal of the RDF WG to replace fragment identifiers #xpath and #char with query parameters.

Felix Sasaki, 6 Sep 2013, 13:39:55

Display change log ATOM feed


Chair, Staff Contact
Tracker: documentation, (configuration for this group), originally developed by Dean Jackson, is developed and maintained by the Systems Team <w3t-sys@w3.org>.
$Id: 131.html,v 1.1 2014-01-21 15:46:10 kahan Exp $