Re: [ISSUE-131] update to NIF mapping section in spec re comments from RDF WG

Hi Felix,
Percent encoding should be fine. I update it here as well:
https://dl.dropboxusercontent.com/u/375401/tmp/EX-nif-conversion-output.xml

Note that I replaced:

<http://example.com/exampledoc.html>
with
<http://example.com/doc.html>

everywhere. One of the reason was, that exampledoc.html was misspelled 
exampldoc.html at some points and "example" occurs twice in the uri.

Furthermore, there is one small mistake in

# we can attach the metadata to the parent node:
<b its-ta-ident-ref="http://dbpedia.org/resource/Dublin"
    translate="no">Ireland</b>

should be

# we can attach the metadata to the parent node:
<b its-ta-ident-ref="http://dbpedia.org/resource/Ireland"
    translate="no">Ireland</b>


Also since it is informative now, we could also link to the persistent 
URI for the NIF service implementation spec as "further reading":
http://persistence.uni-leipzig.org/nlp2rdf/specification/api.html

All the best,
Sebastian

Am 06.09.2013 12:15, schrieb Felix Sasaki:
> Hi Sebastian, Dave, all,
>
> I have made all edits, see
> http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#conversion-to-nif
> http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#nif-backconversion
> http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/nif/EX-nif-conversion-output.xml
>
> The issue around "[" and "]" is not resolved yet. But besides that 
> everything (including Sebastian's note) should be ok. Dave, besides 
> the test suite update you described at
>
> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Sep/0014.html
>
> this should be everything, right?
>
> Best,
>
> Felix
>
>
> Am 06.09.13 11:39, schrieb Sebastian Hellmann:
>> Ok, here is the updated example file: 
>> https://dl.dropboxusercontent.com/u/375401/tmp/EX-nif-conversion-output.xml
>>
>> There is a problem, however. [ and ] are not allowed in the query 
>> component, so
>> rdf:resource="http://example.com/myitsservice?informat=html&amp;intype=url&amp;input=http://example.com/doc.html&amp;xpath=/html/body[1]/h2[1]"
>> violates https://tools.ietf.org/html/rfc3986#appendix-A
>>
>> pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
>> query         = *( pchar / "/" / "?" )
>> fragment      = *( pchar / "/" / "?" )
>> pct-encoded   = "%" HEXDIG HEXDIG
>> unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"
>> reserved      = gen-delims / sub-delims
>> gen-delims    = ":" / "/" / "?" / "#" / "[" / "]" / "@"
>> sub-delims    = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / 
>> ";" / "="
>>
>> I am not an XPath expert. Could we do it like this?
>> xpath=/html/body/1/h2/1
>>
>>
>> NIF 2.0 has become quite delayed. I am one of the main coordinators, 
>> which makes the standardization process lightweight, but I am also a 
>> bottleneck. One of the reason for the delay is my timely contribution 
>> here via email and meetings. I hope this is an acceptable trade off. 
>> The consequence is, that most of NIF 2.0 is not well documented, yet, 
>> although it is getting better day by day.
>>
>> All the best,
>> Sebastian
>>
>>
>>
>> Am 06.09.2013 11:05, schrieb Sebastian Hellmann:
>>> Hi Dave,
>>>
>>> Am 05.09.2013 13:19, schrieb Dave Lewis:
>>>> Following decision on the 4th December call to opt for a query 
>>>> style URL for the NIF string in RDF (which will also be supported 
>>>> in NIF) when defining the mapping the following need to be changed 
>>>> in the spec:
>>>>
>>>> 1) all occurrences of RDF URLs with #char or #xpath fragments to be 
>>>> changed to a query style as suggested by the RDF group and expanded 
>>>> on by Felix in 
>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Sep/0000.html 
>>>>
>>>> i.e. all fragment identifiers for NIF strings in annex F and G 
>>>> should be changed from, e.g.:
>>>>
>>>> one other associated question is, as we are using the query type to get around the limitations ofrfc 5147
>>>> char fragment in working with XML and HTML, is it still appropriate after the above change to type the NIF string in the example with
>>>> the subclass nif:RFC5147String? Sebastien? e.g.
>>>>
>>>> http://example.com/myitsservice?input=http://example.com/exampldoc.html&  <http://example.com/myitsservice?input=http://example.com/exampldoc.html&char=0,29>char=0,11
>>>>   rdf:type nif:RFC5147String;
>>>
>>> I am currently working on a formal ABNF definition for this, but you 
>>> can consider it to be like this in Java:
>>>
>>> String prefix = 
>>> "http://example.com/myitsservice?informat=html&intype=url&input=http://example.com/exampldoc.html&" 
>>> ;
>>> String identifier = "char=0,11" ;
>>> String uri = prefix+identifier ;
>>>
>>> // only identifier has to have the syntax given by the rdf:type
>>> validate ("nif:RFC5147String", identifier) ;
>>>
>>> So the syntax is only relevant for the identifier part. These would 
>>> be alternative prefixes as well:
>>> String prefixOption1 = 
>>> "http://example.com/myitsservice/informat/html/intype/url/input/http://example.com/exampldoc.html/" 
>>> ;
>>> String  prefixOption2 = 
>>> "http://example.com/myitsservice?informat=html&intype=url&input=http://example.com/exampldoc.html#" 
>>> ;
>>> String  prefixOption3 = 
>>> "http://example.com/myitsservice?informat=html&intype=url&input=http://example.com/exampldoc.html&" 
>>> ;
>>>
>>> It really doesn't matter and all three are valid RDF (The first one 
>>> is a bit awkward, of course)
>>>
>>>
>>>> 2) Once this is fixed we need to update the NIF part of the test suite and tests rerun by Felix, Leroy and Phil
>>>
>>> As written above, this is not strictly necessary, but it is nice to 
>>> be consistent.
>>>
>>>> 3) Add the following suggested note wording to the end of Annex
>>>> "Note: NIF allows URL for a String resource to be referenced as URIs that are fragments of the original document in the form:
>>>> http://example.com/exampledoc.html#char=0,11
>>>> or
>>>> http://example.com/exampledoc.html#xpath(/html/body[1]/h2[1]/text()[1])
>>>>
>>>> Though this offers a potentially convenient mechanism for linking NIF resources in RDF back to the original document, the char
>>>> fragment is defined currently only for text/plain while the xpath fragment is not defined for HTML. Therefore this URL
>>>> recipe does fulfil the ITS requirements to support both XML and HTML and the aim of this mapping to produce resources adhering
>>>> to the Linked Data principle of dereferenceablility. The future definition and registration of these fragment types, while a potentially
>>>> attractive feature, is beyond the scope of this specification."
>>>
>>>
>>> Let's change this like this:
>>> http://example.com/doc.html#xpath(/html/body[1]/h2[1]/text()[1])
>>> maps to
>>> http://example.com/myitsservice?informat=html&intype=url&input=http://example.com/doc.html#char=0,11
>>>
>>> Note that RDF is ok, with all Fragment Ids:
>>> http://www.w3.org/TR/rdf-concepts/#section-fragID
>>>
>>> RFC 3986 as well: http://tools.ietf.org/html/rfc3986#page-24
>>>
>>>
>>>> The semantics of a fragment identifier are defined by the set of
>>>>    representations that might result from a retrieval action on the
>>>>    primary resource.  The fragment's format and resolution is therefore
>>>>    dependent on the media type [RFC2046] of a potentially retrieved
>>>>    representation, even though such a retrieval is only performed 
>>>> if the
>>>>    URI is dereferenced.  If no such representation exists, then the
>>>>    semantics of the fragment are considered unknown and are effectively
>>>>    unconstrained.
>>>
>>>
>>>
>>> The text could be like this:
>>> "Note: NIF allows URL for a String resource to be referenced as URIs 
>>> that are fragments of the original document in the form:
>>> http://example.com/myitsservice?informat=html&intype=url&input=http://example.com/doc.html#char=0,11
>>> or
>>> http://example.com/doc.html#xpath(/html/body[1]/h2[1]/text()[1])
>>>
>>> This offers a convenient mechanism for linking NIF resources in RDF 
>>> back to the original document. RDF treats URIs as opaque and does 
>>> not impose any semantic constraints on the used fragment 
>>> identifiers, thus enabling their usage in RDF in a consistent 
>>> manner. However, fragment identifiers get interpreted according to 
>>> the retrieved mime type, if a retrieval action occurs as is the case 
>>> in Linked Data. The char fragment is defined currently only for 
>>> text/plain while the xpath fragment is not defined for HTML. 
>>> Therefore this URL recipe does fulfil the ITS requirements to 
>>> support both XML and HTML and the aim of this mapping to produce 
>>> resources adhering to the Linked Data principle of 
>>> dereferenceablility. The future definition and registration of these 
>>> fragment types, while a potentially  attractive feature, is beyond 
>>> the scope of this specification."
>>>
>>> I will try to update the example in the spec as well.
>>>
>>> All the best,
>>> Sebastian
>>>
>>>> cheers,
>>>> Dave
>>>>
>>>>   
>>>>
>>>>   <http://example.com/myitsservice?input=http://example.com/exampldoc.html&char=0,29>
>>>>
>>>>
>>>>
>>>>
>>>> On 03/09/2013 09:14, Felix Sasaki wrote:
>>>>> 1) last call item "RDF - NIF conversion". See
>>>>> https://www.w3.org/International/multilingualweb/lt/track/issues/131
>>>>> and these mails
>>>>> Phil 
>>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Sep/0001.html 
>>>>>
>>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Aug/0066.html 
>>>>>
>>>>>
>>>>> Dave
>>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Aug/0067.html 
>>>>>
>>>>>
>>>>> Felix
>>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Aug/0068.html 
>>>>>
>>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Sep/0000.html 
>>>>>
>>>>>
>>>>> Goal: decide about the option 1) or 2) or something else (see a 
>>>>> variation of option 2) in 
>>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Sep/0000.html 
>>>>>
>>>>> IMPORTANT: even if you are not implementing ITS <> NIF, please 
>>>>> state your opinion since tomorrow want want to form a working 
>>>>> group opinion, to be able to move forward.
>>>>
>>>
>>>
>>> -- 
>>> Dipl. Inf. Sebastian Hellmann
>>> Department of Computer Science, University of Leipzig
>>> Events:
>>> * NLP & DBpedia 2013 (http://nlp-dbpedia2013.blogs.aksw.org, 
>>> Extended Deadline: *July 18th*)
>>> * LSWT 23/24 Sept, 2013 in Leipzig (http://aksw.org/lswt)
>>> Venha para a Alemanha como PhD: http://bis.informatik.uni-leipzig.de/csf
>>> Projects: http://nlp2rdf.org , http://linguistics.okfn.org , 
>>> http://dbpedia.org/Wiktionary , http://dbpedia.org
>>> Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
>>> Research Group: http://aksw.org
>>
>>
>> -- 
>> Dipl. Inf. Sebastian Hellmann
>> Department of Computer Science, University of Leipzig
>> Events:
>> * NLP & DBpedia 2013 (http://nlp-dbpedia2013.blogs.aksw.org, Extended 
>> Deadline: *July 18th*)
>> * LSWT 23/24 Sept, 2013 in Leipzig (http://aksw.org/lswt)
>> Venha para a Alemanha como PhD: http://bis.informatik.uni-leipzig.de/csf
>> Projects: http://nlp2rdf.org , http://linguistics.okfn.org , 
>> http://dbpedia.org/Wiktionary , http://dbpedia.org
>> Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
>> Research Group: http://aksw.org
>


-- 
Dipl. Inf. Sebastian Hellmann
Department of Computer Science, University of Leipzig
Events:
* NLP & DBpedia 2013 (http://nlp-dbpedia2013.blogs.aksw.org, Extended 
Deadline: *July 18th*)
* LSWT 23/24 Sept, 2013 in Leipzig (http://aksw.org/lswt)
Venha para a Alemanha como PhD: http://bis.informatik.uni-leipzig.de/csf
Projects: http://nlp2rdf.org , http://linguistics.okfn.org , 
http://dbpedia.org/Wiktionary , http://dbpedia.org
Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
Research Group: http://aksw.org

Received on Friday, 6 September 2013 10:37:21 UTC