Re: ISSUE-28: Nathan's RDFa API Questions and Comments (e-mail 1) from Nathan on 2010-08-01 (public-rdfa-wg@w3.org from August 2010)

From: Nathan <nathan@webr3.org>
Date: Sun, 01 Aug 2010 20:55:00 +0100
To: Manu Sporny <msporny@digitalbazaar.com>
CC: RDFa Working Group <public-rdfa-wg@w3.org>
Message-ID: <4C55D114.4030609@webr3.org>
Manu Sporny wrote:
> Hi Nathan,
> 
> Apologies on the late reply to your input on the RDFa API document. We
> have been very busy with the RDFa Core document and now that we've
> approved publication of an RDFa Core and XHTML+RDFa heartbeat document,
> we are going to focus on the RDFa API document.

No problem and ty for the reply Manu - in line from here:

> As you may have noticed, I wrapped all of your feedback into an RDFa WG
> ISSUE so that we may address all of your concerns:
> 
> http://www.w3.org/2010/02/rdfa/track/issues/28
> 
> More feedback below:
> 
> On 06/08/2010 10:24 PM, Nathan wrote:
>> First, to perhaps contribute something (if it hasn't already been
>> suggested in the archives).
>>
>> I noted under future discussion, the following point:
>>   'A mechanism to load and process triples from remote documents.'
>>
>> The RDFa API currently provides the following method:
>>   parser.parse( document );
>>
>> And XHR [1] has the following attribute:
>>   xhr.responseXML
>>
>> which returns a 'Document'
>>
>> So this may already be covered for any mediatype which is text/xml,
>> application/xml or ends in +xml.
>>
>> Outside of this there is the DOMImplementation.createDocument method,
>> but I'm unsure how you could turn the XHR.responseText in to a Document
>> (surely there must be a way??)
> 
> The mechanism to load and process triples from remote documents is a bit
> more involved than that. We could do what you say and depend on XHR, but
> we were wondering if we could enable something like this:
> 
> parser.parse( url );
> 
> That is, could we enable the RDFa DOM API to extract the triples from a
> remote document while ensuring that CORS and XSS issues are mitigated.
> One way that we could approach this is to perform the request for the
> remote URL without sending any cookies or other identifying information.
> That is, parser.parse(url), would use virgin headers when accessing a
> cross-site resource. This would only apply to the browser environment -
> which would allow non-browser environments that want to send cookies to
> cross-site resources to have full access when extracting triples.
> 
> I think that this would solve any XSS problems while simultaneously
> allowing stuff like this:
> 
> // discover movie information for "Inception"
> 
> parser.parse("http://www.freebase.com/view/m/0661ql3");
> parser.parse("http://www.imdb.com/title/tt1375666/");
> parser.parse("http://www.rottentomatoes.com/m/inception/");

First, whilst I'd love anything that gets around CORS/XSS, afaict this 
won't, it solves half the problem but not the other half, consider this 
call:
   parser.parse("http://intranet.local/accounting");

Primary issue CORS resolves (whilst adding many more) is stopping the 
browsers (and thus the vendors) from being men in the middle with the 
attacks - no browser will ever implement anything that exposes private 
intranets, thus this is probably a no go and the document arg will have 
to stick.

to refine my original comment a little:

XHR only returns a document for XML based content, for HTML5 non xml 
variants all you get is the text. However, one can call:
  var d = document.implementation.createDocument('');
  d.write(data);
and d is a valid HTML5 document - thus parser.parse could stick with 
document and be perfectly usable (CORS/XSS issues aside).

Also, after raising this issue with the tag, timbl's response was simply 
'lean on datasources you know implement CORS' (and color them in green 
on the lod diagram) - quite possibly the only real way to get this 
moving is to just opt-in to CORS and get using it. From the other side, 
avoiding this issue in full by simply specifying 'document' could be a 
huge time saver - cors/xss is a nasty time wasting issue I'd hate to see 
you guys get wrapped up in.


>> 1: how do you get all data (triples)?
>> ( read this as, please consider adding a DataStore.getAll() method )
> 
> // passing in null for everything in filter() retrieves all triples
> var allTriples = document.data.store.filter();
> 
> The language was incorrect and didn't allow the subject to be optional
> in the filter() method. I've made the change to allow a zero-argument
> filter() method. I also added an example of how one can retrieve all
> triples.

ty

>> 2: merging stores?
>> given the following example:
>>
>> var rdfa = document.data.createParser("rdfa",
>> document.data.createStore() );
>> rdfa.parse();
>> var hcard = document.data.createParser("hCard",
>> document.data.createStore() );
>> hcard.parse();
>>
>> then rdfa.store will hold all the rdfa data, and hcard.store will hold
>> all the hcard data. (?) how would one merge all the data from the two
>> stores in to a single new one?
> 
> Good point, we had thought previously that one would just write a
> function themselves to merge two stores as there are many ways to do
> this. However, we might as well provide a utility function. I've added
> the following method to the API:
> 
> DataStore.merge( store )

Cheers, that'll be a time saver I'm sure :)

Best,

Nathan
Received on Sunday, 1 August 2010 19:55:41 UTC