RDFa Working Group Teleconference -- 18 Nov 2010

<trackbot> Date: 18 November 2010

<markbirbeck> Manu...I'm at a two-day open data camp: http://blog.okfn.org/2010/08/13/open-government-data-camp-2010-18-19th-november-2010/ I thought I'd be able to find a quiet spot to join the call, but it's not possible. I'll hang out on IRC as much as I can, in case you need any extra votes. :)

<markbirbeck> (I.e., I know there are some issues that you wanted to close sooner rather than later.)

<manu> scribe: Steven

<manu> scribenick: Steven

<manu> Agenda: http://lists.w3.org/Archives/Public/public-rdfa-wg/2010Nov/0101.html

<scribe> Scribe: Steven

<ShaneM> is muted

Manu: Michael Hausenblas is reviewing the RDFa Core 1.1 LC doc

ISSUE-53: DataParser Upgrades

ISSUE-53: DataParser Upgrades http://www.w3.org/2010/02/rdfa/track/issues/53

<trackbot> ISSUE-53 DataParser Interface and Arguments Upgrades notes added

issue-53?

<trackbot> ISSUE-53 -- DataParser Interface and Arguments Upgrades -- open

<trackbot> http://www.w3.org/2010/02/rdfa/track/issues/53

Manu: No one objected to the changes

<webr3> 57,49,55,58 all past 7 days

Manu: take a look at the coming two changes for potential objections

<manu> http://www.w3.org/2010/02/rdfa/track/issues/53

Nathan: Main changes to this parser are 1) to support lightweight SAX-like parsers by adding a method
... to give a callback on each triple found
... 'parse' method may need discussion
... should we also support fragment parsing
... or just full document

Ivan: Procedural point
... is there anyone who will implement the interface on mobile for feedback?

<Zakim> manu, you wanted to comment on mobile implementation

Manu: LibRDFa is sax-based, keeping the footprint small, so I could give feedback

Ivan: We need feedback

Manu: Is a parser with low memory requirements enough?

Ivan: We define our CR criteria, but they should be decent
... an implementation on a phone would be fine

<markbirbeck> We already support fragment parsing in a sense, since the input to the parse method is a DOMElement object, not a Document object.

Shane: I have access to mobile environments
... we should definitely test it

<markbirbeck> MACRO11

<markbirbeck> brings back memories.

Nathan: I see that Mark points out that we do DOMElement, which allows fragments, but a DOMElement is not the document node

<markbirbeck> @Steven: Right. However, we stipulate that the /entire/ document must be taken into account for context.

<markbirbeck> But we /could/ say that we start parsing at that node, and that's it.

<markbirbeck> (That's what my parser did.)

<webr3> markbirbeck, yes but you can't throw in a Document because Document extends Node not Element

<markbirbeck> Oh yes, sorry.

<markbirbeck> @webr3: Good point...but I think that is a mistake then. The intention was originally to be able to start parsing anywhere.

[Scribe notes that web3r is Nathan]

Nathan: there are two interfaces, one for each triple, one for the whole graph
... a third allows you to filter triples
... to keep memory requirements down

Manu: I added some items at the bottom of the issue
... such as RDF Graph
... how does the parser callback have access to the filter method?

Nathan: The filter is run by the parser itself, and the callback is only called if the filter allows it
... the callback is a graph
... the graph is first assembled, and filtered, then passed to the callback

Manu: So 'run' is only called once at the end of parsing?

Nathan: Yes

<webr3> parse( document, function(graph) { //etc }, ?filter )

Manu: If there is an error, how do they get hold of the processor graph?

<markbirbeck> @Nathan: Not quite following...how does it save memory to assemble the whole graph? Surely you want to reject triples during assembly? Or have I missed the point? ,)

[scribe missed answer]

Manu: Is the run method called with what could be parsed?

Nathan: I think so

Manu: LibRDFa ends parsing many times because of bad documents

<webr3> @Mark, the whole graph isn't assembled, before the parser adds a triple to the graph it checks if it passes the filter or not before adding

Manu: it is more useful to be able to use triples you have found than to give up completely

<markbirbeck> Like this?:

<markbirbeck> parse( document, function( triple ) { if favoured( triple ) return true; return false; } )

<markbirbeck> Check each triple?

<markbirbeck> Only add if it matches?

<webr3> @Mark: pretty much yes :) see: https://github.com/webr3/rdfa-api/blob/master/source/parsers.js

Ivan: The question is whether any of the triples found are in error because of the parsing error

Manu: Triples can still be wrong without parsing errors

Ivan: True
... I would prefer not to get triples than to get wrong ones

Manu: That should be up to the application

<markbirbeck> @Nathan: Thanks. Was confused by earlier discussion...it sounded like the filter was applied after the entire graph was assembled.

Manu: Most pages are broken, but with librdfa, I can get lots of useful triples despite that

Steven: Isn't there a DOM, and we should define it in terms of the DOM you get, or would have got if you had built one

Manu: Not possible with a SAX-based parser
... if you're not running in a browser environment, you don't have that option

Ivan: I propose that we should keep the triples, but someone should analyse what sort of danger we are in
... I don't want incorrect triples.

Manu: The parser returns false for incorrect input

Nathan: I think that's fine
... how about if a profile can't be retrieved? Is that a parsing failure?

Manu: I think so, a processing error should return false

<markbirbeck> Bearing in mind of course that a parser is not obliged to retrieve a profile. :)

Manu: other issue is both graphs, default and processor. Option, should go in the callback?

Nathan: Only pass in the default graph into the callback
... that would be wise

Manu: Expose the processor graph via an attribute or a method?

Nathan: A method
... would be friendlier

Manu: Agree
... So that handles the parse method
... oh, one more thing

<markbirbeck> Doesn't that latter point depend on the graph v. store issue?

Manu: do we support passing in IRI's?

<markbirbeck> If you pass in the store, you get both graphs anyway.

Manu: Are there any MUSTs in what is supported?

<manu> Does parse() support: DOMElement, Document, text, IRI?

Nathan: MUST be document, and I suggest text, just for existing RDF. Not sure about IRI
... you don't know what's at the other end

<manu> Document, text

Manu: Could I pass in a <div>?

Ivan: You lose the context, prefixes, language, base, ...
... and get wrong triples

<markbirbeck> Obviously difficult to determine the subtleties of the discussion, but why not a node? I.e., element /or/ document.

<markbirbeck> You don't get 'wrong' triples if you know what you are parsing.

<webr3> mark, agreed /if/ partial documents then has to be Node

Manu: Snippet editing is a use-case

<markbirbeck> Lots of people are doing stuff where they embed RDFa in RDFa.

<markbirbeck> E.g., Drupal.

Manu: Don't think we need to support DOM elements

<markbirbeck> So you might want to parse the /embedded/ RDFa using a different context to the one from the source document.

Ivan: I don't even know what the host language is with a fragemtn

Manu: OK to leave elements out?

Nathan: I'd be happy with that

<markbirbeck> Seems unnecessary.

<markbirbeck> (To leave it out, I mean.)

<webr3> drupal case, is that not 2 documents?

Steven: Is Mark's example a suitable use-case?

Manu: DOM element parsing is more like a convenience
... we can still support the Drupal case without DOM element

<ShaneM> I do not think that arbitrary element subtree parsing makes any sense on its own. It has to be in the context of a document so there is a media type / base.

<manu> @markbirbeck: We don't need to process DOMElements if we support Document and text

<manu> @markbirbeck: DOMElement processing is more of a convenience method...

<markbirbeck> Do we not want to provide conenience? ;)

<manu> @markbirbeck: yes, but it's not too inconvenient to wrap the content in HTML/HEAD/BODY ?

<manu> @markbirbeck: Let's push off this decision until a bit later, until we can have a conversation w/ you about it.

<webr3> @markbirbeck: libraries can easily without requiring this, especially if we just set the type to 'any' which we have to - the issue is half moot

<markbirbeck> The RDFa parsing model is recursive, so once you get past the root node, then you are always parsing elements anyway.

<markbirbeck> The context is passed down the calls, by accumulating from what's above.

Nathan: If we're supporting more than Document, then we have to change the interface to ANY
... we can't prevent that\

Manu: Right, but we'd use spec text to prevent it

<markbirbeck> So it's not really an argument to say that you need the context from the host document, because once you get past the root you're using a contrived context anyway.

<markbirbeck> @Manu: Sure. Sorry that I've missed yet another call. :(

Manu: We have to make the requirements clear

Nathan: I agree

Manu: We will have a future discussion with Mark about fragments

AOB

Steven: Next Thursday is Thanksgiving, are we having a call?

<markbirbeck> I'll also ping Stephane because I think it's his use-case that has lodged in my mind.

<markbirbeck> (And if it's not, I'll see if I can work out where it came from.)

Manu: I can't make the call. Please feel free to have a telcon without me

Shane: I'm available

Manu: I'll send an agenda for the rest of you

[ADJOURN]

<markbirbeck> Bye everyone.

- DRAFT -

RDFa Working Group Teleconference

18 Nov 2010

Attendees

Contents

ISSUE-53: DataParser Upgrades

AOB

Summary of Action Items

Scribe.perl diagnostic output