W3C

Web Annotation Working Group Teleconference

18 Feb 2015

Agenda

See also: IRC log

Attendees

Present
Rob Sanderson (azaroth), Frederick Hirsch (fjh), Dave Cramer (dauwhe), Kyrce Swenson (Kyrce), Ray Denenberg (Rayd), Tim Cole (TimCole), Bill Kasdorf (Bill_Kasdorf), Ivan Herman (Ivan), Matt Hasss (Matt_Haas), T.B. Dinesh (tbdinesh), Maxence Guesdon (MGU), Benjamin Young (bigbluehat), Kristóf Csillag (csillag), Doug Schepers (Shepazu), Dan Whaley (dwhly), Jacob Jett (Jacob)
Regrets
Paolo Ciccarese, Raphaël Troncy, Davis Salisbury
Chair
Frederick Hirsch, Rob Sanderson
Scribe
Kyrce

Contents


Agenda Review

fjh: … approve minutes, go through data model issues, no protocol discussions scheduled. Robust anchoring—not much we can do right now. Anything else?

Minutes Approval

<fjh> proposed RESOLUTION: 11 February 2015 minutes are approved: http://www.w3.org/2015/02/11-annotation-minutes.html

<fjh> RESOLUTION: 11 February 2015 minutes are approved: http://www.w3.org/2015/02/11-annotation-minutes.html

<fjh> Topics: Use Cases

fjh: … paolo cannot attend on use cases issues on csv, not up to date, we want to add a column but we are not sure how that would work. Is there more on use cases that we can discuss on this call?

<fjh> fjh: saw that we have new material related to CSV use cases

<tbdinesh> i submitted the social semantic web as a use case

<tbdinesh> but it would be good to discuss after some offline looking at it

azaroth: … progress with protocol use cases. existing products that use some kind of protocol and see if we can get common use cases out of them. how iAnnotator works, or how any third party annotation tools works.
... … promote if not user stories, a synthesis of the current playing field.

fjh: .. needs action item but not sure who to.

<tbdinesh> (sorry, my phone with headphone has been a problem often)

azaroth: brief summary—paolo for annotopia, does anyone have other products they want to describe?

<azaroth> ACTION: azaroth to request summary of protocols from Paolo (Domeo) and Nick (Annotator) [recorded in http://www.w3.org/2015/02/18-annotation-minutes.html#action01]

<trackbot> Created ACTION-5 - Request summary of protocols from paolo (domeo) and nick (annotator) [on Robert Sanderson - due 2015-02-25].

fjh: … azaroth ask nick. nick will indicate if he is the right person. one issue between linkage of protocol and data model. Decouple as much as possible. Zip files, etc. Decouple deliverables as much as we can.
... … might want to capture how use cases relate to our deliverable.s
... issues now, anchoring when doug comes later.

Data Model issue review

<azaroth> * Split punning body to body / bodyValue?

azaroth: did some cleaning which prompted some issues. Run through them at a high level. Pick one or two to work on, whatever we think is most interesting to people on the call. fjh: Is there a way to enter issues into chat?
... … body property either an object or a literal string as its value. That seemed to fulfill most of the requirements. After the first public working draft there was further discussion. Need to be able to do inferencing, etc. Suggestion is that we split body into body which is always a resource and bodyvalue which is always a string.

<azaroth> * How to do a graph as the body?

azaroth: … Next is how to have a graph as a body of an annotation. Instead of having a human readable comment, we have a machine readable content. It would be valuable to be able to express the value by using a small graph. In Json ld this is pretty straightforward. So that would solve the problem, TriG does not have the concept. Some discussion on what had been the recommendation or what had not.

<azaroth> * PROV inference rules are broken by serializedBy/annotatedBy (and ...At); drop serialized*? change subPropertyOf?

<bigbluehat> http://www.w3.org/2004/03/trix/#consynt

<bigbluehat> TriG's in the list.

<ivan> http://www.w3.org/TR/trig/

<bigbluehat> tnx ivan

<fjh> s/.*consynt//

azaroth: … In the community group model we have annotatedBy and annotatedAt for some— Trig is a serialization of RDF that supports graphs.
... … my thought was to go through all of them and then pick ones to go over with but I am also happy to stop on the ones where people have interest.

<azaroth> +1 to Ivan

ivan: … The problem with the main graph thing is that we bring in yet another obstruction that even less people know about than RDF. That creates problems. You and I have background; we know this. But there are few people who know this. I'd like to see alternatives and more on the original problem. It's a source of discussion even among experts.

azaroth: … paolo has the use case. When you have comments that are generated through machines, where the machine understands the semantics, it would be nice for systems to be able to have a machine-readable comment or body so you can include the graphs for processing.

TimCole: … community group wanted to use real use cases, only interesting to a few people. Allow it, and let people who want to use it, use it. The other option: not allow it and wait until the pressure from those who allow it to put it in. Not sure which is better. Can put it in a way that most people can ignore.

<tbdinesh> (sorry bad skype coonection)

<Jacob> tuples (i.e. triples)

<fjh> graph is collection of triples with URI

<azaroth> Collection of RDF triples that have a URI

ivan: … apologies to those who don't understand. A graph is a collection of triples RDF statements that have a URI and this collection has a URI of its own. In some sense it can be addressed today without complications. I think we should postpone for now.

fjh: I agree. the use case in LD is a non-issue. Can we just require JSON-LD and be done with it?

<shepazu> (BTW, here's an explanation of an RDF Triple: http://stackoverflow.com/questions/273218/what-is-an-rdf-triple)

<shepazu> (more detail: http://webdesign.about.com/od/rdf/a/rdf-triples.htm)

<shepazu> (basically, Subject, Object, Predicate)

shepazu: … the issue was in supporting in one serialization and what can be said in another. Then it becomes problematic. One of the options in the community group or before we already have a way to say a body and it has this format. You can still embed the triples, they would not be part of the serialization. i think that is what Ivan is suggesting? If it does not have the URI it cannot be embedded. I would be happy with the solution in the meantime
... … talk about how we could achieve what was suggested. Main spec + module for those users, extending data model.

ivan: … you don't have to extend the data model.

shepazu: … not sure that would satisfy the use case
... if we find that it does not, I am in favor of a simpler spec. If we need to extend it—extended feature is what you want.

<Jacob> +1 for postponing and addressing via a note or extension

RayD: … directed at Rob, related to BIBframe and you understand that. Is this the same issue? Bodies that have properties. is that the same?

<fjh> proposed RESOLUTION: postpone graph as body issue and address via note or extension

<TimCole> +1

<fjh> subsequently

Rob: similar. Body would describe a set of triples. Question is if the triples enter the body resource.

RayD: will send as part of the mailing list.

<fjh> proposed RESOLUTION: postpone graph as body issue and address via note or extension if current serialization isn’t sufficient, subsequently

<MGU> +1

<azaroth> +1

<shepazu> +1

<ivan> +1

<fjh> RESOLUTION: postpone graph as body issue and address via note or extension if current serialization isn’t sufficient, subsequently

<fjh> +1

<tbdinesh> +1

ivan … trying to understand the issue. Issue when we had the discussion with the csv people. They want to have an annotation of one string and a reference to the target. I would like that finalized so we can finalize the csv doc.

azaroth: current model is that you can have either a string or the object in JSON LD. And revision for the next draft to add a second property bodyValue. The rationale behind it: instead of checking type of resource, you could check for two different properties. Still do inferencing more easily.

<Jacob> Can we just handle this by assigning a sub-type to the body rather than minting some new predicate?

TimCole: …I thought to not have second predicate was compelling. I think that having it check to see if its a string would be easier than checking for a second value.

Ivan: … I agree with everything Tim says today. It depends on who we want to optimize for. How many people in our community really want to do serious inferencing? for most users, simplicity is more important.

azaroth: it was antoine who was in favor of us splitting the properties. I forget what his motivation was. It seemed convincing to me at the time.

Ivan: … so the point is that for the JSON world, a property value can be a string or an object is a widely used pattern. There is nothing surprising or offputting about this for that community. Those are our users in my view. For RDF users, perhaps they will need other tricks.

azaroth: … one point: if the body can be either a resource or a literal, if its a resource you have to give it a JSON object construction. Which is asymmetric with target, which will almost always be a URI. And that's confusing.

Ivan: … I can envisage the target may be more complicated for some media than URI. We may have an object for target as well (eg, Bnode). I wouldn't bet on target being simple.

azaroth: … not always simple. But if 99% use case is simple tagging, then that's the situation we should optimize for and the target would be a string and you would get the asymmetry. We can postpone the issue or we can have the consensus to leave as is.

RayD: Prior to this discussion, I wanted two properties. But this discussion made me change my mind. Antoine was arguing a year or so ago when inferencing was thought to be more important. This is no longer true today. We should rethink the two properties.

<TimCole> if we go 2 property route should we go body and bodyResource

<fjh> should we record in the use cases or elsewhere, the relative importance of inferencing to the design?

azaroth: … leave the model as it stands in the first public working draft?

<ivan> +1

<fjh> proposed RESOLUTION: resolve body property issue by keeping as in FPWD

<RayD> +1

<TimCole> +1

<ivan> +1

<Jacob> +1

<MGU> +1

<azaroth> proposed RESOLUTION: leave the model as it stands in the FPWD with a punning property, pending further discussion from interested parties

fjh: … suggest that we record in the use cases our view that inferencing is a lower priority.

azaroth: … in the principles and data model document.

fjh: … then the question won't come up as much. So it's necessary.

<azaroth> ACTION: azaroth to add that inferencing is not an important design criterion to data model principles [recorded in http://www.w3.org/2015/02/18-annotation-minutes.html#action02]

<trackbot> Created ACTION-6 - Add that inferencing is not an important design criterion to data model principles [on Robert Sanderson - due 2015-02-25].

<fjh> I would say that it is important but lower priority than usability in other contexts

azaroth: good progress on those two issues. Shall we cut over or do one more?

<azaroth> * PROV inference rules are broken by serializedBy/annotatedBy (and ...At); drop serialized*? change subPropertyOf?

azaroth: the next one is from inferencing rules. We inherit these from the provenance ontology. About the agent that creates the annotations. annotatedBy/annotatedAt, serialzedBy/serializedAt
... … if there are two agents for these. For generation, there can only be one reference
... … two possible approaches. We have to change this because we are facing some rules. We could split up annotation into two separate resources: artifact and conceptual, so they can use the same term. we had decided against this in the past because it makes the model more complex.
... … we could drop serializedBy/serializedAt and say we don't need this info and the annotation would always just be the concept.

<Jacob> Is it possible to shift annotatedAt/By to be a property of the body?

azaroth: simplest we can change it so it is not part of the provenance ontology and then we would avoid the inferencing problem. This does not change our model but seems the least pure. So we have some things that are provenance and some that are not but they look like the come from the same.

<fjh> +1 to importance of keeping information

ivan: … keeping this information somewhere is important. e.g., in scientific environments this information is part of the discourse. Let's postpone this and talk to Luc Moreau. See what the issue is from him and perhaps he has a better idea to solve this.

azaroth: Luc brought up the issue

<Jacob> +1 for Ivan's suggestion

<azaroth> https://github.com/w3c/web-annotation/issues/10

ivan: … without a solution? then we need him to propose a solution

<fjh> +1 ask for proposed solution in conjunction with raising issue

fjh: … I agree with Ivan. This brings inferencing back. Luc should provide a proposed solution.

azaroth: +1 from me as well. Suggested: change the mapping. Not sure what this would entail. Making it not part of the mapping, but I'm not certain.
... ask Luc to come up with a proposal that leaves things simple but doesn't fall into the problem.
... … Thank you for going through the issues. Let's cut over to anchoring.

<azaroth> ACTION: azaroth to ask Luc to propose a solution for #10 / #7 [recorded in http://www.w3.org/2015/02/18-annotation-minutes.html#action03]

<trackbot> Created ACTION-7 - Ask luc to propose a solution for #10 / #7 [on Robert Sanderson - due 2015-02-25].

Robust Anchoring

fjh: … is there a way to mark the issue we postpone so we don't go over them endlessly? azaroth: … we will postpone on github.

<azaroth> ACTION: azaroth to update github issues to avoid repetition [recorded in http://www.w3.org/2015/02/18-annotation-minutes.html#action04]

<trackbot> Created ACTION-8 - Update github issues to avoid repetition [on Robert Sanderson - due 2015-02-25].

shepazu: … There is some disagreement about whether or not this should be done at all. Some people are uncomfortable with standardizing. Next step is to carve out a week where I can draw up strawman for API.

fjh: do we know what the concern is and what the root cause is?

<csillag> shall I wait for my place in the que, or should I just jump in?

Shepazu: I'll go ahead and try to characterize it.

<fjh> fjh: thanks for reminding me of the issue, now I remember the desire for more information

<csillag> I'll exit and re-join my voice connection

<fjh> shepazu: selectors input, return is best match, can have a variety of inputs for selection

Shepazu: not sure where csillag sits on this, please revise me. Some of us at hypothes.is feel like we don't know enough to put forward a solution, to put forward an algorithm on how things would behave. put in set of selectors, including string you are looking for. The API would return the best match. Selectors; text string, maybe distance on string, xpath, 32 characters before or after, etc. Find in page API would take these things and find closest match.

<fjh> so issue is getting deterministic response for selector algorithm, which might not be so easy, right?

shepazu: concern from hypothes.is is that this goal would be to specify the algorithm. the implementation should follow this algorithm. Same return value on any document in all browsers. Same behavior. Same range. To do that we would need to 1. specify algorithm. 2. test behavoir exhaustively

<csillag> zakin, +??p7 is csillag

<fjh> seems like an important concern. Can we narrow the scope/concern?

shepazu: … question whether we know enough right now to say what the algorithm should be. It has been said that it should be up to the implementation so they could optimize—but as a dev, I don't want inconsistencies across browsers.

<fjh> can we start with a very simple algorithm?

shepazu: we want interoperability over optimization. interoperability means that people can reliably build apps. Second: I think we can specify algorithms so we can extend in the future. First pass today; allow different ordering of selectors and let new algorithms be introduced or improved.
... … do this in a way that would not disrupt even past implementations. Still would be most reliable result.

<fjh> thanks Doug for clear overview

<azaroth> Just to note that we can't punt on the /model/ side of things -- we need to have some selectors.

csillag: … add two things. Several different levels to standardaze. 1. API, how do we access text content and run searches. Concrete things we can plug into the API. We can standardize the API without standardizing the algorithm.

<azaroth> Thanks all :)

csillag: … The other thing I wanted to note: there are two opinions one to standardize and one to not.

<dwhly> +q

<fjh> cannot join

<ivan> +1 to that 'sort' option

<fjh> suggest we Adjourn

shepazu: … agree about the idea of letting authors provide their own algorithms. example: the sort method in js arrays, where you give a specific algorithm for sort. Find out in the wild what the most optimized or popular algorithm would be.

<RayD> gotta go

<fjh> Kyrce, thank you very much for scribing!

<fjh> Thanks all for a very productive call

shepazu: a robust anchoring API is more robust than text. Not everything you want to robustly anchor will be text. You search for other DOM nodes, including a portion of an image, a map, any other resource.

Ivan: … dwhly urgent? we need to pick this up later.

dwhly: … don't want to characterize things as minority or majority. Do we have enough information to standardize is the question.
... … put it on the agenda again and discuss

ivan: … adjourn now.
... … clearly something for another week. Thank you all. We are adjourned.

<ivan> trackbot, end telcon

Summary of Action Items

[NEW] ACTION: azaroth to add that inferencing is not an important design criterion to data model principles [recorded in http://www.w3.org/2015/02/18-annotation-minutes.html#action02]
[NEW] ACTION: azaroth to ask Luc to propose a solution for #10 / #7 [recorded in http://www.w3.org/2015/02/18-annotation-minutes.html#action03]
[NEW] ACTION: azaroth to request summary of protocols from Paolo (Domeo) and Nick (Annotator) [recorded in http://www.w3.org/2015/02/18-annotation-minutes.html#action01]
[NEW] ACTION: azaroth to update github issues to avoid repetition [recorded in http://www.w3.org/2015/02/18-annotation-minutes.html#action04]
 
[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.140 (CVS log)
$Date: 2015/02/25 18:23:56 $