IRC log of prov on 2012-01-26

Timestamps are in UTC.

Meeting: Provenance Working Group Teleconference
Date: 26 January 2012
16:00:22 [Curt]
scribe: Curt
Chair: Paul Groth
16:01:30 [Mike]
Mike has joined #prov
16:01:44 [Curt]
Regrets: Graham Klyne, Paolo Missier, Khalid Belhajjame, Daniel Garijo
16:03:08 [jcheney]
jcheney has joined #prov
16:05:52 [pgroth]
PROPOSED to accept the minutes of the Jan. 19 telecon
16:05:54 [satya]
16:05:57 [davidschaengold]
16:05:58 [Curt]
0 (not present)
16:06:13 [Christine]
0 (not present)
16:06:13 [kai]
0 (not present)
16:06:15 [smiles]
16:06:16 [jcheney]
16:08:27 [Curt]
pgroth: actions: satya reviewing issues
16:08:50 [Curt]
satya: will try to respond to each on list, but time is short, progress on many of them
16:09:13 [Curt]
... many already addressed, satya just needs to review and make proper recommendations
16:09:18 [pgroth]
16:09:30 [pgroth]
Topic: F2F prep document updates
16:09:48 [Curt]
pgroth: going through documents to determine status and if changes are needed before F2F
16:10:04 [Curt]
... prov-primer
16:11:15 [satya]
16:11:27 [Curt]
working out updates needed, not changed since last editors version
16:11:58 [Curt]
satya: rdfs already provides way to do annotations, not currently modeled like that
16:12:29 [pgroth]
ack satya
16:13:11 [Curt]
satya: trying to bring everything into sync with prov-o and prov-dm in primer,
16:13:28 [Curt]
pgroth: prov-aq
16:14:03 [Curt]
...: Graham has made changes responding to most of issues, a few issues need discussion at F2F and after
16:14:04 [pgroth]
16:14:11 [Curt]
... in good shape for F2F
16:14:19 [Curt]
pgroth: prov-dm
16:14:29 [Curt]
luc: third working draft to release today for F2F
16:14:36 [pgroth]
16:14:40 [Curt]
pgroth: prov-o
16:15:23 [Curt]
many issues addressed at prov-o working group level, some still need whole WG to discuss
16:15:24 [Luc]
16:15:28 [pgroth]
ack Luc
16:15:33 [Curt]
current version has edits
16:15:55 [Curt]
luc: no update for precise/imprecise derivations
16:16:08 [Curt]
satya: still under discussion, consensus not yet determined
16:16:29 [Curt]
luc: some decisions made
16:16:53 [pgroth]
16:16:57 [Curt]
satya: progress has been made, but some things still unclear, need more discussion
16:17:02 [Curt]
pgroth: prov-sem
16:17:32 [Curt]
jcheney: not much changed recently, watching prov-o domain of discourse discussion, which may have an impact
16:17:44 [Curt]
jcheney: waiting for final determination to incorporate
16:17:56 [Curt]
jcheney: a few more things to flesh out that will happen prior to F2F
16:18:14 [Curt]
pgroth: most documents in reasonable sync. given work that has been done
16:18:37 [pgroth]
Topic: Prov-dm for the 3rd working draft
16:19:22 [Luc]
16:20:30 [Curt]
luc: work on complement, specialization, examples, derivation, collections, restructuring, new section 7 with constraints on data model
16:20:53 [Curt]
... ... agent and hadPlan
16:21:11 [pgroth]
Proposed: Release Prov-dm as a third working draft
16:21:19 [smiles]
16:21:24 [satya]
16:21:24 [jcheney]
16:21:25 [MacTed]
16:21:28 [Curt]
16:21:32 [kai]
16:21:53 [Curt]
satya: is the 3rd WD to reflect universe of discourse discussion identifiers?
16:22:05 [pgroth]
ack satya
16:22:30 [Curt]
luc: no, those aren't incorporated yet, those will go into the 4th WD, identifiers and accounts
16:23:12 [Curt]
... too many changes to incorporate, still determining final agreement on identifiers/accounts, may take a while
16:23:37 [satya]
16:23:38 [pgroth]
16:23:40 [Curt]
satya: yes, those may have broad impact
16:24:03 [pgroth]
Accepted: Release Prov-dm as a third working draft
16:24:21 [satya]
16:24:48 [Curt]
satya: good to freeze changes at a defined point and release a good draft
16:25:01 [Curt]
... we should follow that model for prov-o
16:25:07 [pgroth]
ack satya
16:25:13 [Curt]
pgroth: required by W3C to release each 3 months
16:25:21 [Curt]
luc: good to have well-defined goals for each release
16:25:31 [pgroth]
Topic: Identifiers in Prov-dm
16:25:40 [Luc]
16:26:06 [Luc]
I hope I included all the votes (I just added James')
16:26:06 [pgroth]
*All* objects of discourse ("entities") MUST be identifiable by all
16:26:07 [pgroth]
participants in discourse. Object descriptions ("entity records" and
16:26:07 [pgroth]
otherwise) SHOULD use an unambiguous identifier (either reusing an
16:26:07 [pgroth]
existing identifier, or introducing a new identifier) for the objects
16:26:07 [pgroth]
described." (intent)
16:27:07 [pgroth]
16:27:18 [Curt]
pgroth: a series of items were considered to determine what should be part of the universe of discourse
16:27:28 [pgroth]
Proposal 1: Entities and Activities belong to the universe of discourse.
16:27:48 [Luc]
all votes were positive
16:28:34 [MacTed]
I have failed to keep up with the list this week, and see argument with several of these proposals...
16:28:43 [Curt]
(many who voted are not present)
16:28:57 [MacTed]
Zakim, unmute me
16:28:57 [Zakim]
MacTed should no longer be muted
16:29:08 [Curt]
luc/pgroth: record previous vote for minutes rather than re-voting here
16:29:42 [Luc]
ACCEPTED: Proposal 1. Entities and Activities belong to the universe of discourse.
16:30:01 [pgroth]
Proposal 2: Events (Entity Usage event, Entity Generation Event,
16:30:01 [pgroth]
Activity Start Event, Activity End event) belong to the universe of
16:30:02 [pgroth]
16:30:06 [Luc]
16:30:27 [MacTed]
I accept Proposals 1-4, and have concerns or issues with 5-9
16:30:32 [Luc]
ACCEPTED: Proposal 2: Events (Entity Usage event, Entity Generation Event, Activity Start Event, Activity End event) belong to the universe of discourse
16:30:48 [satya]
16:31:21 [satya]
16:31:24 [pgroth]
ack satay
16:31:33 [Curt]
satya: with respect to prov-o, those were included
16:31:37 [Luc]
Proposal 3: Derivation, Association, Responsibility chains, Traceability, Activity Ordering, Revision, Attribution, Quotation, Summary, Original SOurce, CollectionAfterInsertion/Collection After removal belong to the universe of discourse.
16:32:11 [Curt]
luc: Stian voted -1 (for all but associations)
16:32:36 [Curt]
... not sure of his rationale
16:33:35 [Curt]
tim: laundry list is long, a concern to determine how each should be modeled in prov-o
16:34:06 [Curt]
luc: satya suppoted derivation, association and activity ordering, do you support those?
16:34:07 [Curt]
tim: yes
16:34:31 [pgroth]
16:34:43 [Curt]
luc: why doesn't stian think association should not be part of universe of discourse?
16:34:57 [Curt]
pgroth: possibly rephrase proposal 3 and re-vote?
16:35:17 [Curt]
luc: association belongs, since stian and tim do support those
16:35:17 [Luc]
Proposal: 3a: Association belongs to the unvierse of discourse
16:35:44 [Curt]
luc: we'll discuss with stian further and rephrase rest of proposal 3
16:36:17 [Curt]
tim: accepts association
16:36:26 [pgroth]
16:36:35 [Luc]
ACCEPTED: Proposal: 3a: Association belongs to the universe of discourse
16:36:40 [pgroth]
Proposal 4: AlternateOf and SpecializationOf belong to the universe of
16:36:40 [pgroth]
16:37:13 [tlebo]
tlebo has joined #prov
16:37:20 [Curt]
pgroth: may need more discussion of proposal 4, postpone for now
16:37:20 [pgroth]
16:37:33 [Luc]
Proposal 5: Records do not belong to the Universe of discourse This includes Account Record.
16:38:02 [pgroth]
16:38:09 [Curt]
pgroth: satya and macted disagree
16:38:44 [Curt]
satya: we need a construct to aggregate prov. assertions, if we remove records/accounts, we won't have a good way to do that
16:39:21 [Curt]
macted: is this to differentiate data/metadata in a given context?
16:39:23 [Luc]
16:39:27 [pgroth]
16:39:45 [Curt]
... in a database world, the fields are filled with data, the table has the metadata
16:39:58 [zednik]
16:39:58 [Curt]
luc: we're trying to establish that
16:40:07 [Curt]
macted: we need to make that distinction
16:40:23 [pgroth]
ack Luc
16:40:41 [Curt]
luc: we are talking about different levels, the world where things happen; level 2 descriptions of what happened in the world
16:40:54 [Curt]
... account records are at that second level
16:41:05 [Curt]
... we can go even higher to talk about provenance of provenance
16:41:31 [Curt]
macted: that isn't clear in these proposals
16:41:38 [Curt]
luc: we're trying to represent that intent
16:42:10 [Curt]
macted: things/entities are interchangeable, the proposals aren't clear
16:42:34 [Curt]
luc: we're trying to determine how to represent our intent into the documents
16:42:46 [Curt]
macted: difficult with text alone
16:42:47 [jcheney]
See also ISSUE-212
16:42:52 [tlebo]
tlebo has joined #prov
16:42:56 [Curt]
luc: yes, more graphics would help explain the concepts
16:43:26 [Curt]
zednik: yes, confusing, perhaps graphics or ASN could help explain this better, esp. things like prov. of prov.
16:43:26 [Luc]
16:43:29 [jcheney]
Is prov of prov on the critical path? I agree it's important but perhaps we should table it until one-layer prov is stable
16:43:32 [pgroth]
ack zednik
16:43:42 [satya]
16:43:44 [Curt]
pgroth: there is some demand of prov. of prov. from the group
16:44:22 [pgroth]
16:44:25 [Curt]
macted: this is a perpetual problem in graphs, the recursion. These levels can be better described graphically
16:44:36 [Curt]
luc: we haven't determined how to express prov. of prov. yet
16:45:09 [zednik]
@jcheney from "Recommendation # 4: A provenance framework should include a standard way to express the provenance of provenance assertions, as there can be several accounts of provenance and with different granularity and that may possibly conflict"
16:45:21 [Curt]
... for some account records aren't part of discourse, but if you do want to talk about them, then you will have to identify them
16:45:34 [satya]
16:45:38 [zednik]
16:45:40 [pgroth]
ack Luc
16:45:44 [Curt]
... do we want to have prov. of prov.? is that part of the scope we should cover?
16:45:47 [pgroth]
ack zednik
16:46:06 [Curt]
zednik: we don't want to preclude describing prov. of prov.
16:46:47 [Curt]
luc: the term 'thing' -- if we use an account record, we need to make the 'thing' an entity so we can describe it
16:47:06 [Curt]
... looking for guidelines/recommendations of where we are going with this
16:47:12 [pgroth]
16:47:41 [Curt]
pgroth: if we remove notion of account record from proposal 5, would that be in line with our thinking?
16:47:47 [tlebo]
+1 luc: the way to talk about things is by introducing entities. (we get provenance of provenance by making entities about the records - we effectively have shifted the two levels.)
16:47:57 [stephenc]
We have a use case for provenance-of-provenance on legislation
16:47:58 [pgroth]
16:48:10 [pgroth]
Proposal 5: Records do not belong to the Universe of discourse
16:48:35 [pgroth]
16:49:07 [Curt]
macted: this is the recursion problem. prov. of a thing is itself a thing (an entity) when asserting provenance about it
16:49:19 [satya]
16:49:20 [Curt]
macted: difficult to express without a picture
16:49:35 [Curt]
luc: we need more guidance to even draw the picture
16:49:50 [tlebo]
+1 (if i want to talk about Records, I make an entity about it)
16:50:02 [pgroth]
i agree with you tlebo
16:50:05 [Curt]
... if all records have an identity, that is a different direction that if records are not part of the universe of discourse
16:50:33 [pgroth]
16:50:40 [Curt]
macted: example - i have a table, built 1727, joe smith, sold on jan 19, 1728, sold again, again, again
16:50:50 [Curt]
... we track that journey through the world -- the provenance
16:50:58 [Curt]
... the records of that provenance are a distinct entity
16:51:11 [Curt]
... the provenance of the provenance are that I said it was built in 1727
16:51:22 [Curt]
... that shift the perspective up a level
16:51:30 [kai]
+1 for provenance on provenance.
16:51:42 [Curt]
... one level talks about the table, one about the provenance, one about the provenance of the records of the provenance.
16:51:45 [kai]
That's metadata provenance
16:51:59 [tlebo]
(so Records out outside of DM's "current" macted:Shift)
16:52:03 [Curt]
macted: this can be difficult to follow
16:52:25 [tlebo]
@macted, good example
16:52:35 [Curt]
pgroth: that use case is clear, but how do we best communicate that? what construct should prov-dm have?
16:52:56 [Curt]
macted: use a concrete example to figure that out, rather than trying to solve in the abstract
16:53:14 [Curt]
... have to look at both sides to make sure it all works
16:53:24 [pgroth]
16:53:24 [Curt]
... doing the abstract first makes this harder
16:53:26 [pgroth]
16:53:32 [pgroth]
ack satya
16:53:49 [zednik]
+1 to use concrete example before decidiing on abstract model restrictions
16:53:52 [Curt]
satya: the way to talk about things is to introduce entities
16:54:13 [Curt]
... when we want to talk about prov-of-prov, we need to have a universal construct for that
16:54:38 [pgroth]
16:54:41 [Curt]
... we have been discussing this notion already. records should be part of the universe of discourse
16:54:49 [jcheney]
16:55:00 [pgroth]
ack jcheney
16:55:10 [tlebo]
@satya, did you say that you need Account Records AND Accounts in UOD?
16:55:25 [Curt]
jcheney: I said I agree there is a difference between saying all records are part of the UofD, or if some could be
16:55:44 [Curt]
... some ambiguity. Some entities might contain information about provenance records contained elsewhere
16:55:53 [Curt]
... in order to express prov-of-prov
16:56:04 [kai]
16:56:41 [Curt]
... this isn't something we have to decide now to make progress, could we say "by default records aren't necessarily identified entities in the UofD, but they might be"
16:56:42 [pgroth]
16:57:07 [tlebo]
+1 james: by default records are not in domain of discouse, but can be if entities are used to discuss them (this shifts the perspective)
16:57:29 [Curt]
kai: we have a similar problem in dublin core, we can describe everything, but then we have to describe the description
16:57:29 [Zakim]
16:58:06 [tlebo]
+1 "it's nothing special'!
16:58:07 [Curt]
... we need to be able to describe prov-of-prov, need to consider the prov itself as an entity.
16:58:17 [Curt]
... if we do that, then we don't have a problem
16:58:46 [Curt]
... keep it simple, just say that prov. itself can be an entity, then you can describe it just like you describe the prov. of any entity
16:58:48 [tlebo]
+1 keep it simple (knowing that it can be shifted)
16:58:48 [pgroth]
16:58:51 [pgroth]
ack kai
16:58:53 [Curt]
... simply handles the recursion
16:59:12 [pgroth]
by default records are not in domain of discouse, but can be if entities are used to discuss them
16:59:33 [smiles]
16:59:47 [tlebo]
records are only a means of transmission. We only care about the content of the transmission.
16:59:50 [Curt]
pgroth: trying to capture this -- james' proposal allows us to shift perspective, is that ok? is that sufficient guidance for luc?
16:59:53 [MacTed]
see SKOS - containers of entities, which are containers of entities, which are containers...
17:00:03 [Curt]
luc: yes, that and the emails
17:00:16 [Zakim]
17:00:24 [tlebo]
I'm at the top of the hour
17:00:26 [jcheney]
OK with me (that's actually tlebo's wording, but I like it)
17:00:27 [MacTed]
er, sorry, SIOC not SKOS
17:00:28 [kai]
Don't make the mistake that in the end you can describe the provenance of everything, the only exception would be the provenance (records).
17:00:40 [Zakim]
17:00:51 [Curt]
pgroth: next few proposals need even more discussion
17:01:27 [pgroth]
Proposal: by default records are not in domain of discouse, but can be if entities are used to discuss them
17:01:38 [tlebo]
17:01:42 [jcheney]
17:01:44 [trackbot]
trackbot has joined #prov
17:02:03 [Curt]
satya: what does "by default" mean?
17:02:10 [tlebo]
"the current layers of the shift"
17:02:31 [Curt]
pgroth: when you describe provenance, you use things like entities, derivations, etc. not records
17:02:38 [jcheney]
I think it means that you can't infer that a record is in the domain of discourse. You have to assert it.
17:02:40 [Zakim]
17:02:56 [Curt]
... but if you want to describe prov-of-prov, you would (in some fashion) make the records into entities and use those
17:03:31 [satya]
17:03:35 [tlebo]
If we argue for a third layer, we are not being compact and eloquent. And we could argue for the fourth, and fifth. It won't end.
17:03:35 [Curt]
satya: decision not critical to move on
17:03:46 [Curt]
pgroth: this is important for modeling
17:03:54 [pgroth]
17:03:56 [pgroth]
17:04:05 [jcheney]
@satya: There is a difference between saying records "MAY" be in hte domain of discourse and records MUST be in the domain of discourse.
17:04:05 [kai]
17:04:10 [Luc]
@tlebo: i dont think we would introudce more layers, but a "shift operator"
17:04:32 [Curt]
kai: I can describe the provenance of data, not just things
17:04:54 [Curt]
kai: provenance of data is itself data, so we can describe it the same way
17:05:11 [tlebo]
@ speaker, because we already have what we need to discuss provenance (Entities)
17:05:25 [zednik]
-1 (show concrete example before making modeling decision, not other way around)
17:05:29 [Curt]
pgroth: we have "provenance records". last week we said things in the UofD are identified
17:05:53 [Curt]
... if we say records are part of the UofD, then we have to give them identifiers -- that affects the modeling
17:06:04 [Curt]
kai: what is the problem giving them an identifier?
17:06:16 [Curt]
pgroth: sometimes, we might not want to assign them identifiers
17:06:32 [pgroth]
17:06:55 [tlebo]
17:06:59 [Curt]
pgroth: is that in our UofD?
17:07:00 [Zakim]
17:07:26 [satya]
Sorry, I have to leave.
17:07:34 [Curt]
kai: I can only describe identifiable things, so if we want to describe them, we have to identify them
17:07:57 [Curt]
... just a collection of statements might not have an identifier, so we'll have to identify them if we want to describe them
17:07:58 [jcheney]
alternative wording: "records MAY be in the domain of discourse, but we don't assume that all records are in the domain of discourse" ???
17:08:03 [Zakim]
17:08:25 [Curt]
pgroth: some agreement, but try different wording
17:08:27 [pgroth]
records MAY be in the domain of discourse, but we don't assume that all records are in the domain of discourse
17:08:30 [jcheney]
alternative wording: "records MAY be in the domain of discourse, but we don't assume that all records are in the domain of discourse" ???
17:08:52 [jcheney]
is that at least clearer than "by default"?
17:09:11 [Curt]
kai: I think records are in the UofD, but only if they have an identity
17:09:42 [Curt]
kai: "every record that has its own identity is in the UofD"
17:10:06 [Curt]
luc: we were using accounts to handle this, not every single record
17:10:25 [Curt]
... we weren't going to have provenance of other records
17:11:01 [Curt]
... if we revisit this, we need to change more of the data model. we were previously only using accounts as a way to describe prov-of-prov
17:11:13 [Curt]
... are we questioning those decisions made 6 months ago?
17:11:39 [jcheney]
It may not have been clear to everyone whether "records" included or excluded accounts in this discussion (it wasn't to me)
17:11:42 [Curt]
... the latest draft still says the only way to describe provenance itself is through accounts
17:12:07 [Curt]
kai: something that has a URI, an identity, is something that exists. why restrict how you can describe that thing?
17:12:34 [Curt]
luc: we aren't considering resources in general, just the way we model those things in prov-dm
17:12:46 [MacTed]
SIOC Ontology -- -- may save us reinventing many wheels....
17:12:57 [Curt]
luc: are we making provenance records part of the UofD. Can we represent prov. of accounts?
17:13:11 [MacTed]
of particular use --
17:13:28 [Curt]
... are account records part of the UofD?
17:13:42 [Curt]
kai: Is there a problem if that are not in the UofD?
17:14:24 [Curt]
luc: we are breaking early design decisions. saying they are part of UofD, we say that all records have to have identifiers
17:14:43 [Curt]
... implications is every prov. record would have to have a named graph to give the set an identifier
17:15:02 [Curt]
... this is a radical departure to current work
17:15:08 [Curt]
17:15:25 [Curt]
luc: we need guidance on this
17:15:37 [Curt]
kai: we can discuss at F2F
17:15:50 [Curt]
... we don't want to destroy current work
17:16:04 [Curt]
... we should be able to figure out something that works next week
17:16:50 [Curt]
pgroth: kai isn't saying we have to have identifiers for everything, we don't have to have mint identifiers for every prov. record
17:17:03 [Curt]
... we can use that as preliminary guidance
17:17:33 [Curt]
kai: yes, that is what I think, they CAN have an identifier, with that you can describe the records' provenance
17:17:43 [jcheney]
That sounds like what I was trying to say.
17:17:47 [Curt]
... we should indicate that it is possible to describe prov-of-prov
17:18:02 [jcheney]
Might be good to give a small meta-prov example like MacTed's in PROV-DM?
17:18:12 [Curt]
kai: we are mostly in agreement -- just need to detail
17:18:13 [pgroth]
pgroth has joined #prov
17:19:01 [pgroth]
17:19:04 [pgroth]
I'll take care of it
17:19:06 [Curt]
17:19:07 [Curt]
17:19:35 [pgroth]
rrsagent, set log public
17:19:41 [pgroth]
rrsagent, draft minutes
