HCLSIG/SWANSIOC/Nanopublications-Subtask/110216-Skype

From W3C Wiki

Chat History: Nanopublications

Created on 2011-02-16 17:05:11.

David R Newman
14:40:27
Hi Sean.  I assume this nanopublication/swan/research objects chat is at 16:00 GMT?
Sean Bechhofer
15:05:56
yes -- I'll call you.
David R Newman
15:06:29
Ok, just so I can be somewhere quiet for 4pm
Sean Bechhofer
15:54:43
Will call you all in a couple of minutes.
Paolo Ciccarese
15:59:01
ok
Sean Bechhofer
16:03:02
are you there david?
Paolo Ciccarese
16:09:57
i lost you
Sean Bechhofer
16:10:07
restarting
Sean Bechhofer
16:13:40
What are the success criteria?
Sean Bechhofer
16:14:15
Paul: SHould be able to take two nano-publications that talk about the same thing and remove redundancy (but maintain provenance).
Sean Bechhofer
16:14:46
Paul: If it's the same triple but with different provenance.
Sean Bechhofer
16:15:23
Paul: Once we remove redundancy can say there are 10,000 instances of this nano-pib
Sean Bechhofer
16:15:35
Paolo: Lots of inconsistency in the database.
Sean Bechhofer
16:16:01
Paolo: Identfying conflicting statements
Sean Bechhofer
16:16:27
Sean: What's conflict?
David R Newman
16:16:42
Hi Sean
David R Newman
16:16:48
ready now
Sean Bechhofer
16:17:53
Sean: How to identify conflicting triples
Sean Bechhofer
16:18:09
paul:Avoid context, but encode statements with provenance.
Sean Bechhofer
16:18:57
Paul: Can then detect statements with more, less or joint provenance. Don't model conflict or context. Expose things with lots of provenance.
Sean Bechhofer
16:19:24
Sean: Don't try and answer the question. Punt on it. "Here's a bunch of stuff, do what you want with it".
Sean Bechhofer
16:20:01
Paolo: Taking facts out of context don't mean anything.
Sean Bechhofer
16:20:21
Paul: Where do embed the context. It's given by its provenance.
Sean Bechhofer
16:20:56
Paul: Provenance includes a source document.
Sean Bechhofer
16:21:00
Paul: OR a source db
Sean Bechhofer
16:21:22
Paolo: Don't encode the context, point at it and allow the consumer to interpret it.
Sean Bechhofer
16:22:00
Paul: Would rather have someone else worry about context. Anita de Waard tries to do that. With nanopubs we avoid that.
Sean Bechhofer
16:22:31
Paolo: You never have in Alzheimers a statement that's a triple.
Sean Bechhofer
16:22:51
Paolo: THings like "a large amount of <some stuff> is toxic for the brain".
Sean Bechhofer
16:22:57
So it's not at a triple level.
Sean Bechhofer
16:23:08
Paul: Can encode that as a mini[graph.
Sean Bechhofer
16:23:23
That's reasonable. Want to avoid mini grpahs that become large.
Sean Bechhofer
16:24:05
Sean: When is too big too big?
Sean Bechhofer
16:24:54
Paolo: Rule of thmb. Curators create statements. As simple as they can.  But more than triples.
Sean Bechhofer
16:25:11
Paul: Can encode more into the URLs.
Sean Bechhofer
16:26:44
Sean: What do you need for claims other than an identifier?
Sean Bechhofer
16:31:51
Sean: So claims need to have 1/ a means of identification and determining equivalence; and 2/ some means of search/discovery that may require ooking inside the claim.
Sean Bechhofer
16:32:26
and 3/ not too big/complex
Sean Bechhofer
16:35:07
Paul: Can't you split the claim into a triple?
Sean Bechhofer
16:35:18
Paolo: Need to make a load of stuff up.
Sean Bechhofer
16:35:39
Paolo: Can agree on provenance, but not on content.
Sean Bechhofer
16:36:02
Paolo: Could aske Nigem Shah to encode this using ontologies already existing.
Sean Bechhofer
16:37:30
Sean: Worried about forcing claims into triples.
Sean Bechhofer
16:37:45
Paolo: hard to define a self-contained claim and then hard to formalise.
Sean Bechhofer
16:38:30
David: What's important is finding concepts so that things can be reused. Also when a new one is created, need to provide sufficient description so that someone else can find it.
Sean Bechhofer
16:38:46
David: End up with multiple creation of similar things.
Sean Bechhofer
16:39:21
Sean: Dont't you end up with lots of stuff not interlinked?
Sean Bechhofer
16:39:40
Paul: People making claims and claim from data that's there.
Sean Bechhofer
16:39:56
Paul: Would be interesting to look at nano-pubs from data sets. Would be cleaner.
Sean Bechhofer
16:40:08
Paolo: Not so sure. Quantittive.
Sean Bechhofer
16:40:18
Paul: Can use RDF for that but it's a pain.
Sean Bechhofer
16:41:02
Paul: Will try and find a nano-pub based on data. Use the claim in Paolo's example.
Sean Bechhofer
16:41:23
Sean: Doesn't that involve interpretation?
Sean Bechhofer
16:42:11
Paulo: Let me try and select something simpler that can more easily be found in data sets.
Paul Groth
16:43:44
tumor protein p53 has phenotype Colon_cancer
Paul Groth
16:44:12
tumor protein p53 has bio marker UniSTS:158212
Paul Groth
16:44:28
http://linkedlifedata.com/resource/entrezgene/id/7157
Sean Bechhofer
16:45:29
Paul: statements made here don't have provenance.
Sean Bechhofer
16:46:03
Paolo: Notion of canonical claim. Lots of claims that were slightly similar but phrased differently. Biologists would say "they're the same thing".
Sean Bechhofer
16:46:13
Paolo: Idea of canonical statement.
Sean Bechhofer
16:46:50
Paolo: Statement tumor protein.... is the tip of the iceberg.
Sean Bechhofer
16:47:34
Paul: Like idea of the canonical statement.
Sean Bechhofer
16:48:12
Paul: Just saying things without provenance is less interesting.
Sean Bechhofer
16:48:49
ACTION: Paolo talk to Nigem (sp?)
Sean Bechhofer
16:48:58
ACTION: Paul to look at data nano-pubs
Sean Bechhofer
16:49:12
ACTION: Get this coded up in a triplestore. [Paul?]
Paolo Ciccarese
16:49:28
Nigam H. Shah
Sean Bechhofer
16:50:33
Paolo: Can produce RDF triples but not named graphs.
Paul Groth
16:50:58
http://www4.wiwiss.fu-berlin.de/bizer/TriG/
Sean Bechhofer
16:52:24
Sean: What do we ask once we've got it in a t-store?
Sean Bechhofer
16:52:38
Paul: Find me all the informaiton about myelation (and provenance)
Paolo Ciccarese
16:52:56
HyBrow A prototype system for computer-aided hypothesis evaluation - http://www.hybrow.org/
Sean Bechhofer
16:53:00
Paul: Removing redudancy, but none in there right now. Could fake some.
Sean Bechhofer
16:53:40
Paolo: Could ask someone from Alzheimers Knowledge Management.
Sean Bechhofer
16:53:51
Paolo: Would give deeper queries.
Sean Bechhofer
16:54:26
Paul: Find me all things confirmed in two different publications. Query about facts related to their provenance
Paolo Ciccarese
17:01:18
http://www.w3.org/wiki/HCLSIG/SWANSIOC
Sean Bechhofer
17:02:03
ACTION: Paolo to create a page on the HCLS wiki.
Sean Bechhofer
17:03:33
Paul: Can these be used as stand off annotations for ROs to improve search.
Paolo Ciccarese
17:03:53
I am out again :/
Paul Groth
17:04:03
we're just wrapping up
Paolo Ciccarese
17:04:10
what is RO?
Sean Bechhofer
17:04:21
Research Object
Paolo Ciccarese
17:04:25
oh ok
Sean Bechhofer
17:04:33
Bye all!
Paolo Ciccarese
17:04:36
bye :)
Paolo Ciccarese
17:04:51
Sean, are you going to send all of us the notes?