See also: IRC log
See also the slides of the report of the breakout group, as given by Ivan
topic 1. vocab hosting and persistence
re hosting there is the vocab hosting best practices document from swdbp
re vocabulary alignments there are various developments...
sorry, missed the details of the mentioned alignment work
OMV: ontology metadata vocabulary
re URI schemes...
<danbri> danbrI: I have a specific proposal on federating independent namespaces, dns risk reduction & long term persistence
sharedname.org
purl.org
(and associated purl software)
danbri++ re dns risk reduction
also the google vocab namespace
lets not enumerate all of these right now
question: is physical hosting an issue
?
consensus: generally yes
guus: are we just going to provide a list of existing services or develop some guidelines?
now covering the state of the art wrt directory services for vocabs
examples, swoogle, watson, falcons, talis schemacache
guus: distinction: concept search (e.g watson/swoogle) vs schema search (schemacache)
schemaweb.info is outdated and not maintained. schemacache needs to be pushed live
schemacache current uri: http://schemacache.test.talis.com/
crawling vs curation
ivan points out there are some well used offerings in specific domains
distinction: general offering versus domain dependent
topic 3. versioning
publishing versioning histories for vocabs
(says danbri)
danbri mentions tom baker from DC
"when did dc:audience appear?" all these aspects are recorded
also cf the w3c document versioning scheme
danbri would like a mechanism for people to declare downstream dependencies on vocabularies
ivan: its a hairy ie research problem, technical but also sociological
<ivan> Open provenance model
also a need to revisit some of the named graphs work and provide some guidance
scribe: to the community at large on usage of NG
this was topic 4 (provenance) btw
preventing man in the middle attacks is a good use case for signing vocabs with pgp
research issue: robustness of vocabs and class hierarchies
<ivan> chris: falcon and others already have heuristics built in, ie, they reject statements that would define a superclass of foaf:Person
and trust mechanisms to address this
<ivan> cerealtom: for rights, the legal works have been done at open data comons and cc0 at science commons
<ivan> ... active usage of this should be pushed
<ivan> danbri: that is mostly at data issues
<danbri> tom: many diff jurisdictions have diff approaches
<danbri> ...whether there's some notion of a db right
<danbri> ...diff legal frameworks
<danbri> ...for making legal claims, of ownerships over sets of data, collections etc
<danbri> ... see work from Jordan Hatchett
<danbri> ... one strategy: put all in public domain
<danbri> ... layer on that social norms
<danbri> ... statements and requests for how it is used
<danbri> ... most ppl want attribution for what they do
<danbri> .... typically dont want to chase that thru the lawcourts, but was some ack
<danbri> ... cc lets you waive formal rights but express social norms
<danbri> ... open data commons
<danbri> stefan: eg i'd like to make my data public, but not have it aggregated
<ivan> -> www.opendatacommons.org Open data commons' site
<danbri> danbri: see TAMI and respect my privacy work from MIT/DIG (note : Oshani is in SocialWeb XG ...)
<danbri> see http://dig.csail.mit.edu/2009/SocialWebPrivacy/
<danbri> tom: talis community license > odc
<danbri> http://www.w3.org/2009/07/01-swdag2009-irc
Federation of Independent Namespaces
mutual back-watching for namespace hosts
proposal from danbri and tom baker
<hhalpin> http://dublincore.org/documents/singapore-framework/
<hhalpin> Application profiles
<aldogan> whole range of practices from bare reuse of ontology entities from other namespaces, to owl:import
<aldogan> one intermediate practice is declaring a module of an ontology to be imported (this works for reasoning purposes)
<hhalpin> so you can think about this in two ways
<hhalpin> abstract space of possible combinations
<hhalpin> which subset of that on some level that can be characterized as graph patterns
<hhalpin> are good practices
<hhalpin> on a practical level:
<hhalpin> let's think about practical examples
<hhalpin> on an instance level
<hhalpin> that can be put in tutorials
<hhalpin> guus: alignments are just vocabularies that link.
<hhalpin> bizer; but you can have transformation rules rather than alignments via a common concept, ala SKOS
<hhalpin> transformation rules = data fusion
<ivan> ivan: there is also the issue of the granularity of terms within vocabularies, if I refer to a specific term, what else do I pull in
<hhalpin> how can alignments that are manually made
<hhalpin> rather than automatically made
<aldogan> is intensional vs. extensional alignments an issue here? or are we just discussing about publishing alignments?
<hhalpin> be found and used, and validated.
<hhalpin> I think intensional vs. extensional is vocabulary-dependent.
<hhalpin> research challenge: lack of semantics and semantics of alignment
<hhalpin> so you can figure out how to reason properly over aligned vocabularies (again, how much of another vocabulary do you pull in?)
<aldogan> owl:sameAs often used to map any kind of similarity
<aldogan> but this is on the social side, since owl:sameAs has a proper (extesional) semantics
<hhalpin> directory services make it hard to find alignments
<hhalpin> we need directory services for alignments and make it easier to find them, standard way to host, as well as who said what alignment and provenance of alignment.
<hhalpin> guus: named graphs may be a potentially a way to get alignments off the ground
<hhalpin> guus: is this a w3c challenge for named graphs?
<hhalpin> ivan: how can I make ANY inferences in a multi-namespace document?
<hhalpin> ivan: does current RDF-based semantics work in multi-namespace documents? Does DL?
<hhalpin> ivan: Probably for RDF(S), not for DL.
<ivan> -------
<ivan> Start of session again after lunch
<ivan> guus Trying to summarize the research challenges and relations to other groups
<ivan> * Semantics of alignment
<ivan> * language for publishing algnments
<ivan> * characterizing graph patterns
<ivan> * internal and external dependencies across vocabularies
<ivan> * ranking and characterizing directory services
<ivan> * inference consequences of cherry-picking
<aldogan> isn't the last a special case of the previous one?
<ivan> * quality criteria for ranking deployed vocabularies and alignments
<ivan> (forget the last one, rephrase:)
<ivan> * metrics for selecting deployed vocabularies and vocabulary elements
<hhalpin> asun: not every language expresses the same concepts
<ivan> * problems of multilingual vocabulary alignments
<hhalpin> asun: like different kinds of fishes exist in say Greek, not English.
<hhalpin> harry: but we need to encourage schemas to have multiple languages.
<hhalpin> TAG's latest versioning:
<hhalpin> http://www.w3.org/2001/tag/doc/versioning-html/
<ivan> * vocabulary versioning in a distributed environments
<ivan> Slogan: do not delay the future
<ivan> * SW specific provenance models
<danbri> 3 or 4 strands that must be brought together: hacker/internet/crypto provenance, database provenance, logical formalism / kr scene (OWL, RIF, ...) .... and then the RDF/dataweb/web2 scene where the best we have is SPARQL graphs
<danbri> my personal ranking of topics:
<danbri> (this isn't private voting is it?)
<danbri> n2. graph patterns.
<danbri> n4. provenance models.
<danbri> n5. quality metrics for selecting vocabs.
<danbri> n3. vocab dependencies and cherry picking.
<danbri> n1. semantics of alignments / alignment language.
<danbri> n6. multilingual vocabs.
<ivan> After due deliberations and semi-secret voting the result is as follows:
<natasha> Final list of research challenges
<natasha> 1. Quality metrics for selecting vocabularies and vocabulary elements
<natasha> 2. Logical consequences from cherry picking vocabulary elements from distributed vocabularies
<natasha> 3. SW-specific provenance models
<natasha> 4. Semantics of alignments, including languages and models for publishing alignments
<natasha> 5. Characterizing graph patterns in published web data
<aldogan> 1. Semantics of alignments, including languages and models for publishing alignments
<aldogan> 2. vocab dependencies and cherry picking
<aldogan> 3. Quality metrics for selecting vocabularies and vocabulary elements
<aldogan> 4. provenance models
<aldogan> multilinguality and graph patterns are very important as well, uhm
<aldogan> unsure how much overlap exists among the topics emerged however
<ivan> Subtopic: White paper
<ivan> guus outlining a structure
<ivan> - URI Scheme (PURL, etc)
<ivan> - Physical hosting, backup policies
<ivan> - Versioning strategy (current practice like W3C practice, DC practice)
<ivan> - DNS Redirect advices, examples, templates
<ivan> - minimal attribution metadata
<ivan> - link & advice with respect to Open Data Commons
<hhalpin> i am listening in IRC
<hhalpin> "listening" via text