Government Linked Data Working Group Teleconference

25 Jan 2012


See also: IRC log


Hadley, Beeman
BenediktKaempgen, PhilA


<mhausenblas> ok, I guess we're set and ready!

<cygri> member:zakim, who is on the phone?

<cygri> member:zakim who is on the phone?

<cygri> zakim BartvanLeeuwen is with Galway

<sandro> I'm thinking skype is the best bet.

<sandro> (with sound turned off)

<mhausenblas> the only thing we now need (for both Zakim and skype) is ....

<mhausenblas> Washington! :)

<sandro> ( who is dvilasuero ? )

<cygri> sandro, dvialasuero is daniel vila from madrid

<cygri> it's richard.cyganiak

<mhausenblas> trackbot, start telecon

<trackbot> Date: 25 January 2012

<mhausenblas> scribenick: mhausenblas

<bhyland> ping

<sandro> pong bhyland

<sandro> GUEST: Gofran Shakair

<sandro> GUEST: Deirdre Lee

Introduction and welcome. Agenda review

PhilA: I'm in Galway, W3C staff
... I've been working on vocabularies, ADMS, DCAT, organisation ontology

dvilasuero: In Galway, Master student with Boris (UPM) from Spain

fadi: In Galway, finished my MSc on Publishing Linked Gov Data, Google Refine, DCAT

boris: In Galway, UPM, we do Linked Government Data in Spain, will help facility vocab

<PhilA> mhausenblas: I'm co-hosting here. Head Linked Data section here at DERI

cygri: In Galway, LiDRC at DERI as well, I'm focusing on vocabs (DCAT, DataCube) and also other WG (RDF, RDB2RDF)
... I'd like to learn about requirement for DataCube vocab, also DCAT

BartvanLeeuwen: In Galway - I'm a Semantic Fire Fighter from A'dam
... doing Linked Open Data, looking for advise for best practices and share again

<PhilA> Note to self, need to talk to fadi about Dan Smith's work on Refine Extensions http://wiki.linkedgov.org/index.php/Extension

csarven: In Galway, MSc in LiDRC, with Michael and Richard, working on data-gov.ie, tooling around this

GofranS: In Galway, MSc students in eGov unit at DERI, focusing on metadata i18y, ADMS

DeirdreLee: In Galway, heading the eGov unit in DERI, working with Vassilios of the EC
... we're doing Open Data, policies, etc.
... for example, DCAT is of interest

BenediktKaempgen: In Galway, from FZI in Karlsruhe, Germany
... into business intelligence, interested to provide feedback for DataCube and other related efforts
... such as SKOS extension for hierarchy
... as well as versioning input
... we've published Eurostat and XBRL data

Spyros: In Galway, IBM SCTC in Dublin, we are into Linked Open Data publishing (dublinked.ie)
... we do data management for Smart Cities using Linked Data

<sandro> first/last name spelling?

<sandro> I think I have everyone but Spyros on http://www.w3.org/2011/gld/meeting/2012-01-25

<sandro> got it.

<rreck> sheesh, it took 4 tries and then dropped

<cygri> cmusialek: Chris Musialek, working on data.gov

<George> cmusialek: GSA Data.gov lead, working on vocab.data.gov and other related GLD for Data.gov

<George> t_gheen: One World Law Library

<rreck> whew, i called in 9 times

<bhyland> ping


<George> t_gheen: Library of Congress

<bhyland> Introducing George Thomas from US HEaltha & Human Services

<George> bhyland: 3RoundStones, GLD co-chair, US Gov LD initiatives (EPA), strong open source product orientation, more on Web Arch, Data Mgmt

<George> ... better tooling for Web2.0 app dev's for using RDF stack tech

<George> ...

<George> objectives for F2F2 - focus on enabling aspects for GLD publishers, how to roll out LD projects

<George> ... value add to augment tech chops with mgmt understanding

<cygri> me sandro, it's Gofran Shukair and Benedikt Kämpgen

<George> Yigal (not in IRC) - working on Gov Grant vocab -

<George> ... been working with Gov Data for a long time, worked with Dan Gillman (BLS)

<George> Mike Pendleton (not on IRC) - EPA - doing LD projects, new approaches to data warehousing and publishing using LD

<George> ... interest and contribution in Procurement

<George> Anne Washington (not on IRC) from George Mason University, Professor Public Policy, bkgrnd CS and IS

<George> ... interest and bckgrnd in Dig Archives, preservation incl metadata

<sandro> Okay, http://www.w3.org/2011/gld/meeting/2012-01-25 has everyone correctly listed (I think).

<George> ... need for external 'non-branded' info in determining scope and direction of GLD projects

<George> ... part of the W3C eGov IG

<sandro> http://www.w3.org/2011/gld/meeting/2012-01-25

<George> Dan Gillman (not yet on IRC)

<George> BLS, DC F2F2 host

<rreck> where is the video ?

<olyerickson> @bhyland I have been on in car, just arrived at TWCRPI

<George> ... involved with metadata standards and requirements for access to statistical data (for 'quite some time' :)

<George> ... got involved with GLD through chair role of Open Gov Vocab WG (part of Fed CIO Council Data Arch Subcmt)

<George> ... interest in synergy and application of W3C/GLD to BLS data

<bhyland> w?

<bhyland> q/

<George> olyerickson: Dir of Web Science Ops at TWC RPI

<George> ... project lead for logd.twc.rpi.edu - int gov cat search demo, govpedia.org project, others

<George> ... interest in firming up international BP guidance for GLD, co-leading URI construction session later today, also vocab rec's esp DCAT, (other good collab mojo)

<George> sandro: W3C primary staff contact with PhilA, key interest is making SemWeb work, GLD all ++, QB??, more :)

<George> GeraldSteeman: NASA S&T Info Prg Office, deliverable reviewer from lay-person persp, general interest in GLD

<George> ... bhyland adds contibutions from Gerald incl outreach at high levels

<George> DaveReynolds: SW/LD long timer, CTO Epimorphics, UK Pub Sector - data.gov.uk (variety of offices/agencies), vocab work - Org Ont (UK Organogram with cygri and JT), QB, LDA co-developer (great stuff!), variety of edu/env publishing

scribe: Linked Data API see http://code.google.com/p/linked-data-api/

<George> ... interests - mostly vocab with cygri etal

<cygri> simonWall?

<PhilA> Picking up on DaveReynolds comments about the org ontology being used for organograms - here's an example http://data.gov.uk/organogram/department-for-business-innovation-and-skills

<George> simonWall: morning! Dir of Data Mgmt Australian Bu of Stats - working on standardizing statistical data/metadata, statistics/statistics/statistics

<sandro> simonWall: I lead the Data Management Section at the Australian Bureau of Statistics (http://abs.gov.au). (in Canberra)

<George> ... unlike (sandro ;), most interested in QB vocab, role as influencer of international stat community, interest in LD, and W3 membership

<George> ... is alive and well :)

<George> rreck: consultant in Wash DC, masters in comp linguistics, textual data & RDF thesis, published, working in law enforcement, working with vocabs, 3rd W3C (GRDDL, other?) group

<George> ... review props that influence stability of GLD, collab with AnneW

<cgueret_work> I'm here but you don't hear me

<cgueret_work> it's christophe gueret

<cygri> christophe gueret

<cgueret_work> from the VU

<cgueret_work> yep

<cgueret_work> should be

<cgueret_work> http://www.few.vu.nl/~cgueret

<cgueret_work> :/

<cgueret_work> that's me :)

<cgueret_work> thx :)

Agenda Review

<bhyland> Please look at http://www.w3.org/2011/gld/wiki/F2F2

DataCube vocab discussion update

<cygri> http://publishing-statistical-data.googlecode.com/svn/trunk/specs/src/main/html/cube.html

<sandro> cygri: we have a draft spec

<BenediktKaempgen> Wiki page: http://www.w3.org/2011/gld/wiki/Statistical_Cube_Data

<sandro> cygri: started in 2010, that's the current status : http://publishing-statistical-data.googlecode.com/svn/trunk/specs/src/main/html/cube.html

cygri: recently not that much activity re consumption
... working on a generic client for any kind of DataCube data
... we have quite some issues in the queue raised by people that have been using DataCube
... suggestions for improvements and extensions (incl. from BenediktKaempgen)
... next steps are
... transferring the issues to GLD tracker

<cygri> http://code.google.com/p/publishing-statistical-data/issues/list

cygri: as well as discuss extensions in GLD

<Zakim> mhausenblas, you wanted to update on Gishlain status

cygri: additionally we want to publish the current spec as a FPWD in the GLD
... I do have an action on it anyways
... in order to improve DataCube we should take into consideration all the valuable feedback
... need to find a balance between quickly getting out a FPWD on it vs. incorporating the feedback

<BenediktKaempgen> +q

<simonWall> +q

<Zakim> mhausenblas, you wanted to still update on G.'s status ;)

<Zakim> DaveReynolds, you wanted to agree with Richard :)

Michael: Seems Gihslain will join G'way later today

DaveReynolds: Agree with what cygri said
... need a canonical issues list and a FPWD of DataCube spec
... need to remove ambiguities in the spec
... folding in the experience from practice
... DataCube has been used by a number of groups now, already
... some co-ordination is needed
... esp. re aggregation there has been quite some development in the SDMX world

<cygri> DaveReynolds: coordination with standards from the Observations&Measurements area

bhyland: Are people happy enough to move to the W3C space?

cygri: I think GLD is the appropriate space for this, yes
... so far the work has happened in an informal space (cf. Google code repo)

<t_gheen> bhyland: what do we need to do? are the documents in good shape? How do we move forward and raise awareness?

PhilA: From a process point of view we need to create a product for DataCube in the GLD tracker
... and also for future products (DCAT, etc.) as currently there is only one product

cygri: Positiv

<scribe> ACTION: PhilA to add products on issue tracker [recorded in http://www.w3.org/2012/01/25-gld-minutes.html#action01]

<trackbot> Created ACTION-29 - Add products on issue tracker [on Phil Archer - due 2012-02-01].

<cygri> ACTION: cygri to produce editor's draft of Data Cube spec [recorded in http://www.w3.org/2012/01/25-gld-minutes.html#action02]

<trackbot> Created ACTION-30 - Produce editor's draft of Data Cube spec [on Richard Cyganiak - due 2012-02-01].

Michael: I assume FPWD of DataCube will be available together with the other FPWD on BP, etc.?
... which would mean: soon

<DaveReynolds> Agree with cygri that once have first working draft is a good time to seek feedback.

BenediktKaempgen: Question re issues
... how will they be grouped
... Also, questions regarding consumption side

<DaveReynolds> We do see groups consuming live Data Cube data, including iPhone apps. So there is some active use to learn from.

cygri: My take on this is that the scope is backed up by the charter
... so we need to be careful regarding how far we go, can't stray too far from this
... but it's important that we're compatible with other related works such as DDI
... agree with co-ordination with others, yes

<DaveReynolds> Agree with Richard, keep scope narrow as currently defined, but do a good job of co-ordination.

cygri: regarding consumption tools - we're producing a vocab, not a processor

<sandro> the charter says: "Statistical "Cube" Data. The group will produce a vocabulary, compatible with SDMX, for expressing some kinds of statistical data. This need not be as expressive as all of SDMX, but may provide a subset as in the RDF Data Cube vocabulary. It may also include ways to annotate data to indicate its assumptions and comparability."

<sandro> precended by: "The group will also produce documentation, examples, and, optionally, test cases and OWL ontologies for these vocabularies."

cygri: though, feedback from DataCube consumers would be beneficial

<Zakim> cygri, you wanted to talk about use cases

<sandro> on http://www.w3.org/2011/gld/charter

<sandro> mhausenblas, it's optional.

<bhyland> Question: Is this correct wiki page for updating the WG on progress, http://www.w3.org/2011/gld/wiki/Statistical_Cube_Data

<DaveReynolds> mhausenblas: it currently is OWL (for values of "OWL" that are basically RDFS :))

simonWall: we're very active in the SDMX and DataCube space

cygri: Would like to raise one more issue - does it make sense to also document use cases?

Michael: +1 to use cases

<bhyland> cygri: Does it make sense to document use cases for vocabularies?

<PhilA> use cases are always good...

Michael: Yes, we're backed up by charter (cf. 'examples')

<bhyland> cygri: I think it is a good reality check to resolve design criteria issues. Helps with clarity

<PhilA> Although UCS are probably best recorded in a separate document

cygri: Matter of resources in the working group

<bhyland> cygri: Do we have the resources in the group to document use cases.

<BenediktKaempgen> +1 to use cases

<simonWall> +1 for use cases

<dvilasuero> +1 for use cases

<t_gheen> bhyland: we are writing docs for real working people, so mapping to real world is important

<cgueret_work> +1 to bhyland

<t_gheen> bhyland: where is the wiki page to update progress for this?

<BenediktKaempgen> eg., http://www.w3.org/2011/gld/wiki/Statistical_Cube_Data

<bhyland> Is this correct wiki page for updating the WG on progress, http://www.w3.org/2011/gld/wiki/Statistical_Cube_Data

cygri: AFAIK there is no single page that captures the current status
... not really updated, also

<BenediktKaempgen> I updated it recently a bit.

<t_gheen> bhyland: recommend creating a high level page to organize the information

<t_gheen> ... can people devote time to this effort over the next few months?

<Zakim> DaveReynolds, you wanted to talk about examples

DaveReynolds: Yes, I can commit some time in the next 5 month, rather at the end
... re UC, there are different needs
... real data samples

<George> +1 DaveReynolds

+1 as well

<simonWall> +1 too

<rreck> oh too bad we didnt do it in google+ so others could have joined

DaveReynolds: UC in the sense cygri was talking about vs. real world samples

<sandro> rreck, maybe during a break we can experiment with other vid tech.

Michael: both would be good!

<sandro> rreck, also, I gather this is a commercial skype account that can do multi-way.

<rreck> oh?

DaveReynolds: also valuable to evangelise to document the usage
... but not in the spec but a separate doc

<George> DaveReynolds: additional note on how case-study/examples realize use cases

Michael: Agree with DaveReynolds to have a separate non-REC-Track doc on UC


<PhilA> Sounds to me as if bhyland is talking about usage guidelines?

<sandro> or tutorials? not sure.

<t_gheen> bhyland: who can work on this?

<bhyland> Committment for DataCube/SDMX work offered by DaveReynolds, Richard, others?

<simonWall> +q

<bhyland> Add SimonWall to the list.

simonWall: count me in re UC

<PhilA> So the people working on the qb data are DaveReynolds, cygri, simonWall

<BenediktKaempgen> You can count me in also.

<t_gheen> bhyland: simonWall, DaveReynolds, cygri

Vocabulary Selection discussion

<olyerickson> is there a Skype ccall that one could be included in for video?

<BenediktKaempgen> PhilA, can you add me, too?

boris: Please look at the slides at http://www.w3.org/2011/gld/wiki/images/6/65/VocabularySelection.pdf

<olyerickson> ...or is it only DC/Galway>?

<PhilA> So the people working on the qb data are DaveReynolds, cygri, simonWall, mhausenblas, BenediktKaempgen

<boris> http://www.w3.org/2011/gld/wiki/images/6/65/VocabularySelection.pdf

<cygri> boris: starting presentation on vocabulary selection

<cygri> ... charter says we need to provide guidelines to governments

<cygri> ... RDF requires specific domain terms in order to provide a certain domain

<cygri> ... modelling is important phase in data lifecycle

<bhyland> ping


<cygri> ... (showing different data lifecycle models, they all have a modelling phase)

<cygri> ... big picture: 1. search for existing vocabularies in various search engines/repositories

<cygri> ... 2. if suitable is found, re-use it

<cygri> ... 3. otherwise, search for suitable thesauri etc

<cygri> ... 4. if those exist, build a vocabulary by transforming these resources into RDFS

<cygri> ... 5. otherwise, build from scratch. this happens if the domain is very new or complex. but doesn't happen so often

<bhyland> Encourage interactivity IMO

<bhyland> lol, co-chairs differ ;-)

<George> :)

<cygri> boris: there are multiple repositories for searching vocabularies, but no one definitive

<bhyland> No one central place to find a vocab is a feature, not a bug :-)

<cygri> ... (summary table of available repositories)

Michael: ontologi.es is in fact Toby Inkster ;)

<cygri> boris: there are no guidelines to help developers to decide which engine/repo to use

<olyerickson> Note that DataFAQs <https://github.com/timrdf/DataFAQs/wiki> will soon provide a statistical vocabulary ranking service based on use in the LOD cloud. I've asked them for a statement as to how this will work. Note also that they will provide a commmunity vocab ranking service soon as well

Michael: re relevant vocabs - ORG seems to be missing?

<cygri> boris: (summary list of gov-relevant vocabs)

<bhyland> @Michael, noted. I'll add.

<bhyland> This is a partial list

<cygri> ... probably need to include a few more

Michael: re vocabulary prefixes - my advise is simple - use prefix.cc

<cygri> ... there is a list of popular prefixes from the RDFa group

<DaveReynolds> Vocabulary list: would like to see org on there :)

<GofranShukair> also DOAP is missing

<DaveReynolds> DOAP is on the previous list

<GofranShukair> yeah sorry now i see it :(

<DaveReynolds> But wasn't on the prefix list I don't think.

<cygri> boris: (demo of LOV - http://labs.mondeca.com/dataset/lov/suggest/ )

<DaveReynolds> Not sure about BIBO as the one and only vocab in that area to single out, but not my field.

<cygri> boris: criteria for selecting a particular vocab/ontology

<dvilasuero> definately, if we have BIBO there we should have some other important library vocabs

<George> LOV search - nice

<cygri> ... usage, maintenance, coverage, etc etc

<cygri> ... tools for building vocabularies: neologism, protege, ...

<cygri> (LOV people are: Bertrand Vatant and Pierre-Yves Vandenbussche)

<BenediktKaempgen> +q

<Zakim> mhausenblas, you wanted to note re suitability

<cygri> mhausenblas: what is “suitable”? how do you define this?

<George> mhausenblas: what is 'suitable'?

<cygri> ... give concrete advice how to figure out which competing vocabulary to use

<cygri> ... and advice when it makes sense to build your own

<George> ime suitability is often a vocab combo - ie org + vcard

<cygri> ... also important for suitability: does my data sparql well if expressed in this vocab?

<cygri> ... existence of multiple repos/engines not a problem. they do different things

<cygri> ... some crawl, some are curated

<olyerickson> I've asked DataFAQs people to compare/contrast their vocab ranking capability with LOV vocab ranking.

<cygri> ... if we had the resources: meta search engine?

<George> +1 vocab metacrawler at w3

<cygri> ... would have value if run at W3C

<cygri> ... our best practice document will be frozen in time, so static lists will go out of date

<olyerickson> +1 to more than vocab search; need ranking "vocabRank" or "schemaRank"

<cygri> ... perishable info should maybe not go in there

<bhyland> The Best Practices Recommendation document will be almost "frozen" as of the publication data. The way we'll add flexibility to the Vocabs is through the community driven LOD Cookbook.

cygri: Agree with Michael

<bhyland> +1 to Michael

<olyerickson> GLD recommendation for "high quality" linked data is to use widely-used, relevant vocabularies *correctly*

cygri: we should avoid to create concrete suggestions what are suitable vocabs - it's arbitrary

<George> these are two diff gld deliverables tho - selection, and recommended

<olyerickson> The question is, how to find vocabs (a) in wide use (b) whether they are relevant

<DaveReynolds> +1 to Michael and Richard, W3C shouldn't be maintainer of such lists, especially not if that has implications on procurement

cygri: I'd like to see guidance on how to use the tools (check lists) to determine what is relevant, quality, etc,

<bhyland> Cygri: For this WG, suggest that we have a basic sets of questions we ask the maintainer. We don't want to arbitrarily add vocabs.

<olyerickson> +1 to DaveReynolds ( by default ;) )

boris: I agree with both Richard and Michael said

<olyerickson> I think surveys/questionaires/etc don't scale

<George> recommendation for domain agnostic - cross cutting vocabs for all GLD publishers...

<cygri> bhyland: what do you mean by implications on procurement?

<George> snapshot problem regarding procurement and inclusion on some 'list' like a gld deliverable

<olyerickson> I think we should leverage the presence of vocabs "in the wild" (ie LOD Cloud) to assist selection

<Zakim> PhilA, you wanted to agree with mhausenblas and add a bit more

<cygri> mhausenblas: it's a competitive advantage if my vocab is w3c-listed and yours isn't. best practice document will be frozen in time

<cygri> PhilA: here are some criteria:

<cygri> ... 1. permanence of domain name

<cygri> ... for example, LOV service URL looks not permanent. that's bad.

<George> gov consortium mandates are nice ...

<rreck> I think we should call URLs URLs not URIs

<cygri> ... 2. change control. who's in charge of changing it?

<cygri> ... dublin core has a large committee in charge, so changing it is hard. that's good

<cygri> ... 3. is it actually used in the wild?

<cygri> ... we should point out these criteria even if it may be very hard to evaluate in practice

<cygri> +1 to all PhilA said

<cygri> BartvanLeeuwen: should also point out that local language documentation is important

<sandro> +1 BartvanLeeuwen -- another criterion is support for multiple languages, eg in the documentation for the vocabulary

<cgueret_work> +1 too

<GofranShukair> +1 too

<cygri> ... ideally, vocabularies should have documentation in multiple languages

<cgueret_work> vocabs should be properly described in several languages

Michael: We need to distinguish between vocab discovery and vocab creation guidelines, I believe

<cygri> boris: most vocabs are english but governments speak all sorts of languages. we have work in progress on how to express multilingual vocabs on the web of data

<George> GofranShukair: ADMS

<cygri> GofranShukair: ADMS describes semantic assets. that includes vocabularies

<GofranShukair> http://joinup.ec.europa.eu/asset/adms/home

<cygri> ... we describe metadata, incl language

<cygri> ... ready for review

<boris> I can take the action

<cgueret_work> will be pleased to contribute with French concerns

<scribe> ACTION: boris to create a Wiki page on multi-lingualism of vocabs [recorded in http://www.w3.org/2012/01/25-gld-minutes.html#action03]

<trackbot> Created ACTION-31 - Create a Wiki page on multi-lingualism of vocabs [on Boris Villazón-Terrazas - due 2012-02-01].

<cygri> bhyland: multilingual issues are important. awareness should be raised. please, write a blurb on this

<GofranShukair> http://joinup.ec.europa.eu/asset/adms/home

<olyerickson> In addition to the multilingual vocab issue, there is the multilingual instance data issue --- english predicates but literals in other languages.

<cygri> bhyland: having criteria for inclusion of vocabularies is important. let us draft a list of vocabularies.

<cygri> ... where is it hosted? university? production system? what's the institution's commitment to maintenance?

<cygri> ... we should work on such a checklist over the next two days

<sandro> +1000

<cgueret_work> +1

Michael: we should maybe also talk about how vocab management (what is the process to add new terms? who owns the namespace? hit-by-truck scenario)

<Zakim> mhausenblas, you wanted to discuss change control and vocab ownership

<PhilA> +1 to bernadette's suggestions for capturing criteria for vocab selection

<cygri> mhausenblas: there can be issues around ownership of namespace, hit by bus risk etc

<rreck> +1 namespace ownership problems

<George> mhausenblas: namespace ownership, distinguish btw discovery, management, creation advice - more will discover than create -

<cygri> ... need to distinguish between vocabulary search and vocabulary creation. different issues

<sandro> +1 bhyland: during these two days let's start the checklist of things people need to look for in deciding whether a vocab is good enough, such as stability, domain name, point of contact, etc.

<rreck> I have had commercial clients unwilling to use existing namespaces because of copyright exposure

<cygri> ... experience shows that something can start informally and move to something more formal, e.g. story of VoID

<PhilA> PhilA: notes that danbri has solved the "what happens if I go under a bus" issue through an agreement with DCMI (so FOAF is as stable as DC)

<sandro> ( I don't think bhland said we should produce a list of vocabs. )

<cygri> ... so we can say there's a process that can take you from informal work to something permanent and fit for purpose

<cygri> mhausenblas: i like checklists

<rreck> +1

+1 to sandro's ' fears/nightmare-scenarios'

<sandro> charter quote: "Vocabulary Selection. The group will provide advice on how governments should select RDF vocabulary terms (URIs), including advice as to when they should mint their own. This advice will take into account issues of stability, security, and long-term maintenance commitment, as well as other factors that may arise during the group's work."

<sandro> +1 cygri: don't list vocabs, just list how to evaluate vocabs

Michael: Does the WG interpret this in the sense of 'we provide checklist how to' or rather 'list concrete vocabs'?
... I'd very much prefer the former

<George> cygri: lists of recommended vocabs in bp vocab selection? instead, criteria list for selection - then there's std vocabs for cross cutting GLD publisher concerns - nice delineation


<simonWall> +1

<cygri> sandro: i agree. arbitrary lists would be a problem

<DaveReynolds> +1 to cygri, criteria not lists

<cygri> sandro: we might explain that criteria list in terms of "nightmare scenarios"

<George> sandro: how to write this 'checklist' - nice to explain in terms of issues/challenges (fears/nightmare-scenarios)

<George> +1 cygri

<cygri> ... "here are possible things that could go wrong. check how the vocabulary or its maintainers deals with that"

<cygri> ... this would bring it to life

<cygri> bhyland: i agree but can we put a positive spin on it?

<Zakim> sandro, you wanted to suggest the document include fears/nightmare-scenarios

<cygri> PhilA: for the record: it would be horrible if danbri was hit by a bus.

<Zakim> PhilA, you wanted to make 2 suggestions for vocab selection if you want me to, or I'll park it if time is short

<cygri> PhilA: national part of domains matter

<cygri> ... but you can use .us in .ie

<cgueret_work> +1 to PhilA

<cygri> ... multilingual: want to use dublin core in finnish? don't reinvent it. provide a translation with finnish labels

<BartvanLeeuwen> +1 to PhilA

<George> cygri: +1 provide labels for existing vocab/namespace

<rreck> we should mention Z39.19?

<George> ... common issue/problem/mistake

<simonWall> The Finnish National Library maintains the Finnish version of Dublin Core...

<rreck> skos

<George> mhausenblas: label as 'quality requirement'

<cygri> mhausenblas: some quality criteria can be expressed as sparql queries

<cygri> ... for example presence of labels

<PhilA> PhilA: chose Finnish at random - but good to see that my entirely random choice is ahead of the game, simonWall

<cygri> ACTION: mhausenblas to compile first version of vocabulary selection quality checklist [recorded in http://www.w3.org/2012/01/25-gld-minutes.html#action04]

<trackbot> Created ACTION-32 - Compile first version of vocabulary selection quality checklist [on Michael Hausenblas - due 2012-02-01].

<DaveReynolds> Having the label in the URI for vocab terms is a multi-language issue for some folks. There is genuine argument on both sides whether opaque URIs + labels in all languages is better than having one preferred language reflected in the URIs.

Michael: Z39.19 sounds interesting indeed, thanks rreck!

<simonWall> Point taken (I googled that one; I do know that the New Zealand National Library maintains the Maori version of DC though.)

<cygri> ACTION-31?

<trackbot> ACTION-31 -- Boris Villazón-Terrazas to create a Wiki page on multi-lingualism of vocabs -- due 2012-02-01 -- OPEN

<trackbot> http://www.w3.org/2011/gld/track/actions/31

<dvilasuero> *mhausenblas: i could help with that action

<rreck> i have done alot of work with z39.19 and multi-lingual representation

<cygri> DaveReynolds, do you have some pointers re multilingual URIs? would be good to include the debate in that wiki page

Legacy Data

<SpyrosKotoulas> http://www.w3.org/2011/gld/wiki/File:LegacyData.pdf

<cygri> scribenick: BenediktKaempgen

<DaveReynolds> cygri: would have to dig, the OBO world has best practice advice on using opaque URIs which might be relevant. Also I have annedotal evidence though would need to be circumspect about to phrase that in public :)

Spyros: On Dublin data rdfized to RDF
... what is legacy data? Is gov supposed to transform all data (e..g., pdfs, scan, xsl)?
... most data from relational db

<dvilasuero> cygri: we also have a paper for las dc conf on multilingual URIs

<dvilasuero> where we review obo and others

Spyros: often also: geo data, temporal data (statistics), record oriented relational data (e.g., about citizens)

<PhilA> Spyros' slides are now linked from the agenda

<PhilA> scribe: BenediktKaempgen

Spyros: concerns: privacy issues (who can assess whether something is privacy sensitive?), how much to publish (efficiently, considering the costs), ...

<bhyland> ping

Spyros: considering risks with opening up data; how about institutions that are not quite government

<bhyland> Sorry is this is a repeat, per the charter on legacy data: "Legacy Data. The group will produce specific advice concerning how to expose legacy data, data which is being maintained in pre-existing (non-linked-data) systems.

Spyros: also technical issues: architecture, what visualizations (applications consuming data), how to facilitate use by non-experts
... how to automate such processes
... how to provide guidance/template/references/cookbook for processes
... transforming data into RDF often possible but might be awkward

<Zakim> mhausenblas, you wanted to comment on the term 'legacy data' and to discuss prioritisation of data sources - demand driven

<olyerickson> I think we should consider referring to "data life cycle" ala http://www.ddialliance.org/what (DDI Alliance)

mhausenblas: two reactions: term legacy data, maybe we should use a different term (e.g., raw data)

<davidwood_> bhyland, please check your phone for immediate text message requiring your action. Sorry to interrupt.

<olyerickson> I think the core question is, what best practices for data life cycle management should this group make that pertain to GLD?

mhausenblas: Secondly, question always: where to start publishing data? Uptake then further drives publishing process. User-pull rather than publisher-push.

<Zakim> DaveReynolds, you wanted to ask what makes this 'legacy'

<George> me thinks we're talking about exposing RDB's ergo R2RML

<mhausenblas> Michael: re multimedia interlinking see http://events.linkeddata.org/ldow2009/papers/ldow2009_paper17.pdf

DaveReynolds: RDF can walk along "legacy"/raw data
... Representing key parts in raw data/legacy is difficult.

<George> +1 - exposing these existing/emerging W3 works for this

Richard: We should list related work (R2RML, M, Griddle, xslt...)

<mhausenblas> cygri: There are a number of existing W3C standards that already address the transformation part (R2RML, GRDDL, etc.)

<olyerickson> I think the real issue is how to integrate LD best practices with your existing data life cycle management infrastructure

<mhausenblas> +1 to what olyerickson

<t_gheen> bhyland: Spyros points out how broad the description of legacy data is in the charter

<t_gheen> ... we should set some boundaries

<dvilasuero> +1 olyericksson

Bhyland: How to bound this topic?

<mhausenblas> s/what olyerickson/what olyerickson said

<cygri> +1 to byhland. bounding is important

<mhausenblas> 1+

<mhausenblas> Michael: Scope should be on W3C standards and then expand

<olyerickson> @bhyland please re-state what to take a stab at...

Legacy Data discussion

<bhyland> bhyland: the charter is very broad in the description of what is to be included in the "Legacy" section of the BP Recommendation.

mhausenblas: What resources are available?

<bhyland> We need to bound it. Suggest we put some lines in the sand as to what is "in" and we'll be able to reasonably do within the next 6 mos in this WG.

<George> mhausenblas: IBM Biplav/Spiros resource committment to drive expeccted 'legacy' contribution

<bhyland> Spyros is here on behalf of IBM and is an invited guest of the F2F. Thus, he cannot make make committments for IBM to this WG.

<cygri> mhausenblas: if we go for a broad interpretation of this topic, then we need people and volunteers

cygri: Agree with boundaries. Good starting point would be W3C standards.

<rreck> better arbitrary than nothing?

cygri: E.g., it would be helpful to describe tools. Risk to be arbitrary with inclusion.

<George> cygri: standards, tools, approaches

cygri: Also useful to describe approaches, e.g., for modelling.

<olyerickson> Hmmm...this is the first time I realized we were talking about CONVERSION

cygri: There should be experiences in WG to give recommendations on such processes.

<mhausenblas> Michael: Against listing tools explicit, but rather provide examples of tool catalogs such as found http://www4.wiwiss.fu-berlin.de/latc/toollibrary/ and a http://www.planet-data.eu/results/datasets-and-tools

<Zakim> mhausenblas, you wanted to discuss tools

<bhyland_> We have some of the content Cygri is describing in the current LOD cookbook, especially as it relates to the auto conversion vs. human-involved modeling.

<rreck> +1 point at the wiki makes good sense

mhausenblas: Problem with tools is that they can get outdated.

<cygri> olyerickson: i converted a price from dollars to pounds recently

mhausenblas: Similar to Vocabulary case, have a checklist.

<bhyland_> MichaelH: His bias is on describing checklist approach rather than a specific list of tools which will become dated over time.

<olyerickson> @cygri that's the "right" direction, isn't it?

DeirdreLee: Agrees with not describing tools. But in case of vocabularies makes sense.
... Users demands would help with legacy issues.

cygri: Agrees with seeing transforming legacy data as a process that needs to be a compromise of effort and benefit. Start with metadata, concept schemes, and later go on with the acutal raw data. Looking at users will really be useful.

<stasinos> cf. http://users.iit.demokritos.gr/~konstant/

<bhyland_> cygri: Handling legacy AKA "raw data" has some logical starting points and (could go on infinitely). Address misconceptions about converting to RDF as an "augmentation" to existing system. Others convert to RDF and that is it.

<olyerickson> +1 to cygri

cygri: Important w.r.t. legacy data: What does it actually mean? What does it implicate? Regarding on the situation, specific approaches may make more sense than others (e.g., transformaing most data into RDF).

<mhausenblas> Michael: Suggest to think along TimBL's 5 star scheme http://5stardata.info/

olyerickson: Discussion about legacy is not usefull if not seen from perspective of a certain scenario. Best-practice they need is to continuously manage their data.

<George> Refine+DERI_extensions and R2RML covers 80% of the GLD publisher waterfront afaic - i'd love to see standards, tools, approaches covering spreadsheets and RDB's

<cygri> olyerickson, i didn't see the link you mentioned?

<rreck> +1 concrete examples are essential

<DaveReynolds> +1 to olyerikson - focus on Linked Data as an access approach and how it ties in to existing data management practice, avoid terms like "legacy"

olyerickson: Life examples of tools of how to get specific issues done, might be useful.

<olyerickson> link to DDI Alliance http://www.ddialliance.org/what

<bhyland_> Olyerickson: Feels we walk a line between decribing checklists to evaluate vs. associating specific tools to "get the job done."

<olyerickson> Link to ANDS recommendations http://ands.org.au/guides/index.html

<cygri> olyerickson, are you aware of https://github.com/FranckCo/DDIOnto ?

<Zakim> mhausenblas, you wanted to ask if we can agree on a term now, please? should we use original data? source data?

<bhyland_> Time check: 3 minutes until tea break

<Zakim> cygri, you wanted to suggest talking about standards instead of tools when possible

DeirdreLee: Concrete example: EU Inspires (?) data publishing very cumbersome. To sell the approaches to government may be very difficult.

<DaveReynolds> Aside: INSPIRE can be met via linked data, e.g. UK has proposed URI guidelines for naming INSIPRE spatial objects.

<George> +1 source data (although I don't have any 'legacy' heartburn...)

<bhyland_> Agreed: Legacy data to be recast as "raw data"

<olyerickson> @cygri No I wasn't, thanks!

<DanG> Legacy? How about "metadata-challenged"

mhausenblas: Shall we use a different term than legacy. Suggestion: See it in terms of TimBL star schema.

<DeirdreLee> legacy/raw data is 'existing' data. Linked Data is simply an extra way to represent 'existing' data

<George> +1 to hopping TBL's 'raw data' bandwagon, however 'raw data' is a misnomer in my gov experience

mhausenblas: rename legacy to raw data.

<DaveReynolds> -1 to "raw data" that caused problems when TBL used it

<mhausenblas> PROPOSAL: To use 'raw data' rather then 'legcay data' along TimBL's 5 star scheme

<cygri> kind of -1 to "raw data". statisticians hate that

<PhilA> Proposal: To use the term 'Raw Data' to refer to existing data

<olyerickson> +1 to "source data" over "raw data"...

<cygri> "non-RDF data"?

<dvilasuero> +1 to source data

<PhilA> Proposal: Not carried

<cygri> "spreadsheets"

<mhausenblas> PROPOSAL: To use 'source data' rather then 'legacy data' along TimBL's 5 star scheme

???: Mainly about spreadheets and relational data.

<cygri> +2 to sandro

<stasinos> "pre-formal"?

<olyerickson> +1 to exposing...what? ;)

<bhyland_> OK, chairs have conferred and we agree ... "Source Data"

<olyerickson> mhausenblas' proposal seconded...

<mhausenblas> PROPOSAL: To use 'non-RDF data' rather then 'legacy data' along TimBL's 5 star scheme

mhausenblas: how to call that first publing working draft

<simonWall> unlinked data!

<cgueret_work> -1 to non RDF

<rreck> non-RDF is stilted

<PhilA> -1 to non-RDF

<rreck> +1 bio

<cgueret_work> @simonWall some RDF is also unlinked

<boris> no open data?

<olyerickson> +1 to bladder relief...

<PhilA> +1 to source data

<DaveReynolds> +0 on "source data", it means some different but in a less harmful way than "raw data"

<cgueret_work> +1 to source data

<rreck> +1 source data

<cygri> +0.5 to source data. not my favourite but could work well enough.

+1 source data

<mhausenblas> bhyland_: we resume at 10:30am/3:30pm

<olyerickson> are we hanging up?

<olyerickson> do we have to dial in again?

<olyerickson> ...or is everyone on mute?

<bhyland_> ping, is the Galway team read to resume?

<mhausenblas> yes

<mhausenblas> sorry

<sandro> "Interfacing to Existing Data System"

<sandro> "Providing an RDF Interface"

Galway is coming...

<sandro> "RDF Interfaces"

<sandro> galway ping

<sandro> cygri, mhausenblas ...

<sandro> "Providing RDF Interfaces"

<bhyland_> -1 to non-RDF. I prefer "Source Data"

<mhausenblas> PROPOSAL: To use 'source data' rather then 'legacy data' along TimBL's 5 star scheme

<bhyland_> +1 PhilA

<cygri> +0.5

<cgueret_work> +1 to "source" too

<cygri> +0.5 to source data

<DaveReynolds> +0 on "source data", it means some different but in a less harmful way than "raw data"

+1 source data (although domain specific/original data would be more clear)

<cgueret_work> and what about "genuine data" ? :)

<bhyland_> Proposal for replacement name, it has a "use by date" of at least the FPWD

<sandro> sandro: the default is we dont revisit decisions.

<boris> +0.98 to "source data"

<bhyland_> Agreed: "Source Data"

<cgueret_work> cool

<mhausenblas> RESOLUTION: To use 'source data' rather then 'legacy data' along TimBL's 5 star scheme

<sandro> +0 source data okay as long as it's open to revisiting before LC. ( -1 to this term forever)

<boris> http://logd.tw.rpi.edu/sites/default/files/w3c_gld_uri_construction_25jan12.pdf

URI Construction discussion

olyerrickson: we have general URI recommendations, e.g., data patterns.

dvilasuero: Agrees.

<bhyland_> What do you mean missing?

olyerrickson: instance-hub-uri-design makes it possible to re-host uris
... re-host, i.e. move to a different architecture after testing
... requirements to uri creation approach: no need to make URI self-describing, non-domain-specific

<bhyland_> yes, Michael, it must be. I see legacy discussion in two parts in fact.

olyerrickson: major parts: id, org, category/token

<DanG> What about using subject matter categories rather than agency based ones? They won't die if the agency does.

olyerrickson: explanations of examples are linked from the wiki, e.g. in best practices document

<DaveReynolds> DanG - UK recommendation is def to use subject matter and avoid agencies

olyerrickson: room for discussion.

<olyerickson> This is NOT a recommendation; it's simply what we are ucing

<olyerickson> s/ucing/using/

<bhyland_> @Michael - we're planning to break in 15-20 minutes, when we've completed or at least come to natural break point in URI discussion. We have to walk to get our lunch.

<simonWall> I was planning to be gone by now, good night all.

Richard: Good handle of what the section should say. Small concern: some guidelines are applicable everywhere,e.g., slashes, stability; other aspects that apply only to specific use cases. UK gov guidelines mostly only apply to specific environments.
... E.g., re-hosting is something quite specific.

<olyerickson> @cygri good point; that was "merely" RPI's requirement ;)

<bhyland_> cygri: The main focus should be on stuff that is "true everywhere".

<bhyland_> What always applies vs. more specific example that could be better described as use cases.

Richard: Recommendations should be more generic. Needed: To abstract from the use cases of TWC or UK gov to have a less complicated design.

<bhyland_> Example from DERI, re: data.gov.ie project ...

<mhausenblas> s/data.gov.ie/http://data-gov.ie

Richard: approach was to complicated, but this was realized only afterwards.

<dvilasuero> +1 cygri

<Zakim> PhilA, you wanted to caution against using the org component, slide 3

PhilA: Concern: Names of governments departements change very often, should not be included in URI. Similar goes for locations.

<bhyland_> How about if we provide 1) background on the imporance of URI strategy; 2) the value of persistence strategy; 3) detail the issues involved to evaluate a URI scheme

olyerickson: Good point. But there is always the question whether create URIs from the concepts (from the actual data). If modelled from the data, then even if concepts changed, at that time of modelling the data was valid and as such the URIs are valid still, also.

<olyerickson> _dammit or /dammit ?;)

<mhausenblas> Michael: I don't see much of a point in criticising RPI's work now - he made it clear it's an example, not the recommendation

<olyerickson> +1 to sandro's point

sandro: important to have a plan in case a name changes.

<bhyland_> NB: We aren't criticizing RPIs URI draft ... it gave us something in black & white to discuss. Therefore it is good & useful IMO.

<George> {sector}.data.gov.*/id/{thing-type}/{instance}/natural/instance/hierarchy

<cygri> really good point

<mhausenblas> +1 to Dave's wise words re scalability of URI spaces via sub-domains

<bhyland_> DaveReynolds: Describe constants: 1) the constants (e.g., sectors for the UK). 2) use of sub-domains to allow for autonomy within gov't authorities. 3) explain scalability implications involved depending on URI structure. Explain URI construction and allude to performance issues ...

<mhausenblas> +1

<bhyland_> ... Separating the advice of what to do vs. if you don't do it, you'll get bitten in the bum

<stasinos> +1

<dvilasuero> +1

<DeirdreLee> Stale URIs (from non-existant depts) will make the data look stale, even if it's brand new....

DaveReynolds: Depends on the use of URIs: stabilized, architecture-dependent. Separation of tools that allow to create uris and the methods of how to deal with issues afterwards.

<mhausenblas> Michael: We're implicitly assuming transparent URIs now

Yigal: also responsibilities change. We need to think about temporal issues such as at what time did uri represent something.

<mhausenblas> Michael: also known as hackable URIs

<George> Yigal: temporal aspect in URI? which HHS? responsibilities change even if/when orgs don't

<DaveReynolds> For those who may not be aware ... as well as the original UK recommendations http://www.cabinetoffice.gov.uk/media/308995/public_sector_uri.pdf there are recommendations about spatial objects (as relates to the EU INSPIRE directive) http://location.defra.gov.uk/wp-content/uploads/2011/09/Designing_URI_Sets_for_Location-V1.0.pdf useful example of patterns of things beyond "id" and "def"

<cygri> +1 to point out ways in which things can/will break

bhyland: We need to bound URI construction topic.
... On the one hand best practices should be valid as long as possible. On the other hand it should also include more specific issues.
... cannot tell Google, Yahoo which vocabularies to use.

:-) thanks PhilA. Can someone scribe?

<PhilA> scribe: PhilA

<BenediktKaempgen> Thanks.

<bhyland> ping

<cmusialek> sorry about that!

olyerickson: I'd like to propose that the guidance we're getting -> we should transform what we have so far into a check list or decision tree

<cmusialek> thanks!

<bhyland> sorry from all of us in DC .. we seem to get aperiodically dropped from our guest network and there is no explicit notification ...

olyerickson: highlight the issues. What we're saying is what we did, what we thought about and why we did it

<bhyland> and worst, we loose the IRC history :-(

<bhyland> s/worst/worse of all/

<George> UK guidance also talks about /def (controversy!) and /dataset among other topics

bhyland: I would say, not having read the UK guidance in 8 months or so - that's more comprehensive and thought out. We should consider others (and strip out the UK-specific stuff)

sandro: I like the decision tree idea a lot

<olyerickson> +1 to "decision tree" idea

sandro: "Don't do this" or it will cause problems later and "you probably don't want to do this but you may have reasons not to" and so on

bhyland: cmusialekhas a mission to do today. I don't think the guidance is ready for him. The RPI draft is a good input - needs to be discussed further

<olyerickson> "RPI thing" is not a draft...it's what we did and why ;)

<olyerickson> Wait a minute...I think ChrisM is a test subject and should actually

cmusialek: I'm less familiar with the intricacies of URI design. But I'm hearing that it's time to act from the US gov and maybe get 80% right

<bhyland> s/cmusialekhas/cmusialek/

olyerickson: I'm going to disagree with you, bhyland
... We're not saying to Chris, go ahead and use this. I'm saying "try it, see what breaks and let us know"

<bhyland> Olyerickson is saying the draft RPI URI guidance is a proposal ... try it and give us feedback.

olyerickson: I'd also say take a look at the UK advice and tell us what the problem is

<DeirdreLee> +1 Olyerickson

<bhyland> RPI has used the RPI version for a very specific case.

olyerickson: We've used ours for a very specific case

<DeirdreLee> community drives standards or standards drive community?

The former DeirdreLee (if it's to be used)

cygri: I wanted to say that in terms of structuring these BP Recommendations, I agree with sandro and mhausenblas to structure these as a list
... Seems a good way to teach/inform

<olyerickson> +1 to cautioning about what might go wrong...BUT it needs to be informed advice

cygri: Have you thought about future change? Is there 'cruft' in there (scribe doesn't recognise the term cruft but that's life)

<boris> +1 to richard

<mhausenblas> +10000 to cygri

bhyland: I appreciate John's request for data.gov to take the RPI advice and see how it works. That might be the RPI state, but I'm not sure it's the W3C position as it hasn't been sanctioned by the WG

<cygri> olyerickson, who suggested giving uninformed advice? ;-)

<olyerickson> +1 to keeping things separate

<cygri> ack then,

<olyerickson> @cygri I didn't mean...hmmm...what did I mean ;)

<bhyland> AnneW: How do we iterate through a suggested set of guidelines & recommendations?

<olyerickson> PROPOSAL: URI sub-team work on a check-list for URI construction

sandro: The WG is supposed to iterate on the doc until everyone agrees with it
... then it goes to the outside world
... etc.

<mhausenblas> +1 to olyerickson proposal

<boris> +1 to oleyrickson

<bhyland> sandro: The normal W3C process is that the group reviews and once they don't have any problems with it, then goes to last candidate review for feedback.

<dvilasuero> +1 to oleyrickson

bhyland: What RPI has provided is a draft. But let's encourage cmusialek to be part of the discussion as it continues to evolve

<olyerickson> +1 to cmusialek et.al. be part of the conversation

bhyland: Are we at a natural breaking point?

<Yigal> In reference to using Congressional Districts as example: Is everyone aware that these are redrawn every 10 years?

GalwayL YES

<olyerickson> are we hanging up?

<mhausenblas> reconvene at 12:25 and 5:15pm

<olyerickson> I can also stay on bridge...

<mhausenblas> reconvene at 12:15 and 5:15pm

Wiki record of these minutes is up to date at this point

<olyerickson> Note for the "Vocabulary Selection" team: check out the recent addition to DataFAQs re: the role of vocabulary selection in Linked data quality https://github.com/timrdf/DataFAQs/wiki/Assisting-vocabulary-selection

<mhausenblas> I have to go now, unfortunately, Richard is taking over Galway. Literally. :)

sandro: the Washington room is still empty (the video link is showing us that). I can ping you when we're about to reconvene if you like?

<cygri> zkim, code

Discussion on Best Practices for Publishing Government Linked Data (FPWD)

bern: Did a big restructuring of the wiki page yesterday

<cygri> http://www.w3.org/2011/gld/charter

BernHyland: Two questions - how do we move from a wiki to a FPWD, and how do we reflect fture changes


sandro: We can publish directly from the wiki using a transformation script we have
... It's called RevDoc. It's only my WGs that have used it
... so far
... code is not polished
... alternative is to convert to respec which a lot of folk prefer

bernHyland: does it require your help to use RevDoc?

sandro: yes - incantations and bones are involved
... it could be useful but there are alternatives

bernHyland: Respec is the alternative

bh: I'm familiar with Respec so I'd rather use that
... I'll need help from people to make sure that they remember to record who changes what and when

<bhyland> ping

<boris> http://dev.w3.org/2009/respec2/

<Zakim> cygri, you wanted to mention ReSpec 2

bh: There seemed to be a lot of activity last September in terms for formatting that we can look at

sandro: One month off is OK. But we can put changes on the front page of the wiki

<Zakim> cygri, you wanted to ask what documents the group is going to publish

cygri: Do we have something like a complete list of the documents that the WG is going to produce (Rec and non-Rec)

<DaveReynolds> +1 a clear list of docs and intended status would be helpful

bhyland: The community directory is published, BPs will be a Rec,

<BenediktKaempgen> +1 to list of documents that will be produced

sandro: The Wg should do what it thinks is best, there are no rules as such

bhyland: We should put things that logically go together in a single doc. For e.g. we might have a lot of stuff about URI consutruction that could be separate
... The Cookbook isn't a Rec - it could become part of the directory

George_: The milestones section of the charter says that the directory and cookbook are separate

sandro: I think we should remain open to splitting docs as we see fit

<mhausenblas> Michael: Regarding publishing the BP FPWD, I think boris and I already had a chat, no? Boris, can you share our proposal on the call, please?

<mhausenblas> Michael: Essentially, the idea was to manually transfer the content from the Wiki - we're three Editors, so workload-wise this should work

cygri: I thought the Recommended vocabs were going to be in separate docs (DCAT and Data Cube) but I don't know about the otehr areas

<bhyland> cygri: suggested re: recommended vocabs, have one doc for DCAT, another for DataCube

<mhausenblas> Michael: Just to make it clear - I'm against the script-based version from the Wiki as we have a rather messy structure there and I don't wanna play guinea pig. sorry sandro, no offence meant

sandro: If we're just going to endorse someone else's vocab we don't need a big doc for that

bhyland: How much of the data cube spec is already written?

cygri: We have a spec that is pretty much ready. We might want to add things and improve things but in principle there is an existing spec that covers what you woujld expect it to
... we will need to write more if we decide that there are issues that need to be addressed?

bhyland: Is there any benefit for having this as a separate doc?

<DaveReynolds> +1 that's what I thought

cygri: Yes, that makes sense and I already have an action item to create it
... with help from DaveReynolds et al

<BenediktKaempgen> +1 QB should be an own spec

<mhausenblas> +1

<boris> +1 separate spec for qb

boris: wrt the draft of the BP spec - we (Michael, Bern and I will create the doc, people only need to update the wiki)

bhyland: Agreed

Community Directory

Slides are at http://www.w3.org/2011/gld/wiki/images/c/c6/BHyland_W3C_GLD_WG_F2F2_Directory.pdf

bhyland: The idea is the the CD (Community Directory) is a place where people not necessarily familiar with LD can get some guidance
... The initial CD was put together with some loose requirements from the June f2f
... Talks through her slides
... Haven't had a lot of feedback - need and would like more
... So where do we go? semanticweb.org? SWEO?
... Now that we have a working site, we can seek feedback, maybe open it up
... I think the first thing is to make sure that people think it's a good idea

<olyerickson> RE UI, actually simple is good --- priority should be *useful*

bhyland: Biplav asked what a company like IBM should put in? What's the (relevant) address for IBM?

<George_> bhyland: addresses for global/multi-national concern is good topic for vocab rec tomorrow

cygri: DERI is listed in there. We did that because we were asked to do it
... But I was thinking about why I should want to return to it to make sure our data is up to date?

<George_> cygri: what's the incentive for data freshness on the CD?

cygri: If people come here to find info about expertise then obviously we'd want to be properly represented
... We have an interest in being found if people are looking for LD expertise in Ireland

<DeirdreLee> we need a vocabulary to describe Linked Data domain :)

cygri: I'd be interested to know who else is working on the kind of thing we do

<George_> cygri: LD Communities of Interest/Practice query where?

bhyland: We used the W3C CSS and then made changes - we'd like to make the side panel batter

olyerickson: Don't be too hard on yourself. It looks good and it's hard to do faceted browsing
... There seems to be some interlinking that is not linking up. If you choose a company, then look at the topics, then try and click on those, what you expect to see is a re-listing of relevant companies

bhyland: Agree it would be useful to have tool tips around different terms
... such as adding tool tips
... I agree with cygri that if you know people are finding you through the DIR then you'll be more careful about keeping it up to date

]ack sandro

sandro: I'm super picky about sites as a user. But I do have to wonder about a bit of usability testing wouldn't be a bad thing. Unless it delivers a good experience on attempt 1 you might lose people
... Are there way that other people could contribute improvements? Fork?
... I'm not sure how Callimachus puts things together. Are there grad students that could do stuff with it?

bhyland: They're welcome to download the code and work on it. This is built on v.12 - we're now on v.16 which now includes import/export of apps
... updating the instance doesn't take a lot of work
... There's not a large technical hurdle to overcome. Just a bit of CSS and JS
... what I'd like is a list of features that we can fix
... expecially if they're trivial!


bhyland: We went to a lot of trouble to get it on a w3 domain for reasons of permanence etc. Got to be easy to use

sandro: Do you have an issues list?

<olyerickson> @sandro VERY good point!!

<cygri> +1 to sandro

<olyerickson> is there a github wiki?

bhyland: I'll ask James how he wants to queue up issues
... Things like needing a login is surprising

<olyerickson> What is the code host? github? Google Code? each have built-in issues trackers

bhyland: but maybe that's a good thing to prevent the spam

<sandro> ACTION: bhyland to set up an issues list for dir.w3.org [recorded in http://www.w3.org/2012/01/25-gld-minutes.html#action05]

<trackbot> Created ACTION-33 - Set up an issues list for dir.w3.org [on Bernadette Hyland - due 2012-02-01].

<olyerickson> Problem solved: http://code.google.com/p/callimachus/issues/list

bhyland: Then we can see what is easy and what needs more work to implement, prioritise etc.

<olyerickson> Ah okay

sandro: There's a Callimachus issues list, what we need is a dir.w3.org issue list

<olyerickson> @sandro thanks for the clarification

bhyland: Obviously James and I are best places to decide if it's a Callimachus or dir.w3.org issue

cygri: I wanted to give an armchair view of usability but not sure if that' the bets use of our time?

<sandro> now you can submit issues. :-)

bhyland: I have an bias towards action - I was expecting some philosophical issues to deal with

<cygri> on the philosophical side, i just want to know whether it's httpRange-14 compliant

I'm going to need to ask other people in the government space and see what they expect and compare it with what there is

<t_gheen> Mike_Pendleton: bugs in Firefox display

Mike_Pendleton: The left had side has a list of things that may or may not mean anything to people. ... Conversation then found a bug
... continues to give thoughts to bhyland who takes notes...

<t_gheen> bhyland: how about visualizations?

<bhyland> http://dir.w3.org/page/number-of-organizations-by-country.xhtml?view

<t_gheen> ... names queries?

<t_gheen> s/names/named

<sandro> Hmmm. When I'm looking at an "area of expertise", like http://dir.w3.org/scheme/organizational+categories/rdf+store?view ... I don't see who has that expertise.

<t_gheen> ... are there other ways to view the information that are more meaninful?

<olyerickson> Not sure what you're looking at...

<olyerickson> Okay, bhyland was referring to visualizations on http://dir.w3.org/page/number-of-organizations-by-country.xhtml?view

<t_gheen> bhyland: any suggestions for linking up with egov interest group?

<t_gheen> ... open knowledge foundation

PhilA: That's best achieved by a personal conversation

cygri: We work with OKFN and can tell them about it. The DIR isn't quite there yet though

<t_gheen> ACTION: bhyland convene a meeting on armchair usability for community directory [recorded in http://www.w3.org/2012/01/25-gld-minutes.html#action06]

<trackbot> Created ACTION-34 - Convene a meeting on armchair usability for community directory [on Bernadette Hyland - due 2012-02-01].

BartvanLeeuwen: Backing up a bit... if we think about being able to pull some data directly from the Web, perhaps through gr: data?

<George_> BartvanLeeuwen: GR for company products services (Deirdre - vocab for LD domain?)

<t_gheen> bhyland: how does GR pull/update company info?

bhyland: I think it's a really good suggestion.

<George_> ... and then pull that from where ever into the CD

Lots of red faces around the table looking at the large pile of uneaten dog food

bhyland: We could offer guidance on what RDFa to include on your site, then we could accept a URL of a page to parse and then that could be added to the directory

<t_gheen> bhyland: if there was basic RDFa on someone's site, how can we automatically update their info in the directory?

bhyland: It's tiresome to have to enter that by hand in 2012

sandro: I think we'd want to support the system being able to import data from a given location

BartvanLeeuwen: and preferably auto-updating too

<George_> sandro: auto slurping high on list of to do's in general

<cygri> washington seems to have dropped off skype?

<DeirdreLee> Core business vocabulary?

<t_gheen> bhyland: what is the state of the art for scraping a page, RDFa?

bhyland: What's the state of the art for being able to scrape a site for RDFa,

sandro: It doesn't have to be RDFa, it can be any RDF format

<DeirdreLee> https://joinup.ec.europa.eu/asset/core_business/home

BartvanLeeuwen: I'm willing to take a look at it

bhyland: take our site - say we had a book that we'd published. And we marked up the page with data. How to do we say look at this and this but not that

BartvanLeeuwen: If you look at GR you can say what your service offerings are

<George_> ACTION: BartvanLeeuwen to investigate GR ingest from CD provided page [recorded in http://www.w3.org/2012/01/25-gld-minutes.html#action07]

<trackbot> Sorry, couldn't find user - BartvanLeeuwen

<scribe> ACTION: BartvanLeeuwen to investigate how Good Relations etc could assist with automatically filling up the directory [recorded in http://www.w3.org/2012/01/25-gld-minutes.html#action08]

<trackbot> Sorry, couldn't find user - BartvanLeeuwen

<olyerickson> can we please not make this more complicated than necessary

<scribe> ACTION: Leeuwen to investigate how Good Relations etc could assist with automatically filling up the directory [recorded in http://www.w3.org/2012/01/25-gld-minutes.html#action09]

<trackbot> Sorry, couldn't find user - Leeuwen

<scribe> ACTION: van Leeuwen to investigate how Good Relations etc could assist with automatically filling up the directory [recorded in http://www.w3.org/2012/01/25-gld-minutes.html#action10]

<trackbot> Created ACTION-35 - Leeuwen to investigate how Good Relations etc could assist with automatically filling up the directory [on Bart van Leeuwen - due 2012-02-01].

<Zakim> cygri, you wanted to say this is not on the critical path for the community directory

cygri: I'm all for eating our own dog food. At the same time, to make the CD a success, the question of whetehr it can slup in RDFa is not necessarily the most important

<George_> but it does speak to the freshness and updating issue :)

cygri: I don't want to discourage people looking at it, but it's not priority number 1
... So this is a vendor directory for LD organisations etc, yes?
... Are there similar examples of sites that do the same for other areas?

<bhyland> It is broader than a vendor directory.

<bhyland> Cygri: is there an analogous site to this one?

cygri: Can we find an example of something that achieves what we want to do in a differnt domain?

bhyland: The library community likes directories
... It's not just about vendors

<George_> agree with Mike_Pendleton wrt being aligned with Procurement

bhyland: It's about finding expertise, whether commercial, academic or whatever

<t_gheen> bhyland: there are many examples of these kinds of directories - ex. travel sites

<boris> biomedical directories

olyerickson: I'll reinforce what others have said about KISS
... It's hard to get people to add their data, even harder to get them to recode their websites
... if we want to be able to slurp in pre-cooked RDF then great, but maybe that should be a separate file
... an option for GR is having a separate location of the RDF info
... If I add my company info into the CD then it would be nice if the CD made an RDF file available that I could then add to my site

bhyland: Love that suggestion
... It's a Foafomatic tool - great

<George_> me thinks that's what BartvanLeeuwen meant in the first place, + the idea that callimachus could/should also serve as a RDFa template for those that can/will publish that

bhyland: You get something back for your effort

<BartvanLeeuwen> George_, ack

<olyerickson> ;)

DeirdreLee: It seems the CD seems to be taking a centralised approach. We want people to put theire data out there and then third party tools can use it
... And the CD is a third party tool in this context

<George_> otherwise we'll pull it from dbpedia :)

bhyland: Yep, think distributed, think linked data
... summarises what she's taken down so far (and Sandro reminds her he's on the q)

Linked Data Cookbook

<boris> http://www.w3.org/2011/gld/wiki/Linked_Data_Cookbook

bhyland: It uses a linking gov data chapter I wrote from last November
... we got permission to keep the copyright
... some of it prob belongs in the best practices
... useful if you've had a chance to review it of course

boris: The content looks the same as the BP working draft - is it not the same?

cygri: Refers to the charter...

<cygri> The group will produce a collection of advice on smaller, more specific issues, where known solutions exist to problems collected for the Community Directory. This document is to be published as a Working Group Note, or website, rather than a Recommendation. It may, instead, become part of the Community Directory site.

<BenediktKaempgen> +q

BenediktKaempgen: We have been talking about the BP as a static document and it shouldn't be too specific as it will go out of date. The cookbook is more of a live document/resource

BartvanLeeuwen: I see it as a more specific document and yes, a living one

BenediktKaempgen: For example, a list of the current, most important vocabularies - that's a useful start for individuals

<George_> BenediktKaempgen: posits list of vocabs as example of 'smaller more specific'

<sandro> how about: when to use RDF/XML vs Turtle vs RDFa vs SPARQL ?

BenediktKaempgen: so the criteria go in the BP doc, ones that meeti the crierta go in the cookbook

DeirdreLee: What's the government element of the cookbook?

<George_> DeirdreLee: what's the Gov angle?

DeirdreLee: It seems as if it could cover life sciences etc. ...

bhyland: There is a lot of overlap and may overlap the Linked Data Platform WG too

bh: I write the various entries with gov in mind even though things can be used elsewhere too
... 80%+ can apply to any LD project, yes - but people from gov will gravitate to it on w3.org

<GofranShukair> Sorry ..I have to go ..bye everyone see you tomorrow

<cygri> http://www.w3.org/TR/swbp-vocab-pub/

<sandro> http://answers.semanticweb.com/

sandro: I picture the cookbook as an FAQ, stak overflow type thing
... There are 30-40 questions that gov people will ask when asked to consider implementing LD

bhyland: Mike_Pendleton gave me a bunch of questions when we began working with the EPA - yes, that makes sense

<cygri> +1 to sandro. that made sense to me.


bh: Thanks for the feedback - that helps me see what needs to be done

<olyerickson> +1 to stack overflow-like functionality (but that's not free anymore)

cygri: So how can we collect those questions?

<olyerickson> @bhyland I have to sign off now...apologies. Have a great day, everyone!

<t_gheen> ACTION: bhyland gather top 30-40 questions for the FAQ [recorded in http://www.w3.org/2012/01/25-gld-minutes.html#action11]

<trackbot> Created ACTION-36 - Gather top 30-40 questions for the FAQ [on Bernadette Hyland - due 2012-02-01].

byland: dare I suggest an action item to collect the questions

<cygri> mhausenblas, i'm not in charge of the agenda, but it says we stop at 8

<sandro> list of stackoverflow clones. we could install an instance of one of these.... http://meta.stackoverflow.com/questions/2267/stack-overflow-clones

<cygri> sandro, why? answers.semanticweb.com is already there. don't fragment

<BartvanLeeuwen> cygri, +1

bhyland: Considers the day, whether we have achieved our targets
... reviews tomorrow's agenda

<sandro> cygri, not sure, just brainstorming. are there tags there we can use to help get GLD folks started in the right direction there?

bhyland: anyone not here tomorrow? t_gheen has to meet someone very senior in the West Wing

<cygri> sandro, not really. it's for asking questions and getting them answered, not really for reading old answers

<rreck> yes. i have posted the slides

bhyland: We'll talk about stability tomorrow

PhilA: Anne W might want to look at the outcome from the workshop on stability held last month http://www.w3.org/2001/tag/2011/12/dnap-workshop/notes.html

<George_> cygri: more DCAT tomorrow

cygri: Would like to talk about DCAT

<George_> +1 cygri

+1 on DCAT as the hope is to resolve to go to FPWD

<boris> http://www.w3.org/2011/gld/wiki/F2F2#Agenda

<George_> +1 more ADMS tomorrow morning too

Current static version of DCAT is at https://www.w3.org/2011/gld/group/WD-DCAT-20120106.html

<George_> agreed

<George_> with a mandate!

<DeirdreLee> Interoperability Solutions for European Public Administrations http://ec.europa.eu/isa/

<DeirdreLee> Join up https://joinup.ec.europa.eu/

Thanks all round

Meeting adjourned

<bhyland> ping

<bhyland> is someone in Galway publishing the minutes for today??

Summary of Action Items

[NEW] ACTION: BartvanLeeuwen to investigate GR ingest from CD provided page [recorded in http://www.w3.org/2012/01/25-gld-minutes.html#action07]
[NEW] ACTION: BartvanLeeuwen to investigate how Good Relations etc could assist with automatically filling up the directory [recorded in http://www.w3.org/2012/01/25-gld-minutes.html#action08]
[NEW] ACTION: bhyland convene a meeting on armchair usability for community directory [recorded in http://www.w3.org/2012/01/25-gld-minutes.html#action06]
[NEW] ACTION: bhyland gather top 30-40 questions for the FAQ [recorded in http://www.w3.org/2012/01/25-gld-minutes.html#action11]
[NEW] ACTION: bhyland to set up an issues list for dir.w3.org [recorded in http://www.w3.org/2012/01/25-gld-minutes.html#action05]
[NEW] ACTION: boris to create a Wiki page on multi-lingualism of vocabs [recorded in http://www.w3.org/2012/01/25-gld-minutes.html#action03]
[NEW] ACTION: cygri to produce editor's draft of Data Cube spec [recorded in http://www.w3.org/2012/01/25-gld-minutes.html#action02]
[NEW] ACTION: Leeuwen to investigate how Good Relations etc could assist with automatically filling up the directory [recorded in http://www.w3.org/2012/01/25-gld-minutes.html#action09]
[NEW] ACTION: mhausenblas to compile first version of vocabulary selection quality checklist [recorded in http://www.w3.org/2012/01/25-gld-minutes.html#action04]
[NEW] ACTION: PhilA to add products on issue tracker [recorded in http://www.w3.org/2012/01/25-gld-minutes.html#action01]
[NEW] ACTION: van Leeuwen to investigate how Good Relations etc could assist with automatically filling up the directory [recorded in http://www.w3.org/2012/01/25-gld-minutes.html#action10]
[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.136 (CVS log)
$Date: 2012/01/25 19:09:24 $

Scribe.perl diagnostic output

[Delete this section before finalizing the minutes.]
This is scribe.perl Revision: 1.136  of Date: 2011/05/12 12:01:43  
Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: RRSAgent_Text_Format (score 1.00)

Succeeded: s/hieratchy/hierarchy/
Succeeded: s/by/but/
Succeeded: s/Gishlain/Ghislain/
Succeeded: s/see/seek/
Succeeded: s/bhland/bhyland/
Succeeded: s/Melvin C/Toby Inkster/
Succeeded: s/about/about how/
FAILED: s/what olyerickson/what olyerickson said/
FAILED: s/ucing/using/
WARNING: Bad s/// command: s/data.gov.ie/http://data-gov.ie
FAILED: s/worst/worse of all/
FAILED: s/cmusialekhas/cmusialek/
FAILED: s/fture/future/
FAILED: s/expecially/especially/
FAILED: s/names/named/
FAILED: s/stak/stack/
Found ScribeNick: mhausenblas
Found ScribeNick: BenediktKaempgen
Found Scribe: BenediktKaempgen
Inferring ScribeNick: BenediktKaempgen
Found Scribe: PhilA
Inferring ScribeNick: PhilA
Scribes: BenediktKaempgen, PhilA
ScribeNicks: mhausenblas, BenediktKaempgen, PhilA

WARNING: No "Present: ... " found!
Possibly Present: AnneW Aside BartvanLeeuwen BenediktKaempgen BernHyland ChristopheGueret Cygri DanG DaveReynolds Deirdre DeirdreLee DruidSmith GUEST George George_ GeraldSteeman Gofran GofranS GofranShukair IPcaller MacTed Michael MichaelH Mike_Pendleton NB Olyerickson P0 P1 P10 P11 P4 P9 PROPOSAL PhilA Richard SpyrosKotoulas Washington Yigal aabb aacc aadd aaee aaff bern bh bhyland bhyland_ boris boris_ byland cgueret_work cmusialek csarven danbri davidwood_ dvilasuero fadi galway gld https joined member mhausenblas olyerrickson rreck sandro scribenick simonWall spyros stasinos t_gheen t_gheen_ tighten trackbot
You can indicate people for the Present list like this:
        <dbooth> Present: dbooth jonathan mary
        <dbooth> Present+ amy

Regrets: Hadley Beeman
Agenda: http://www.w3.org/2011/gld/wiki/F2F2#Wednesday.2C_25-Jan-2012

WARNING: No meeting chair found!
You should specify the meeting chair like this:
<dbooth> Chair: dbooth

Found Date: 25 Jan 2012
Guessing minutes URL: http://www.w3.org/2012/01/25-gld-minutes.html
People with action items: bartvanleeuwen bhyland boris cygri leeuwen mhausenblas phila van

[End of scribe.perl diagnostic output]