SV_MEETING_TITLE -- 02 Nov 2010

<sandro> People in room, in order:

<sandro> Karen Myers

<sandro> Sandro Hawke

<sandro> Jeni Tennison

<sandro> Daniel Dardailler

<sandro> Roger Cutler

<sandro> Phil Archer, W3T + Talis

<sandro> Martín Álvarez, Fundación CTIC

<sandro> Thomas Bandholtz Germany -- LD Environrment Data

<sandro> Antonio Sergio Cangiano, SERPRO

<sandro> Jose Leocadio, SERPRO

<sandro> Yosuke Funahashi, Tomo-Digi Corporation

<sandro> Vagner Diniz

<sandro> Karen Burns

<scribe> scribe: PhilA

<scribe> scribeNick:PhilA

<karen> Daniel: Are we going to talk about the creation of a task force to look at education and outreach?

DD: Raises issue of non-tech education & outreach
... (as possible agenda item later)

Karen: 1.5 yrs ago we had an active task force
... comm team etc. planning lots of stuff
... then there was a shift in priorities, and Josema left. It was very effective when we were doing it
... I have a personal interest. We have some support from our PR firm that has a knowledge base in this area.
... but it's US-based. Need a more global view

DD: There is funding from the EU to help PSI

Karen: I like the idea of a TF.
... if there's a need for an IG just looking at that, all well and good

DD: we could spin off other groups from the IG

Karen: We need high level messages - what is open data, what is Linked data etc.

PhilA: Talis is interested in this ;-)

Sandro: Others?

Interest in the room from New Zealand, Brazil and more

<sandro> Karen B, Vagner

<karen> Vagner Diniz, Karen Burns

<scribe> ACTION: Daniel D to set up task Force on EO [recorded in http://www.w3.org/2010/11/02-egov-minutes.html#action01]

<trackbot> Created ACTION-118 - D to set up task Force on EO [on Daniel Bennett - due 2010-11-09].

Linked data at data.gov.uk

Jeni: Takes the floor...
... shows data.gov.uk Web site

<sandro> ACTION-118 is really on Daniel Dardailler not Daniel Bennett.

(http://data.gov.uk)

Jeni: Most data is in CSV or XML. Some, but not much, LD
... explains the term 'organogram' to mean org chart, organisational info etc.
... an edict from gov said that all departments should publish their organograms on data.gov.uk, and it specified what info had to be included
... about 62 on d.g.u now
... majority published a PDF of their organisational structure
... pretty pictures with tables that don't help a lot as there's no data to pull out
... some of the org charts use headings defined centrally
... some published as Power Point
... senior post data includes reporting structures
... shows a CSV file
... talks about 'Gridworks', now renamed 'Google Refine'
... see http://code.google.com/p/google-refine/

Roger C: That's cool!

Jeni: yes it is!
... important tool for cleaning up data
... sometehing that non-specialists can use
... Demos Google Refine
... You can see the facets for a column, edit values that have gone wrong, edit column names
... the key point is that civil servants can use this tool
... we have gone round training a bunch of civil servants. Lots of good feedback. People have begun using it
... extremely nice features around reconciling data around already published data
... you can ask the tool to reconcile a column

<sandro> gridworks "reconcile" to link to web data. nice!

Jeni: turns strings into links (if it finds relevant data)
... You can do a bit of manual work to produce clean RDF without actually handling RDF (knowlingly)
... you can apply scripts
... shows adding a column for, in this case, provenance
... data.gov.uk tries to keep track of where we get data from
... and if I open up a script (in this case, a bit of JSON). Paste that in and apply those instructions - it will perform various tasks, creating extra columns etc.
... my script adds in lots of URIs in this case
... (URIs central to linked data)
... DERI has created a plug in that describes the data
... now can export the data as turtle or RDF/XML

(Shows RDF generated from the CSV)

Jeni: You can see the different posts within 'BIS' (Department of Business Innovation and Skills)
... we run several stores, mostly hosted by Talis

<sandro> (I wonder if there's a way to simplify that script application process....)

Jeni: they bring together data sets for, say, transport
... then one on education and so on
... organogram data is "reference data"

i.e. http://reference.data.gov.uk

(Shows SPARQL queries against BIS organogram data). Live. No safety net

Voila! Some results

Jeni: Most people, including developers, don't react well to being asked to write SPARQL queries
... so we have added a layer on top of the SPARQL to provide a simpler API
... I'll show you the basic Linked data API first of all

http://reference.data.gov.uk/id/department/bis

Shows how this gives a 303 to http://reference.data.gov.uk/doc/department/bis

demos exploring the data

Karen: This is fabulous
... practical question - who is updating the data?

Jeni: the generic answer is "how long is a piece of string". Some data changes daily, some changes much less frequently
... for organogram work, the stipulation was that data should be valid on 30/6/10 and should be updated every 6 months

Karen: How did the departments react?

Jeni: It was hard. it took a big stick from the Cabinet office to get it done
... Most departmetns have generated just a PDF or a POwer Point
... some generated a CSV (prob by HR with help from IT dept)
... generation of RDF was done by me (x 6). One dept has done it themselves

Karen: And they can navigate this UI?

Jeni: This UI is designed to show them the benefit of doing it as LD. Shwoing that people can navigate around the data
... you can see the different sources of the data

Sandro: Is everyone's salary info public by law?

Jeni: Top civil servants - although it's not by law, it's the culture

Roger: Who wants to do this and why?

Jeni: WE have a strong developer community in the UK. They want to get hold of gov data, package it and so on
... they usually want to pursue this for lobbying or political ends
... person X is claiming ABC on their expenses, is this right?
... personally I don't find that the most interesting data that governments can put out but it is where the current political drive is in the UK

Karen: it is tangible though

Jeni: School performance is something that people can relate to as well
... completes demo
... this helps people explore the data and find out where the data came from

<sandro> exploring http://reference.data.gov.uk/doc/department/bis

Jeni: shows XML data, or JSON data - the interface allows you to access the data in various formats. Just add .xml or .json to the URI. That's what the Linked data API is about
... So much for the data, available as an explorer and as data for developers. But it's not especially pretty
... so let's see if we can find a pretty output

Demos BIS organogram visualisation

http://danpaulsmith.com/gov/orgvis/?dept=bis

Sandro: Does it go all the way ip to the prime minister?

<sandro> sandro: if you want to get on this giant org chart, give us your RDF

Jeni: Not yet, but that would be cool, especially if it came out of all the different data sets created by different departments. The overall org chart comes out of its component parts - if you use linked data
... Getting to an end to end story like this has taken several months
... we had to work out what URIs should look like for different departments. This is a department within a department, a unit etc.
... we had to create some vocabularies for organisational structures generally and then specifically for UK
... provenance data is very important
... statistics around salary costs - needed a vocabulary for talking about statistical dta
... and those fundamental design choices etc. had to be done to support the kind of end to end story we've been looking at here

Sandro: Those vocabularies sound like candidates for standardisation

Roger: AIUI you started by defining standardised things. In the tool you had a reconciliation step that knew what to do with the data. So it must have been pretty close for automated reconciliation
... what does that depend on?

Jeni: Gridworks takes the values that it finds in the column. Takes a sample, sends it to a reconciliation service - an API for this kind of thing
... the Rec service looks at the values, looks at the data it has, and works out what it looks like and what vocab is appropriate

<sandro> sandro: I've put those vocabularies, as best I understood, into the GLD WG Work Items list, http://www.w3.org/2001/sw/wiki/GLD_Work_Items

Jeni: recognising "John Smith" as a name cf. "Smith, John" is something the reconciliation service does. Leigh Dodds (Talis) created a good example of this

See http://www.ldodds.com/blog/2010/08/gridworks-reconciliation-api-implementation/

Further discussion on how this works between Roger, Jeni and Sandro

Roger: How does a salaray get recognised as a salary

Jeni: That's in the RDF schema

<sandro> sandro: so reconciliation takes strings which are intended to be identifiers and turns them into proper URI identifiers

Jeni: and we can use the CSV column name as the hook

Roger: so "salary" might match and "sal" won't

Jeni: yes

Roger: So there's a certain amount of fixing up of "user data"

karen: What is the UK gov policy towards the APIs?

jeni: The Linked data API is on Google Code and anyone can use it http://code.google.com/p/linked-data-api/

Talis implementation in PHP http://code.google.com/p/puelia-php/

Karen: Are you shouting about this?

Jeni: yes

Sandro: The Linked data API work was presented at various meetings

karen: so it needs more outreach

Roger: I'd like to hear more about how politicians talk about this?

Jeni: Gordon brown got it and understood it

<sandro> It's a possible item for GLD WG: http://www.w3.org/2001/sw/wiki/GLD_Work_Items#Developer-Friendly_API_and_Serialization

Jeni: current government see it as part of the transparency agenda. Making data available in a machine readable format

Roger: Made the point that organisational capability, software support etc. is really important for commercial companies etc.

Vagner; Jeni talks about visualisation

scribe: has W3C done any work on visualisation?

Sandro; beyond CSS and SVG, I'm not aware of any

Vagner: You added this item on the WG. I wonder if it's something we need to discuss?

Sandro: That's coming out of the data.gov.uk work but I don't know any more detail

PhilA: I think an RDF-SVG link would be cool

Fabien: Talks about an existing effort to link CSS, SPARQL and more

Open data initiatives in Spain

Martin: Shows map drawn in SVG, lots of tables in RDF so we're already linking RDf and SVG

Sandro: What's the linkage?

Martin: I think it's Java
... begins talk

<FabGandon> http://www.cytoscape.org/ Cytoscape combines SPARQL CONSTRUCT with a graph interface to allow the user to select and render RDF data

Martin: Open Data act 2007
... aim to create catalogue of open gov data

<FabGandon> more precisely the S*QL plugin for cytoscape http://semtech2010.semanticuniverse.com/sessionPop.cfm?confid=42&proposalid=2932

Martin: 3 regional governments (Asturias) Basque and Catalonia

Shows Asturias catalogue

Martin: only 4 data sets but all linked
... uses things like the organisation vocab, iCal etc.
... project is 100% linked data, hosted on our own Oracle triple store
... we've added some Jena modules (and our own) to create SPARQL endpoints etc.
... metadata modelled using VoID
... HTML view generated dynamically
... Basque country is similar but is focussed on raw data. They have some RDF links but they're static files to describe the data sets
... more than 1,000 data sets in raw formats
... useful info for citizens anda industry
... translation memories (Euskara -> Espanol etc.)
... Catalonia is a new one
... this will be similar to Basque country initiative
... most data will be in raw formats (CSV, XML etc.)
... they will provide some info in RDf
... as well as RDF static files
... we're also creating a catalogue using DCAT
... See http://vocab.deri.ie/dcat

<FabGandon> S*QL plugin for Cytoscape presentation here https://connect.umms.med.umich.edu/p79605689/?launcher=false&fcsContent=true&pbMode=normal

Martin: they will present > 26K vCards in RDF, describing public centres, using linked data approach
... as well as the regional governmetns we also have Saragossa, Gijon and Barcelona as cities in the project
... hope to have more info next month
... Saragoss will be using linked data. Should be first city to adopt this
... they're also using DCAT
... Gijon is a simple project
... adapting a CMS to provide RDF content representations in parallel byt adding RDFa to pages
... we conclude that most governmetns are interested in publishing many data sets quickly
... they want to release the data 'now'
... they want good headlines
... they are neutral on idea of linked data
... they 'know' that semantic modelling is hard and they don't want to spend more time and money on it
... for us it's easy to create examples. It's not so easy for the developers
... maybe we need more examples like the Linked data API to help
... this would help to help to foster the use of the linked data info

Jeni: It is the case that making data useable and reusable is hard. Linked data is no harder. It's making the data clean that's hard.

<sandro> jeni: Making data reusable is hard (not semantics, per se)

Martin: we are trying to convince them using these examples. we gather their spreadsheets and show the linked data examples

<martin> Example of linked data representation http://datos.fundacionctic.org/sandbox/ineasturias/viviendas.do

Karen: Can you say a little more about the time and money. What levels of government are you working with?

Martin: The best example is Catalonia. They called us 15 days ago and said they wanted to have an open data site within a month. Can you help?
... speed was major concern
... get the data we have out there
... they are convinced that linked data is a good solution, they want to follow it, but they prefer spending their resources in developing open data site, specifying the licence
... LD comes later?

Karen: So who called you?

Martin: Not an IT person. More close to the citizenry
... not sure of actual department
... some are closer to the IT departments

Sandro: What was their motivation

Martin: They know about linked data because we told them about it
... most of them haven't heard about LD before
... they know open data initiatives, not linked data
... we managed to convince them ;-)

Sandro: Any other questions?
... then we'll take our break now

<sandro> http://www.w3.org/egov/wiki/TPAC_2010

<FabGandon> Datalift: http://datalift.org

Fabien Gandon on France's DataLift Project

<sandro> fabien: I'm not speaking for all of France! This is just one accepted project. Accpted in june, kickoff was at end of September.

<sandro> ... Not a lot to show yet, but now is a good time to give us feedback.

<sandro> ... Last year at ISWC we had a meetup, and discussed the lack of data.gov project in France

<sandro> ... We considered a prototype in Talis; first question --- where will the data physically be stored? In the UK or France?

<sandro> ... considered cloud in Europe

<sandro> ... datalift == lifting from raw data to rdf in France

<sandro> ...Atos Origin will be integrator, building an open source integrated platform. As side effect they'll be ready to offer services

<sandro> ... Mondeca is a KR firm in Paris, doing industrial knowledge modeling

<sandro> ... Academics: INRIA at Grenoble (aligning schemas), Eurocom, Lirmm (the guy from INRIA got promoted there).

<sandro> ... I'll be using DataLift as a scenario for pushing Named Graphs

<sandro> ... INSEE has all the national statistics for France, IGN has all the maps

<sandro> ... Fing is new generation of tools - use cases and business models

<sandro> ... Phase 1: an easy open end of data, open platform

<sandro> ... not re-inventing wheel. We'll re-use existing solutions if they pass our benchmarks; all dev will be open source.

<sandro> ... - Assist the selection of data

<sandro> ... (every thing must be proven on INSEE and IGN data)

<sandro> ... - identify appropriate schemas

<Zakim> PhilA, you wanted to ask about licensing/openness of IGN data

<sandro> phil: glad to see IGN in there (we have Ordnance Survey, in UK, which does the mapping); OS wants money for some of the data.

<sandro> fabien: IGN has to get half their budget from sales, so that is a concern

<sandro> roger: RDF only, or OWL too?

<sandro> fabien: We'll use OWL when the scenario calls for it

<sandro> fabien: We are concerned about speed of reasoning, so we'll have to strike the balance

<sandro> sandro: you don't have to do the reasoning

<sandro> fabien: it depends on the scenario

<sandro> roger: Is this typical in eGov, to do this tradeoff?

<sandro> fabien: Everyone needs to make this kind of tradeoff

<sandro> jeni: we're mostly staying away from OWL

<sandro> fabien: if we need some big of OWL, then we can use it.

<sandro> fabien: With Atos, we'll benchmark every solution and see what scales well enough.

<sandro> roger: in HCLS, I saw a very elaborate authentication scheme, that was depending on just being in RDF [[S?]].

<sandro> fabien: I heard of this in Freebase

<sandro> ... not even using RDFS because it was deemed too expensive.

<sandro> ... - format conversion & connectors

<Roger> I believe it was RDFS

<sandro> ... (eg csv to rdf)

<Roger> The system was called S3. It's pretty interesting.

<sandro> ... - data publication itself (led by Atos)

<sandro> ... - interconnecting data

<sandro> .. eg URI for Paris connected to other URIs for Paris

<sandro> ... (or re-used)

<sandro> phil: What about talking to developers about using the data?

<sandro> fabien: Yes, we should have raised that topic more.

<sandro> ... it's one thing to show how to publish data; it's another thing maintain it

<Roger> Sorry -- S3DB

<sandro> ... our developers mostly don't speak SPARQL and many don't speak English.

<sandro> ... so a cookbook in English wont be enough

<sandro> TB: (missed question)

<sandro> fabien: As soon as possible

<sandro> fabien: Other topics: visualize, API for mobile, clouds, legal advice, cookbook

<sandro> fabien: can you legally protect a URI?

<sandro> phil: *boom* TimBL exploding on hearing that [imagined]

<sandro> fabien: R&D challenges:

<sandro> ... methods and metrics for schema selection

<sandro> ... balance of specific needs & reusability (I think there is a tradeoff between usability and reusability)

<sandro> ... data conversion & identifiers generation

<sandro> ... automation of dataset interconnection (via Jerome Euzenat)

<sandro> ... named graphs [hopefully aligned with RDF 1.1), provenance, licenses and rights

<Roger> S3DB Permissioning: http://s3db.org/documentation/installation

<sandro> ... First 18 months get platform running by www2012 in this building in Lyon!, then 18 more months.

<sandro> ... user's club -- folks who want to use it

<sandro> ... includes City of Bordeaux

<sandro> ... Various Liaisons

<sandro> sandro: how much money is the funding?

<sandro> fabien: 3 years, about 2-3k per year, some more for leader.

<sandro> fabien: may create related sub-projects.

<sandro> fabien: we're trying to disturb the environment to create bubbles. :-)

Linked Environment Data in Germany (Thomas Bandholtz)

<sandro> tb: Open Environment Data in the 90s

<sandro> ... Aarhus Convention 1998

<sandro> ... European Env. Agency (EEA)

<sandro> ... Environmental Agencies in Germany

<sandro> ... (slide 4)

<sandro> ... INSPIRE based on open geospacial consortium, nor RDF yet

<sandro> ... access to raw data in OGC feature service

<sandro> ... many public sector portals about water, soil, etc --- web pages, pdf, csv, xml of web services --- exhausting harmonization process

<sandro> ... sub-clouds like Linked Open Drug Data, linked to dbpedia; we probably wont use dbpedia as the central ref point, but it looks like they will map to us.

<sandro> ... (slide 13)

<sandro> ... (slide 14)

<sandro> ... GEMET and EUNIS published as Linked Data by EEA

<sandro> ... (slide 15)

<sandro> ... (slide 17 has involve rdf vocabs)

<sandro> ... SKOS, SKOS(XL) -- only stable/w3c

<sandro> ... Dublin Core

<sandro> ... geonames

<sandro> ... linked events ontology, for the chronicle

<sandro> ... Darwin Core (for species)

<sandro> ... SCOVO

<sandro> fabien: I think there's a commercial version of geonames for more/better/current data

<timbl> With a different ontology?

<sandro> tb: German govt has their own data, and the agency that owns the data wants to sell it. There's a free version, but it doesn't include the polygons.

<sandro> tb: We us geograph names; we don't use maps; this river flows through these cities, one by one.

<sandro> tb: sensor web, many developments to come

<sandro> tb: Darwin Core seemed to like the version I did of their work using SKOS.

<sandro> timbl: Have you looked at Open Street Map as a source of geospacial?

<sandro> timbl: linkedgeodata.org is a LD mirror of it.

<sandro> tb: I'll take a look at that.

<sandro> timbl: I'm told open streetmap is a better source of data than geonames

<sandro> tb: We use SCOVO or env. specimen bank, and some extensions. SDMX data came along.

<sandro> Jeni: We've looked at using SDMX -- just using the datacube part looks good, as a midpoint between SCOVO and SDMX.

<sandro> tb: We used the specialized subproperties of dimensin

<sandro> jeni: Yes

<sandro> tb: In skos-xl, class literals, so you can link labels.

<sandro> tb: inflectional forms of one word, extended properties of label class.

<timbl> For an RDF mapping see LinkedGeoData.org http://linkedgeodata.org/About

<sandro> tb: you could talk about this for years, we never came to an end.

<sandro> tb: (slide 18)

<sandro> tb: SPARQL end points -- can easily give accidental Denial of Service attack. :-)

<sandro> tb: but providing SPARQL would be nice.

<sandro> tb: authentication and access control would be good.

<timbl> (or a default limit =1000 for non-authenticated users)

<sandro> sandro: 4store includes a built-in resource limit, but default

<sandro> fabien: we built in a default limit, although that can confuse users who dont know about it.

<sandro> tb: This is good advice

<Vagner-br> TB has joined #egov

http://www.w3.org/2001/sw/wiki/GLD_Work_Items

<JeniT> sandro: using same model as RDF Core Work Items list

Sandro: inspiration for methodology here is the RDF Core http://www.w3.org/2001/sw/wiki/RDF_Core_Work_Items

<JeniT> sandro: four categories for the work items

<JeniT> sandro: 1. helping deployment happen

<JeniT> sandro: 2. liaison items such as provenance & named graphs

<JeniT> sandro: 3. vocabulary items

<JeniT> sandro: 4. other technical development work items such as design patterns for URIs

<JeniT> sandro: promised charter by end of January

<JeniT> sandro: would mean start in April, running for two years

<JeniT> sandro: expect F2F meetings to be useful but hard for people to travel, so may try split F2F meetings

<JeniT> ... video conferencing between two places

<JeniT> ... to specific work items:

<JeniT> ... 2.1 Procurement Definitions

<JeniT> ... @johnlsheridan mentioned that this is an issue

<JeniT> ... having standardised definitions of terms/products to include this in ITTs etc

<JeniT> PhilA: something that is very important for government procurement

<JeniT> ... similar to WCAG guidelines, governments can point to them and say 'you must produce according to these standards'

<JeniT> FabGandon: would this include success stories?

<JeniT> ... real scenarios?

<JeniT> Sandro: not in this piece

<JeniT> Sandro: beautiful license out of UK

<JeniT> ... could be understood as a human

<JeniT> ... is there something we can do internationally?

<JeniT> ... having a list of licenses used in different countries?

<JeniT> FabGandon: I've been using double licensing

<JeniT> ... RDFa/GRDDL profile was licensed LGPL and a french license

<JeniT> Sandro: yesterday Daniel talking about getting bicycle accident data

<JeniT> ... had to sign a paper license

<JeniT> ... included things to say that he had to keep his application up to date

<sandro> vocab for describing licenses

<sandro> sandro: let me query for datasources I;m allowed to use for my app

<JeniT> FabGandon: something to indicate where licenses are roughly equivalent

<sandro> jeni: maybe some recommendations about what makes a good license for gov data --- allowing reuse

<sandro> jeni: guidance for licenses which enable the right kind of use

<JeniT> Sandro: 5-10 page note maybe?

<JeniT> Sandro: is this W3C says this or just the working group says this?

<JeniT> PhilA: be hard to have a recommendation for licenses

<JeniT> ... but a recommendation carries more weight

<JeniT> ... how would you include two independent implementations?

<JeniT> Sandro: two governments that follow the practices

<JeniT> ... might make sense to have it as one of several points within a recommendation

<JeniT> ... need the WG to work out what granularity of documents they want

<JeniT> Sandro: 2.3 Community Survey

<JeniT> ... self-sustaining database of vendors

<JeniT> PhilA: would this include apps that use the data?

<JeniT> Sandro: wasn't thinking so but data consuming systems would be good

<JeniT> ... the hardest part is to make it self-sustaining

<JeniT> FabGandon: only example that comes to mind is Semantic Web Tool Wiki page

<JeniT> ... but you're talking about a real database

<JeniT> Sandro: it could be a wiki page, but there are some people who aren't happy with that

<JeniT> ... would give WG freedom to decide how to do it

<JeniT> PhilA: why do you care that this gets done?

<sandro> sandro: it's more important that this is done than that ie be a demo.

<JeniT> PhilA: about the whole government linked data thing

<JeniT> Sandro: got a very enthusiastic yes from the AC

<JeniT> PhilA: building community is very important

<JeniT> ... how far does it go?

<JeniT> ... it's hard to keep it coherent and up to date

<JeniT> ... high hurdle for WGs

<JeniT> Sandro: these lists tend to atrophy

<JeniT> FabGandon: only successful example is this wiki page, because it survived the group that started it

<JeniT> Sandro: even if it doesn't survive the group, the list working for a year or two would be very useful

<JeniT> PhilA: certainly as the group is going

<sandro> jeni: make it be a resource for the WG as it's runnig.

<JeniT> Sandro: would hope that it could aim to be potentially self-sustaining

<sandro> jeni: It should be a success just to have it run during the live of the wg.

<JeniT> PhilA: would hope that at the end someone would want to pick it up and continue with it, but it would not be a failure of the WG if that didn't happen

<JeniT> Sandro: maybe the mediawiki solution is good enough in that case

<JeniT> ... fairly dogfoody, even if RDF is not very linked data

<JeniT> ... helps us make sure that we know who to ping to try to get public review of our specs

<JeniT> ... and is useful to the communities

<JeniT> Sandro: 2.4 Cookbook or Storybook

<JeniT> FabGandon: yes, scenarios and success stories

<JeniT> ... when I talk to people in public sector, as a researcher they think everything I say is science fiction

<JeniT> ... I want a place to point them

<JeniT> PhilA: would that be the equivalent of a use cases document?

<JeniT> FabGandon: use cases aren't always implemented, scenarios are things that are already deployed

<JeniT> ... using UK a lot for this

<JeniT> PhilA: but this would be early input to the group

<JeniT> FabGandon: making them visible in a document gives me something to point to

<JeniT> ... there are best practices

<JeniT> Sandro: use cases tend to abstract from scenarios

<JeniT> FabGandon: GRDDL use cases were a fiction

<JeniT> PhilA: I'm expecting WG to come up with best practices and recommendations

<JeniT> ... need to have scenarios as input for that

<JeniT> ... same function as use cases

<JeniT> Sandro: a product of WG is to have gathered a collection

<JeniT> ... could be written by people associated with scenarios, if we can get them to do it

<JeniT> ... not sure about stories about failures

<JeniT> PhilA: having stories about failure are really useful

<JeniT> ... being able to talk about failures in a constructive way

<JeniT> Sandro: may be hard to do that in published writing

<JeniT> ... but worth a try

<JeniT> Sandro: 3.1 Provenance

<JeniT> ... been incubator running for a year

<JeniT> ... final year is going to recommend WG

<JeniT> ... suspect that there will be one in the next 6 months

<JeniT> ... this group interacting with that group would be useful

<JeniT> Sandro: 3.2 Named Graphs

<JeniT> ... similarly, this interacts with provenance

<JeniT> Sandro: 3.3 POI WG

<JeniT> ... not sure how much government geography is addressed by this

<JeniT> ... think it's just going to be lat/long + polygons

<JeniT> PhilA: I ran workshop that led to POI WG

<JeniT> ... going to be struggle to get them to acknowledge linked data exists

<JeniT> ... one guy from DERI trying to get them to think about it

<JeniT> ... augmented reality main group

<JeniT> ... will need active steering to ensure liaison

<JeniT> Sandro: need a person in both groups

<JeniT> ... I was being optimistic about RDF vocabulary

<JeniT> PhilA: yes, very

<JeniT> ... as interested in moving objects as static

<JeniT> ... and motion in relative direction

<JeniT> Sandro: in worst case, someone could take formal model and map to RDF

<JeniT> Sandro: probably other liaisons I've forgotten

<JeniT> ... SPARQL?

<JeniT> ... don't know exactly what dependency looks like

<JeniT> ... are there any outside of W3C?

<JeniT> ... organisations doing some close to GLD?

<JeniT> PhilA: need people from data.gov from different countries

<JeniT> Sandro: hoping that they get involved in the working group

<JeniT> ... thinking about peer organisations

<JeniT> ... normally have standards, vendors & other standards bodies

<JeniT> FabGandon: wonder if relying on local offices to synchronise locally

<JeniT> ... W3C office in Paris will be good point of synchronisation

<JeniT> ... of communicating, diffusing, making sure right people are aware

<JeniT> PhilA: not just national governments

<JeniT> ... colleague talking to Helsinki, Berlin, city authorities

<JeniT> ... not just national governments, but local ones as well

<JeniT> Sandro: check with OASIS and OMG and usual suspects

<JeniT> Thomas: INSPIRE and OGC?

<JeniT> ... they are doing something not so different, but with URNs and XML

<JeniT> ... someone would have to write a technical spec for RDF

<JeniT> Sandro: is there funding available if someone has the skills to do it?

<JeniT> Thomas: it's a EU directive, and each government has people who are working on it

<JeniT> Sandro: seems like the kind of thing that a university might do

<JeniT> Sandro: is it a good model that anyone else might be interested in?

Jeni: Stuart Williams is working with the UK end of INSPIRE to do some mapping of the object modles into RDF

<JeniT> Thomas: harmonising on what each member should provide on each topic

<JeniT> ... they have a dozen themes

<JeniT> ... mandatory data items on each theme

Stuart Williams, formerly of HP, TAG member, now at Epimorphics, Bristol-based Sem Web consultancy

<JeniT> ... we shouldn't care about domain-specific things

<JeniT> ... we could get a huge mass of more data if we mapped into RDF

<JeniT> ... get a lot of benefits from organisational power of INSPIRE

<JeniT> PhilA: the one bit of data that sticks in my head

<JeniT> ... is target for implementation is 2018

<JeniT> ... so don't want to depend on INSPIRE

<JeniT> ... this group would inspire INSPIRE

<JeniT> ... W3C is known to be slow, but we're faster than that!

<sandro> OGC

<JeniT> Thomas: there are many agencies publishing data using OGC services

<JeniT> ... maybe better to talk about SDI

<sandro> spacial data infrastructure

<JeniT> ... they have a G (Global) SDI conference every year

<JeniT> ... have questions about how to publish this in RDF

<JeniT> ... all fragmentary contributions

<JeniT> ... would be a different level

<JeniT> ... they have a catalogue service web, like DCAT

<JeniT> Sandro: is OGC a reasonable way to interact with them?

<JeniT> ... they are W3C members

<JeniT> ... we might be able to get them to participate in a liaison capacity

<JeniT> Thomas: geoSPARQL is one of these topics

<JeniT> ... encoding of sensor observation services in RDF

<JeniT> ... these are ongoing activities

<JeniT> ... not specific for government, but INSPIRE is

<JeniT> Sandro: every nation has a lot of legal issues around geographical information

<JeniT> Thomas: this is one of the things, that you describe the data that you will sell

<JeniT> ... I used to talk about linked data

<JeniT> ... not talking about LOD any more, because we shouldn't exclude non-open data

<JeniT> FabGandon: And accessing the data from my company I have access to things on the intranet

<JeniT> Sandro: these are good pointers, but I'm not sure what it makes sure to do in this charter

<JeniT> ... my thought was that POI would take care of it, but I guess not

<JeniT> JeniT: feels like a rat hole

<MacTed> (I'm concerned every time I see a line like "I used to talk about linked data, but I don't talk about LOD anymore" because "Linked Data" is bigger than "LOD" ... so I hope you [Thomas] still talk about "Linked Data" which absolutely includes non-open data)

<JeniT> Sandro: we can make it in scope, out of scope, or get the WG to decide

<JeniT> FabGandon: think it's difficult to rule out geographic data in a government data charter

<JeniT> ... so many scenarios where you need geographical data

<JeniT> PhilA: some liaison would be useful

<JeniT> ... 'we will liaise with POI WG, and be aware of other work going on in this area, but not core duty of GLD WG to codify'

<JeniT> FabGandon: going to be the same with temporal data representation

<JeniT> ... want to say that 'this data is only valid for this financial year'

<JeniT> ... another rat hole

<sandro> PhilA: "We think this is important, and we'll liaise, but we wont develop a vocab for geo"

<JeniT> ... good part is that you don't have proprietary aspects

<JeniT> ... again needs liaison with people in time data

<JeniT> PhilA: this is relevant for POI, because important in crisis management

<JeniT> FabGandon: we have someone who may be involved in this aspect

<JeniT> Sandro: The next two groups were vocabulary and non-vocabulary technical items

<JeniT> ... I had some idea of doing vocabularies later, but let's proceed in order

<JeniT> ... TimBL at dinner last night said something...

<JeniT> ... I had always envisioned that W3C would write the vocabulary, document it and so on

<JeniT> ... but TimBL said that if foaf:name is what people should use, we can say in the W3C Recommendation that that's what people should use

<JeniT> ... but we could set a bar for what we mean for a 3rd party vocabulary

<JeniT> ... and if FOAF can get over that bar

<JeniT> ... then that's fine

<JeniT> PhilA: we wanted to use FOAF

<JeniT> ... and if DanBrickley goes under a bus, the server goes with him

<JeniT> ... (this is in POWDER)

<JeniT> ... got around it by using Dublin Core

<JeniT> ... we had conversations for ages about this, about how FOAF could become more stable

<JeniT> ... doesn't have an organisation behind it

<JeniT> ... could W3C manage it? no

<sandro> jeni: I think there are some important things here, around check boxes for what vocabs we will trust.

<sandro> ... lots of stuff around the org behind it, documented policy on change control, ... it would be useful to document these up front. THESE ARE THE THINGS WE EXPECT A GOOD VOCAB TO DO.

<JeniT> Sandro: going meta, aside from the terms that we recommend...

<JeniT> ... this is going to be useful for Governments as well

<JeniT> ... to help Governments to identify which vocabularies they can use

<JeniT> ... could be GLD or could come from somewhere else

<sandro> jeni; Wider LD cloud might not care so much about stability. Academic projects don't mind so much.

<sandro> fabien: France wont use schemas of the UK.

<JeniT> PhilA: going to be a problem all over

<JeniT> ... W3C isn't designed to manage vocabularies

<JeniT> FabGandon: scalability problems as well

<JeniT> ... only standardise what's domain independent

<JeniT> ... can standardise provenance

<JeniT> ... cannot standardise biology ontology

<JeniT> ... this changes things a little bit

<JeniT> ... here we're crossing that line a bit

<JeniT> Thomas: we don't have to standardise geographical vocabulary, just specifying serialisation

<JeniT> FabGandon: there could be a well-known XML vocabulary, just provide RDFS version

<JeniT> PhilA: I think purls provide the way out of this

<JeniT> ... if it can't be on w3.org

<JeniT> Sandro: I wouldn't say it can't be on w3.org

<JeniT> ... there's the organisation vocabulary

<JeniT> ... @der42 approached TimBL to host it

<JeniT> ... there's a maintenance headache that comes with that

<JeniT> ... this is something TimBLs been pushing a long time

<JeniT> ... I've been pushing this for a long time too

<JeniT> PhilA: the person to convince is Ted Gild

s/Gild/Guild/

<JeniT> Sandro: vocabulary hosting in general is a huge issue for governments

<JeniT> FabGandon: more important than in any other domain

<JeniT> Sandro: I've been advocating that someone like IBM should get into the vocabulary hosting business

<JeniT> PhilA: same issue with Talis hosting it: we're a commercial company!

<JeniT> Sandro: so you get what you pay for

<JeniT> ... could pay a company to host it for a period of time

<JeniT> PhilA: we would host the stuff with a purl pointing to it

<JeniT> ... the purl points somewhere else if Talis goes under a bus

<JeniT> Sandro: I would say domain name per vocabulary

<JeniT> ... foaf.org rather than xmlns whatever it is

<JeniT> ... that gives the most flexibility

<JeniT> PhilA: govvocabulary.org/2010 or whatever

<JeniT> Sandro: but then you bind together several vocabularies in one organisation

<JeniT> ... if they are controlled by different people then you don't want them on the same domain name

<JeniT> PhilA: it's an issue because of neutrality

<JeniT> ... FOAF is a good example

<JeniT> Sandro: were you serious, Fabien, when you said that France wouldn't use any UK vocabularies?

<JeniT> FabGandon: I haven't checked, I know the reaction about hosting the data

<JeniT> ... wouldn't be surprised if French objected

<JeniT> ... issue with internationalisation as well

<JeniT> Sandro: would hope that any vocabulary provider would accept translations

<JeniT> PhilA: but who guarantees translation is accurate

<JeniT> FabGandon: in EU, have whole process of maintaining translation of different documents

<JeniT> PhilA: if you had a vocabulary that had anything but a .com, .org ending...

<JeniT> ... no way Americans would accept that

<JeniT> Sandro: end up using .com, .org or .net for the vocabularies

<JeniT> Sandro: 4.1 Metadata for Data Catalogs

<JeniT> ... no brainer that we want to move along DCAT in this group

<JeniT> ... had an interest group telecon with @cygri

<JeniT> ... wanted to spin off taskforce to do it

<JeniT> ... had large group that quickly dwindled

<JeniT> ... stopped entirely when Semtech came around, and didn't start up again

<JeniT> ... lots of interest there

<JeniT> ... bit question is does it end up as WG Note, as a Recommendation, as a pointer to something else?

<JeniT> FabGandon: how specific is it to eGov?

<JeniT> Sandro: right now taskforce in eGov IG

<JeniT> ... in doing taskforce charter

<JeniT> ... said clearly applicable beyond government

<JeniT> ... but let's take narrower scope for now

<JeniT> ... can see that it could be broader

<JeniT> FabGandon: it could even be a task of the new RDF WG

<JeniT> Sandro: I think it's too late to go there now

<JeniT> ... or in the provenance WG

<JeniT> ... someone asked what's the difference between provenance and DCAT

xLooking at http://vocab.deri.ie/dcat

<JeniT> PhilA: one thing that is missing is refresh rate

<JeniT> JeniT: think that's part of VoiD

<JeniT> PhilA: ah right

<sandro> http://www.w3.org/egov/wiki/Data_Catalog_Vocabulary

<JeniT> ... Alex Tucker has done RDF dump of CKAN data

<JeniT> Sandro: Wiki page includes use cases, deliverables, minutes and participants: 28 participants

<JeniT> ... huge amount of interest

<JeniT> ... reminded that Thomas was listening

<JeniT> Thomas: got a little bit bored...

<JeniT> ... did so much work on data catalogs in Germany...

<JeniT> ... ended up disappointing because no one used it

<JeniT> ... idea of having one data catalog as an access point is not a priority

<JeniT> ... in linked data domain discovery is following links

<JeniT> ... not looking at catalogs

<JeniT> ... it's OK, we need it, but...

<JeniT> Sandro: you don't need 28 people to design a vocabulary

<JeniT> ... you want 3 people to do the work, and wide review

<JeniT> ... you don't want big telecons with everyone who cares

<JeniT> ... in general that's going to be true

<JeniT> ... sometimes there will be issues that you want discussion for, but a lot is design by a small group

<JeniT> PhilA: I still think in terms of best practice document

<JeniT> ... say 'use DCAT and VoiD to describe your catalog'

<JeniT> Thomas: how is VoiD involved?

<JeniT> Sandro: DCAT can be for non-RDF data, VoiD for RDF data

<sandro> jeni: Some vocab (dcat) is about EVERY data set, and then some other vocabs are for certain kinds of data (eg geo about geo data, void about RDF data)

<sandro> PhilA: Neither void nor dcat covers refresh rate.

<JeniT> Sandro: I've never heard anyone assess quality or suitability of VoiD

<JeniT> ... only game in town

<JeniT> ... if we're going to recommend a vocabulary, in a recommendation

<JeniT> ... then we need implementation experience

<JeniT> ... which includes going through to consumers

<JeniT> Thomas: VoiD has been designed without DCAT in mind

<JeniT> ... so didn't care about separation of concerns

<JeniT> ... I think someone has to make a new version of VoiD, to fit in

<JeniT> Sandro: we could ask @cygri whether he thinks a new version of VoiD is needed

<JeniT> ... another thing on DCAT is I don't know how it relates to CKAN

<JeniT> ... I don't know how happy CKAN were with it

<JeniT> ... another force in play is the Sunlight Foundation in the US

<JeniT> ... they have done national data catalog that combines Federal, State and Local levels

<JeniT> JeniT: do you need input about what to put in the charter?

<JeniT> Sandro: I feel we should say a W3C Recommended vocabulary along the lines of DCAT

<JeniT> PhilA: so the group would create and maintain the vocabulary

<JeniT> Sandro: I think DCAT should enable multiple catalogs, for a decentralised system

<JeniT> ... each catalog should describe itself using DCAT

<sandro> s/catalog/data source/

<JeniT> JeniT: there's the set of terms (Dublin Core + DCAT + VoiD etc) and the namespace for DCAT

<JeniT> FabGandon: when you look at FOAF, FOAFomatic really helped encourage its use

<JeniT> Sandro: OKFN has a form where they're asking people to fill out questionnaire about their government data

<JeniT> ... be nice if it gave back RDF

<JeniT> PhilA: keen to do outreach as well

<JeniT> ... ideally as part of this working group

<JeniT> ... important part of the implementation

<JeniT> Sandro: OK, add under Procurement Assistance

<JeniT> BREAK TIME UNTIL 16:00

<FabGandon> scribe: FabGandon

resuming on vocabularies.

scribe: JeniT: what are the next stages?

Sandro: next stage is identify what can be done within the WG charter/timespan/force
... avoid shoot for too little or too much.
... identify what can be done in other TF / WG.
... for vocs we could work on the basis of having an identified editor for each voc.

<sandro> sandro: maybe the vocabs will each be time-permitting / nice-to-have

<JeniT> http://www.epimorphics.com/public/vocabulary/org.html

JeniT: Organization Ontology

<sandro> JeniT: foaf and vcard exist

JeniT: Dave Reynolds put that together because nothing was putting togerther what we needed about Org.
... so we took that and extended that for UK gov.

Sandro: this is reusable in other organizations.

PhilA: very UK centric.

Sandro: this should be blessed by W3C for others to use

PhilA: an Org.org schema :-)

<martin> In Spain, we use it, and it was OK for our purpose (city council and departments)

JeniT: change event is used to capture a change in an Organization, it is hook

<sandro> JeniT: changeEvent hook for saying org1+org2 => org3

Vagner-br: very useful to follow changes in structures and names, acronyms, etc.

PhilA: does your national library archives web sites?

<JeniT> FabGandon: In France, we have law that says we must archive every French official media channel

<JeniT> ... and we don't know how to do that

Sandro: question of ontology engineering process and the way to go for a new voc.

tban: I wouldn't use UML, this is not object-oriented work
... I use TopBraid composer
... nice figures.
... Richard came up with SDMX but not enough sem. web oriented.

JeniT: we work with Richard on that because SDMX is important in the statitician community

<sandro> jeni: ONS used SDMX already, so it was opportunistic for us to use it.

JeniT: SDMX is hard but may be necessary.

Sandro: we haven't solve the evolution story of how we move from a voc to the next.

JeniT: also hard to know when a voc is stable enough to be really used.
... check list of what you expect from a voc.

<sandro> jeni: checklist item: have documentation which is good, have ref guide, examples, etc

JeniT: e.g. it must have ref guide, examples, managed by an org with a longevity, etc.

PhilA: for FOAF for instance the longevity of the domain is a problem.

<JeniT> FabGandon: reading through the minutes yesterday, there's a good thing happening in eGov in that we have very stable bodies involved

<JeniT> ... INRIA is a government institute

<JeniT> ... so we have hosting that is very stable

<JeniT> ... people believe we will continue to exist

<JeniT> ... won't want to use a namespace hosted by the UK

<JeniT> ... but one hosted by a government would have longevity

<JeniT> ... We tried several things, including knowledge engineering approach

<JeniT> ... tried VoCamp approach, where people come with a need for a vocabulary

<JeniT> ... break up in small groups and hack

<JeniT> ... some of these were successful

<sandro> FabGandon: We tried Knowledge Engineering - limits, VoCamp fairly successful, ...

<JeniT> ... depends on scope of vocabulary

Sandro: this a question for the chairs and the group.
... any other org ontology.

JeniT: there is a blog post from Dave

<sandro> sandro: I'll just link to DER's blog post, with its references

<JeniT> http://www.epimorphics.com/web/wiki/organization-ontology-requirements

<sandro> tb: what about sameAs inflation?

tban: the inflation of sameAs, and misuse of sameAs.
... I wouldn't sameAs, but what else.

<sandro> tb: mapping vocab like skos but without inferring it's a skos concept.

<JeniT> FabGandon: subClassOf subPropertyOf also used in alignment

tban: provide a mapping voc with only properties and no classes to avoid inferences

<sandro> sandro: bad sameAs is just bad data

<sandro> FabGandon: in datalift, we are thinking about how to do mapping, from sameAs onto procedural declaration.

<sandro> FabGandon: okaam huge eu project on this -- efficient sameAs resolution for semweb. give uri, it gives back ones which might be equivalent.

<sandro> FabGandon: (let's stay away from this...)

http://www.okkam.org/

tban: when we try to link e.g GEMET and German Thesaurus we need the same in SKOS without domain and range.

<sandro> jeni: a school is not a skos:Concept according to the SKOS spec

<sandro> sandro: skos is just broken. :-(

JeniT: same name for a local authority vs. the area

Sandro: you need to formalize properly.

tabn: we should include the problems aboout alignment to be discussed in the charter

JeniT: if RDF 1.1 don't want to do it we have to come up with a convincing scenario

Sandro: the key thing for people is to see if we can stabilize FOAF.

<sandro> jeni: important to understand how foaf works with vcard

tban: what about foaf+ssl?

JeniT: I wondered if we should include something about identitity in the eGov WG.

<sandro> 4.4 Statistical/Data Cube Datasets

sandro: statistical, so far there is a sub-set of SDMX

<sandro> sandro: I'm hearing there's a subset of SDMX, cube, that's pretty good.

<JeniT> http://publishing-statistical-data.googlecode.com/svn/trunk/specs/src/main/html/cube.html

<sandro> PhilA: It's good for describing what you see in CSVs.

PhilA: the cube ontology is good to describe the sort of data you find in CSV file.

JeniT: Cube comes from the hypercube structure of the data.
... an observation is a cell in the cube
... each dataset is described by a dataset def
... for statistical data, payment data, etc. any thing you put in a Spreadsheet
... we use it a lot

sandro: how can be sure this meets most needs?

<sandro> sandro: if we make this a Rec, who might object? Among people who buy into SDMX & RDF already....

<sandro> PhilA: Statisticians might find this reduces too much.

Sandro: if we need more of SDMX can we extend it?

<sandro> Jeni: that was the goal, yes.

JeniT: yes it was designed to be extended

tban: we use it for measurment data

JeniT: we wanted to publish statistic for a larger audience than the statistician community

sandro: if we want to change these schema, how do we do that? what would be the process?

JeniT: feel free to take it !

sandro: it rare that somebody does this kind of work and does follow it as an editor of the Rec.
... Data Cube seems important.

<sandro> [edit] 4.5 Data Quality, Timeliness, Status

JeniT: I am sure that voiD as something about temporal validity

<sandro> jeni: we use dc:temporal for expressing the temporal range for which the data is true

JeniT: we have our own small voc for that

<sandro> jeni: we use our own data.gov.uk for draft-ness

JeniT: nothing on data quality at the moment

PhilA: can't find this in voiD

JeniT: in must be in RSS then

<sandro> PhilA: Who is responsible for cleaning it up? Who will update it, and when?

PhilA: need to know if the data I am using now will be here tomorrow

<sandro> PhilA: Ooften the data comes from screen-scraping!

PhilA: need to know how often data updated

<sandro> FabGandon: This is in Provenance -- an expiration

JeniT: this is new work probably

<sandro> JeniT: I think this is new work, much less baked than data cube

sandro: the WG could provide such voc.

JeniT: it fits under dcat

<sandro> jeni: This goes under dcat -- it applies to data sets.

<sandro> FabGandon: Granularity might be small -- some bit of the data changes often, some bit doesn't.

<sandro> FabGandon: this might not be about the dataset, it might be about one subgraph within the dataset.

<sandro> tb: In the Gazettier, when we have changes in communities, merging, the official service just drops the old communities. We don't drop them, we mark them expired.

<sandro> tb: dcat should describe your policies about such things.

tban: the policy should be also described on the dcat level.

sandro: the granularity problem might be more general with dcat and dataset.
... granularity can be a political game.

<sandro> sandro: so if dcat can handle the gran. then this can be folded in.

<sandro> 4.6 Assumptions/Basis/Comparability of Data

JeniT: we need to know if we can compare two values.

<sandro> jeni: In statistical data they really care if you can compare two values, because defn of some bit in your data changed.

JeniT: e.g. after a policy change.

<sandro> JeniT: annotate a qb:observation to say this is not comparable, etc.

<sandro> JeniT: Vocab for classiying these kinds of annotations

PhilA: we have a 10 month data vs. an 11 month data

tban: different methods in differents countries.

<sandro> tb: lining maps up between country, INSPIRE Harmonization effort.

<sandro> FabGandon: The notion of an unemployed person in France is totally different than in some other countries -- not comparable.

JeniT: encourage people to use different terms when they use different notions

<sandro> JeniT: Sometime you just mean datafr:unemployment has a different URI than datauk:unemployment

JeniT: there may be some matches but when we use the same URI it IS the same thing

<sandro> JeniT: this is more about same vocab, same dimension, ... this is to annotate where it's different.

JeniT: at least we should be able to say "this is a statement about comparability".

<sandro> JeniT: This is for categories of ways to annotate observations.

tban: using different URIs is different from using different terms.

sandro: no candidate voc on that right now?

<JeniT> http://sdmx.org/wp-content/uploads/2009/01/01_sdmx_cog_annex_1_cdc_2009.pdf

JeniT: some of the SDMX voc may be relevant
... Dave has mapped those onto a voc which could be a candidate

<JeniT> http://publishing-statistical-data.googlecode.com/svn/trunk/specs/src/main/vocab/sdmx-concept.ttl

<sandro> [edit] 4.7 Describing Visualization and Presentation

<sandro> fresnel

http://www.w3.org/2005/04/fresnel-info/

<sandro> sandro: not hearing a lot of interest/experience on this one.

<sandro> FabGandon: Fresnel has a huge potential

sandro: design pattern for URIs

<sandro> 5.1 Design Patterns for URIs

JeniT: updated version:

<JeniT> http://data.gov.uk/resources/uris

JeniT: it takes a different kind of angle.

sandro: huge design space, how much we want to expand or focus the design space
... should we give all the options or pescribe some good practices?

<sandro> PhilA: use of id, 303 to doc, SHOULD be in LD

<PhilA> I mean - the pattern http://reference.data.gov.uk/id/department/co breaks down as

<sandro> JeniT: sayig do 4.2 from coolURIs

<PhilA> {sector}.data.gov.uk/id/{department}/unique_identifier

JeniT: sometimes the pattern does work well

<PhilA> If you dereference that, the /id/ gets replaced by /doc/ as part of the HTTP 303 (see other) response, and that leads to a document that describes the original identified thing

<sandro> JeniT: Although that pattern works really well in some circumstances, it doesnt for others. eg for the people in the org structures, we dont have a good URI pattern. so we end up using hash URIs in the datasets, thinking they might be linked up later.

JeniT: we used # URIs depending on the dataset.

<sandro> JeniT: just using pattern 4.2 doesn't always work well.

JeniT: not simple to just say use that pattern.

<sandro> FabGandon: need keys :-

<sandro> FabGandon: need keys :-)

<sandro> sandro: Just get everyone to mint URIs for themselves :-)

tban: the original URL of TimBL also described what you should not do.

<sandro> tb: '98 cool uris, don't put classifications/datatypes into URI, or other things that would make them change.

<sandro> FabGandon: Don't forget there are scenarios where you want to do the opposite -- to anonymous people.

<sandro> tb: I've come to prefer totally opaque URIs.

tban: generally I prefer URI that don't tell anything by themselves

sandro: what should we do?

<JeniT> FabGandon: it could be 'follow the guidelines of the LOD group'

sandro: one output could be follow 4.2

<sandro> sandro: maybe a flowchart, even!

<sandro> JeniT: I found we needed design patterns not just for schools, but also for vocabs, concept schemes, datasets.

JeniT: we also need design patterns for URIs for schemas

sandro: versioning of dataset crosses with the temporal point before.

<sandro> sandro: shoud I fold this into designing-URI, or timeliness vocab ?

tban: what does versioning mean here, e.g. statiscal data changes every year

<sandro> tb: Every year has year more --- discussion of versioning.

<sandro> tb: verionsing of vocab, too.

<sandro> Jeni: how you design URIs, how you design the data....

<sandro> 5.3 Change Propagation and Notification

<sandro> dady -- dataset dynamic

<sandro> I think of this as protocol,

<sandro> FabGandon: RSS feed of changes -- talis changest vocab

<sandro> JeniT: Sparql push

<JeniT> http://code.google.com/p/sparqlpush/

<JeniT> http://esw.w3.org/DatasetDynamics

<sandro> seems out of scope

<sandro> JeniT: we need to do it anyway

<sandro> JeniT: (we = data.gov.uk)

PhilA: also about SPARQL Push

JeniT: we need that for data that we are publishing every week

<sandro> JeniT: We'll see data published on a weekly basis, so we need

JeniT: we need to a a design pattern for that

sandro: just publishing the new data is not enough?

JeniT: no
... links back to the named graphs.

<sandro> FabGandon: It's too big for this....

<sandro> 5.4 Distributed Query

sandro: too big to be handled here.

<sandro> same as above -- needs to be done, too big for us.

SPARQL 1.1 has some elements of answer.

<sandro> JeniT: Maybe it goes into procurement guidelines, eg Sparql 1.1 service descriptions suitable fo rhtis

<sandro> 5.5 Developer-Friendly API and Serialization

<sandro> linked-data api

sandro: JSON syntax for RDF should be part of the charter of RDF 1.1

<sandro> PhilA: should be relatively easy to get out the door

<sandro> JeniT: Yes, 3 impls, could be fast, but does need wider review -- eg for impementations.

PhilA: it is manageable and we should pursue this

<sandro> PhilA: this is really important, and doable.

PhilA: important in terms of deployment

sandro: will still exist even if we don't do anything within W3C

PhilA: from a visibility point of you this is important

<sandro> sandro: I'm worried about arbitrary decisions in the design coming back to be a problem in the WG

<sandro> JeniT: the JSON might be a problem.

<sandro> JeniT: I think we're a lot of the way there, but leaning towards its own WG.

PhilA: need to talk about outreach

<sandro> [edit] 2.5 Outreach

PhilA: it needs to happen somehow

<sandro> PhilA: somehow this has to happen, perhaps via EU funding

PhilA: some way to distribute the output of the group among the governments

sandro: counter argument: the focus of the WG is the how not the why.
... the demos of the "how" will make the job of the people doing the "why" easyer

<sandro> robin: In general, WGs are pretty bad at selling their own stuff, being so involved in the technical work.

<sandro> ... people who were writing great blogs went silent when they joined the WG.

robin: may be outreach should happen outside the WG

sandro: could still be included in the charter.

<sandro> PhilA: marketing is important in making markets

sandro: I don't have any exact data about the number of members for the WG.

<sandro> JeniT: great value to have new folks in WG, so people experience having to explain this stuff

- DRAFT -

SV_MEETING_TITLE

02 Nov 2010

Attendees

Contents

Linked data at data.gov.uk

Open data initiatives in Spain

Fabien Gandon on France's DataLift Project

Linked Environment Data in Germany (Thomas Bandholtz)

http://www.w3.org/2001/sw/wiki/GLD_Work_Items

Summary of Action Items

Scribe.perl diagnostic output