08:00:56 RRSAgent has joined #egov 08:00:56 logging to http://www.w3.org/2010/11/02-egov-irc 08:01:13 RRSAgent, make logs public' 08:01:14 RRSAgent, make logs public 08:01:20 People in room, in order: 08:01:20 Karen Myers 08:01:20 Sandro Hawke 08:01:20 Jeni Tennison 08:01:20 Daniel Dardailler 08:01:21 Roger Cutler 08:01:23 Phil Archer, W3T + Talis 08:01:24 karen has joined #egov 08:01:27 Martín Álvarez, Fundación CTIC 08:01:29 Thomas Bandholtz Germany -- LD Environrment Data 08:01:31 Antonio Sergio Cangiano, SERPRO 08:01:33 Jose Leocadio, SERPRO 08:01:35 Yosuke Funahashi, Tomo-Digi Corporation 08:01:37 bandholtz has joined #egov 08:01:37 Vagner Diniz 08:01:39 Karen Burns 08:03:16 PhilA has joined #egov 08:03:30 scribe: PhilA 08:03:35 scribeNick:PhilA 08:03:39 Daniel: Are we going to talk about the creation of a task force to look at education and outreach? 08:03:51 martin has joined #egov 08:03:55 DD: Raises issue of non-tech education & outreach 08:04:04 .. (as possible agenda item later) 08:04:33 Karen: 1.5 yrs ago we had an active task force 08:04:40 .. comm team etc. planning lots of stuff 08:04:59 .. then there was a shift in priorities, and Josema left. It was very effective when we were doing it 08:05:27 .. I have a personal interest. We have some support from our PR firm that has a knowledge base in this area. 08:05:37 .. but it's US-based. Need a more global view 08:06:03 DD: There is funding from the EU to help PSI 08:06:11 Karen: I like the idea of a TF. 08:06:31 .. if there's a need for an IG just looking at that, all well and good 08:06:40 DD: we could spin off other groups from the IG 08:07:08 Karen: We need high level messages - what is open data, what is Linked data etc. 08:07:17 PhilA: Talis is interested in this ;-) 08:07:29 Sandro: Others? 08:07:46 Interest in the room from New Zealand, Brazil and more 08:08:02 Karen B, Vagner 08:08:05 Vagner Diniz, Karen Burns 08:09:16 action: Daniel D to set up task Force on EO 08:09:17 Created ACTION-118 - D to set up task Force on EO [on Daniel Bennett - due 2010-11-09]. 08:10:14 Topic: Linked data at data.gov.uk 08:10:25 Jeni: Takes the floor... 08:10:40 .. shows data.gov.uk website 08:10:50 s/website/Web site/g 08:10:59 ACTION-118 is really on Daniel Dardailler not Daniel Bennett. 08:11:03 (http://data.gov.uk) 08:11:31 Jeni: Most data is in CSV or XML. Some, but not much, LD 08:11:54 .. explains the term 'organogram' to mean org chart, organisational info etc. 08:12:26 .. an edict from gov said that all departments should publish their organograms on data.gov.uk, and it specified what info had to be included 08:12:38 .. about 62 on d.g.u now 08:13:04 .. majority published a PDF of their organisational structure 08:13:28 .. pretty pictures with tables that don't help a lot as there's no data to pull out 08:13:45 rrsagent, make logs public 08:14:10 Jeni: some of the org charts use headings defined centrally 08:14:16 .. some published as Power Point 08:14:27 .. senior post data includes reporting structures 08:14:32 rrsagent, draft minutes 08:14:32 I have made the request to generate http://www.w3.org/2010/11/02-egov-minutes.html PhilA 08:15:02 Jeni: shows a CSV file 08:15:24 .. talks about 'Gridworks', now renamed 'Google Refine' 08:15:45 .. see http://code.google.com/p/google-refine/ 08:15:51 Roger C: That's cool! 08:15:56 Jeni: yes it is! 08:16:13 .. important tool for cleaning up data 08:16:23 .. sometehing that non-specialists can use 08:16:37 .. Demos Google Refine 08:16:55 .. You can see the facets for a column, edit values that have gone wrong, edit column names 08:17:46 .. the key point is that civil servants can use this tool 08:18:05 .. we have gone round training a bunch of civil servants. Lots of good feedback. People have begun using it 08:18:32 .. extremely nice features around reconciling data around already published data 08:18:57 .. you can ask the tool to reconcile a column 08:19:11 FabGandon has joined #egov 08:19:15 gridworks "reconcile" to link to web data. nice! 08:19:17 .. turns strings into links (if it finds relevant data) 08:19:41 Jeni: You can do a bit of manual work to produce clean RDF without actually handling RDF (knowlingly) 08:19:45 .. you can apply scripts 08:19:56 .. shows adding a column for, in this case, provenance 08:20:08 data.gov.uk tries to keep track of where we get data from 08:20:24 s/data.gov.uk tries to keep track of where we get data from/..data.gov.uk tries to keep track of where we get data from/ 08:21:14 .. and if I open up a script (in this case, a bit of JSON). Paste that in and apply those instructions - it will perform various tasks, creating extra columns etc. 08:21:25 .. my script adds in lots of URIs in this case 08:21:34 .. (URIs central to linked data) 08:22:15 .. DERI has created a plug in that describes the data 08:22:40 .. now can export the data as turtle or RDF/XML 08:22:58 (Shows RDF generated from the CSV) 08:23:23 Jeni: You can see the different posts within 'BIS' (Department of Business Innovation and Skills) 08:23:37 .. we run several stores, mostly hosted by Talis 08:23:39 (I wonder if there's a way to simplify that script application process....) 08:23:53 .. they bring together data sets for, say, transport 08:23:58 .. then one on education and so on 08:24:06 .. organogram data is "reference data" 08:24:17 i.e. http://reference.data.gov.uk 08:24:49 (Shows SPARQL queries against BIS organogram data). Live. No safety net 08:25:09 Voila! Some results 08:25:38 Jeni: Most people, including developers, don't react well to being asked to write SPARQL queries 08:25:56 .. so we have added a layer on top of the SPARQL to provide a simpler API 08:26:06 .. I'll show you the basic Linked data API first of all 08:27:31 http://reference.data.gov.uk/id/department/bis 08:27:42 Shows how this gives a 303 to http://reference.data.gov.uk/doc/department/bis 08:27:56 demos exploring the data 08:28:21 q+ 08:28:49 ack ka 08:28:56 Karen: This is fabulous 08:29:05 .. practical question - who is updating the data? 08:29:33 Jeni: the generic answer is "how long is a piece of string". Some data changes daily, some changes much less frequently 08:29:58 .. for organogram work, the stipulation was that data should be valid on 30/6/10 and should be updated every 6 months 08:30:08 Karen: How did the departments react? 08:30:22 Jeni: It was hard. it took a big stick from the Cabinet office to get it done 08:30:37 Jeni: Most departmetns have generated just a PDF or a POwer Point 08:30:48 .. some generated a CSV (prob by HR with help from IT dept) 08:31:06 .. generation of RDF was done by me (x 6). One dept has done it themselves 08:31:16 Karen: And they can navigate this UI? 08:31:41 Jeni: This UI is designed to show them the benefit of doing it as LD. Shwoing that people can navigate around the data 08:31:49 .. you can see the different sources of the data 08:32:02 Sandro: Is everyone's salary info public by law? 08:32:15 Jeni: Top civil servants - although it's not by law, it's the culture 08:32:40 Roger: Who wants to do this and why? 08:33:01 Jeni: WE have a strong developer community in the UK. They want to get hold of gov data, package it and so on 08:33:22 .. they usually want to pursue this for lobbying or political ends 08:33:45 Jeni: person X is claiming ABC on their expenses, is this right? 08:34:14 .. personally I don't find that the most interesting data that governments can put out but it is where the current political drive is in the UK 08:34:22 Karen: it is tangible though 08:34:34 Jeni: School performance is something that people can relate to as well 08:35:15 Jeni: completes demo 08:35:37 .. this helps people explore the data and find out where the data came from 08:36:34 exploring http://reference.data.gov.uk/doc/department/bis 08:36:40 .. shows XML data, or JSON data - the interface allows you to access the data in various formats. Just add .xml or .json to the URI. That's what the Linked data API is about 08:37:16 Jeni: So much for the data, available as an explorer and as data for developers. But it's not especially pretty 08:37:25 Jeni: so let's see if we can find a pretty output 08:37:39 Demos BIS organogram visualisation 08:38:12 http://danpaulsmith.com/gov/orgvis/?dept=bis 08:38:51 Sandro: Does it go all the way ip to the prime minister? 08:39:17 sandro: if you want to get on this giant org chart, give us your RDF 08:39:24 Jeni: Not yet, but that would be cool, especially if it came out of all the different data sets created by different departments. The overall org chart comes out of its component parts - if you use linked data 08:39:45 Jeni: Getting to an end to end story like this has taken several months 08:40:04 .. we had to work out what URIs should look like for different departments. This is a department within a department, a unit etc. 08:40:28 .. we had to create some vocabularies for organisational structures generally and then specifically for UK 08:40:35 .. provenance data is very important 08:40:58 .. statistics around salary costs - needed a vocabulary for talking about statistical dta 08:41:17 .. and those fundamnetla design choices etc. had to be done to support the kind of end to end story we've been looking at here 08:41:36 q+ 08:41:40 Sandro: Those vocabularies sound like candidates for standardisation 08:42:32 s/fundamnetla/fundamental/ 08:42:38 Roger: AIUI you started by defining standardised things. In the tool you had a reconciliation step that knew what to do with the data. So it must have been pretty close for automated reconciliation 08:42:44 .. what does that depend on? 08:43:21 Jeni: Gridworks takes the values that it finds in the column. Takes a sample, sends it to a reconciliation service - an API for this kind of thing 08:43:47 .. the Rec service looks at the values, looks at the data it has, and works out what it looks like and what vocab is appropriate 08:43:50 sandro: I've put those vocabularies, as best I understood, into the GLD WG Work Items list, http://www.w3.org/2001/sw/wiki/GLD_Work_Items 08:45:38 .. recognising "John Smith" as a name cf. "Smith, John" is something the reconciliation service does. Leigh Dodds (Talis) created a good example of this 08:46:08 See http://www.ldodds.com/blog/2010/08/gridworks-reconciliation-api-implementation/ 08:46:48 Further discussion on how this works between Roger, Jeni and Sandro 08:47:11 Roger: How does a salaray get recognised as a salary 08:47:17 Jeni: That's in the RDF schema 08:47:24 sandro: so reconciliation takes strings which are intended to be identifiers and turns them into proper URI identifiers 08:47:32 .. and we can use the CSV column name as the hook 08:47:49 Roger: so "salary" might match and "sal" won't 08:47:51 Jeni: yes 08:48:07 Roger: So there's a certain amount of fixing up of "user data" 08:48:08 q? 08:48:09 ack kar 08:48:24 karen: What is the UK gov policy towards the APIs? 08:49:26 jeni: The Linked data API is on Google Code and anyone can use it http://code.google.com/p/linked-data-api/ 08:50:09 Talis implementation in PHP http://code.google.com/p/puelia-php/ 08:50:39 Karen: Are you shouting about this? 08:50:42 Jeni: yes 08:51:06 Sandro: The Linked data API work was presented at various meetings 08:51:20 karen: so it needs more outreach 08:51:33 Roger: I'd like to hear more about how politicians talk about this? 08:51:41 Jeni: Gordon brown got it and understood it 08:52:02 It's a possible item for GLD WG: http://www.w3.org/2001/sw/wiki/GLD_Work_Items#Developer-Friendly_API_and_Serialization 08:52:43 .. current government see it as part of the transparency agenda. Making data available in a machine readable format 08:59:45 Roger: Made the point that organisational capability, software support etc. is really important for commercial companies etc. 08:59:54 Vagner; Jeni talks about visualisation 09:00:09 .. has W3C done any work on visualisation? 09:00:20 Sandro; beyond CSS and SVG, I'm not aware of any 09:00:37 Vagner: You added this item on the WG. I wonder if it's something we need to discuss? 09:00:57 Sandro: That's coming out of the data.gov.uk work but I don't know any more detail 09:02:53 PhilA: I think an RDF-SVG link would be cool 09:03:21 Fabien: Talks about an existing effort to link CSS, SPARQL and more 09:03:45 Topic: Martin Alvarez, CTIC 09:04:08 Martin: Shows map drawn in SVG, lots of tables in RDF so we're already linking RDf and SVG 09:04:16 Sandro: What's the linkage? 09:04:25 Martin: I think it's Java 09:04:32 Martin: begins talk 09:04:42 http://www.cytoscape.org/ Cytoscape combines SPARQL CONSTRUCT with a graph interface to allow the user to select and render RDF data 09:05:09 s/Topic: Martin Alvarez, CTIC/Topic: Open data initiatives in Spain/ 09:05:21 Martin: Open Data act 2007 09:05:30 JeniT has joined #egov 09:05:53 .. aim to create catalogue of open gov data 09:06:05 more precisely the S*QL plugin for cytoscape http://semtech2010.semanticuniverse.com/sessionPop.cfm?confid=42&proposalid=2932 09:06:26 .. 3 regional governments (Asturias) Basque and Catalonia 09:06:50 Shows Asturias catalogue 09:06:58 Martin: only 4 data sets but all linked 09:07:12 John has joined #egov 09:07:36 .. uses things like the organisation vocab, iCal etc. 09:07:49 .. project is 100% linked data, hosted on our won Oracle triple store 09:08:15 s/won/own 09:08:16 .. we've added some Jena modules (and our own) to create SPARQL endpoints etc. 09:08:27 .. metadata modelled using VoID 09:08:41 .. HTML view generated dynamically 09:09:19 .. Basque country is similar but is focussed on raw data. They have some RDF links but they're static files to describe the data sets 09:09:30 .. more than 1,000 data sets in raw formats 09:09:37 .. useful info for citizens anda industry 09:10:05 .. translation memories (Euskara -> Espanol etc.) 09:10:18 .. Catalonia is a new one 09:10:33 .. this will be similar to Basque country initiative 09:10:42 .. most data will be in raw formats (CSV, XML etc.) 09:10:47 .. they will provide some info in RDf 09:11:01 .. as well as RDF static files 09:11:15 .. we're also creating a catalogue using DCAT 09:11:33 .. See http://vocab.deri.ie/dcat 09:11:41 rrsagent, draft minutes 09:11:41 I have made the request to generate http://www.w3.org/2010/11/02-egov-minutes.html PhilA 09:11:44 S*QL plugin for Cytoscape presentation here https://connect.umms.med.umich.edu/p79605689/?launcher=false&fcsContent=true&pbMode=normal 09:12:12 Martin: they will present > 26K vCards in RDF, describing public centres, using linked data approach 09:12:59 .. as well as the regional governmetns we also have Zaragossa, Gijon and Barcelona as cities in the project 09:13:04 .. hope to have more info next month 09:13:18 s/Zaragossa/Saragossa/ 09:13:38 .. Saragoss will be using linked data. Should be first city to adopt this 09:13:44 .. they're also using DCAT 09:14:06 .. Gijon is a simple project 09:14:26 .. adapting a CMS to provide RDF content representations in parallel byt adding RDFa to pages 09:14:45 .. we conclude that most governmetns are interested in publishing many data sets quickly 09:14:54 .. they want to release the data 'now' 09:15:00 .. they want good headlines 09:15:18 .. they are neutral on idea of linked data 09:15:38 .. they 'know' that semantic modelling is hard and they don't want to spend more time and money on it 09:15:48 .. for us it's easy to create examples. It's not so easy for the developers 09:15:58 .. maybe we need more examples like the Linked data API to help 09:16:15 .. this would help to help to foster the use of the linked data info 09:17:05 Jeni: It is the case that making data useable and reusable is hard. Linked data is no harder. It's making the data clean that's hard. 09:17:19 jeni: Making data reusable is hard (not semantics, per se) 09:17:33 q+ 09:17:33 Martin: we are trying to convince them using these examples. we gather their spreadsheets and show the linked data examples 09:17:47 Ibrahima__ has joined #egov 09:18:13 ack kar 09:18:28 Example of linked data representation http://datos.fundacionctic.org/sandbox/ineasturias/viviendas.do 09:18:31 Karen: Can you say a little more about the time and money. What levels of government are you working with? 09:19:12 Martin: The best example is Catalonia. They called us 15 days ago and said they wanted to have an open data site within a month. Can you help? 09:19:20 .. speed was major concern 09:19:33 .. get the data we have out there 09:20:09 .. they are convinced that linked data is a good solution, they want to follow it, but they prefer spending their resources in developing open data site, specifying the licence 09:20:17 .. LD comes later? 09:20:22 Karen: So who called you? 09:20:53 Martin: Not an IT person. More close to the citizenry 09:21:04 .. not sure of actual department 09:21:17 .. some are closer to the IT departments 09:21:23 Sandro: What was their motivation 09:21:32 Martin: They know about linked data because we told them about it 09:21:46 .. most of them haven't heard about LD before 09:22:05 .. they know open data initiatives, not linked data 09:22:14 ..we managed to convince them ;-) 09:22:39 Sandro: Any other questions? 09:22:46 .. then we'll take our break now 09:23:01 rrsagent, draft minutes 09:23:01 I have made the request to generate http://www.w3.org/2010/11/02-egov-minutes.html PhilA 09:23:36 http://www.w3.org/egov/wiki/TPAC_2010 09:25:11 John has joined #egov 09:30:24 John has joined #egov 09:42:46 John has joined #egov 09:55:54 bandholtz has joined #egov 09:58:43 Datalift: http://datalift.org 09:58:49 FabGandon has left #egov 10:06:03 topic: Fabien Gandon on France's DataLift Project 10:06:08 karen has joined #egov 10:06:53 fabien: I'm not speaking for all of France! This is just one accepted project. Accpted in june, kickoff was at end of September. 10:07:11 ... Not a lot to show yet, but now is a good time to give us feedback. 10:07:46 leocadio has joined #egov 10:07:49 ... Last year at ISWC we had a meetup, and discussed the lack of data.gov project in France 10:08:23 ... We considered a prototype in Talis; first question --- where will the data physically be stored? In the UK or France? 10:08:32 ... considered cloud in Europe 10:08:48 ... datalift == lifting from raw data to rdf in France 10:09:31 ...Atos Origin will be integrator, building an open source integrated platform. As side effect they'll be ready to offer services 10:09:50 ... Mondeca is a KR firm in Paris, doing industrial knowledge modeling 10:10:41 ... Academics: INRIA at Grenoble (aligning schemas), Eurocom, Lirmm (the guy from INRIA got promoted there). 10:10:59 ... I'll be using DataLift as a scenario for pushing Named Graphs 10:11:21 ... INSEE has all the national statistics for France, IGN has all the maps 10:11:47 ... Fing is new generation of tools - use cases and business models 10:12:03 ... Phase 1: an easy open end of data, open platform 10:12:28 ... not re-inventing wheel. We'll re-use existing solutions if they pass our benchmarks; all dev will be open source. 10:12:49 ... - Assist the selection of data 10:13:17 ... (every thing must be proven on INSEE and IGN data) 10:13:26 ... - identify appropriate schemas 10:13:39 q+ to ask about licensing/openness of IGN data 10:14:33 ack PhilA 10:14:33 PhilA, you wanted to ask about licensing/openness of IGN data 10:15:11 phil: glad to see IGN in there (we have Ordnance Survey, in UK, which does the mapping); OS wants money for some of the data. 10:15:59 fabien: IGN has to get half their budget from sales, so that is a concern 10:16:03 David has joined #egov 10:16:20 roger: RDF only, or OWL too? 10:16:34 fabien: We'll use OWL when the scenario calls for it 10:17:01 fabien: We are concerned about speed of reasoning, so we'll have to strike the balance 10:17:07 MoZ has joined #egov 10:17:34 sandro: you don't have to do the reasoning 10:17:49 fabien: it depends on the scenario 10:18:16 roger: Is this typical in eGov, to do this tradeoff? 10:18:57 fabien: Everyone needs to make this kind of tradeoff 10:19:06 jeni: we're mostly staying away from OWL 10:19:17 fabien: if we need some bit of OWL, then we can use it. 10:19:36 fabien: With Atos, we'll benchmark every solution and see what scales well enough. 10:20:10 roger: in HCLS, I saw a very elaborate authentication scheme, that was depending on just being in RDF [[S?]]. 10:20:21 fabien: I heard of this in Freebase 10:20:52 emma has joined #egov 10:21:29 ... not even using RDFS because it was deemed too expensive. 10:21:42 ... - format conversion & connectors 10:21:47 I believe it was RDFS 10:22:00 ... (eg csv to rdf) 10:22:10 The system was called S3. It's pretty interesting. 10:22:46 ... - data publication itself (led by Atos) 10:22:59 ... - interconnecting data 10:23:28 .. eg URI for Paris connected to other URIs for Paris 10:23:33 ... (or re-used) 10:24:01 phil: What about talking to developers about using the data? 10:24:12 fabien: Yes, we should have raised that topic more. 10:24:27 ... it's one thing to show how to publish data; it's another thing maintain it 10:24:31 Sorry -- S3DB 10:24:59 ... our developers mostly don't speak SPARQL and many don't speak English. 10:25:15 ... so a cookbook in English wont be enough 10:25:32 TB: (missed question) 10:25:37 fabien: As soon as possible 10:26:15 fabien: Other topics: visualize, API for mobile, clouds, legal advice, cookbook 10:27:16 fabien: can you legally protect a URI? 10:27:32 phil: *boom* TimBL exploding on hearing that [imagined] 10:27:57 fabien: R&D challenges: 10:28:05 ... methods and metrics for schema selection 10:28:27 ... balance of specific needs & reusability (I think there is a tradeoff between usability and reusability) 10:28:43 ... data conversion & identifiers generation 10:29:03 ... automation of dataset interconnection (via Jerome Euzenat) 10:29:32 ... named graphs [hopefully aligned with RDF 1.1), provenance, licenses and rights 10:29:34 S3DB Permissioning: http://s3db.org/documentation/installation 10:30:33 ... First 18 months get platform running by www2012 in this building in Lyon!, then 18 more months. 10:30:47 ... user's club -- folks who want to use it 10:31:19 ... includes City of Bordeaux 10:31:28 ... Various Liaisons 10:31:56 sandro: how much money is the funding? 10:32:22 fabien: 3 years, about 2-3k per year, some more for leader. 10:32:45 fabien: may create related sub-projects. 10:33:05 fabien: we're trying to disturb the environment to create bubbles. :-) 10:33:32 Topic: Linked Environment Data in Germany (Thomas Bandholtz) 10:38:42 tb: Open Environment Data in the 90s 10:38:52 ... Aarhus Convention 1998 10:39:06 ... European Env. Agency (EEA) until 2002 10:39:25 s/until 2002/ 10:39:38 FabGandon has joined #egov 10:39:57 ... Environmental Agencies in Germany 10:41:12 ... (slide 4) 10:42:09 ... INSPIRE based on open geospacial consortium, nor RDF yet 10:42:21 ... access to raw data in OGC feature service 10:43:26 ... many public sector portals about water, soil, etc --- web pages, pdf, csv, xml of web services --- exhausting harmonization process 10:44:31 ... sub-clouds like Linked Open Drug Data, linked to dbpedia; we probably wont use dbpedia as the central ref point, but it looks like they will map to us. 10:45:03 ... (slide 13) 10:45:57 ... (slide 14) 10:46:51 ... GEMET and EUNIS published as Linked Data by EEA 10:47:26 tlr has joined #egov 10:47:41 ... (slide 15) 10:51:05 ... (slide 17 has involve rdf vocabs) 10:51:20 ... SKOS, SKOS(XL) -- only stable/w3c 10:51:24 ... Dublin Core 10:51:27 ... geonames 10:51:36 ... linked events ontology, for the chronicle 10:51:44 ... Darwin Core (for species) 10:51:53 ... SCOVO 10:52:16 timbl has joined #egov 10:52:20 fabien: I think there's a commercial version of geonames for more/better/current data 10:52:30 With a different ontology? 10:53:06 tb: German govt has their own data, and the agency that owns the data wants to sell it. There's a free version, but it doesn't include the polygons. 10:53:14 q+ to OSM 10:53:30 tb: We us geograph names; we don't use maps; this river flows through these cities, one by one. 10:53:59 tb: sensor web, many developments to come 10:54:01 David has joined #egov 10:54:19 tb: Darwin Core seemed to like the version I did of their work using SKOS. 10:54:56 timbl: Have you looked at Open Street Map as a source of geospacial? 10:55:09 timbl: linkedgeodata.org is a LD mirror of it. 10:55:15 tb: I'll take a look at that. 10:55:36 timbl: I'm told open streetmap is a better source of data than geonames 10:56:17 tb: We use SCOVO or env. specimen bank, and some extensions. SDMX data came along. 10:56:54 Jeni: We've looked at using SDMX -- just using the datacube part looks good, as a midpoint between SCOVO and SDMX. 10:57:05 tb: We used the specialized subproperties of dimensin 10:57:09 jeni: Yes 10:58:00 tb: In skox-xl, class literals, so you can link labels. 10:58:24 tb: inflectional forms of one word, extended properties of label class. 10:58:29 For an RDF mapping see LinkedGeoData.org http://linkedgeodata.org/About 10:58:38 tb: you could talk about this for years, we never came to an end. 10:58:38 s/skox-xl/skos-xl/ 10:58:55 tb: (slide 18) 10:59:26 tb: SPARQL end points -- can easily give accidental Denial of Service attack. :-) 10:59:57 tb: but providing SPARQL would be nice. 11:00:14 tb: authentication and access control would be good. 11:00:17 (or a default limit =1000 for non-authenticated users) 11:01:07 sandro: 4store includes a built-in resource limit, but default 11:01:32 fabien: we built in a default limit, although that can confuse users who dont know about it. 11:01:46 tb: This is good advice 11:02:53 RRSAgent, pointer? 11:02:53 See http://www.w3.org/2010/11/02-egov-irc#T11-02-53 11:03:26 RRSAgent, draft minutes 11:03:26 I have made the request to generate http://www.w3.org/2010/11/02-egov-minutes.html sandro 11:03:55 PhilA has left #egov 11:07:09 martin has left #egov 11:18:06 johnlsheridan has joined #egov 11:19:00 bandholtz has joined #egov 11:39:00 johnlsheridan has joined #egov 11:56:44 johnlsheridan has joined #egov 12:01:22 john_ has joined #egov 12:16:21 johnlsheridan has joined #egov 12:25:07 David has joined #egov 12:37:17 Vagner-br has joined #egov 12:38:16 karen has joined #egov 12:39:02 timbl has joined #egov 12:40:21 bandholtz has joined #egov 12:50:08 darobin has joined #egov 12:53:55 tlr has joined #egov 12:58:44 leocadio has joined #egov 13:00:28 JeniT has joined #egov 13:00:29 tlr has joined #egov 13:06:44 FabGandon has joined #egov 13:06:45 jaeyeollim has joined #egov 13:09:03 PhilA has joined #egov 13:09:22 topic: http://www.w3.org/2001/sw/wiki/GLD_Work_Items 13:09:54 tban has joined #egov 13:10:05 sandro: using same model as RDF Core Work Items list 13:10:28 Sandro: inspiration for methodology here is the RDF Core http://www.w3.org/2001/sw/wiki/RDF_Core_Work_Items 13:10:36 gautier has joined #egov 13:10:46 sandro: four categories for the work items 13:10:57 sandro: 1. helping deployment happen 13:11:11 sandro: 2. liaison items such as provenance & named graphs 13:11:30 sandro: 3. vocabulary items 13:11:54 sandro: 4. other technical development work items such as design patterns for URIs 13:12:22 sandro: promised charter by end of January 13:12:31 sandro: would mean start in April, running for two years 13:12:55 sandro: expect F2F meetings to be useful but hard for people to travel, so may try split F2F meetings 13:13:30 ... video conferencing between two places 13:13:57 ... to specific work items: 13:14:07 ... 2.1 Procurement Definitions 13:14:17 ... @johnlsheridan mentioned that this is an issue 13:14:58 ... having standardised definitions of terms/products to include this in ITTs etc 13:15:35 timbl_ has joined #egov 13:15:47 PhilA: something that is very important for government procurement 13:16:15 ... similar to WCAG guidelines, governments can point to them and say 'you must produce according to these standards' 13:16:57 FabGandon: would this include success stories? 13:17:02 ... real scenarios? 13:17:13 Sandro: not in this piece 13:17:22 Sandro: beautiful license out of UK 13:17:35 ... could be understood as a human 13:17:44 ... is there something we can do internationally? 13:17:53 ... having a list of licenses used in different countries? 13:18:03 FabGandon: I've been using double licensing 13:18:38 ... RDFa/GRDDL profile was licensed LGPL and a french license 13:19:10 Sandro: yesterday Daniel talking about getting bicycle accident data 13:19:18 ... had to sign a paper license 13:19:31 ... included things to say that he had to keep his application up to date 13:19:54 vocab for describing licenses 13:20:24 sandro: let me query for datasources I;m allowed to use for my app 13:20:45 FabGandon: something to indicate where licenses are roughly equivalent 13:21:44 jeni: maybe some recommendations about what makes a good license for gov data --- allowing reuse 13:22:08 jeni: guidance for licenses which enable the right kind of use 13:22:27 Sandro: 5-10 page note maybe? 13:22:47 Sandro: is this W3C says this or just the working group says this? 13:23:01 PhilA: be hard to have a recommendation for licenses 13:23:14 ... but a recommendation carries more weight 13:23:48 ... how would you include two independent implementations? 13:23:58 Sandro: two governments that follow the practices 13:24:40 ... might make sense to have it as one of several points within a recommendation 13:24:57 ... need the WG to work out what granularity of documents they want 13:25:03 Sandro: 2.3 Community Survey 13:25:14 ... self-sustaining database of vendors 13:25:25 PhilA: would this include apps that use the data? 13:25:38 Sandro: wasn't thinking so but data consuming systems would be good 13:25:47 ... the hardest part is to make it self-sustaining 13:26:11 FabGandon: only example that comes to mind is Semantic Web Tool Wiki page 13:26:19 ... but you're talking about a real database 13:26:31 Sandro: it could be a wiki page, but there are some people who aren't happy with that 13:26:39 ... would give WG freedom to decide how to do it 13:27:38 PhilA: why do you care that this gets done? 13:27:39 sandro: it's more important that this is done than that ie be a demo. 13:28:00 PhilA: about the whole government linked data thing 13:28:12 Sandro: got a very enthusiastic yes from the AC 13:28:38 PhilA: building community is very important 13:28:42 ... how far does it go? 13:28:51 ... it's hard to keep it coherent and up to date 13:29:01 ... high hurdle for WGs 13:29:06 Sandro: these lists tend to atrophy 13:29:23 FabGandon: only successful example is this wiki page, because it survived the group that started it 13:29:43 Sandro: even if it doesn't survive the group, the list working for a year or two would be very useful 13:29:49 PhilA: certainly as the group is going 13:30:10 jeni: make it be a resource for the WG as it's runnig. 13:30:26 Sandro: would hope that it could aim to be potentially self-sustaining 13:31:05 jeni: It should be a success just to have it run during the live of the wg. 13:31:15 PhilA: would hope that at the end someone would want to pick it up and continue with it, but it would not be a failure of the WG if that didn't happen 13:31:28 Sandro: maybe the mediawiki solution is good enough in that case 13:31:42 ... fairly dogfoody, even if RDF is not very linked data 13:32:19 ... helps us make sure that we know who to ping to try to get public review of our specs 13:32:28 ... and is useful to the communities 13:32:36 Sandro: 2.4 Cookbook or Storybook 13:32:48 FabGandon: yes, scenarios and success stories 13:33:09 ... when I talk to people in public sector, as a researcher they think everything I say is science fiction 13:33:22 ... I want a place to point them 13:33:34 PhilA: would that be the equivalent of a use cases document? 13:33:50 FabGandon: use cases aren't always implemented, scenarios are things that are already deployed 13:33:59 ... using UK a lot for this 13:34:13 PhilA: but this would be early input to the group 13:34:33 tlr has joined #egov 13:34:35 FabGandon: making them visible in a document gives me something to point to 13:34:45 ... there are best practices 13:34:50 rrsagent, draft minutes 13:34:50 I have made the request to generate http://www.w3.org/2010/11/02-egov-minutes.html PhilA 13:35:03 Sandro: use cases tend to abstract from scenarios 13:35:20 FabGandon: GRDDL use cases were a fiction 13:35:33 PhilA: I'm expecting WG to come up with best practices and recommendations 13:35:43 ... need to have scenarios as input for that 13:35:50 ... same function as use cases 13:36:04 Sandro: a product of WG is to have gathered a collection 13:36:17 ... could be written by people associated with scenarios, if we can get them to do it 13:36:33 ... not sure about stories about failures 13:36:57 PhilA: having stories about failure are really useful 13:37:12 ... being able to talk about failures in a constructive way 13:37:20 Sandro: may be hard to do that in published writing 13:37:24 ... but worth a try 13:37:32 Sandro: 3.1 Provenance 13:37:43 ... been incubator running for a year 13:37:48 ... final year is going to recommend WG 13:37:54 ... suspect that there will be one in the next 6 months 13:38:04 ... this group interacting with that group would be useful 13:38:12 Sandro: 3.2 Named Graphs 13:38:18 ... similarly, this interacts with provenance 13:38:28 Sandro: 3.3 POI WG 13:38:39 ... not sure how much government geography is addressed by this 13:38:48 ... think it's just going to be lat/long + polygons 13:38:57 PhilA: I ran workshop that led to POI WG 13:39:10 ... going to be struggle to get them to acknowledge linked data exists 13:39:27 ... one guy from DERI trying to get them to think about it 13:39:38 ... augmented reality main group 13:39:51 ... will need active steering to ensure liaison 13:39:58 Sandro: need a person in both groups 13:40:08 ... I was being optimistic about RDF vocabulary 13:40:14 PhilA: yes, very 13:40:31 ... as interested in moving objects as static 13:40:43 ... and motion in relative direction 13:41:09 Sandro: in worst case, someone could take formal model and map to RDF 13:41:34 Sandro: probably other liaisons I've forgotten 13:41:38 ... SPARQL? 13:41:51 ... don't know exactly what dependency looks like 13:42:07 ... are there any outside of W3C? 13:42:09 Andre has joined #egov 13:42:16 ... organisations doing some close to GLD? 13:42:47 PhilA: need people from data.gov from different countries 13:42:54 Sandro: hoping that they get involved in the working group 13:43:01 ... thinking about peer organisations 13:43:16 ... normally have standards, vendors & other standards bodies 13:43:43 FabGandon: wonder if relying on local offices to synchronise locally 13:43:48 MacTed has joined #egov 13:43:54 ... W3C office in Paris will be good point of synchronisation 13:44:11 ... of communicating, diffusing, making sure right people are aware 13:44:17 PhilA: not just national governments 13:44:30 ... colleague talking to Helsinki, Berlin, city authorities 13:44:37 ... not just national governments, but local ones as well 13:45:34 Sandro: check with OASIS and OMG and usual suspects 13:45:51 Thomas: INSPIRE and OGC? 13:46:07 ... they are doing something not so different, but with URNs and XML 13:46:30 ... someone would have to write a technical spec for RDF 13:46:39 Sandro: is there funding available if someone has the skills to do it? 13:46:55 Thomas: it's a EU directive, and each government has people who are working on it 13:47:07 Sandro: seems like the kind of thing that a university might do 13:48:02 Sandro: is it a good model that anyone else might be interested in? 13:48:09 Jeni: Stuart Williams is working with the UK end of INSPIRE to do some mapping of the object modles into RDF 13:48:17 Thomas: harmonising on what each member should provide on each topic 13:48:22 ... they have a dozen themes 13:48:32 ... mandatory data items on each theme 13:48:39 Stuart Williams, formerly of HP, TAG member, now at Epimorphics, Bristol-based Sem Web consultancy 13:48:43 ... we shouldn't care about domain-specific things 13:49:01 ... we could get a huge mass of more data if we mapped into RDF 13:49:08 ... get a lot of benefits from organisational power of INSPIRE 13:49:17 PhilA: the one bit of data that sticks in my head 13:49:25 ... is target for implementation is 2018 13:49:31 ... so don't want to depend on INSPIRE 13:49:36 ... this group would inspire INSPIRE 13:49:49 ... W3C is known to be slow, but we're faster than that! 13:50:01 OGC 13:50:01 Thomas: there are many agencies publishing data using OGC services 13:50:21 ... maybe better to talk about SDI 13:50:30 spacial data infrastructure 13:50:41 ... they have a G (Global) SDI conference every year 13:50:50 ... have questions about how to publish this in RDF 13:50:56 ... all fragmentary contributions 13:51:04 ... would be a different level 13:51:12 ... they have a catalogue service web, like DCAT 13:51:23 Sandro: is OGC a reasonable way to interact with them? 13:51:27 ... they are W3C members 13:51:38 ... we might be able to get them to participate in a liaison capacity 13:52:00 Thomas: geoSPARQL is one of these topics 13:52:09 ... encoding of sensor observation services in RDF 13:52:15 ... these are ongoing activities 13:52:22 ... not specific for government, but INSPIRE is 13:52:39 Sandro: every nation has a lot of legal issues around geographical information 13:52:52 Thomas: this is one of the things, that you describe the data that you will sell 13:53:04 ... I used to talk about linked data 13:53:26 ... not talking about LOD any more, because we shouldn't exclude non-open data 13:53:47 FabGandon: And accessing the data from my company I have access to things on the intranet 13:54:07 Sandro: these are good pointers, but I'm not sure what it makes sure to do in this charter 13:54:17 ... my thought was that POI would take care of it, but I guess not 13:55:31 JeniT: feels like a rat hole 13:55:40 (I'm concerned every time I see a line like "I used to talk about linked data, but I don't talk about LOD anymore" because "Linked Data" is bigger than "LOD" ... so I hope you [Thomas] still talk about "Linked Data" which absolutely includes non-open data) 13:55:46 Sandro: we can make it in scope, out of scope, or get the WG to decide 13:56:11 FabGandon: think it's difficult to rule out geographic data in a government data charter 13:56:19 ... so many scenarios where you need geographical data 13:56:27 PhilA: some liaison would be useful 13:56:57 ... 'we will liaise with POI WG, and be aware of other work going on in this area, but not core duty of GLD WG to codify' 13:57:13 FabGandon: going to be the same with temporal data representation 13:57:29 ... want to say that 'this data is only valid for this financial year' 13:57:32 ... another rat hole 13:57:40 PhilA: "We think this is important, and we'll liaise, but we wont develop a vocab for geo" 13:57:51 ... good part is that you don't have proprietary aspects 13:58:06 ... again needs liaison with people in time data 13:58:23 PhilA: this is relevant for POI, because important in crisis management 13:58:45 FabGandon: we have someone who may be involved in this aspect 13:59:23 Sandro: The next two groups were vocabulary and non-vocabulary technical items 13:59:34 ... I had some idea of doing vocabularies later, but let's proceed in order 14:00:02 ... TimBL at dinner last night said something... 14:00:21 ... I had always envisioned that W3C would write the vocabulary, document it and so on 14:00:43 ... but TimBL said that if foaf:name is what people should use, we can say in the W3C Recommendation that that's what people should use 14:00:56 ... but we could set a bar for what we mean for a 3rd party vocabulary 14:01:02 ... and if FOAF can get over that bar 14:01:04 ... then that's fine 14:01:11 PhilA: we wanted to use FOAF 14:01:26 ... and if DanBrickley goes under a bus, the server goes with him 14:01:32 ... (this is in POWDER) 14:01:43 ... got around it by using Dublin Core 14:01:57 ... we had conversations for ages about this, about how FOAF could become more stable 14:02:02 ... doesn't have an organisation behind it 14:02:07 ... could W3C manage it? no 14:02:29 jeni: I think there are some important things here, around check boxes for what vocabs we will trust. 14:03:06 ... lots of stuff around the org behind it, documented policy on change control, ... it would be useful to document these up front. THESE ARE THE THINGS WE EXPECT A GOOD VOCAB TO DO. 14:03:23 Sandro: going meta, aside from the terms that we recommend... 14:03:30 ... this is going to be useful for Governments as well 14:03:41 ... to help Governments to identify which vocabularies they can use 14:03:56 ... could be GLD or could come from somewhere else 14:04:43 jeni; Wider LD cloud might not care so much about stability. Academic projects don't mind so much. 14:04:56 fabien: France wont use schemas of the UK. 14:05:07 PhilA: going to be a problem all over 14:05:13 ... W3C isn't designed to manage vocabularies 14:05:19 FabGandon: scalability problems as well 14:05:25 ... only standardise what's domain independent 14:05:29 ... can standardise provenance 14:05:37 ... cannot standardise biology ontology 14:05:43 ... this changes things a little bit 14:05:58 ... here we're crossing that line a bit 14:06:17 Thomas: we don't have to standardise geographical vocabulary, just specifying serialisation 14:06:34 FabGandon: there could be a well-known XML vocabulary, just provide RDFS version 14:06:41 PhilA: I think purls provide the way out of this 14:06:49 ... if it can't be on w3.org 14:06:58 Sandro: I wouldn't say it can't be on w3.org 14:07:05 ... there's the organisation vocabulary 14:07:51 ... @der42 approached TimBL to host it 14:08:01 John has joined #egov 14:08:04 ... there's a maintenance headache that comes with that 14:08:14 ... this is something TimBLs been pushing a long time 14:08:25 ... I've been pushing this for a long time too 14:09:12 PhilA: the person to convince is Ted Gild 14:10:36 s/Gild/Guild/ 14:10:39 Sandro: vocabulary hosting in general is a huge issue for governments 14:10:48 FabGandon: more important than in any other domain 14:11:12 Sandro: I've been advocating that someone like IBM should get into the vocabulary hosting business 14:11:27 PhilA: same issue with Talis hosting it: we're a commercial company! 14:11:33 Sandro: so you get what you pay for 14:12:19 ... could pay a company to host it for a period of time 14:12:30 PhilA: we would host the stuff with a purl pointing to it 14:12:47 ... the purl points somewhere else if Talis goes under a bus 14:12:53 Sandro: I would say domain name per vocabulary 14:13:01 ... foaf.org rather than xmlns whatever it is 14:13:17 ... that gives the most flexibility 14:13:28 PhilA: govvocabulary.org/2010 or whatever 14:13:39 Sandro: but then you bind together several vocabularies in one organisation 14:13:52 ... if they are controlled by different people then you don't want them on the same domain name 14:14:01 PhilA: it's an issue because of neutrality 14:14:05 ... FOAF is a good example 14:14:19 Sandro: were you serious, Fabien, when you said that France wouldn't use any UK vocabularies? 14:14:29 FabGandon: I haven't checked, I know the reaction about hosting the data 14:14:33 leocadio has joined #egov 14:14:50 ... wouldn't be surprised if French objected 14:15:02 ... issue with internationalisation as well 14:15:16 Sandro: would hope that any vocabulary provider would accept translations 14:15:26 PhilA: but who guarantees translation is accurate 14:15:42 FabGandon: in EU, have whole process of maintaining translation of different documents 14:16:09 PhilA: if you had a vocabulary that had anything but a .com, .org ending... 14:16:16 ... no way Americans would accept that 14:16:49 Sandro: end up using .com, .org or .net for the vocabularies 14:17:06 Sandro: 4.1 Metadata for Data Catalogs 14:17:14 ... no brainer that we want to move along DCAT in this group 14:17:29 ... had an interest group telecon with @cygri 14:17:40 ... wanted to spin off taskforce to do it 14:18:00 ... had large group that quickly dwindled 14:18:11 ... stopped entirely when Semtech came around, and didn't start up again 14:18:18 ... lots of interest there 14:18:38 ... bit question is does it end up as WG Note, as a Recommendation, as a pointer to something else? 14:18:42 s/bit/big 14:18:49 FabGandon: how specific is it to eGov? 14:18:59 Sandro: right now taskforce in eGov IG 14:19:23 ... in doing taskforce charter 14:19:34 ... said clearly applicable beyond government 14:19:42 ... but let's take narrower scope for now 14:20:01 ... can see that it could be broader 14:20:07 FabGandon: it could even be a task of the new RDF WG 14:20:14 Sandro: I think it's too late to go there now 14:20:20 ... or in the provenance WG 14:20:35 ... someone asked what's the difference between provenance and DCAT 14:20:49 xLooking at http://vocab.deri.ie/dcat 14:21:01 PhilA: one thing that is missing is refresh rate 14:21:21 JeniT: think that's part of VoiD 14:21:29 PhilA: ah right 14:21:37 http://www.w3.org/egov/wiki/Data_Catalog_Vocabulary 14:21:43 ... Alex Tucker has done RDF dump of CKAN data 14:22:02 Sandro: Wiki page includes use cases, deliverables, minutes and participants: 28 participants 14:22:12 ... huge amount of interest 14:22:29 ... reminded that Thomas was listening 14:22:38 Thomas: got a little bit bored... 14:22:49 ... did so much work on data catalogs in Germany... 14:23:01 ... ended up disappointing because no one used it 14:23:15 ... idea of having one data catalog as an access point is not a priority 14:23:24 ... in linked data domain discovery is following links 14:23:30 ... not looking at catalogs 14:23:40 ... it's OK, we need it, but... 14:23:52 Sandro: you don't need 28 people to design a vocabulary 14:23:59 ... you want 3 people to do the work, and wide review 14:24:09 ... you don't want big telecons with everyone who cares 14:24:16 John has joined #egov 14:24:18 ... in general that's going to be true 14:24:35 ... sometimes there will be issues that you want discussion for, but a lot is design by a small group 14:24:48 PhilA: I still think in terms of best practice document 14:24:59 ... say 'use DCAT and VoiD to describe your catalog' 14:25:08 Thomas: how is VoiD involved? 14:25:20 Sandro: DCAT can be for non-RDF data, VoiD for RDF data 14:26:11 jeni: Some vocab (dcat) is about EVERY data set, and then some other vocabs are for certain kinds of data (eg geo about geo data, void about RDF data) 14:26:45 PhilA: Neither void nor dcat covers refresh rate. 14:27:07 Sandro: I've never heard anyone assess quality or suitability of VoiD 14:27:11 ... only game in town 14:27:58 ... if we're going to recommend a vocabulary, in a recommendation 14:28:05 ... then we need implementation experience 14:28:10 ... which includes going through to consumers 14:28:19 Thomas: VoiD has been designed without DCAT in mind 14:28:27 ... so didn't care about separation of concerns 14:28:36 ... I think someone has to make a new version of VoiD, to fit in 14:29:16 Sandro: we could ask @cygri whether he thinks a new version of VoiD is needed 14:29:27 ... another thing on DCAT is I don't know how it relates to CKAN 14:29:40 ... I don't know how happy CKAN were with it 14:30:04 ... another force in play is the Sunlight Foundation in the US 14:30:18 ... they have done national data catalog that combines Federal, State and Local levels 14:30:48 JeniT: do you need input about what to put in the charter? 14:31:01 Sandro: I feel we should say a W3C Recommended vocabulary along the lines of DCAT 14:31:13 John has joined #egov 14:31:27 PhilA: so the group would create and maintain the vocabulary 14:31:48 Sandro: I think DCAT should enable multiple catalogs, for a decentralised system 14:32:15 ... each catalog should describe itself using DCAT 14:32:24 s/catalog/data source/ 14:33:49 JeniT: there's the set of terms (Dublin Core + DCAT + VoiD etc) and the namespace for DCAT 14:34:19 FabGandon: when you look at FOAF, FOAFomatic really helped encourage its use 14:34:36 Sandro: OKFN has a form where they're asking people to fill out questionnaire about their government data 14:34:43 ... be nice if it gave back RDF 14:35:14 PhilA: keen to do outreach as well 14:35:20 ... ideally as part of this working group 14:35:26 ... important part of the implementation 14:35:58 Sandro: OK, add under Procurement Assistance 14:36:10 BREAK TIME UNTIL 16:00 14:36:59 John has joined #egov 14:47:32 martin has joined #egov 14:47:38 timbl has joined #egov 14:52:19 karen has joined #egov 14:53:36 tlr has joined #egov 14:54:23 David has joined #egov 15:01:06 leocadio has joined #egov 15:02:46 bandholtz has joined #egov 15:02:50 tban has joined #egov 15:05:09 scribe: FabGandon 15:05:50 resuming on vocabularies. 15:06:06 ... JeniT: what are the next stages? 15:06:54 Sandro: next stage is identify what can be done within the WG charter/timespan/force 15:07:44 ... avoid shoot for too little or too much. 15:08:05 ... identify what can be done in other TF / WG. 15:08:37 ... for vocs we could work on the basis of having an identified editor for each voc. 15:08:50 sandro: maybe the vocabs will each be time-permitting / nice-to-have 15:09:04 http://www.epimorphics.com/public/vocabulary/org.html 15:09:06 JeniT: Organization Ontology 15:10:00 JeniT: foaf and vcard exist 15:10:10 ... Dave Reynolds put that together because nothing was putting togerther what we needed about Org. 15:10:30 ... so we took that and extended that for UK gov. 15:11:03 Sandro: this is reusable in other organizations. 15:11:32 PhilA: very UK centric. 15:11:57 Sandro: this should be blessed by W3C for others to use 15:12:13 PhilA: an Org.org schema :-) 15:13:20 In Spain, we use it, and it was OK for our purpose (city council and departments) 15:13:21 JeniT: change event is used to capture a change in an Organization, it is hook 15:13:31 JeniT: changeEvent hook for saying org1+org2 => org3 15:14:22 Vagner-br: very useful to follow changes in structures and names, acronyms, etc. 15:14:37 s/Vagner-br/TB/ 15:14:49 PhilA: does your national library archives web sites? 15:15:46 FabGandon: In France, we have law that says we must archive every French official media channel 15:15:51 ... and we don't know how to do that 15:16:45 Sandro: question of ontology engineering process and the way to go for a new voc. 15:17:22 tban: I wouldn't use UML, this is not object-oriented work 15:17:33 ... I use TopBraid composer 15:17:57 ... nice figures. 15:18:29 darobin has joined #egov 15:18:38 ... Richard came up with SDMX but not enough sem. web oriented. 15:19:38 JeniT: we work with Richard on that because SDMX is important in the statitician community 15:19:57 jeni: ONS used SDMX already, so it was opportunistic for us to use it. 15:20:27 ... SDMX is hard but may be necessary. 15:21:02 Sandro: we haven't solve the evolution story of how we move from a voc to the next. 15:21:49 JeniT: also hard to know when a voc is stable enough to be really used. 15:22:13 Vagner-br_ has joined #egov 15:22:17 ... check list of what you expect from a voc. 15:22:26 jeni: checklist item: have documentation which is good, have ref guide, examples, etc 15:22:48 ... e.g. it must have ref guide, examples, managed by an org with a longevity, etc. 15:23:19 PhilA: for FOAF for instance the longevity of the domain is a problem. 15:24:17 FabGandon: reading through the minutes yesterday, there's a good thing happening in eGov in that we have very stable bodies involved 15:24:24 ... INRIA is a government institute 15:24:35 ... so we have hosting that is very stable 15:24:42 ... people believe we will continue to exist 15:25:03 ... won't want to use a namespace hosted by the UK 15:25:20 ... but one hosted by a government would have longevity 15:25:46 ... We tried several things, including knowledge engineering approach 15:25:55 ... tried VoCamp approach, where people come with a need for a vocabulary 15:26:01 ... break up in small groups and hack 15:26:07 ... some of these were successful 15:26:20 FabGandon: We tried Knowledge Engineering - limits, VoCamp fairly successful, ... 15:26:28 ... depends on scope of vocabulary 15:26:45 Sandro: this a question for the chairs and the group. 15:26:54 ... any other org ontology. 15:27:04 JeniT: there is a blog post from Dave 15:27:11 sandro: I'll just link to DER's blog post, with its references 15:27:25 http://www.epimorphics.com/web/wiki/organization-ontology-requirements 15:28:00 tb: what about sameAs inflation? 15:28:09 tban: the inflation of sameAs, and misuse of sameAs. 15:28:34 ... I wouldn't sameAs, but what else. 15:28:59 tb: mapping vocab like skos but without inferring it's a skos concept. 15:29:45 FabGandon: subClassOf subPropertyOf also used in alignment 15:29:53 ... provide a mapping voc with only properties and no classes to avoid inferences 15:30:17 sandro: bad sameAs is just bad data 15:30:54 FabGandon: in datalift, we are thinking about how to do mapping, from sameAs onto procedural declaration. 15:31:31 FabGandon: okaam huge eu project on this -- efficient sameAs resolution for semweb. give uri, it gives back ones which might be equivalent. 15:31:59 FabGandon: (let's stay away from this...) 15:32:21 http://www.okkam.org/ 15:33:40 tban: when we try to link e.g GEMET and German Thesaurus we need the same in SKOS without domain and range. 15:35:29 jeni: a school is not a skos:Concept according to the SKOS spec 15:35:36 sandro: skos is just broken. :-( 15:36:10 JeniT: same name for a local authority vs. the area 15:36:41 Sandro: you need to formalize properly. 15:37:31 tabn: we should include the problems aboout alignment to be discussed in the charter 15:39:28 JeniT: if RDF 1.1 don't want to do it we have to come up with a convincing scenario 15:40:22 Sandro: the key thing for people is to see if we can stabilize FOAF. 15:40:26 jeni: important to understand how foaf works with vcard 15:40:53 tban: what about foaf+ssl? 15:41:20 JeniT: I wondered if we should include something about identitity in the eGov WG. 15:42:04 4.4 Statistical/Data Cube Datasets 15:42:23 sandro: statistical, so far there is a sub-set of SDMX 15:42:32 sandro: I'm hearing there's a subset of SDMX, cube, that's pretty good. 15:42:38 http://publishing-statistical-data.googlecode.com/svn/trunk/specs/src/main/html/cube.html 15:42:48 PhilA: It's good for describing what you see in CSVs. 15:43:02 PhilA: the cube ontology is good to describe the sort of data you find in CSV file. 15:43:55 JeniT: Cube comes from the hypercube structure of the data. 15:44:09 JeniT: an observation is a cell in the cube 15:44:27 ... each dataset is described by a dataset def 15:45:38 ... for statistical data, payment data, etc. any thing you put in a Spreadsheet 15:45:49 ... we use it a lot 15:47:03 sandro: how can be sure this meets most needs? 15:47:33 sandro: if we make this a Rec, who might object? Among people who buy into SDMX & RDF already.... 15:48:03 PhilA: Statisticians might find this reduces too much. 15:48:32 Sandro: if we need more of SDMX can we extend it? 15:48:44 Jeni: that was the goal, yes. 15:48:48 JeniT: yes it was designed to be extended 15:49:07 tban: we use it for measurment data 15:50:11 JeniT: we wanted to publish statistic for a larger audience than the statistician community 15:50:37 sandro: if we want to change these schema, how do we do that? what would be the process? 15:51:02 JeniT: feel free to take it ! 15:51:43 sandro: it rare that somebody does this kind of work and does follow it as an editor of the Rec. 15:52:12 sandro: Data Cube seems important. 15:52:24 John has joined #egov 15:52:58 [edit] 4.5 Data Quality, Timeliness, Status 15:53:48 JeniT: I am sure that voiD as something about temporal validity 15:53:50 jeni: we use dc:temporal for expressing the temporal range for which the data is true 15:54:03 ... we have our own small voc for that 15:54:09 jeni: we use our own data.gov.uk for draft-ness 15:54:15 ... nothing on data quality at the moment 15:54:28 PhilA: can't find this in voiD 15:54:38 JeniT: in must be in RSS then 15:54:49 David has joined #egov 15:55:04 PhilA: Who is responsible for cleaning it up? Who will update it, and when? 15:55:11 PhilA: need to know if the data I am using now will be here tomorrow 15:55:19 PhilA: Ooften the data comes from screen-scraping! 15:55:27 ... need to know how often data updated 15:55:52 FabGandon: This is in Provenance -- an expiration 15:56:23 JeniT: this is new work probably 15:56:27 JeniT: I think this is new work, much less baked than data cube 15:56:50 sandro: the WG could provide such voc. 15:57:04 JeniT: it fits under dcat 15:57:05 jeni: This goes under dcat -- it applies to data sets. 15:57:42 FabGandon: Granularity might be small -- some bit of the data changes often, some bit doesn't. 15:59:10 FabGandon: this might not be about the dataset, it might be about one subgraph within the dataset. 15:59:49 rrsagent, draft minutes 15:59:49 I have made the request to generate http://www.w3.org/2010/11/02-egov-minutes.html PhilA 16:00:05 tb: In the Gazettier, when we have changes in communities, merging, the official service just drops the old communities. We don't drop them, we mark them expired. 16:00:16 tb: dcat should describe your policies about such things. 16:00:22 tban: the policy should be also described on the dcat level. 16:01:18 sandro: the granularity problem might be more general with dcat and dataset. 16:01:37 ... granularity can be a political game. 16:02:36 sandro: so if dcat can handle the gran. then this can be folded in. 16:02:52 4.6 Assumptions/Basis/Comparability of Data 16:03:17 JeniT: we need to know if we can compare two values. 16:03:22 jeni: In statistical data they really care if you can compare two values, because defn of some bit in your data changed. 16:03:26 ... e.g. after a policy change. 16:04:09 JeniT: annotate a qb:observation to say this is not comparable, etc. 16:04:28 JeniT: Vocab for classiying these kinds of annotations 16:04:36 PhilA: we have a 10 month data vs. an 11 month data 16:04:53 tban: different methods in differents countries. 16:04:55 tb: lining maps up between country, INSPIRE Harmonization effort. 16:05:33 FabGandon: The notion of an unemployed person in France is totally different than in some other countries -- not comparable. 16:05:55 JeniT: encourage people to use different terms when they use different notions 16:06:01 JeniT: Sometime you just mean datafr:unemployment has a different URI than datauk:unemployment 16:06:27 ... there may be some matches but when we use the same URI it IS the same thing 16:06:50 JeniT: this is more about same vocab, same dimension, ... this is to annotate where it's different. 16:07:45 JeniT: at least we should be able to say "this is a statement about comparability". 16:08:03 JeniT: This is for categories of ways to annotate observations. 16:08:22 tban: using different URIs is different from using different terms. 16:09:15 sandro: no candidate voc on that right now? 16:09:16 http://sdmx.org/wp-content/uploads/2009/01/01_sdmx_cog_annex_1_cdc_2009.pdf 16:09:33 JeniT: some of the SDMX voc may be relevant 16:09:55 ... Dave has mapped those onto a voc which could be a candidate 16:10:15 http://publishing-statistical-data.googlecode.com/svn/trunk/specs/src/main/vocab/sdmx-concept.ttl 16:10:56 [edit] 4.7 Describing Visualization and Presentation 16:11:10 fresnel 16:11:40 http://www.w3.org/2005/04/fresnel-info/ 16:12:36 sandro: not hearing a lot of interest/experience on this one. 16:12:45 FabGandon: Fresnel has a huge potential 16:13:46 sandro: design pattern for URIs 16:13:53 5.1 Design Patterns for URIs 16:14:12 JeniT: updated version: 16:14:22 http://data.gov.uk/resources/uris 16:14:41 ... it takes a different kind of angle. 16:16:02 sandro: huge design space, how much we want to expand or focus the design space 16:16:33 ... should we give all the options or pescribe some good practices? 16:16:45 PhilA: use of id, 303 to doc, SHOULD be in LD 16:17:07 I mean - the pattern http://reference.data.gov.uk/id/department/co breaks down as 16:17:11 JeniT: sayig do 4.2 from coolURIs 16:17:39 {sector}.data.gov.uk/id/{department}/unique_identifier 16:18:02 JeniT: sometimes the pattern does work well 16:18:23 If you dereference that, the /id/ gets replaced by /doc/ as part of the HTTP 303 (see other) response, and that leads to a document that describes the original identified thing 16:18:27 JeniT: Although that pattern works really well in some circumstances, it doesnt for others. eg for the people in the org structures, we dont have a good URI pattern. so we end up using hash URIs in the datasets, thinking they might be linked up later. 16:18:32 ... we used # URIs depending on the dataset. 16:18:44 JeniT: just using pattern 4.2 doesn't always work well. 16:18:46 ... not simple to just say use that pattern. 16:20:02 FabGandon: need keys :- 16:20:06 FabGandon: need keys :-) 16:20:19 sandro: Just get everyone to mint URIs for themselves :-) 16:20:40 tban: the original URL of TimBL also described what you should not do. 16:21:15 tb: '98 cool uris, don't put classifications/datatypes into URI, or other things that would make them change. 16:21:23 David has joined #egov 16:22:20 FabGandon: Don't forget there are scenarios where you want to do the opposite -- to anonymous people. 16:23:56 tb: I've come to prefer totally opaque URIs. 16:23:59 tban: generally I prefer URI that don't tell anything by themselves 16:24:54 sandro: what should we do? 16:25:11 FabGandon: it could be 'follow the guidelines of the LOD group' 16:25:27 sandro: one output could be follow 4.2 16:25:59 sandro: maybe a flowchart, even! 16:26:28 JeniT: I found we needed design patterns not just for schools, but also for vocabs, concept schemes, datasets. 16:26:34 JeniT: we also need design patterns for URIs for schemas 16:27:47 sandro: versioning of dataset crosses with the temporal point before. 16:29:12 sandro: shoud I fold this into designing-URI, or timeliness vocab ? 16:29:32 tban: what does versioning mean here, e.g. statiscal data changes every year 16:29:33 tb: Every year has year more --- discussion of versioning. 16:29:44 tb: verionsing of vocab, too. 16:29:59 Jeni: how you design URIs, how you design the data.... 16:30:37 5.3 Change Propagation and Notification 16:30:50 dady -- dataset dynamic 16:31:16 I think of this as protocol, 16:31:40 FabGandon: RSS feed of changes -- talis changest vocab 16:31:48 JeniT: Sparql push 16:31:52 http://code.google.com/p/sparqlpush/ 16:32:08 cygri has joined #egov 16:32:56 http://esw.w3.org/DatasetDynamics 16:33:04 seems out of scope 16:33:09 JeniT: we need to do it anyway 16:33:27 JeniT: (we = data.gov.uk) 16:33:28 PhilA: also about SPARQL Push 16:34:01 JeniT: we need that for data that we are publishing every week 16:34:13 JeniT: We'll see data published on a weekly basis, so we need 16:34:13 ... we need to a a design pattern for that 16:34:39 sandro: just publishing the new data is not enough? 16:34:44 JeniT: no 16:35:24 ... links back to the named graphs. 16:36:34 FabGandon: It's too big for this.... 16:36:57 5.4 Distributed Query 16:37:22 sandro: too big to be handled here. 16:37:29 same as above -- needs to be done, too big for us. 16:37:54 SPARQL 1.1 has some elements of answer. 16:38:09 JeniT: Maybe it goes into procurement guidelines, eg Sparql 1.1 service descriptions suitable fo rhtis 16:38:16 5.5 Developer-Friendly API and Serialization 16:38:23 linked-data api 16:39:01 sandro: JSON syntax for RDF should be part of the charter of RDF 1.1 16:39:25 PhilA: should be relatively easy to get out the door 16:39:57 JeniT: Yes, 3 impls, could be fast, but does need wider review -- eg for impementations. 16:40:46 PhilA: it is manageable and we should pursue this 16:40:47 PhilA: this is really important, and doable. 16:40:59 ... important in terms of deployment 16:41:20 sandro: will still exist even if we don't do anything within W3C 16:41:44 PhilA: from a visibility point of you this is important 16:42:48 sandro: I'm worried about arbitrary decisions in the design coming back to be a problem in the WG 16:42:59 JeniT: the JSON might be a problem. 16:43:39 JeniT: I think we're a lot of the way there, but leaning towards its own WG. 16:44:26 PhilA: need to talk about outreach 16:44:31 [edit] 2.5 Outreach 16:44:41 ... it needs to happen somehow 16:44:50 PhilA: somehow this has to happen, perhaps via EU funding 16:45:10 ... some way to distribute the output of the group among the governments 16:46:05 sandro: counter argument: the focus of the WG is the how not the why. 16:46:44 ... the demos of the "how" will make the job of the people doing the "why" easyer 16:47:10 robin: In general, WGs are pretty bad at selling their own stuff, being so involved in the technical work. 16:47:50 ... people who were writing great blogs went silent when they joined the WG. 16:47:55 robin: may be outreach should happen outside the WG 16:49:21 sandro: could still be included in the charter. 16:50:03 PhilA: marketing is important in making markets 16:52:05 sandro: I don't have any exact data about the number of members for the WG. 16:55:24 JeniT: great value to have new folks in WG, so people experience having to explain this stuff 16:56:48 John_ has joined #egov 16:57:04 martin has left #egov 16:57:42 FabGandon has left #egov 17:09:04 rrsagent, generate minutes 17:09:04 I have made the request to generate http://www.w3.org/2010/11/02-egov-minutes.html PhilA 17:10:29 PhilA has left #egov 17:29:25 timbl has joined #egov 17:30:02 tlr has joined #egov