Using Open Data: policy modeling, citizen empowerment, data journalism Day 1

19 Jun 2012

See also: IRC log


See attendee list

scribes: PhilA, cgueret, Jeanne, Martin Kaltenböck, olyerickson


<AlbertoCottica> Context: Crossover is a project that is trying to come up with research directions for modeling policy in the connected age. Today we try to think around this issue in the context of Open Data.

<PhilA2> DavidO: Highlights Crossover project web site http://crossover-project.eu/

<josema> wow, it's been a long while since I've seen this channel so crowded

hi there!

<AlbertoCottica> Hello Josema :-)

<PhilA2> Session 1 starts on time :-)

<PhilA2> Nikos Loutas starts things off. Slides to follow

<PhilA2> Nikos: Lots of discucssion about the data, but who knows it's there?

<PhilA2> ... Assumption it means things on a Google map

<PhilA2> Slides are informative for recording substance of Nikos' talk

<AlbertoCottica> Interesting: "Singapore has become a new player in the OD space"

<Dirdigeng> B

<AlbertoCottica> Most OD apps developed by INDIVIDUALS, not organizations: freelancers and reseachers. Business is unconvinced.

Interesting/worrying to see business not developing apps using Open Data

<PhilA2> Nikos: Researches and individuals interested. Big business not so interested yet - why?

<Dirdigeng> Over 90% of apps are free

<PhilA2> ... most apps are free

<PhilA2> ... so business model for those that do charge is interesting

<Julianlstar> Businesses are interested but open data is just data to them

<Dirdigeng> However some app devs (eg Malcolm Barclay) have fer

<Dirdigeng> Freemium models

<PhilA2> Nikos: Most apps are Web based, some platfrom-specific

<MartinKaltenboeck> Regarding Business: the needed data is not there as well as the SLAs etc to attract e.g. enterprises - out of my opinion...

<AlbertoCottica> Individuals, not organizations; free apps. You are looking at labor of love. Conclusion (as usual): better not to oversell the job creation potential of OD.

90% of apps put things on a map

<Julianlstar> We are starting to see a layer of infomediaries Placr, Geowise, Talis etc

<PhilA2> This IRC channel is the record of the day, yes and is public. Twitter with #pmod is for telling the world what's going on now.

<AlbertoCottica> Interesting! OD Apps are geodata-centered, with 90% using maps.

<PhilA2> AlbertoCottica: See ^ :-)

<Dirdigeng> Some UK data is available as either open data or as a service with an SLA; businesses tend to use the SLA-based service

<Dirdigeng> @phila2 will papers and presentations be available online later p,ease?

<PhilA2> Nikos: Emphasis on single, static data set, not integrating multiple or using real time data (due to availabilty I wonder?)

<Dirdigeng> Is there any data about how much each app is used?

<PhilA2> Nikos: Surprisingly - it's hard to find the apps. It took a lot of effort

<olivier> Dirdigeng - all papers linked from http://www.w3.org/2012/06/pmod/agenda, and I think slides will be too

<MartinKaltenboeck> Is it important for the users what apps are based on open data?

<TimDavies> Nikos highlighting the importance of having better ways of discovering applications - lightweight semantics and common meta-model

<Dirdigeng> Nikos says hards to find data on apps usage

<AlbertoCottica> Moving onto Twitter, I am better at commenting than note taking

<TimDavies> From @Julianlstar on twitter: We need to think of what the definition of an #opendata app is. There are apps/services that exist that use #opendata if available #pmod / Companies such as Parkopedia, Geowise, Placr, Spikes Clavell all use #opendata its just they aren't based soley on open data #pmod

<PhilA2> Noel Van Herreweghe from Citadel (paper -> http://www.w3.org/2012/06/pmod/pmod2012_submission_1.pdf)

<PhilA2> noel: Emphasises importance of local data

<deirdrelee> @Julianlstar and @TimDavies - I would think integratiing Open Data into existing product is a more realistic business model than trying to build an entire new enterprise around an Open Data app. What do you think?

<PhilA2> Data often in silos - still...

<PhilA2> Noel: Local gov really interested. Event last Friday very successful

<PhilA2> Noel: Citizen needs - demand side!

<PhilA2> Noel: Links open data to Living Labs concept

<pub> Example of usage of living labs concept for linked open data: http://mvoices.eu</pub>

<PhilA2> Noel: describes iPhone apps that use city data

<Dirdigeng> @deirdrelee Open Data sometimes provides the core reference data (company register, geospatial refs) on which private adds value. That's the model for many existing uses of Public Sector Information

<PhilA2> Dierdre: Shows various stages along the way from getting data out there and then... need more

<AlbertoCottica> Any link to Deirdre Lee's slides?

<Girts> Q to Noel - any application in Flanders which would create data by citizen feedback (e.g. reports of communal problems in the neighborhood or so)?

<PhilA2> Deirdre: Shows national impact of OGD work

<CaptSolo> mentions http://www.dublinked.ie/ - has embedded rdf

<PhilA2> Volvo race is a big event

<PhilA2> OGD and apps important part of the outreach

<PhilA2> Deirdre: Needed to get machine readable data from Galway City. Managed to get that and then triplified

<CaptSolo> using _linked_ open data

<CaptSolo> converted all data (csv, xml, xls, ...) to rdf

<CaptSolo> loaded into Virtuoso RDF server

<CaptSolo> http://lab.linkeddata.deri.ie/2010/grefine-rdf-extension/

<PhilA2> Deirdre: Highlight's DERI's Google Refine extension that exports RDF

<PhilA2> Daniel: What was the total length of the project?

<Dirdigeng> Enriching open government data with crowd sourced data

<PhilA2> Deirdre: It's part of a bigger project which started around Christmas

<PhilA2> ... some issues of getting into App stores takes time

<CaptSolo> takes time to get apps onto Android store and Apple Store

<PhilA2> PhilipD: Do you convert from KML, then into RDF then back to KML to be in Google maps - seems lossy?

<PhilA2> Deirdre: Wanted to show potential of LOD so it gets enriched along the way

<katleen_> @dirdigeng: enriching the data with crowdsourced data opens up a big pool of legal issues re quality, ownership and data protection

<PhilA2> osimod: How can local authorities use the power of this to improve their planning and policy making?

<PhilA2> Deirdre: It's not being used in that way by Galway... not aware of LOD and crowd sourced data being used in that way but there's a lot of potential

<PhilA2> ... harvesting what people are saying in social media and integrating that

<PhilA2> MartinKaltenboeck: Do you offer the triples for re-use?

<PhilA2> Deirdre: Haven't had the discussion with the council about licences.

<PhilA2> ... data sets are quite small

<PhilA2> ... want to extend the collaboration

<PhilA2> MartinKaltenboeck: I was hoping that the work you've done is then available to others as data

<PhilA2> Deirdre: Dublin core, basic Geo etc.

<PhilA2> Deirdre: Within each data set, it will have it's own name space. Looking for common elements, esp terms in DBpedia - problem data is small and doesn't necessarily have many links to make there or geonames etc.

<PhilA2> Vagner_br: Joins the discussion. Makes comments about what we've heard so far

<PhilA2> Vagner_br: We saw simple data and not much integration

<PhilA2> Vagner_br: Not being used to improve government decision making

<PhilA2> Vagner_br: Noel told us that OGD is often difficult to use by end users

<PhilA2> Vagner_br: We've seen lots of initiatives. many of them though are missing a use case and don't necessarily present real value to the public

<PhilA2> Vagner_br: Common idea - OGD is not a trivial task

<PhilA2> Vagner_br: It has become fashionable, but there are mnay countries with scarce resources opening their data without any concern about it being used or being useful

<PhilA2> Vagner_br: Not concerned whether the community has the skill to make use of the data

<PhilA2> Vagner_br: We need to look at the value chain

<PhilA2> Vagner_br: In particular, citizens as protagonist of open e-government and lawyers as protagonists of license issue

<PhilA2> Vagner_br: Need to get different actors in the chain together

<PhilA2> Vagner_br: Need to understand what kind data can be used and establoish a common agenda

<PhilA2> Vagner_br: Putting data in an open format and publishing is not trivial. Need expert help, esp. with triplification

<PhilA2> Vagner_br: Anotehr lesson - timing is different for both sides of the value chain - the guardians of hte data and the users of it

<PhilA2> Vagner_br: Maybe dialogue is asynchronous - we need to understand the value chain better

<PhilA2> Noel: I agree fully

<PhilA2> ... we called a round table and asked people what they wanted in terms of transparency, And OGD was the main factor

<PhilA2> Vagner_br: (Answering osimod) some govs just put the data out without much concerns about it use

<PhilA2> Vagner_br: So they have different view of timing

<TimDavies> PhilA2 asking "If governments don't care about re-use, why do they do it?"

<Dirdigeng> Does the value chain look something like http;//pic.twitter.com/fixc8P53

<PhilA2> Julianlstar: Any evidence of the government re-using their own data?

<PhilA2> Vagner_br: I can't see any re-use of the data by the government. One benefit is to foster more cooperation between govs - but I've not seen it used by itself

<PhilA2> Vagner_br: But I'd like to see it

<PhilA2> Julianlstar: What we found in Manchester - they analysed how data was being used within their own departments. Having made the data more open, it's easier for the staff themselves to access it and they're owkting on making that better

<PhilA2> Yannis:OGD is a hot topic in academia

<Vagner_br> Open data value chain by Janet Hughes http://www.slideshare.net/janet-hughes/open-data-value-chain

<PhilA2> Yannis: We make apps so you can know things. We need more in the way of services so users can interact with the government

<PhilA2> katleen_: In Felmmish community we've looked at data sharing within the government. We keep looking at this from the supply side. We should include the developers in the discussion

<PhilA2> ... terms like OGD prob won't reach them

<PhilA2> AndrewL: WE have an audience at the BBC what is not fascinated by data. We have a challenge to create something that is interesting

<PhilA2> ... focussing on the needs of the citizen is a good way of approaching this

<CaptSolo> PhilA2: I wince every time someone says "we will open this data by creating an iPhone app"

<CaptSolo> ... iPhone is all nice if you are rich enough to own one

<Vagner_br> s /http;/http:/

<CaptSolo> coffee break now

<Dirdigeng> I understand the concern about the deluge, but important not to let public agencies slip back into "we will judge what data to release when and how" mode of thinking

<PhilA2> Gwyneth: Example from the US - immigration data isn't linked across each set. So crowd souring work can be useful for maiking those links. And I see lawyers using it rather than asking the government directly.

<PhilA2> rrasgent, draft minutes

<CaptSolo> there is a spare lightning talk slot

<Jeanne> AlbertoCottica: Works on Wikitalia. Chairing session on public engagement.

<Jeanne> AndreaP: The demand side of open data is taking center stage after working on getting open data out as fast as possible.

<Dirdigeng> Native English speakers: do not use cricketing metaphors either...

<Jeanne> AlbertoCottica: The pessimists among us worry about looking at an open data bubble. The growth and supply from cities is not going as fast.

<Jeanne> AlbertoCottica: If you are judges in Apps contests, we are seeing the same people and apps in each contest. As the civil society fails to deliver, government might back off if they feel they have overpromised.

<Dirdigeng> Growth in number of civic hackers is not matching growth in #opendata

<Jeanne> AlbertoCottica: Speakers will be looking at technical, legal, and social solutions to the problem. Starting with Tim Davies--human solutions.

<Jeanne> TimDavies: Supporting open data use through active engagement: http://www.w3.org/2012/06/pmod/pmod2012_submission_5.pdf

<Jeanne> TimDavies: The conventional narrative is that we get open data and that leads to applications. That's not always true. We often go from data to interfaces to ways of manipulating data.

<Jeanne> TimDavies: We might go to fixing data, data journalism, and analysis. Many uses initially go from one dataset to create other datasets. The highlighter pen is the most often used visualizer tool.

<CaptSolo> ... highlighter pen (+ a spreadsheet) is the most common open data visualization tool

<PhilA2> love the idea that most common tool for analysing open data is a highlighter pen

<CaptSolo> ... people print data out, highlight parts of it, talk about

<Jeanne> TimDavies: All the ways we use data actually involve many different technical steps. Whether making presentations or sites. We have to dig deeply into the creation of individual apps.

<Jeanne> TimDavies: With George Cook in Nottingham, we looked at what is involved in the open data sets. Example: release of Coins, public spending data set in UK. Lot of work in creating a tab-separated file to a clean, interactive version. People provide APIs and add context and linkable with common identifiers.

<CaptSolo> ... people clean up the data, put it out for others to use

<CaptSolo> ... next: add context to data

<Jeanne> TimDavies: Shared source code, configurations of tools, that leads to a stock of information about how to use the data. Most of these steps are invisible to others. All these stages can become unstable.

<CaptSolo> Open Data -> Cleaned Data -> Linkable Data -> Mashups and Apps ->

<Jeanne> TimDavies: Open data and democracy project was the basis for the study. Many of the apps fail because one or more of the stages of cleansing, linking, or visualizing fails. This creates a challenge in the open data ecosystem. We publish the raw data because that's the mandate for the project. Different end users build from different places on the system. The data stewards have a big challenge.

<PhilA2> TimDavis: Openness of the data is important - but we need to link the people using the data as well

<Jeanne> TimDavies: Challenge is hearing about the uses of the data to let them connect and document the value chain and allow interactions. Some of the data in project is crowdsourced, and not government data. If we architect our ecosystems around gov vs. other data we create an artificial gulf.

<Jeanne> TimDavies: Users of data have diverse needs. Effective use of open data requires more than the dataset. Open data for policy making for governance comes from both government and from other data providers.

<PhilA2> TimDavis: 5 stars of engagemetn http://www.opendataimpacts.net/engagement/

<Jeanne> TimDavies: Some of the best responses to this challenge involve deep engagement of publishers with data users. Looked at the 5 stars of open engagement: 1) be demand driven, 2) put data in centext, 3) support conversations around data...

<Jeanne> TimDavies: People are challenged because when gov releases data it is often after gov publishes an analysis.

<Jeanne> TimDavies: 3 stars of engagement: Engage with hack-a-thons, online and offline engagements around datasets

<Jeanne> TimDavies: 4 stars: Invite people in to shadow in organizations, "School of Data" for online learning to working with data, run skill-building sessions in community

<Jeanne> TimDavies: 5 stars: Build in feedback loops, collaborate with community for new resources, provide support to sustain and build new apps and resources

<CaptSolo> slides: http://www.slideshare.net/timdavies/open-data-engagement-using-open-data-w3c-workshop

<Jeanne> TimDavies: Support citizen empowerment through open data needs greater attention to open data policy and practice, research needs to detail practicies, architectu open data technologies

<Jeanne> Question: How does survey relate to findings?

<Jeanne> TimDavies: Survey was in 2010 around Data.gov.UK--findings based on survey, and followed up in depth with 10-12 cases documenting journey. Opendata.net/reports

<Jeanne> TimDavies: Questions are also available on that page: http://opendata.net/reports

<Jeanne> AlbertoCottica: Miel Vander Sande from Ghent on the legal solutions

<Jeanne> mi

<Jeanne> MielVDS: Challenges for Open Dat aUsage: Open Derivatives and Licensing: http://www.w3.org/2012/06/pmod/pmod2012_submission_4.pdf

<Jeanne> MielVDS: Research group on Multimedia Lab around semantic web and linked data solutions. With local governments we are trying to do a bottom up approach to make local governments open more data.

<Jeanne> Phil--this would be a great topic for a future eGov IG on the findings and outcomes of Tim's survey.

<Jeanne> MielVDS: Neelie Krooes noted that open data increases transparency and makes evidence-based decisions possible. So we all asked for the data and many governments responded with open data portals.

<PhilA2> Nice to see examples of open licences other than UK!

<Jeanne> MielVDS: Published now under an open data license. Show UK open gov license--do what you want with the data including making money, but need to show a contribution. Same in France and Vienna. Licenses are needed. But concerns as well

<Jeanne> MielVDS: Since governments are giving open data to the public, but want benefits to get to the public via open data. But does it really? Perhaps should add extensions to existing licenses.

<Jeanne> MielVDS: Two evolutions: data co-ownership and feedback loop of applications. Most open data models: gov opens an open data portal, uses a platform, if data is wrong then uses a different channel to notify the government. Government then changes the data if we are lucky.

<Jeanne> MielVDS: Evolution should be working towards one channel where public reads and writes and also notifies the government about what needs to be changed--mono-channel system.

<Jeanne> MielVDS: Changes from government feeding data to the public and then creating a platform where others can change the data. This allows for data co-ownership.

<Dirdigeng> Restrictions to open create opportunitities for bureaucrats to use remote control - cf first version of Canada's licence

<Jeanne> MielVDS: This would allow additional information to be added, crowdsourced, etc. This lets people improve the quality of the data and takes care of it.

<Dirdigeng> Need to find ways of avoiding IP ownership mess as data is corrected/enriched by numerous people

<Jeanne> MielVDS: Should use a standard protocol to communicate--OpenID, full feedback, well-described information. Data owners need to invest in infrastructure rather than create reports and visualizations. Servers need same communication protocol, flexible technologies, linked data applications, version management, conflict management, trust and provenance management.

<Jeanne> MielVDS: Relates back to Linked Gov Data group in W3C.

<PhilA2> Details of (new) W3C Linked Data Platform WG are at http://www.w3.org/2012/ldp/charter.html

<TimDavies> MielVDS presentation reminding me of discussion of Open Data Ecosystems (http://blog.okfn.org/2011/03/31/building-the-open-data-ecosystem/)

<Jeanne> MielVDS: Feedback on the communications have to go back into the story of the data itself. Usage statistics need to connect. Perhaps Google (or other dominant force) has the primary statistics on usage and need to be open as well. Mashups and aggregations should show sources and provenance.

<Jeanne> MielVDS: Results should contribute to the original. Legal framework: Defining domains of society concerning openness; stimulating governmental investment in recalculation platform, well thought licensing.

<Jeanne> MielVDS: Licenses need to enforce restrictions and engagement--who is obligated to particate in open data: governments, users, suppliers (private and public), monopoly with wide public coverage.

<TimDavies> Miel_VDS talking about the need for a legal framework. Also need to articulate norms of good behaviour on the web...

<Jeanne> The new Open Government Platform is architected to connect engagement and feedback in a single place: http://www.opengovplatform.org/ This is free, open-source software for anyone to download for an open data portal: https://github.com/opengovplatform/opengovplatform

<Jeanne> Question: How do you see the ownership/custodianship idea?

<Jeanne> MielVDS: A restriction might be that a government has to invest in the improvement of the data. If you get co-ownership of updated versions, then the license should state that everything is open.

<Jeanne> Question: Symmetry--open data is a push in one direction toward citizens, but needs to be feedback?

<Jeanne> MielVDS: Homogeneous--the channel the government and public are on should be the same.

<Jeanne> Michalis Vafopoulos: We need public datasets with public ownership, and we also need infrastructure. We need big infrastructure for big data.

<Jeanne> Question: The information system is going to be open, but we are focusing on the data only right now. Soon it will be about an open bus--all the specifications of the bus will be published. You can apply and be issued a passport. Millions of small players with a few big publishers. In private world, many channel masters opened up to their partners to allow response just in time and benefitted from it. Then we can design good apps.

<Jeanne> Thanks!

<MartinKaltenboeck> here we go

<MartinKaltenboeck> David Osimo starts presentation

<MartinKaltenboeck> Title: Opinion Mining and Sentiment Analysis

<CaptSolo> slides: http://www.slideshare.net/osimod/osimo-crossoveropinionmining-pdf

<MartinKaltenboeck> David Osimo: chanllenges are manifold as making sense of thousands of voices and identify the good ideas

<MartinKaltenboeck> David Osimo: state of the art technology is used to manage content sentiments - content analysis

<MartinKaltenboeck> David Osimo: therefore social networks are evaluated using sentiments

<MartinKaltenboeck> David Osimo: argument mapping tools are used

<MartinKaltenboeck> David Osimo: furthermore similarity patterns on the several opinions are used. As well as crowd soursing mechanism by voting opinions (what is important for the crowd) and thereby cluster and identify ideas

<MartinKaltenboeck> David Osimo: all of this used tools are freely available or of very low cost - what is very important that it can be used by everyone...

<PhilA2> David seems to be saying that tools like Twitterart are easy to use and popular but unreliable. Also, simple tools to gather opinions from Web users are easy to use but unreliable... Hm.. that's a theme then!

<MartinKaltenboeck> David Osimo: mentioned http://www.regulations.gov as opinion mining tool example

<Jeanne> Interesting comments on democratization on content production and visualization, but not on the analysis of the data.

<CaptSolo> - let people filter content by voting (e.g., UserVoice service)

<CaptSolo> -- ineffective for identifying good ideas, generating innovation (people vote on simple things)

<MartinKaltenboeck> David Osimo: democratisation of content analysis is missing because the respective software is too expensive

<MartinKaltenboeck> Question: there are different tools available as http://sentistrength.wlv.ac.uk/

<MartinKaltenboeck> David Osimo: only freely available for R&D projects

<MartinKaltenboeck> David Osimo: for real opinion mining there is a lack of tools available for everyone - it is a difference to sentiment analysis.

<MartinKaltenboeck> David Osimo: future challenges for opinion mining: reduce human efforts; identify GOOOD ideas (not only ideas); question of investment because of costs of the available software; usability: tools are very very technical and lack of usability and thereby are less inclusive;

<MartinKaltenboeck> David Osimo: more information available in the slides - available soon

<MartinKaltenboeck> Question: also 'big elephants' as facebook, Google etc see things that govs does not see...

<MartinKaltenboeck> David Osimo: this should be taken into account for the future research - to include this opportunities

<Dirdigeng> MOTD

<MartinKaltenboeck> PhilA2: you cannot extract meaningful information from tweets cause of the length of the text

<MartinKaltenboeck> PhilA2: free tools use generic algorithms - to produce good tools is lots of efforts

<MartinKaltenboeck> Remark by Yannis: we can put specific opinions into several places by pressing 'one button' and can also collect all reactions on this with pressing another 'button' - this is dangerous.

<TimDavies> David Osimo has highlighted inequality of access to analytic tools between citizens and governments. Can address this by promoting greater availability of open tools to an extent. But John Erickson has highlighted that Corporations may always have more access to data and tools than citizens or governments that they can use for influencing policy. Can this be solved technically? Or does it require regulatory response on how corporations can input into policy,

<MartinKaltenboeck> Katleen: are you using personal data for this research? Farida: riot tweet analysis: learned that using only machines is dangerous - human beings are needed for this.

<MartinKaltenboeck> David Osimo: only dealwith public opinion mining; but also this data collected at scale brings different output; loudest voices: there are methods how to treat to avoid just to hear & take into account the loudest voices...

<deirdrelee> insightful vs inciteful......interesting use of words in deliberation sphere :)

<MartinKaltenboeck> Efthimios Tambouris, University of Macedonia - starts presentation: Augmenting Open Government Data with Social Media Data

<MartinKaltenboeck> Efthimios Tambouris: OGD characteristiscs as an introduction (various formats, domains,large numbers, ...)

<MartinKaltenboeck> Efthimios Tambouris: OGD and SMD (social media data) comparison. Aim of research: integration of OGD and SDM

<MartinKaltenboeck> Efthimios Tambouris: proof of concepts done by elections 2012 (Greece)

<MartinKaltenboeck> Efthimios Tambouris: showcases a classification scheme of OGD for better understanding of OGD - the project also evaluated ~ 60 publications on SMD

<MartinKaltenboeck> Efthimios Tambouris: Integration of OGD and SMD - challenges: e.g. identify relevant OGD data sets, transforming noisy SMD into structured data, integration issues, how to show results on a GUI (dashboard), AND: meaningful CASE STUDIES!!

<MartinKaltenboeck> Efthimios Tambouris: stage model for OGD that more and more integrates over data

<MartinKaltenboeck> Proof of Concept: UK elections 2012 - 2 views to integrate: objective OGD view and subjective SMD view

<MartinKaltenboeck> UK election 2012 (not 2012)

<MartinKaltenboeck> election 2010 (not 2012)

<MartinKaltenboeck> Example: children in poverty: integrated view of OGD and SMD - single views available - integrated view in progress by project team

<MartinKaltenboeck> Aim of the project is to provide the infrastructure for OGD and SMD integration soon

<MartinKaltenboeck> Anneke Zuiderwijk, TU Delft introduces herself

<MartinKaltenboeck> Points out that meta data is very important in the field of OGD - asks presenters about there opinion

<MartinKaltenboeck> David Osimo: in opinion mining e.g. the labels of differet systems including opinions are compared

<MartinKaltenboeck> Federico Remiti, ETNA announces his demo at 05.00pm at the workshop

<MartinKaltenboeck> Question: to David Osimo & audience: industry products could be used for policy issues also?

<MartinKaltenboeck> David Osimo: it is a question of algorithms - the available products come from marketing areas, politics field (not policy making), etc...

<MartinKaltenboeck> David Osimo: process design is curial to set up such a opinion mining system

<MartinKaltenboeck> LUNCH break ;-)

<CaptSolo> PhilA2: a lightining talk slot is still open. do volunteer.

Open Data and the Media

<olyerickson> program note: Oluseun couldn't make it in time; please see his paper

<olyerickson> Paper 1: Christophe Gueret

<olyerickson> ...Decentralized Open Data

<olyerickson> ...focused on European countries

<olyerickson> ..."Common facets of Open Data"

<olyerickson> ...We think of Open Data Portals

<olyerickson> ...Single point-of-entry

<olyerickson> ... usually in @en

<olyerickson> ...txt oriented

<olyerickson> ...Facet: Open Data Users

<olyerickson> ...(several assumptions about users)

<olyerickson> ...facet: Open Data Producer Hardware

<olyerickson> ... datacentres, max uptime, hgh speed., reliability

<olyerickson> ...Putting hte facets together: Who are we targeting>

<olyerickson> ...Data consumers and producers that are a SMALL FRACTIOn of reality

<olyerickson> ...So: Data sharing need and issues

<olyerickson> ...(rural farmer example)

<olyerickson> ...issues with sharing from this example: distances, limited infrasgtructure, language/dialect, etc

<olyerickson> ..."A simple question: How can everyone benefit from Open Data?"

<olyerickson> ...They organized workshop on that theme

<olyerickson> ..aspects of the olution: voice interfaces, "down-scaled" infrastructure

<olyerickson> ...Why voice: everyone can at least speak; good mobile penetration in sub-sahara africa

<olyerickson> ...How: T2S, ASR technologies need to be adopted and further developed

<olyerickson> ...Downscaling Infrastructure? Adopt swarm model of micro-servers instead of larger central one

<olyerickson> ...u-servers less costly, increase robustness, de-centralized/closer to users (consumers, producers)

<olyerickson> ...<image> Target hardware, portal and users

<PhilA2> Can't help pointing out the lack of iPhones in that picture...

<olyerickson> @phila and lots of those nokias...

<olyerickson> ...Some of the ongoing projects: RadioMarche, Foroba Blon, semanticXO

<PhilA2> Agenda updated http://www.w3.org/2012/06/pmod/agenda

<olyerickson> ...radioMarche: sharing market prices (NGO is in the center)

<olyerickson> ...<image> RadioMarche node

<olyerickson> ...radio, tiny server, power strip

<olyerickson> ...learning: with voice command, must have famliar voices for prompts (people the famers know)

<olyerickson> ...Foroba Blon: Timely Data Sharing

<olyerickson> ...Tweet-like status updates on map interface

<olyerickson> ...enables/encourages citizen journalists

<olyerickson> ...users call service, leave message

<olyerickson> ...SemanticXO: sharing data 6 to 12 using mesh based on OLPC

<olyerickson> ...designed to be very easy to use for ids

<olyerickson> ...data sharing stack for XO (aka OLPC)

<olyerickson> ...every XO is data publisher and consumer

<olyerickson> ...does not rely on central host

<olyerickson> ...Q: Do you have a linked data application (e.g. mashup) framework

<olyerickson> ...Answer to John's question: Data is added/consumed via "Sugar activities" (which consume the API)

<olyerickson> paper 2: How Open Data is Redefining the Rols of Journalist (et al) (Andrew Leimfdorfer (BBC))

<olyerickson> ...will be a little off topic; they don't work with "large" datasets

<olyerickson> ...but relevant wrt meeting the needs of the "audience" needs

<olyerickson> ...their team: "bbc secials" ... lots of multimedia itneractives

<olyerickson> ... developers, designers, etc

<olyerickson> ...BBC: Reach, Quality, Impact, Value

<olyerickson> ...They have various tangible metrics --- how many clicks, time spent on page, etc

<olyerickson> ... Example: Infographic for Eurozone crisis

<olyerickson> ...positive feedback, BUT in user feedback, "too much maths..."

<olyerickson> ...User feedback can be an eye-opener. Remember the vacuum in which you operate...

<olyerickson> ...Another example: "user friendly" Eurozone criss infographic (the one with circles)

<olyerickson> ...Q: What was the reaction from the dataviz experts?

<olyerickson> ...Another learning: audience loves infographics/apps that help them

<Jeanne> Interesting BBC perspective on how readership changes with the ability for people to be drawn to the graphics, personalization of data like how much tax do I pay based on my behavior?

<olyerickson> ...example: personal finance app that people wanted to share

<olyerickson> ...Answer to John's Q: "Junk charts" review; BBC made some changes, explanations, but held strong on other issues.

<olyerickson> ...Another example: "Deaths on Every Road" --- fed by FOI request

<olyerickson> ...Partnered with Talis

<olyerickson> ...hit a dead-end w.r.t. back-end implementation

<olyerickson> ...example: eerie, artistic "roads lit up' visualization of accidents

<olyerickson> ...re-stressed: BBC does what it thinks its audience will get the most value out of

<olyerickson> ...BBC well-known for exposing data from its core domains

<olyerickson> .....BBC music --- rich data because it has the program's own adat

<olyerickson> s/adat/data/

<olyerickson> ...BBC falls in the spectraum between just users and just publishers

<olyerickson> ...The media has an important role

<olyerickson> ...Example: public sector pay (application)

<olyerickson> ...enables the extraction of the "real story" from the raw data

<olyerickson> ...looking at BBC's work as not publihsing, not consumg, but a mix

<olyerickson> Q: Are comments to BBC web being analyzed?

<olyerickson> ...BBC comment system is not high quality; attracts lots of trolls

<olyerickson> ......still trying to work out a better way to do beak....

<olyerickson> ...Q: What is BBC doing with different kinds of visualization

<olyerickson> ...Answer: The analysis is still largely textual

<olyerickson> ...hasn't gone over to mostly visual (still an area of work)

<olyerickson> ...and gets ported to alternative languages...

<olyerickson> ...Q: Cool dataviz have front-end effective but do they lead to long-term action?

<olyerickson> ...Q: What does it mean in terms of technology and methods? Open-sourced, etc?

<olyerickson> ...Answer: doing follow-up research to see if coverage is having an effect.

<olyerickson> ...Also: in regular contact with NYTimes, Guardian, etc. on methods

<olyerickson> Paper 3: Anneke Zuiderwijk (TU Delft) excerpt from Workshop on Linked data: "Challenges and Solutions

PhilA2, I'm not very good at it but I could try...

<olyerickson> ...ENGAGE project (FP7)

<olyerickson> ...ENGAGE Questionaire

<olyerickson> ...distributed to email lists, ENGAGE web site, LinkedIn contacts, etc

<olyerickson> ...target return: 246 respondents; by Mar 9, 192

<olyerickson> ...(early results presented)

<olyerickson> ...Survey included 'What types of data used," "How often?' etc

<olyerickson> ...Results: breakdown of where they got the data from...

<olyerickson> ...Results; Purposes for which the users used the data

<olyerickson> ...Results: Does the data meet the users requirements?

<olyerickson> ...RE: meeting requirements, difference between meets requirements ("able") and simply making it easier

<olyerickson> ...Results; metadata --- 79% said they "used metadata..."

<olyerickson> ...Results: Which metadata would be useful?

Lightning Talks (Jeanne Holm)

<olyerickson> Talk 1: Farida Vis (Univ of leicester) "Allotment (publics)"

<olyerickson> ...combines several topics we've heard about today

<olyerickson> ...allotment is rented land form town govt

<olyerickson> ...standard allotment: 10 "rods"

<olyerickson> ...why? Interested in understanding user participation, prodicing data people care about, etc

<olyerickson> ...Clause 23 of Allotment Act applies

<olyerickson> ...allotments are the responsibility of local councils

<olyerickson> ...return of monies is specified to be returned to local councils

<olyerickson> ...uproar over potential loss; 'threat to the good life"

<Jeanne> Farida Vis' analysis of the UK Allotment Act data shows influence of citizen feedback driving gov policy.

<olyerickson> ...major response to govt (over half of all rsponses)

<olyerickson> ...history: dates to WWII/'Dig for Victory"

<olyerickson> ...enormous demand, tiny supply, cycles of interest/popularity

<olyerickson> ...who's done research on allotment data?

<Jeanne> Farida's paper is at http://www.w3.org/2012/06/pmod/pmod2012_submission_28.pdf

<olyerickson> ...TTWK, Guerrilla gardening movement, etc

<olyerickson> ...What data is available? Example: greater manchester open data

<olyerickson> ...locations, waiting list data, councils that have closed waiting litss

<olyerickson> ...FOIA results for specific key data

<olyerickson> ...Published on guardian data blog

<olyerickson> ...picked up by UK press

<CaptSolo> http://www.guardian.co.uk/news/datablog/interactive/2011/nov/10/allotments-rents-waiting-list-england

<olyerickson> ...Ginormous waiting lists...

<olyerickson> ...Now what? Building a national allotment data hub

<olyerickson> ...working with local councils

<olyerickson> ...locating who actually has this data

<olyerickson> ...much of it is the local allotment holders

<olyerickson> attotmentdata.org and @allotmentdata

<olyerickson> Talk 2: "BuitenBeter" ???

<olyerickson> ... http://www.buitenbeter.nl/

<olyerickson> ...(technical difficulties...)

<olyerickson> ..."BuitenBeter: a free mobile app"

<olyerickson> ...Take photo...GPS location...problem description...report problem to correct city

<olyerickson> ...Citizen needs, Government needs

<olyerickson> ..."citizen-centric solution...direct/relevant communication channel.."

<olyerickson> ...Impact: Sustainable...100K users, 60K real issues reported

<olyerickson> ..."Good simplicity hides necessary complexity"

<olyerickson> ...<architecture diagram>

<olyerickson> ...Twitter and blog activity demonstrating impact

<olyerickson> ...<collage of BuitenBeter images>

<olyerickson> ...emphasis on active engagement

<olyerickson> ...helps to create "smart cities"

<olyerickson> ... partner with http://liveandgov.eu

<olyerickson> Q: Do problems "fix themselves" because of twitter activity about them?

<olyerickson> ... See also http://fixmystreet.com

Full Break, return at 3:50p

<PhilA2> scribe: cgueret

Impact of Open Data on Policy Modeling (panel)

Andrew Stott, Public Sector Transparency Board, former UK Government Director of Transparency & Digital Engagement

unit dealing with geospatial information

scribe: Joint Research Center (JRC)
... the INSPIRE directive is a law to establish an infra to share geospatial information
... based on SDIs (Spatial Data Infrastractures)
... does not affect property rights, started in May 2007
... november 2011: EU member states had to provide inspire-compliant discovery and view services
... 200000 resources from 15 members now
... (still in testing phase)
... purpose of INSPIRE was to allow interoperability
... through harmonization
... inspire is based on XML
... INSPIRE is also an architecture: contributors can register their services to it
... INSPIRE is not only about geospatial information, also buildings, administrative units, etc
... 20 different objects - list agreed over all the members
... metadata in inspire describe de services that are available through it
... in machine readable format
... make use of SKOS thesauri
... the metadata is not necessarily in English
... can be in any of the official EU language
... highlights the importance of thesauri
... inspire is already using standards, why shall it moved to Linked Data?
... would allow cross-domain search and data aggregation by joining other eGov framework
... this integration need a domain-independent data model => RDF/LOD provides this
... what needs to be done to make inspire LOD-compliant, without having to rebuild it from scratch>


scribe: inspire ok up to three stars on Tim's scale
... 4 and 5 partially
... no commitment to use URIs for identifiers, some may be using them, some may not
... no links
... need to define mappings and build an abstraction layer
... issues: agree on minimal common terminology (vocabularies) and foster semantic multilingual search
... for vocabularies, c.f. ISA program
... core vocabularies define a minimal but extensible set of terms to describe eGov resources
... but not enough to represent crossdomain temporal geo-graphical entities
... spatial / temporal queries: should allow for expressing geometries
... several approaches available
... still an issue
... pionnering work by Ian Davis on placetime.com
... ADMS + eGov core vocabularies are to enable cross-domain integration
... two other new initiatives: European Union Location Framework (EULF) and
... LOD enabled INSPIRE prototype: http://inspire-geoportal.ec.europa.eu
... exposes 200,000 resources
... metadata served in XML, JSON and HTML
... conclusion, important points: vocabularies and encoding of geospatial things and coordinates

Q: indeed no agreement for geo in RDF but you are using geosparql anyway?

A: use whatever is available to compensate for the lack of standard

Q: metadata is three stars, what about the data itself?

A: inspire says that metadata and discover services have to be open, nothing is imposed on the data

scribe: should not be imposed by law, it's up to the data owners to see the interest they have in opening it

Panel Session: Impact of Open Data on Policy Modeling Chair: Gianluca Misuraca, JRC-IPTS/EC

<PhilA2> Panel members: Jeanne Holm (data.gov evangelist); Andrew Stott UK Cabinet office Transparency Board; Simona de Luca Ministry of Economic Development, Department for Cohesion, Italy; Franco Accordino, Head of the European Commission's Task Force Digital Futures

Jeanne Holm: background in space, joined open data a few years ago

scribe: Obama signed a memo for transparency in his early days of presidency
... even before, transparency driven by popular demand
... have a feedback mechanism
... evolved other time, from rating to conversations on Twitter
... initiative "we the people" in the white house
... petition based data driven
... everyone can suggest to change something in the gov and get support for it
... most petitions have many signatures
... move policy to being driven by data and transarency

Andrew Stott: background in Gov IT and large scale systems + policy making

scribe: big data movement close to the open data movement
... now govs have to deal with increasing amount of data in order to make their policies

<PhilA2> Interesting - used to be a 6 month wait for a policy maker to be able to scan the UK Social Security database (which took a weekend). Now it's on your laptop

scribe: did some work on reviewing data from new zealand: many feedback from users

<PhilA2> National UK crime-maps at http://police.uk

<PhilA2> Police complaining about all the feedback :-)

scribe: police complained they have to revise their priorities to cope with the feedback
... a lot of time gov spend some time looking for info internally
... people in govs use open data sites to get the data they need
... improve dialog within government
... no systematic measure of it but visibly happening

<PhilA2> UK MPs more likely to use http://www.theyworkforyou.com/ than http://www.parliament.uk/ to get info about parliament

Simona de Luca: recently published study on situation

scribe: not a lot of data, no central point of access
... data re-use showed to go with central point of access
... in Italy, experiments to speed up publication process

<PhilA2> Interesting that Italian 'technical government' is accelerating progress on open data

scribe: at the moment, working on opening information on cohesion policies
... opening data means cleaning the data
... choose criteria to publish
... means need to work on metadata. Open data is a first step, next comes visiualisation and tools to collect feedback
... not thinking about modelling and linking now

<PhilA2> Focussing on mapping 'interventions' and assessing impact

Franco Accordino: digital future task force

scribe: start July 4th
... by 2013, deliver inspiring ideas (with evidence) for policy making for 2020
... c.f. euro 2020
... digital agenda is ont of the 7 pilars of this document
... goal of -20% carbon footprint by 2020
... for instance, several other targets establoshed
... hard to assess if goals are realistic or not, but important is to find what is needed to reach them

<PhilA> Digital Futures brings in scientific data together with opinions "we want to elimiate poverty by..." etc.

<PhilA> Then mixes in contradictory opinions and see what comes out

scribe: their could be competing visions to achieve a given goal
... we want to engage more and more the citizens
... make every european a policy maker

<PhilA> Digital Futures: To Make Every European A Policy Maker

scribe: to do that, will develop a platform
... open to everyone with data access and also reasoning on the data
... goal is to offer the same experience to everyone
... important to work closely with the US because it's all global problems
... "data has no borders"
... the all commission is moving towards Open Data

Q: petitioning mechanism sounds like a rough mean, which is also easy to manipulate. What are the limits to filter?

A: (Jeanne) Petitions are proposed by people - most popular is to legalise Marijuana, was against many local laws

scribe: some less controversial
... the government is clear about why something is possible or not

(Andrew) first petition in UK in 2007, had an up and down history

scribe: one famous one was against the taxes for congestion
... in 2009 act required every council to have a petition system
... if more that 200000 signatures, there is a debate in parliament
... "spending challenge" crowd-source ideas for saving public money
... UK gov has hiered Wikipedia founder to think about ways to get more feedback from public consultations

(Franco) 1500 responses to a call about horizon2020 ideas

scribe: 29 questions, big chjallenge was to process the data
... everything had to be done manually because contained text like "you should do this and that"
... 6 months of processing time => too much
... writting a directive takes 2 years from consultation to end
... aim at making it more evidence based and participatory (less reacting)

(Simona) start by opening the data to incite for collaboration within the citizens

Q: are we to the point where a country could no more not publish his data?

Q: Open data enmphasises more the supply than the demand side, does that reflect some difficulties from the gov to talk with the consumers side?

Q: how can the participatory approaches be brought forward? How can be re-define the policy making cycle to consider this?

(Jeanne) whatever entity you represent, it's important we all share our information

scribe: it's more important than to worry about the number of stars (but that's legitimate too)

(Simona) governements have a short time in front of them so need to act quickly

(Andrew) already saw two times a redisign of policy making process

scribe: one for ISO compliance
... one for not only having quality standards but also have predictive results
... lot of policy makers don't have good data scale
... a bit like "minority report" now: it is possible to predict which kids will be at risk before they are even born, how do we deal with that for policy making?

(Franco) that's the beauty of opening

scribe: we are one big crowd that can reach consensus on things
... we're moving onto having a way to influence a system that has been working for years in some way
... swiss already have the right culture for it
... and they can afford it (referendums)
... now with ICT, new systems will be developed
... govs have to adapt faster to an ever changing world
... data is about the past but can be used to extrapolate to people's desire
... it's important that the community of Open Data make it so they can reach non specialists
... politicians that are not in the US, UK or Italy have to understand the whereabouts too

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.136 (CVS log)
$Date: 2012/07/12 11:35:50 $