Timestamps are in UTC.

meeting: ODW Day 2
Has anyone seen hadley
scribenick: BartvanLeeuwen_
Topic: Perspectives & Experience
scribenick: BartvanLeeuwen
08:04:19 [BartvanLeeuwen]
Topic: Perspectives & Experience
08:05:02 [BartvanLeeuwen]
DNA Digest by Fionna Nielsen
08:05:38 [StevenPemberton]
08:06:03 [BartvanLeeuwen]
DNADigest non for profit org. with purpose of opening up genome sequences
08:06:53 [BartvanLeeuwen]
DNA Sequencing, purpose : cancer research/ heritable traits and illness and rare diseases
08:07:19 [StevenPemberton]
08:08:41 [StevenPemberton]
08:09:23 [StevenPemberton]
Chair: Phil Archer
08:09:44 [PhilA]
Apologies to Fiona who is saying seriously interesting stuff but fighting technology problems. Facilities are on the case but so far without success
08:09:49 [StevenPemberton]
08:10:29 [StevenPemberton]
scribenick: StevenPemberton
08:10:51 [StevenPemberton]
Fiona: researcher needs databases of genomic variation
08:11:06 [StevenPemberton]
... more data is needed to validate results
08:11:24 [StevenPemberton]
... they want to learn from other data
08:11:29 [StevenPemberton]
... but are not sharing their own
08:11:40 [StevenPemberton]
... why not?
08:11:47 [StevenPemberton]
... - confidential data
08:12:07 [StevenPemberton]
... - easy to identify individuals from genome data
08:12:17 [StevenPemberton]
... - BULKY
08:12:38 [StevenPemberton]
... so data has to be de-identified and aggregated
08:12:46 [StevenPemberton]
scribenick: BartvanLeeuwen
08:12:50 [BartvanLeeuwen]
no open sharing of raw data
08:13:22 [BartvanLeeuwen]
only defects for described diseases is shared
08:13:45 [BartvanLeeuwen]
Solution of DNA Digest, not all research needs access to full data
08:14:28 [BartvanLeeuwen]
e.g. Does this mutation occur at higher frequency
08:14:48 [stressindikator]
this result is already de-identified
08:16:18 [BartvanLeeuwen]
challenges: connecting existing datasets, incentives for sharing and commercial model
08:16:34 [BartvanLeeuwen]
08:17:27 [BartvanLeeuwen]
Next speaker: Florian Bauer
08:17:58 [BartvanLeeuwen]
reeep helps with renewable energy in developing countries
08:18:57 [BartvanLeeuwen]
website after website is launched, "portal proliferation syndrome", that needs to be fixed
08:19:31 [BartvanLeeuwen]
in current projects, data silos are created
08:22:09 [StevenPemberton]
scribenick: StevenPemberton
08:22:36 [StevenPemberton]
Florian: the ystem has to deal with disambiguation
08:22:41 [StevenPemberton]
08:22:45 [BartvanLeeuwen]
its seems very hard to stop this
08:22:46 [BartvanLeeuwen]
but its because project deliverables state information dissemination , not how
08:22:46 [BartvanLeeuwen]
We need to break silos and link information to help them do it differently
08:22:47 [BartvanLeeuwen]
Can we support the linking of information with and automated system
08:22:47 [BartvanLeeuwen]
we have to !
08:22:52 [StevenPemberton]
... and use linked open data
08:23:16 [StevenPemberton]
... we supply thesauri
08:23:30 [StevenPemberton]
... our system helps tag unstructured documents
08:23:53 [StevenPemberton]
... a file is sent, in whatever format, we analyse it, extract common terms
08:24:05 [StevenPemberton]
... based on thesaurus and vocab
08:24:16 [StevenPemberton]
... and deliver lots of metadata
08:24:25 [StevenPemberton]
... with definitions etc.
08:24:44 [StevenPemberton]
... and we can store the location of the document with tags (if the user wants)
08:25:05 [StevenPemberton]
... [gives an example]
08:25:42 [StevenPemberton]
... please look at
08:25:57 [StevenPemberton]
... we are a non-profit
08:26:02 [StevenPemberton]
... so no charge
08:26:24 [StevenPemberton]
... we consider this a first step
08:27:13 [StevenPemberton]
Next speaker -, Dan Brickley
08:27:21 [BartvanLeeuwen]
next speaker: Dan Brickely
08:27:28 [JeniT]
08:27:37 [StevenPemberton]
scribenick: BartvanLeeuwen
08:28:10 [StevenPemberton]
08:28:13 [BartvanLeeuwen]
08:28:38 [BartvanLeeuwen]
search engines decided to collaborate, even though they are competitors
08:29:39 [BartvanLeeuwen]
putting triples on the web was only done by a limited number of people
08:30:00 [BartvanLeeuwen] is not SEO, its semantic technology gone mainstream
08:30:44 [BartvanLeeuwen] is html with markup, to generate triples
08:31:12 [BartvanLeeuwen]
creates rich snippets for enhanced search results
08:31:31 [BartvanLeeuwen]
allows custom search engine
08:32:22 [BartvanLeeuwen]
working on integration and discovery of datasets
08:32:45 [BartvanLeeuwen]
we are driven by being able to answer questions, thats why datasets are interesting
08:33:05 [BartvanLeeuwen]
[ shows examples ]
08:34:02 [Florian]
08:34:39 [BartvanLeeuwen]
vocabulary is not designed by the people, its community driven
08:35:06 [BartvanLeeuwen]
the team only integrates community efforts
08:35:40 [BartvanLeeuwen]
this community is driven by w3c community group
08:36:08 [BartvanLeeuwen]
dataset are added based on work by a.o. W3C GLD
08:37:13 [BartvanLeeuwen]
where do graphs stop and tables start ?
08:37:49 [BartvanLeeuwen] is about finding dataset not making all data rdf
08:38:19 [PhilA]
If I get a chance I'll talk about what we're doing at W3C on vocabs - the management model is instructive
08:38:25 [BartvanLeeuwen]
Next speaker: Licensing Library and Authority Data Under CC0: The DNB Experience, Lars Svensson
08:40:09 [BartvanLeeuwen]
today I talk about the change business model
08:41:35 [BartvanLeeuwen]
mixed business model right now: which dataset / which format / access type
08:42:31 [BartvanLeeuwen]
3 methods for harvesting meta data: SRU, OAI and HTTP
08:43:02 [BartvanLeeuwen]
SRU and OAI are free services after registration to keep customers up to date
08:43:25 [BartvanLeeuwen]
HTTP is free, no registration or tracking
08:43:42 [BartvanLeeuwen]
formats, RDF, CSV and MARC
08:43:49 [BartvanLeeuwen]
MARC is a custom library format
08:44:16 [StevenPemberton]
08:44:52 [BartvanLeeuwen]
by end of this Quarter also as PDF ( nice PDF )
08:45:17 [BartvanLeeuwen]
Authority data is available under CC0 and metadata as well
08:45:56 [BartvanLeeuwen]
metadata about books is mostly CC0, RDF / CSV always CC0
08:46:32 [BartvanLeeuwen]
MARC data of last 2 years is available under a fee older under CC0
08:46:48 [BartvanLeeuwen]
process of transition
08:47:05 [BartvanLeeuwen]
discussion about lost of revenue and spillover
08:47:50 [BartvanLeeuwen]
scared of people running away with our data and earning lots of money with it.
08:48:04 [BartvanLeeuwen]
no know examples of this actually happening
08:48:32 [BartvanLeeuwen]
lessons learned: Finding right licenses can be tricky
08:48:49 [BartvanLeeuwen]
CC-by license was to complicated
08:49:22 [BartvanLeeuwen]
make it easy for people to use your data !
08:49:29 [BartvanLeeuwen]
we ended up with CC0
08:49:58 [BartvanLeeuwen]
Next speaker: Open Government Data Projects in Japan, Shuichi Tashiro, IPA
08:50:06 [rjw]
08:50:24 [PhilA]
IPA is the government IT standards body.
08:51:23 [BartvanLeeuwen]
brief intro of opendata in Japan
08:52:09 [BartvanLeeuwen]
Oktober 2008 CIO Forum started discussion about opendata in japan
08:52:20 [StevenPemberton]
08:52:24 [BartvanLeeuwen]
in 2001 there was already a e-gov portal site
08:53:06 [BartvanLeeuwen]
multiple administrative procedures available in one portal, but with non standard UI
08:53:30 [BartvanLeeuwen]
public private collaboration with a realtime datalink
08:53:48 [BartvanLeeuwen]
between train company and seismic center
08:54:07 [BartvanLeeuwen]
in 2011 all trains stopped 10 seconds before earthquake wave reached main land
08:54:52 [BartvanLeeuwen]
October 2010 a Open gov Lab was started with various items
08:55:07 [BartvanLeeuwen]
before 2012 open gov strategy
08:55:28 [BartvanLeeuwen]
lot of data put out had low quality
08:55:40 [BartvanLeeuwen]
Earthquake of 2011 was a turning point
08:55:54 [BartvanLeeuwen]
demand for data:
08:55:57 [BartvanLeeuwen]
-- who is where
08:56:06 [BartvanLeeuwen]
-- who need what / who can provide what
08:56:20 [BartvanLeeuwen]
-- availabily of general infrastructure
08:56:27 [BartvanLeeuwen]
-- polution level
08:56:35 [BartvanLeeuwen]
Multi tier collaboration is needed
08:56:49 [StevenPemberton]
08:57:07 [StevenPemberton]
08:57:17 [BartvanLeeuwen]
[ shows examples ]
08:58:29 [BartvanLeeuwen]
Recovery and reconstruction support program database, one stop shop for support programs
08:59:11 [BartvanLeeuwen]
this resulted in a vocabulary problem, all local govs used there own terminology
09:00:53 [BartvanLeeuwen]
Japan has problem with its character code
09:01:02 [BartvanLeeuwen]
very local, and very specific to identity
09:02:25 [BartvanLeeuwen]
[ shows architectural diagram ]
09:02:51 [BartvanLeeuwen]
o.a. contains a vocabulary database to solve problems with local terminology
09:04:02 [BartvanLeeuwen]
also a open license character databases, with RDF support to find relationship between characters and legislation about character sets ?
09:04:33 [BartvanLeeuwen]
Panel: Previous speakers
09:05:29 [daveL]
daveL has joined #odw
09:06:48 [BartvanLeeuwen]
hadley: calls out to panel: if you had 10billion budget where would you be in 10years
09:07:30 [BartvanLeeuwen]
Danbri: we should finaly start integrating
09:07:37 [BartvanLeeuwen]
Lars: Digetizing
09:08:06 [BartvanLeeuwen]
solve rights issues, who own rights on publications is somewhat hard
09:09:01 [BartvanLeeuwen]
Florian: governments need give us the data we need to answer the important questions. Smartgrids are important
09:09:05 [daveL]
in relation to IPA presentation on linked data in Japan, there is a new W3C community that aims to develop best practice in multilingual linked open data:
09:09:47 [BartvanLeeuwen]
Fiona: Money is not a issue, not technology , standardization is the current issue
09:10:06 [BartvanLeeuwen]
Questions from room
09:10:33 [BartvanLeeuwen]
ivan, remark: we should generalize a bit not just government but also science data
09:11:02 [BartvanLeeuwen]
they produce data we should be able to access that as well.
09:12:09 [BartvanLeeuwen]
Dave Lewis: remark, focus on LOD in countries where main language is english, there is multilingual Comm group [ please insert uri ]
09:13:03 [daveL]
09:13:50 [BartvanLeeuwen]
Hans Overbeek: questions about standards, what is the best advice? should we use or DCAT profile for integrating datasets
09:13:59 [PhilA]
DCAT Application profile work through EC's ISA Programme. Details at
09:14:27 [BartvanLeeuwen]
danbri: Schema .org is a agile approach, which could spin of into a standardization body
09:14:52 [ldodds]
scribenick: ldodds
09:15:05 [ldodds]
Topic: Product Data
09:15:28 [ldodds]
Chair: PhilArcher
09:17:12 [ldodds]
PhilArcher: I think product data is going to be very important
09:17:32 [ldodds]
first speaker is John Walker, paper:
09:17:58 [ldodds]
John is talking about Open Data in the electronics industry
09:18:25 [HadleyBeeman]
09:18:27 [ldodds]
John: @NXPData is our twitter, please share what we're doing
09:18:57 [ldodds]
... NXP is semiconductor company in netherlands, lots of large customers and large portfolio of products
09:19:14 [ldodds]
... Content is the product, Product data is part of the content => data is the product
09:19:29 [ldodds]
... Why make data Open?
09:19:53 [ldodds]
... In bad old days, very doc centred information, lots of content silos, content reuse was copy and paste or (worse) re-keying
09:20:07 [ldodds]
... Consequences: inconsistency, cost, errors, complex to manage
09:20:35 [ldodds]
... Consequences (for customers): confusing, inconsistent, manual effort to gather/re-use information, difficult to find new products
09:20:50 [ldodds]
... Vision: unified content strategy. Create once, approve once, re-use many times
09:21:04 [ldodds]
... ISO 13584 -- data model for describing products
09:21:14 [ldodds]
... DITA -- for natural language content
09:21:36 [ldodds]
... variety of outputs, including flyers, data sheets, online, etc
09:21:47 [ldodds]
... data sheets are mixture of text and data
09:22:07 [ldodds]
... parametric search interface for finding products
09:22:35 [ldodds]
... dictionary of properties of products
09:23:33 [ldodds]
... NXP want to be canonical source of information which is re-published by others, e.g. distributors, aggregators, etc
09:23:43 [ldodds]
... want Web of Data to help collaboration
09:23:58 [ldodds]
... and
09:24:19 [ldodds]
... that is a work in progress
09:24:38 [ldodds]
... give us feedback on what is there
09:24:46 [ldodds]
... Open Data challenges
09:24:55 [ldodds]
... how do we convince others to use (linked) open data?
09:25:10 [ldodds]
... how do we justify business case (ROI) -- argue that its simpler
09:25:26 [ldodds]
... what formats should we use? (rate of adoption of RDF/Linked Data)
09:25:39 [ldodds]
... How do we ensure quality, security, enable access
09:25:51 [ldodds]
... How do we combine semi- and unstructed content in publications?
09:25:56 [ldodds]
... Are we giving away a key asset?
09:26:07 [ldodds]
... How do we standardise in industry?
09:27:08 [ldodds]
Next speakers are Andy Hedges & Richard McKeating, Tesco
09:27:16 [StevenPemberton]
09:28:06 [ldodds]
Andy: big numbers about tesco, but what is interesting is how our data connects us
09:28:23 [ldodds]
... some of that data is ours, some belongs to others: customers, suppliers, manufacturers
09:28:36 [ldodds]
... may not be able to share all of it: rights, commercial sensitivity
09:28:57 [ldodds]
... need to understand where information comes from, and how it can be best combined for customer benefit
09:29:09 [ldodds]
... offer customers best service/prices and compare data
09:29:36 [ldodds]
... perhaps not intuitive to allow cross-supermarket price comparison, but want to be a good brand have good contract with community
09:30:06 [ldodds]
Richard: I'm passionate about how to use open data, microformats, etc to help customers
09:30:34 [ldodds]
... Tesco Open Data is about our customers
09:30:44 [ldodds]
... Places where you can buy things, incl. online
09:30:52 [ldodds]
... Products that we offer (across range of brands)
09:31:03 [ldodds]
... Orders: what's in your basket, or buy on the web
09:31:21 [ldodds]
... Journeys: we make to fulfill orders, e.g. deliveries, logistics
09:31:36 [ldodds]
... Rewards: what do you get as a customer, club card pointers, offers, price promise
09:32:07 [ldodds]
... at Tesco we are on journey towards exposing this data, to allow app providers to access it
09:32:16 [ldodds]
... would be great to link to open data sources
09:32:26 [ldodds]
... really key area is how we share data with trading partners
09:32:41 [ldodds]
... important for brands to share product information, esp. accurate data
09:33:01 [ldodds]
... customer access to data, e.g. purchase history, allergen information, etc
09:33:11 [ldodds]
... Tesco sell NXP products :)
09:33:42 [ldodds]
... but they have little information on it (opportunity for sharing)
09:34:06 [ldodds]
Andy: customers don't just shop at tesco
09:34:24 [ldodds]
... standards make our life easier
09:34:49 [ldodds]
... expose price promise, cross retailer product comparison, delivery choices
09:35:08 [ldodds]
... can compare products using GTIN (same EAN)
09:35:24 [ldodds]
... but suppliers might have different sizes, etc. Some products can vary
09:35:45 [ldodds]
... Keen to start dialog with standards makers
09:36:12 [ldodds]
Next speaker is Mark Harrison, GS1
09:36:16 [ldodds]
09:36:23 [ldodds]
GS1 is a standards body, assigns GTINs
09:36:53 [ldodds]
Mark: GS1 is global stds org; 1m companies working on standards, e.g. for barcodes, logistics, rfid, etc
09:37:11 [ldodds]
... new initiative started in Feb, GS1 digital: putting identification into the web
09:37:49 [ldodds]
... B2C example: map human-readable keywords ("milk") to Product category identifier (GPC), search has user constraints (price, distance, urgency)
09:38:08 [ldodds]
... contextual filters for product category, e.g. organic, skimmed milk
09:38:37 [ldodds]
... refine search to find products and services that match needs, e.g local store offering the product
09:38:53 [ldodds]
... Achieve this using data linkages
09:39:17 [ldodds]
... start with keyword which is mapped to category; category has attributes/criteria
09:39:54 [ldodds]
... imagine using contextual searching across all suppliers across the web, not just facet search within single website
09:40:04 [ldodds]
... and Good Relations are key vocabularies
09:40:26 [ldodds]
... + GS1 : Global Product Classification (GPC)
09:40:45 [ldodds]
... easy for them to open that, already available as XML, in process of creating multi-lingual RDF
09:41:18 [ldodds]
... GSN: Global Data Synchronisation Network has additional data about manufacturers, etc
09:41:29 [ldodds]
... GPC as Linked Open Data
09:42:05 [ldodds]
Slide shows simple example graph of GTIN and facets
09:42:19 [ldodds]
Mark: please collaborate with us: email
09:42:38 [ldodds]
... I'm working on project as researcher, others handling industry engagement
09:43:12 [ldodds]
Next speaker is Phillipe Plagnol
09:43:16 [ldodds]
Product Open Data
09:43:18 [ldodds]
09:43:46 [ldodds]
Philippe: Product data is critical for open data movement
09:44:10 [ldodds]
... everything around us is a product, has one GTIN code which is unique identifier
09:44:27 [ldodds]
... products used by everybody, every day
09:44:39 [ldodds]
... lots of contextual information, e.g. product packaging, nutrietion
09:44:58 [ldodds]
... products are fundamental for trade, economics
09:45:15 [ldodds]
... objective is to create big repository of data about products, based on barcode
09:45:31 [ldodds]
... include ecological impacts, sources, support responsible consumers
09:46:02 [ldodds]
... lots of apps to support product barcode scanning, then can use that to access data
09:46:12 [ldodds]
... BUT: currently no public database containing this database
09:46:54 [ldodds]
... manufacturers have all this information in databases to support printing, but don't share it
09:47:04 [ldodds]
... largely question of access
09:47:30 [ldodds]
... give us what is already printed on the packaging, using GTIN as a key
09:48:11 [ldodds]
... need to have a product schema for manufacturers to support their publishing
09:48:39 [ldodds]
... asking manufacturers for nutritional data
09:48:53 [ldodds]
... working in France, hoping to get traction in other countries
09:49:19 [ldodds]
... incredible possibilities of using this product data
09:49:48 [ldodds]
... GTIN is a new communication channel. More easily support product annotation, product mapping, etc
09:50:23 [ldodds]
... apps can support product reviews, consumer recommendations/decision support
09:50:39 [ldodds]
... only thing stopping us is an open catalog of the data
09:50:56 [ldodds]
... imagine a "google maps for products"
09:52:01 [ldodds]
Example of consumer buying decisions based on using a third-party app that uses ecological data
09:52:31 [ldodds]
Now moving to discussion with all speakers
09:52:59 [ldodds]
PhilArcher: (to Andy) does Philippe's talk scare/please you?
09:53:18 [ldodds]
Andy: pleases us, we want to see this data opened too, helps us be better retailer
09:53:32 [ldodds]
... an awesome challenge
09:53:59 [ldodds]
Richard: legislation is important too, "contains nuts" is a life/death decision
09:54:32 [ldodds]
John: v. interesting. In semi-conductor industry, people will take the data if its available
09:54:50 [ldodds]
... want to make sure products are accurately described, whether its strawberries or microchips
09:55:00 [ldodds]
... help consumers find what they need
09:55:18 [ldodds]
PhilArcher: Mark, what are members saying?
09:55:42 [ldodds]
Mark: enthusiasm from many members, great opportunity for all of us to be more responsible consumers
09:55:55 [ldodds]
... how do we spend out money, what choices do we make?
09:56:37 [ldodds]
Jim King (Adobe): isn't there a large product db of RFID data?
09:56:58 [ldodds]
Mark: not quite the same thing, that is likely EPC data, movement of stock through supply chain
09:57:04 [ldodds]
... we're discussing the product master data
09:57:19 [ldodds]
... opening EPC data is more commercially sensitive
09:57:41 [ldodds]
TomHeath: this is great session, what are the concrete steps towards openly licensed data?
09:58:06 [ldodds]
Mark: different kinds of product data; product categories/values can be openly licensed, largely format shift
09:58:22 [ldodds] from manufacturers, requires discussions with them, v. early stage
09:58:34 [ldodds]
... they need confidence in benefits for themselves and others and licensing discussions
09:58:40 [ldodds]
... lots of good will, enthusiasm
09:58:59 [ldodds]
Richard: Tesco are translating desire to action by working with GS1 and brands
09:59:11 [ldodds]
... make it easy and not disadvantage suppliers
09:59:58 [ldodds]
ZachBeauvais: heard lots of positive noises, but what are the business cases on building on available open data? Any concrete examples?
10:00:15 [ldodds]
Richard: one bus. case is legislative changes and ensuring that products are accurately described
10:00:28 [ldodds]
... opportunities within our enterprise, we have silos
10:00:40 [ldodds]
... really only just starting to understand wider bus. case
10:00:56 [ldodds]
Martin ? (IBM): are competitors trying to catch up?
10:01:18 [ldodds]
Richard: plenty of retailers working in this space, but not co-ordination yet
10:01:47 [ldodds]
John: companies are already scraping and reselling data outside of our control
10:01:58 [ldodds]
... if we can clearly license it, then can make it easier
10:02:06 [ldodds]
Closing statements
10:02:39 [ldodds]
Philippe: originally saw GS1 as "enemy", but can now see they're embracing open data
10:03:04 [ldodds]
... GTIN code is often hidden by lots of ecommerce web sites, needs to be published and clearly available
10:03:27 [ldodds]
... come and download dump our own data to see how it is constructed
10:03:38 [ldodds]
... I'm following GS1 stds work, give me feedback
10:03:57 [ldodds]
Mark: huge opportunity to make a difference, make world better place
10:04:15 [ldodds]
John: in B2B industry, focus needs to be on streamlining business integration
10:04:23 [ldodds]
Richard: keep focus on benefits for customer
10:04:37 [ldodds]
Andy: echo that, need to look at customer needs.
10:04:51 [ldodds]
That's the end of the session!
10:05:17 [ldodds]
rrsagent, make minutes
10:05:17 [RRSAgent]
I have made the request to generate ldodds
10:06:42 [PhilA]
rrsagent, draft minutes
10:06:42 [RRSAgent]
I have made the request to generate PhilA
10:23:07 [naomi_]
Topic: Dumb Strings That Mean So Much
10:23:14 [naomi_]
Chair: Hideaki Takeda
10:23:26 [naomi_]
scribenick: naomi
10:28:17 [naomi_]
Next speaker: Ministry of the Interior of the Netherlands, Geonovum, Hans Overbeek
10:28:39 [naomi_]
Hans: Concept URI Strategy for the NL Public Sector
10:29:09 [naomi_]
... Thijs gives you geographic informaion
10:29:20 [ivan]
10:30:43 [StevenPemberton]
10:31:19 [PhilA]
scribe: PhilA
10:31:26 [PhilA]
scribeNick: PhilA
10:31:46 [PhilA]
topic: Draft URI Strategy for the NL Public Sector, Hans Overbeek
10:31:57 [PhilA]
10:32:29 [PhilA]
Hans: Talking about Designing URI Sets for Public Sector, ISA programme study on URI Persistence etc
10:32:48 [PhilA]
Hans: Why do we need a URI strategy - it's about trust, provenance
10:32:57 [PhilA]
... hard to do in the LOD Cloud
10:33:18 [PhilA]
Hans: We want our URIs to be recognisable and trustworthy
10:34:17 [PhilA]
Hans: We have kept registers - buildings, railways etc. for hundreds of years, mostly to define identifiers
10:34:23 [PhilA]
... joint points for linked data
10:35:13 [PhilA]
Hans: we develop a model and a vocabulary for it
10:35:22 [PhilA]
... a register is a list of things that you want to reference
10:35:58 [PhilA]
Hans: then there's all the sensor data etc. What we think of as the big data
10:36:38 [PhilA]
... re-use of things like reference objects is what we want to re-use when we write our URI strategy
10:37:02 [PhilA]
... we struggled a little as we have to mint URIs, but we have a lot of identifiers that we can re-use
10:37:08 [PhilA]
... but we don't have a register for everything
10:37:20 [PhilA]
... there was no register for all our municialites
10:37:33 [PhilA]
... so we had to mint URIs for them... which means making a register
10:37:41 [PhilA]
... you can only have URIs if you have a register
10:37:53 [PhilA]
... No iregister? No identifier
10:38:02 [PhilA]
... so we were convinced that we needed a URI strategy
10:38:32 [PhilA]
... the pattern that we used was, not surprisingly the one developed in the UK/backed by ISA
10:39:23 [PhilA]
Hans: The domain should identify the regsiter in a persistent way, {register}
10:40:24 [PhilA]
Hans: The UK pattern has a {sector} in the pattern which sounds nice but its' hard to find someone to govern the sectors. Some will overlap etc.
10:40:38 [PhilA]
... so we thought we might not deed {sector} and left it out
10:40:51 [StevenPemberton]
10:41:06 [StevenPemberton]
10:41:07 [PhilA]
... with no strategy, you can use any URI, but it's less recognisable and less trustworthy
10:41:20 [PhilA]
hans: That means we end up having to have a register of registers
10:41:38 [StevenPemberton]
10:41:40 [PhilA]
Hans: What infratsructure is needed?
10:41:55 [StevenPemberton]
10:42:04 [PhilA]
Hans: which apps use the resvolvers and how frequently
10:42:13 [PhilA]
Hans: There's more in the presentation and the paper
10:42:19 [StevenPemberton]
10:42:27 [PhilA]
... are we heading the right way?
10:42:48 [StevenPemberton]
10:42:57 [PhilA]
Topic: Shared understanding = shared foreign keys (and more), Richard Light
10:43:03 [StevenPemberton]
10:43:05 [PhilA]
10:43:15 [PhilA]
10:44:03 [StevenPemberton]
10:44:11 [PhilA]
Richard: Want to talk about a pragmatic approach to getting URIs into the cultural heritage sector
10:45:14 [PhilA]
Richard: gives brief history of museum identifiers
10:45:41 [PhilA]
... work was done on vocabularies, controlled vocabularies
10:45:55 [naomi_]
scribenick: PhilA
10:47:17 [PhilA]
Richard: shows some examples of collections described in RDF
10:49:49 [PhilA]
Richard: There are good discussions in progress across the sectors...
10:49:49 [PhilA]
... but although there is more RDF coming out, when you look in detail, a lot of values are given as strings
10:50:21 [PhilA]
Richard: Modes is software used in most UK museums, Not free but you become a share holder
10:50:51 [PhilA]
Richard: Modes includes standard term lists etc., that become standards across users
10:51:02 [PhilA]
... now startung to use Web to get the terms
10:52:00 [PhilA]
Richard: Modes includes a live search of geonames as source of URLs for geographic places. Conversion happens in the software
10:52:26 [PhilA]
Richard: Can we use SPARQL endpoints as a a term list? Yes...
10:53:17 [PhilA]
Richard: Curators won't do any LD publishing themselves. All done in Modes
10:53:59 [PhilA]
Richard: uses XSLT to transform data from original XML data. handles the conneg etc
10:54:19 [StevenPemberton]
10:54:52 [PhilA]
Richard: Shows work that gave a URI to every word Shakespeare ever wrote
10:55:20 [StevenPemberton]
s/scribenick: naomi/scribenick: naomi_
10:55:25 [StevenPemberton]
10:55:26 [PhilA]
Richard: Adlib and CALM so looking at generating/using linked data
10:55:56 [PhilA]
Richard: gives example of dog food eacting
10:56:04 [PhilA]
10:56:20 [PhilA]
Topic: Aggregating media fragments into collaborative mashups: standards and a prototype, Philippe Duchesne
10:56:30 [PhilA]
10:56:53 [PhilA]
pd: on Media Fragments
10:57:17 [PhilA]
pd: lists issues faced
10:58:31 [PhilA]
scribe note - slides are expressive/detailed
10:59:30 [PhilA]
pd: points to earlier work that is all media specific
11:00:02 [PhilA]
pd: No harmonised definition of the fragments
11:00:17 [PhilA]
pd: Wanted to decouple fragment from media
11:00:45 [PhilA]
pd: geospatial and tree paths not part of any previous work AFAIK
11:02:07 [PhilA]
pd: project done mostly using HTML/JSON developers
11:02:18 [PhilA]
... but we have a SPARQL endpoint as well
11:02:35 [PhilA]
JSON on the screen a few minutes after RDF/XML...
11:04:18 [PhilA]
Sorry folks, no time for questions on this session
11:05:32 [PhilA]
pd: gives a quick demo
11:07:16 [floppy]
floppy has joined #odw
11:07:28 [PhilA]
topic: Digital Archiving 3.0, Christophe Guret
11:07:36 [PhilA]
11:07:44 [PhilA]
11:08:23 [PhilA]
scribe note - slides are expressive
11:10:02 [PhilA]
cgueret: we need to treat the data and metadata differently
11:10:09 [PhilA]
... we find LD the best format for thias
11:10:16 [PhilA]
11:10:40 [PhilA]
cgueret: Many formats for data itself
11:11:49 [PhilA]
cgueret: rather than force people to transform their data, they should just get the data in the repository - it's up to the latter to sort out formats
11:12:07 [PhilA]
cgueret: Forget about URIs as data
11:12:11 [PhilA]
PhilA: Grrrr
11:12:48 [PhilA]
cgueret: we have new formats every 5 years. Use conneg to handle format evolution
11:14:01 [ivan]
-> a related pointer: how to cite data for scholarly purposes, the Amsterdam Manifesto,
11:15:21 [CaptSolo]
topic: Discovery panel
11:15:29 [CaptSolo]
scribenick: CaptSolo
11:15:40 [CaptSolo]
chair: danbri
11:16:00 [CaptSolo]
danbri: we will be discovering discovery, as everything brings back to this topic
11:16:12 [CaptSolo]
next speaker: Richard Wallis, OCLC
11:16:34 [CaptSolo]
rjw: working for OCLC
11:16:55 [CaptSolo]
... WorldCat (stats about number of lbiraries, books)
11:17:10 [CaptSolo]
... integrating linked data,
11:17:38 [CaptSolo]
... the other hat: chaired ... [could you add detail here, missed it?]
11:18:13 [CaptSolo]
... need to publicize links to resources
11:18:31 [CaptSolo]
... generic vocabs = generic "glue" that helps link resources
11:19:11 [CaptSolo]
... you have to demonstrate the benefits -- use the data to drive the services
11:19:30 [CaptSolo]
next speaker: Chris Metcalf, Socrata
11:19:36 [CaptSolo]
[ abstract, paper]
11:19:50 [CaptSolo]
Chris: ~60 customers who want to use open data
11:20:10 [CaptSolo]
... how can we use, etc. to help solve the discovery problem
11:20:23 [CaptSolo]
... catalogs that need to speak to each other (, ...)
11:20:39 [CaptSolo]
... how can we encourage people and industry
11:20:54 [CaptSolo]
note: if i miss things scribing, please add them :)
11:21:47 [CaptSolo]
(technical pause)
11:22:01 [CaptSolo]
next speaker: Steven Pemberton, CWI
11:22:01 [CaptSolo]
11:22:02 [CaptSolo]
[ abstract, paper]
11:22:13 [CaptSolo]
Steve: don't have slides, can speak now :)
11:22:24 [CaptSolo]
Steve: involved with W3C from day 1
11:22:36 [CaptSolo]
... name on number of standards, incl. RDFa
11:22:51 [CaptSolo]
... point of research: make computers easier for people to use
11:22:59 [CaptSolo]
... small data is important
11:23:07 [CaptSolo]
... e.g., website for this conference
11:23:29 [CaptSolo]
... you look for data on aiports, lodging, agenda, ... (and you enter same info again and again)
11:23:53 [CaptSolo]
... if the info were in RDFa you could automatically add this info to your calendar, find best flights, ...
11:24:08 [CaptSolo]
... if your browser helped you here, people's lives 'd better
11:24:23 [CaptSolo]
... and browsers would would by providing services that help use this data
11:24:26 [CaptSolo]
... win-win for all
11:24:30 [CaptSolo]
... use RDFa
11:25:02 [CaptSolo]
next speaker: Pascal Romain and Elie Sloïm, Conseil général de la GIronde/Temesis
11:25:12 [CaptSolo]
... Pascal from local council
11:25:20 [CaptSolo]
... Elie from a company, W3C member
11:25:26 [CaptSolo]
[ abstract]
11:26:01 [CaptSolo]
Elie: checklist of 72 good practices (fr, en)
11:26:20 [CaptSolo]
... every good practice has to be available online, international, usable, realistic
11:26:35 [CaptSolo]
... OPQuast - Open quality standards
11:26:47 [CaptSolo]
... if you are open data producer, go and check the guidelines
11:27:05 [markbirbeck]
s/Steve: don't have slides/Steven: don't have slides/
11:27:12 [markbirbeck]
11:27:34 [CaptSolo]
Pascal: open data checklist : a tool for LOD?
11:27:37 [markbirbeck]
s/Steve: involved with W3C from day 1/Steven: involved with W3C from day 1/
11:27:50 [CaptSolo]
markbirbeck: thanks
11:28:07 [CaptSolo]
next: Madi Solomon, Pearson
11:28:13 [CaptSolo]
[ abstract, paper, slides]
11:28:26 [CaptSolo]
Madi: need to find new ways of doing business
11:28:42 [CaptSolo]
... provided a solution for publishing open linked data (?)
11:28:54 [CaptSolo]
... but never used the words "open" or "linked"
11:29:07 [CaptSolo]
... termed it resource enrichment (?)
11:29:11 [jpcs1]
... textbook can be broken down into a large number of assets
11:29:36 [CaptSolo]
... each requires its metadata, ...
11:29:51 [CaptSolo]
... concept extraction, keywords, faceted exploration
11:30:15 [CaptSolo]
... built a rule to match with wikipedia, automated metadata generation as quickly as possible
11:30:48 [CaptSolo]
... astrophysics textbook -- had to do filtering, though, to keep out sci-fi topics :)
11:31:02 [CaptSolo]
... found out we can make taxonomies on-the-fly
11:31:26 [CaptSolo]
... creates a baseline, curated taxonomy
11:31:41 [CaptSolo]
... virtuos circle -- put it back into community, maintain, update, ...
11:31:53 [CaptSolo]
Dan: question time
11:32:03 [CaptSolo]
... back to W3C aspects
11:32:27 [CaptSolo]
... (question to the whole panel)
11:32:37 [CaptSolo]
... if not making standars, what should be doing instead
11:32:40 [CaptSolo]
... ?
11:32:49 [CaptSolo]
Steven: small data is important.
11:33:15 [CaptSolo]
... see benets from including data on websites
11:33:53 [CaptSolo]
Pascal?: standardisation is good efford
11:33:55 [DeirdreLee]
rjw: not standards work, it is nurturing communities that have emerged
11:34:20 [CaptSolo]
... sharing experiences, ...
11:34:50 [CaptSolo]
Madi: as new coach of digital Publishing IG - questions is what can i do for you?
11:35:10 [CaptSolo]
Chris: i love simple tools that do powerful things
11:35:34 [CaptSolo]
... w3c working on these things, but lot of people don't know re them
11:35:39 [CaptSolo]
... need to reach out, inform
11:36:00 [CaptSolo]
Elie: need to produce standards + guidelines for implementing those standards
11:36:20 [CaptSolo]
... have nice specs but not always simple to implement
11:36:25 [CaptSolo]
questions from the audience
11:37:20 [CaptSolo]
Hadley: back to geocities era - we made lists
11:37:32 [CaptSolo]
... feel we are doing that now = making data catalogs, lists of resources
11:37:56 [CaptSolo]
... what we need to do to make it worthwhile to index the metadata by search engines
11:38:03 [st]
st has joined #odw
11:38:07 [CaptSolo]
... so average person can participate
11:38:10 [CaptSolo]
11:38:24 [CaptSolo]
Chris: that's RDFa stuff
11:38:34 [CaptSolo]
... build it into catalogs, so data gets crawled
11:38:52 [CaptSolo]
... not just list datasets, also make metadata schemas more practical for people
11:39:04 [CaptSolo]
... want to type in my ZIP-code and find what's relevant to me
11:39:22 [CaptSolo]
rjw: we (non-search-engines) we need to talk their (search engine) language
11:39:45 [CaptSolo]
... need to put up resources in front of people
11:39:59 [CaptSolo]
Steven: i marked my homepage w RDFa
11:40:06 [BartvanLeeuwen]
BartvanLeeuwen has joined #odw
11:40:13 [CaptSolo]
... involved me ending up on Google Maps w the location where I live
11:40:21 [CaptSolo]
danbri: getting more vistors? :)
11:40:39 [CaptSolo]
Bob Schloss (IBM): by analogy to SEO - let's implement
11:41:00 [CaptSolo]
... fabulous implementation -- go to cloud's website, enter a URL from where the dataset can be fetched
11:41:15 [CaptSolo]
... they 'd "suck in" the dataset and give it buck w enriched metadata
11:41:22 [CaptSolo]
11:41:37 [CaptSolo]
... let's externalize it. if nobody starts, we won't have it
11:41:53 [CaptSolo]
Chris: we are doing cataloging well
11:42:01 [CaptSolo]
... re dscovery by search engines
11:42:13 [CaptSolo]
... data by itself not so useful to everyday people
11:42:22 [CaptSolo]
... people need to *use* those datasets
11:42:33 [CaptSolo]
... we need to allow the data to be more useful
11:43:11 [CaptSolo]
Bernadette: as someone who has spent years on vocabs for linked data
11:43:21 [CaptSolo]
... need communication, mentoring
11:43:31 [CaptSolo]
... info that simply and quickly explains what it is about
11:43:46 [CaptSolo]
... to those who make decisions
11:44:02 [CaptSolo]
... people in this room should write books, make videos, organize seminars with the stakeholders
11:44:20 [CaptSolo]
... there's so many standards, implementations. can be overwhelming
11:45:02 [CaptSolo]
(miss this one re google glass, RDFa, ...)
11:45:20 [CaptSolo]
Martin, CTIC: Spanish experience re economy, corruption
11:45:39 [CaptSolo]
... Spanish government launched a technical standard for all public bodies
11:45:54 [CaptSolo]
... all have to use these guidelines when exposing open data
11:46:03 [CaptSolo]
... have to use linked data
11:46:20 [CaptSolo]
... URI scheme for catalogs, datasets, ...
11:47:01 [CaptSolo]
now the last round of comments from the panel
11:47:08 [CaptSolo]
danbri: 10words of 30 syllables now
11:47:24 [CaptSolo]
Steven: if you got information, it should be on the web + machine readable
11:47:50 [CaptSolo]
?: as data producers think of objects and entities -- instead of datasets
11:48:21 [CaptSolo]
rjw: you're only people in the domain who know the benefits -- demonstrate them to everyone !
11:48:38 [CaptSolo]
Madi: middle place b/w where data is release and ppl access it. data-driven businesses
11:48:55 [CaptSolo]
Chris: we as LOD advocates need to reach out to ppl outside the LOD community
11:49:06 [CaptSolo]
... many as tired of SemWeb cause they don't see the benefits
11:49:14 [CaptSolo]
...,RDFa, simple tools
11:49:26 [CaptSolo]
Elie: need more metadata for search engings, end-users
11:49:36 [CaptSolo]
... metadata quality -- checks needed
11:49:42 [CaptSolo]
... make website for end-users
11:49:47 [CaptSolo]
... websites
11:49:59 [CaptSolo]
... need to work on quality
11:50:08 [CaptSolo]
danbri: thanks to the panel
11:50:20 [CaptSolo]
rrsagent, here?
11:50:20 [RRSAgent]
11:50:31 [CaptSolo]
panel discussion finished
11:51:17 [PhilA]
12:55:41 [CaptSolo]
anyone scribing?
12:57:46 [CaptSolo]
could someone take over if Agis can not do it now?
12:58:09 [JeniT]
ScribeNick: JeniT
12:58:17 [bhyland]
Michalis speaking now re
12:59:01 [JeniT]
Topic: "Storytelling" in the economic LOD: the case of
12:59:38 [JeniT]
Michalis: visualising open spending data
13:00:17 [JeniT]
... linked open data enables these kinds of analysis
13:00:30 [JeniT]
... this has prompted new projects on LOD
13:00:41 [JeniT]
... network analysis is a useful tool suited to linked data & semantic web
13:00:46 [JeniT]
... particular interest in real-time open data
13:01:14 [JeniT]
question: what tools did you use to generate network analysis graphs?
13:01:28 [JeniT]
Michalis: visualisations were done by Gephi, processed by R & other mathematical tools
13:01:48 [JeniT]
... all done with open source tools
13:02:15 [JeniT]
Topic: Bottom up Activities for linked open data, open government in Japan
13:02:36 [JeniT]
Takumi from OKF Japan
13:02:38 [bschloss]
Reminder to PhilA -- put RDA lightening talk slides on ODW Workshop website, please!
13:03:11 [JeniT]
... and from Keio University
13:03:33 [JeniT]
Takumi: International Open Data in Japan Feb 23rd 2013, 300 participants in 8 cities
13:04:06 [JeniT]
... botom-up activities from stakeholders are driving LOD in Japan
13:04:16 [JeniT]
... academic institutions try to engage with local government & communities
13:04:25 [StevenPemberton]
13:04:34 [JeniT]
... community members have same goals for LOD, which promotes collaboration
13:04:49 [JeniT]
... neutral intermediary coordinates activities & shares best practices
13:05:05 [JeniT]
... LODAC (LOD for ACademia)
13:05:30 [JeniT]
... develops lots of datasets & builds dictionaries
13:05:36 [JeniT]
... engages with local communities
13:05:52 [JeniT]
... Yokohama city, one of the biggest cities in Japan, large LOD community
13:06:09 [JeniT]
... Sabae city has first local government in Japan publishing LOD on its website
13:06:24 [JeniT]
... LODAC & Yokohama community collaborate
13:06:45 [JeniT]
... create mashup around museum & event information, demonstrating value of combining datasets from different communities
13:06:59 [JeniT]
... private companies have question/answer datasets for Yokohama city
13:07:07 [bschloss]
I will e-mail you slides, for now, add links please to and to (their 1-page flyer)
13:07:27 [JeniT]
... community generated new consortium, including government, citizen, academic members
13:07:40 [BartvanLeeuwen]
BartvanLeeuwen has joined #odw
13:07:43 [JeniT]
... this led to Yokohama city becoming big LOD community
13:07:55 [JeniT]
... Sabae city first publisher of LOD on its website
13:08:03 [JeniT]
... 2011 published XML datasets with CC licences
13:08:09 [JeniT]
... 2012 published RDF
13:08:23 [JeniT]
... ATR Creative (private company) used LOD for their own product
13:08:26 [yoshiaki]
Yokohama Art Spot:
13:08:34 [JeniT]
... iPhone application
13:09:01 [JeniT]
... Sabae city became most advanced open data city in Japan, because of collaboration between stakeholders
13:09:23 [JeniT]
... government publishes datasets, but other organisations gather, aggregate, make available LOD
13:09:38 [yoshiaki]
Sabae Burari, an application of POI and maps mush-up for local sightseeing spot:
13:09:45 [JeniT]
... OKF Japan organises bottom-up activities in Japan
13:10:11 [JeniT]
... organised 300 participants in open data day, some doing hackathons, some editing Wikipedia
13:10:18 [yoshiaki]
International Open Data Day in Japan:
13:10:19 [JeniT]
... over 90% participants were satisfied
13:10:40 [JeniT]
... biggest benefits around networking, sharing ideas, and learning about open data & improving engineering skills
13:10:53 [JeniT]
... those involved from different sectors
13:11:17 [JeniT]
... OKF Japan helps to share best practices with each area, by providing toolkits & tutorials
13:11:33 [JeniT]
... eg Where Does My Money Go? originally developed in UK, localised for Japanese usage
13:11:44 [JeniT]
... used in Yokohama city, with tutorials for other cities
13:12:02 [JeniT]
... Conclusion:
13:12:14 [JeniT]
... bottom-up activities has driven engagement
13:12:23 [JeniT]
... collaboration by academia is key
13:12:36 [JeniT]
... neutral intermediary (OKF Japan) coordinates & helps share activities
13:12:56 [JeniT]
Topic: Utilising Linked Social Media Data for Tracking Public Policy and Services
13:13:05 [JeniT]
Deirdre from DERI
13:14:00 [JeniT]
Deirdre: relationship between open data & public policy
13:14:17 [JeniT]
... can be used to influence public policy & services, to lobby, influence policy makers
13:14:29 [JeniT]
... eg use statistics to guide new schools
13:14:40 [JeniT]
... also used to justify policy decisions
13:14:47 [JeniT]
... and to evaluate policies
13:15:17 [JeniT]
... eg environmental measurements to see whether regulations have had an effect
13:15:28 [JeniT]
... still a lot of research to do about how this all works
13:15:43 [JeniT]
... what about combining open data with social media data?
13:16:06 [JeniT]
... could give evidence-based policy evaluation
13:16:27 [JeniT]
... social media data is already being used for business intelligence, trend analysis, opinions on brand etc
13:16:34 [JeniT]
... lots of activity from industry
13:16:50 [JeniT]
... government is coming around to this, but using them in limited ways
13:17:11 [JeniT]
... social media used for dissemination & limited engagement, but not to full potential
13:17:17 [JeniT]
... not being used to get information from social media
13:17:28 [JeniT]
... government is only a publisher, not a consumer of social media
13:18:09 [JeniT]
... government should be harnessing information from social media
13:18:16 [JeniT]
... proposed to do this using linked data
13:18:36 [JeniT]
... extract data from social media, express as linked data, analysis on it
13:18:43 [JeniT]
... challenges:
13:18:58 [JeniT]
... wide variety of sources, each with its own API
13:19:03 [JeniT]
... wide variety of formats
13:19:05 [JeniT]
... privacy concerns
13:19:14 [JeniT]
... can be noisy, difficult to process
13:19:38 [JeniT]
... this is all based on solid research funded under EU FP7
13:19:56 [BartvanLeeuwen]
13:19:59 [JeniT]
... Linked2media to provide SMEs with tooling
13:20:13 [JeniT]
... DERI developed Social Media Linked Data Space
13:20:31 [JeniT]
... now trying to apply this (designed for SMEs) to government
13:20:56 [JeniT]
... has a triplestore, crawlers, integrating 25 different sites including review sites
13:21:25 [JeniT]
... there are restrictions on different social media APIs which limit which data you can access
13:21:50 [JeniT]
... once we have data, we model in common linked data format, reusing existing vocabularies
13:21:56 [JeniT]
... using SIOC for review data
13:21:58 [ldodds__]
PhilA: can I add a barcamp discussion proposal?
13:22:06 [JeniT]
..., rev, Marl etc
13:22:39 [ldodds__]
PhilA: How should we attribute open datasets?
13:22:39 [JeniT]
... next steps are to look into integration of social media data with other linked data
13:22:58 [JeniT]
... also using data to influence, justify & evaluate public policy
13:23:16 [JeniT]
... not just technical aspects to this research, also social, political etc
13:23:26 [JeniT]
Topic: Panel discussion
13:23:47 [JeniT]
Christian: run a small web design company in London
13:24:06 [JeniT]
... running tool to monitor corruption
13:24:38 [JeniT]
Uldis: if you had one lesson learnt, what would it be?
13:24:53 [JeniT]
Christian: we've had a lack of collaboration, despite open source tools
13:25:16 [JeniT]
Deirdre: use of social media data is limited by the restrictions that they place on it
13:25:52 [JeniT]
Michalis: big lesson is that critical mass for open data is lower than the web itself
13:26:01 [JeniT]
... you build on the existing web
13:26:19 [JeniT]
... make an application work, and everything else will fall into place
13:26:36 [JeniT]
Takumi: engage with local community and local government
13:26:49 [JeniT]
... local community has a diversity of needs
13:27:02 [JeniT]
... need cross-relationships to tackle the real problems
13:27:54 [JeniT]
Bart: for Deirdre: did you investigate engaging with the public rather than just reading twitter? asking specific questions rather than just listening?
13:28:16 [JeniT]
Deridre: that's something we are looking at
13:28:27 [JeniT]
... a lot of citizen engagement platforms ask on specific topics
13:28:41 [JeniT]
... but it's hard to find participants that care enough to give that feedback
13:28:48 [JeniT]
... when you just listen you can see the trends
13:29:02 [JeniT]
... see what they *do* care about: maybe they just care about environment, not transport
13:29:18 [JeniT]
... these are different approaches for different goals
13:29:54 [JeniT]
Uldis: what do you expect to get from crowd-sourcing?
13:30:08 [JeniT]
Christian: we kept hearing stories about corruption, but we didn't write them down or map them
13:30:16 [JeniT]
... wanted to build something
13:30:24 [JeniT]
... doesn't work for people who don't have internet access
13:30:33 [cerealtom]
yvesr: surely its time for a game of crack attack!
13:30:33 [JeniT]
... they do have mobile phones, we have SMS number
13:30:49 [JeniT]
... we want to tell people they have something wrong in their country
13:31:27 [JeniT]
Kal: are you concerned about the demographic about people who contribute open data & participate on social media, and how that's different from demographic of general population?
13:31:45 [JeniT]
Takumi: in Japan, we don't have much difference
13:32:06 [JeniT]
... tends to be young people, lots of men, but otherwise not so much difference
13:32:40 [CaptSolo]
Takumi's coauthor is speaking
13:32:54 [JeniT]
panelmember: in both Yokohama & Sabae, there are lots of knowledge workers
13:33:21 [JeniT]
... lots of data providers
13:33:42 [JeniT]
... many students took part in open data day, to compose articles on Wikipedia
13:34:10 [JeniT]
Michalis: researchers & journalists are different
13:34:17 [JeniT]
... because of how they're funded
13:34:27 [JeniT]
... there are both types of users in each demographic
13:34:52 [JeniT]
... we're seeing a spread in access around Greece, among users who just want to find relevant information
13:35:04 [JeniT]
Deirdre: I'm not worried about the demographic as long as we're aware of it
13:35:10 [JeniT]
... we're not claiming that it's representative
13:35:18 [JeniT]
... you can build in SMSs or having real workshops
13:35:20 [CaptSolo]
s/panelmember: in/Yoshiaki Fukami: in/
13:35:31 [JeniT]
... if you need something that's representative, build in other demographics
13:35:44 [JeniT]
Christian: new technology goes hand-in-hand with old technology, don't forget radio
13:36:02 [JeniT]
BobSchloss: political parties are getting clever at extracting features from social media
13:36:24 [JeniT]
... some politicians have dashboards in which the weight of each statement is modified by Facebook friends or twitter followers
13:36:48 [JeniT]
... the rumour is that they can identify whether people are influential in their communities
13:37:08 [JeniT]
... will we see comments being weighted?
13:37:29 [JeniT]
Christian: like A/B testing politics: it's no longer politics just the most popular person wins
13:37:39 [JeniT]
Deirdre: is that something that should or can be stopped? probably not
13:37:51 [JeniT]
... if we just have opinions then it's biased & subjective
13:38:00 [JeniT]
... if we just have open data, it's not tied into human aspect
13:38:09 [JeniT]
... we need to combine the two to get the balance
13:38:22 [JeniT]
Michalis: if you can make objective information more attractive
13:38:36 [JeniT]
... can you relate election area to spending, for example
13:38:50 [JeniT]
... tagging the spatial location of the payment
13:39:00 [JeniT]
... you can find objective information in a subjective way
13:39:36 [JeniT]
Uldis: Concluding comments?
13:40:11 [JeniT]
Michalis: we believe economic LOD should be nucleus of LOD
13:40:16 [JeniT]
... need money to go around
13:40:36 [JeniT]
... need to say clearly to policy makers which data is data infrastructure
13:40:38 [floppy]
floppy has joined #odw
13:40:47 [JeniT]
... eg in economics, all public spending, prices
13:40:53 [beauvais]
beauvais has joined #odw
13:40:59 [JeniT]
... theory and application together
13:41:24 [JeniT]
Takumi: relationship between local community & other communities very important
13:41:59 [JeniT]
Deirdre: as a community, it's great to see our progress, but I'd love to see more interdisciplinary talks & sessions at these events
13:42:12 [JeniT]
... bringing real use cases to complement the technical skills we bring
13:42:27 [JeniT]
Christian: we mustn't forget that there are parts of the world where things aren't moving at this speed
13:42:41 [JeniT]
ScribeNick: AndreaP
13:42:51 [AndreaP]
Topic: Lightning Talks - eGovernment and multilingualism (chair: Yaso Córdova)
13:43:37 [yoshiaki]
yoshiaki has joined #odw
13:43:57 [AndreaP]
Topic: A Brief Report on the Research Data Alliance Plenary in March 2013 (Bob Schloss, IBM)
13:44:25 [AndreaP]
Bob: Huge amount of data from scientists in the next years
13:44:49 [AndreaP]
... How such data will be accessible?
13:45:40 [AndreaP]
... RDA wants to move as IETF
13:46:09 [AndreaP]
... to accelerate and facilitate research data exchange.
13:46:24 [AndreaP]
floppy has joined #odw
13:46:52 [AndreaP]
... e.g., on Persistent Identifiers, Metadata
13:47:05 [AndreaP]
... They want to look at existing standards.
13:47:23 [AndreaP]
... Provenance and quality are other key issues for RDA.
13:47:52 [AndreaP]
... They want metadata to be searchable, in a cross-disciplinary way.
13:48:23 [AndreaP]
... Another issue: how to handle big datasets.
13:48:58 [AndreaP]
... Again: datasets for peer review.
13:49:21 [AndreaP]
.. Real work starts in September. You are all encouraged to join.
13:50:03 [AndreaP]
... About how to join:
13:50:13 [AndreaP]
Topic: Open Data in Data Journalists' Workflow (Uldis Bojārs, University of Latvia)
13:51:17 [AndreaP]
Uldis: National Library of Latvia opening up data.
13:51:40 [AndreaP]
... [interrupted]
13:52:48 [AndreaP]
... [technical issues]
13:53:27 [AndreaP]
Topic: Empowering the E-government data life cycle (Edoardo Colombo, Politecnico di Milano)
13:53:56 [AndreaP]
Edoardo: The project is multidisciplinary.
13:54:14 [AndreaP]
... computer scientists, engineers, ...
13:54:47 [AndreaP]
... Goal to set up a publishing protocol for open data that can be used by PAs.
13:56:11 [AndreaP]
... General goal is to set up an eGov system, enabling PAs to publish data and citizen to discover them.
13:56:45 [AndreaP]
... Why? To have facts, not opinions - open data are facts.
13:56:52 [bschloss]
See , consider coming to their plenary in Washington DC in September, think what W3C standards and Open Data best practices (such as DCAT) can be extended for their needs.
13:57:12 [AndreaP]
... Presenting the system "search computing architecture".
13:57:29 [AndreaP]
... Use case: money given to hospitals.
13:57:46 [AndreaP]
... The high level query is translated into low level ones.
13:58:06 [AndreaP]
... Result is presented to the user.
13:59:09 [AndreaP]
Topic: Open Data in Data Journalists' Workflow (Uldis Bojārs, University of Latvia)
13:59:18 [AndreaP]
Uldis: I'm back.
13:59:43 [AndreaP]
... Interested in the area of open data.
14:00:05 [AndreaP]
... We need to make it easier to work with the data, to make them more re-usable.
14:00:34 [AndreaP]
... Work with data frictionless from the start.
14:00:55 [AndreaP]
... We should be able to use it for building stories.
14:01:13 [AndreaP]
... Presenting workflow on data-driven journalism process.
14:02:11 [AndreaP]
... The idea is to have a set of tools able to cover the whole process.
14:02:25 [AndreaP]
... Data journalism is just one of t he use cases.
14:03:03 [AndreaP]
... Stressing the need to get stories from data.
14:03:15 [AndreaP]
s/t he/the/
14:03:55 [AndreaP]
... Most important part is data discovery and publishing.
14:04:48 [AndreaP]
.. Journalist must have information useful to assess the quality of the data they are going to use, first of all information on data provenance.
14:05:08 [trc]
PhilA has joined #odw
14:05:37 [AndreaP]
Topic: Lessons learned (and questions raised) from an interdisciplinary Machine Translation approach (Timm Heuss, University of Plymouth)
14:05:42 [yaso]
14:06:14 [AndreaP]
Timm: Motivation is that ambiguity is an issue in Natural Language Processing.
14:06:22 [PhilA]
rrsagent, draft minutes
14:06:22 [RRSAgent]
I have made the request to generate PhilA
14:06:29 [AndreaP]
... Ambiguity may result in incorrect translations.
14:06:50 [AndreaP]
... Disambiguation is carried out based on dictionaries.
14:07:10 [AndreaP]
... Not like the approach.
14:07:25 [AndreaP]
... Rather: use what is in the LOD cloud.
14:08:27 [AndreaP]
... Statistics for LOD is key for NLP.
14:09:11 [AndreaP]
... Number of issues.
14:09:23 [AndreaP]
... Can LOD really model natural language?
14:09:49 [AndreaP]
... How can simply access LOD datasets? Some of the relevant ones are not easily accessible.
14:10:40 [AndreaP]
Topic: Interoperability Challenges for Linguistic Linked Data (David Lewis, Trinity College Dublin)
yaso has joined #odw
14:11:28 [AndreaP]
David: Would like to talk on a number of issues: content management, NLP technologies, localisation.
14:11:58 [AndreaP]
... Presenting localisation's value chain.
14:12:30 [AndreaP]
... A lot of work is outsourced.
14:12:39 [AndreaP]
... Re-use is also a big market.
14:12:52 [AndreaP]
... Statistical machine translation is also used.
14:13:11 [AndreaP]
... The value chain is quite long.
14:13:39 [AndreaP]
... Support for interoperability is there (XML-based), but interoperability is expensive.
14:14:57 [AndreaP]
... W3C ITS IG ( is trying to address some of these issues.
14:15:25 [AndreaP]
... The idea is to address all the localisation process workflow.
14:15:52 [AndreaP]
... This is done by using existing formats.
14:16:34 [AndreaP]
.. Interest in using Linked Data to disambiguate terms and to introduce confidence.
14:16:51 [AndreaP]
... Also: how can we use Linked Open Data in the process?
14:17:08 [AndreaP]
s/.. I/... I/
14:17:46 [AndreaP]
... Provenance ontology ( very relevant here.
14:18:24 [AndreaP]
... RDF used also for process monitoring.
14:19:26 [AndreaP]
... Another opportunity is to use multilingual LOD datasets to train machine translation.
14:20:47 [AndreaP]
Topic: Bar Camp Pitches
14:22:49 [AndreaP]
Eric: Project on reference implementation for LOD supporting data quality and provenance
14:23:23 [AndreaP]
Bart: How can we actually say that OD is successful from a business perspective.
14:24:23 [AndreaP]
James: Which are the barriers to using OD?
14:25:10 [CaptSolo]
anyone ready to scribe? (as AndreaP is leaving soon)
14:25:25 [AndreaP]
Leigh: Attribution and OD - can we have best practices on this?
14:25:46 [AndreaP]
Bernadette: Best practices for Persistent URIs?
14:27:20 [AndreaP]
Christopher: Want to show what I did by aggregating data from different Universities
14:27:26 [CaptSolo]
Wolfgang Orthuber
14:27:50 [CaptSolo]
numeric feature spaces
14:28:43 [CaptSolo]
JeniT, Omar: linked CSV
14:28:51 [CaptSolo]
Mark Harrison: ?
14:29:13 [CaptSolo]
AndreaP: can't manage to scribe everything, but i can add some detail
14:29:28 [CaptSolo]
Michael Lutz: ?
14:29:43 [CaptSolo]
ok, barcamp pitches finished
14:30:00 [CaptSolo]
if you pitched barcamp ideas, add more detail here
14:31:44 [AndreaP]
Michael: What you would like to have as a contribution from the European Commission on open data? e.g., concerning legislation, regulation, reference data and services
14:35:18 [AndreaP]
