IRC log of odw on 2013-04-23

Timestamps are in UTC.

07:50:05 [RRSAgent]
RRSAgent has joined #odw
07:50:05 [RRSAgent]
logging to http://www.w3.org/2013/04/23-odw-irc
07:50:38 [PhilA]
PhilA has changed the topic to: ODW13 Day 1
07:50:48 [PhilA]
meeting: Open Data on the Web Day 1
07:50:52 [PhilA]
chair:PhilA
07:51:36 [jpcs1]
jpcs1 has joined #odw
08:19:22 [jpcs1]
jpcs1 has joined #odw
08:25:33 [jpcs1]
jpcs1 has joined #odw
08:25:39 [ivan]
ivan has joined #odw
08:25:45 [timdavies]
timdavies has joined #odw
08:25:47 [daveL]
daveL has joined #odw
08:25:49 [JeniT]
JeniT has joined #odw
08:26:01 [Steven]
Steven has joined #odw
08:26:29 [yvesr]
yvesr has joined #odw
08:27:23 [markbirbeck]
markbirbeck has joined #odw
08:27:43 [mig_garcia]
mig_garcia has joined #odw
08:28:01 [danbri]
danbri has joined #odw
08:28:05 [floppy]
floppy has joined #odw
08:28:12 [cjg]
cjg has joined #odw
08:28:18 [bschloss]
bschloss has joined #odw
08:29:02 [PhilA]
scribe: PhilA
08:29:05 [StevenPemberton]
Agenda: http://www.w3.org/2013/04/odw/agenda
08:29:08 [PhilA]
scribeNick: PhilA
08:29:27 [PhilA]
Topic: John Sheridan - Building our houses on rock
08:29:37 [PhilA]
paper http://www.w3.org/2013/04/odw/odw13_submission_25.pdf
08:29:38 [laurent_au]
laurent_au has joined #odw
08:30:04 [bschloss]
John Sheridan from the National Archives begins talking - move from 'ephermal, temporary' world to a world where there is more confidence in our data
08:30:06 [PhilA]
JohnS: Talking philosophically about the need for longevity
08:30:21 [PhilA]
... how do I discover data that I can trust nad rely on, use etc.
08:30:25 [bschloss]
How to discover open data I can trust and rely on
08:30:46 [PhilA]
Johns: How do we firm up our open data can begin to use it and re-use it confdeintly and well
08:30:58 [PhilA]
JohnS: Sustaining our open data. How do we do that.
08:31:15 [bhyland]
bhyland has joined #odw
08:31:16 [PhilA]
... our budgets are declining. how do we sustain our publishing activity
08:31:41 [ldodds]
ldodds has joined #odw
08:31:48 [AndyS]
AndyS has joined #odw
08:31:58 [PhilA]
JohnS: Share the responsibility of supporting and curating open data
08:32:11 [libby]
libby has joined #odw
08:32:17 [PhilA]
... open data community is good at coming up with the rock to build on
08:32:40 [yoshiaki]
yoshiaki has joined #odw
08:32:45 [PhilA]
JohnS: I work for a reputable institution. You'll trust the data if you trust the institution
08:32:51 [PhilA]
johnS: Adds solidity
08:33:10 [cerealtom]
cerealtom has joined #odw
08:33:19 [PhilA]
JohnS: Extreme end is legislation. e.g. INSPIRE regulations that demand certain data norms
08:34:08 [PhilA]
JohnS: How can we know if policies like INSPIRE will work? Should we be asking for more of that or going to people like the National Archives and asking them for commitments
08:34:25 [PhilA]
JohnS: There's a lot to do to build our data on rock
08:34:26 [edsu]
edsu has joined #odw
08:34:33 [lottebelice]
lottebelice has joined #odw
08:34:45 [PhilA]
JohnS: The ODI certificate may be one of the most important things for the community to work on this year
08:34:49 [jpcs1]
jpcs1 has joined #odw
08:34:58 [yvesr]
http://theodi.github.io/open-data-certificate/
08:34:58 [fumi]
fumi has joined #odw
08:35:04 [PhilA]
JohnS: it would be good to discuss here what role things like the ODI certificate can have
08:35:23 [PhilA]
JohnS: Talking about the Gazettes (London Belfast etc.)
08:35:50 [PhilA]
JohnS: This is about putting things on the public record, where data is available, provenence and authenticity supported and availability guaranteed
08:35:56 [PhilA]
... service will be completed by September
08:36:04 [PhilA]
... how do we see more services like this come into existence
08:36:15 [PhilA]
... it's about devising tracks
08:36:40 [PhilA]
... the way forward to make all this happen with a solid basis, that we can build on
08:37:05 [cgueret]
cgueret has joined #odw
08:37:19 [PhilA]
JohnS: No one organisation can do this on its own, we need to act as a community to solidify our efforts
08:37:52 [lottebelice]
THe ODI certificate is really interesting, something to consider for things like http://openglam.org/principles/ and http://www.opencultuurdata.nl/about/
08:37:52 [jpcs1]
jpcs1 has joined #odw
08:37:53 [PhilA]
topic: Can open data (and big data) be used to improve the operations of development organisations?, Millie Begovic Radojevic
08:38:07 [PhilA]
paper http://www.w3.org/2013/04/odw/odw13_submission_3.pdf
08:38:36 [PhilA]
Millie: UNDP spends about $5bn a year that generates a lot of data. Have we improved things? What effect have we ahd
08:38:44 [PhilA]
... we also generate procurement data
08:38:56 [PhilA]
... we use thaty data mostly for accountability purposes
08:39:13 [PhilA]
.... we've been wondering what other insights might be accessible from that data
08:39:30 [PhilA]
... can we work out which projects will be most effective
08:39:46 [Alexrcoley]
Alexrcoley has joined #Odw
08:39:47 [PhilA]
... what about the companies we pay, who is most effective, who do they employ etc
08:39:52 [ivan]
rrsagent, set log public
08:40:12 [rjw]
rjw has joined #odw
08:40:17 [PhilA]
Millie: We started a series of events called Data Dives where we worked with people we don't normally work with
08:40:21 [rjw]
rjw has left #odw
08:40:26 [PhilA]
... data analysists, programmers etc
08:40:35 [PhilA]
... are there problems that we're not asking that we shoujld be asking
08:40:56 [PhilA]
Millie: We'll be opening a new challenge prize shortly for the best algorithm
08:40:59 [StevenPemberton]
StevenPemberton has joined #odw
08:41:13 [stressindikator]
stressindikator has joined #odw
08:41:41 [PhilA]
Millie: We took data from the World Bank on major contracts in 2007. We were interested in the suppliers and the relationships between those companies
08:42:00 [PhilA]
PhilA: As an aside - must introduce Mille to Chris Taggart this evening
08:42:17 [PhilA]
Mille: Certain companies tend to win contracts in particular sectors
08:42:50 [PhilA]
... two companies dominate this sub network of projects. What happens to the sub contractors is something goes wrwong with the main contractor - few points of failures
08:43:00 [PhilA]
... do certain clusters of companies that tend to bid together
08:43:21 [PhilA]
... we see clusters. Are these people really good ior is there something else going on?
08:43:34 [PhilA]
... do contracts go to home countries or from the more developed world
08:43:40 [StevenPemberton]
rrsagent, here?
08:43:40 [RRSAgent]
See http://www.w3.org/2013/04/23-odw-irc#T08-43-40
08:44:20 [PhilA]
Millie: A few hours' work produced these insights
08:44:30 [trc]
trc has joined #odw
08:44:42 [PhilA]
... the World bank folks had the data but not the insights which actually didn't take a huge time to create
08:44:56 [cjg]
This analysis might be interesting (and easy) to apply to http://gtr.rcuk.ac.uk/ ...
08:44:57 [richardm]
richardm has joined #odw
08:45:07 [edsu]
http://www.w3.org/2013/04/odw/odw13_submission_3.pdf is a 404 for me btw
08:45:09 [PhilA]
Millie: shows visualisation of projects and performance
08:45:32 [danbri]
edsu, i think the whole paper is in the 'abstract'
08:45:47 [PhilA]
PhilA: Thanks edsu - I'll fix than when I'm done scribing
08:45:59 [jpcs1]
jpcs1 has joined #odw
08:46:08 [PhilA]
Millie: It's not big data, it's lots of little data scattered around
08:46:35 [edsu]
danbri: thanks I found http://www.w3.org/2013/04/odw/papers now :)
08:46:39 [PhilA]
... global challenges coming up. We need help, people in orgs who can help open more data sets and help us get more inshights out of that data
08:46:46 [ldodds]
danbri: would make interesting reading, although I've not seen any open data on that?
08:47:15 [danbri]
re eu, I think you'd need a temporal view... some partners sorta dominate, then EU notice that and punish them in later rounds
08:47:17 [cerealtom]
this was the link from the final slide of the talk: http://europeandcis.undp.org/
08:47:32 [AndreaP]
AndreaP has joined #odw
08:48:17 [PhilA]
Topic: Researching the emerging impacts of open data in developing countries (ODDC) Tim Davies
08:48:22 [rjw]
rjw has joined #odw
08:48:37 [PhilA]
paper http://www.w3.org/2013/04/odw/odw13_submission_19.pdf
08:49:16 [PhilA]
TimD: Poses questions - why people are interested in open data - transparency, innovation, inclusion and empowerment
08:49:30 [MLutz]
MLutz has joined #odw
08:49:32 [PhilA]
... the way we do open data can make it easier to realise these differnet aspects
08:49:57 [richardm]
richardm has left #odw
08:50:47 [StevenPemberton]
s/differnet/different
08:50:58 [PhilA]
TimD: Talking about the launch (tomorrow) of ODDC
08:51:13 [PhilA]
... Web Foundation and OGP are behind it
08:52:08 [bhyland]
bhyland has joined #odw
08:52:13 [StevenPemberton]
s/inshights/insights
08:52:18 [chrismetcalf]
chrismetcalf has joined #odw
08:52:41 [PhilA]
TimD: Slides are expressive and contain the gist of the talk
08:52:56 [PhilA]
TimD: Draw out some key points
08:53:12 [PhilA]
TimD: As we've seen, supply needs to be built on solid foundations
08:53:36 [naomi]
naomi has joined #odw
08:53:50 [PhilA]
TimD: Are we building platforms that reply on always on high capacity systems in rural areas of the developing world
08:54:13 [PhilA]
... are the standards right/ We articulate standards but are the right people in the room
08:54:39 [PhilA]
... loads of stahndards being specifed - but do they work in all contexts? Does a London-based system work in Kenya?
08:54:51 [StevenPemberton]
s/stahn/stan/
08:55:23 [PhilA]
TimD: Are the licensing arrangements, correct/ Are first movers keeping others out?
08:56:08 [PhilA]
TimD: We have opendataresearch.org and more - see sldies
08:56:40 [PhilA]
Topic: Open Data NEXT: a strategy for social & economic value from Linked Open Data Hayo Schreijer
08:56:57 [PhilA]
paper http://www.w3.org/2013/04/odw/odw13_submission_50.pdf
08:58:16 [PhilA]
Hayo: Talking about the Dutch linked data project in NL
08:58:31 [PhilA]
hayo: We started out open data programme 2 years ago
08:58:43 [PhilA]
... want to help government depts open their data
08:58:44 [cerealtom]
good collection of questions there
08:58:46 [cerealtom]
...
08:58:52 [cerealtom]
what problem are we solving?
08:58:56 [PhilA]
... now 6K data sets from national and local administrations.
08:59:04 [cerealtom]
why spend money on opening data?
08:59:07 [PhilA]
... some great apps but not really solving real problems
08:59:13 [cerealtom]
why is nobody using our data?
08:59:25 [cerealtom]
why dont they build an app like...?
08:59:27 [PhilA]
... what actual problem does it solve? Where are the apps that do clever stuff?
08:59:43 [cerealtom]
hayo: we've reached a kind of impasse; governments are losing enthusiasm
08:59:44 [StevenPemberton]
s/sldies/slides
08:59:51 [PhilA]
Hayo: We need to look at how OD is being used to solve real problems?
09:00:05 [cerealtom]
hayo: our approach: focus on real-life problems
09:00:18 [PhilA]
Hayo: Purple areas on shown map are where population is declining, orange it's growing
09:00:29 [cerealtom]
e.g. disadvantaged and depopulated areas
09:00:42 [PhilA]
Hayo: we want to help those people with the real problems, disadvantaged areas etc.
09:01:02 [PhilA]
Hayo: trying to companies together, working on the problem
09:01:25 [PhilA]
... There's a problem of continuity. data is opened once and not updated
09:01:37 [PhilA]
... produced for one hackathon and then stopped
09:01:46 [PhilA]
... we're tackling that with linked data
09:02:14 [PhilA]
... NL has a lot of open data around legislation, case law etc. Gov not using it, they're buying it from people who put wrapper around our data and sell it back
09:02:36 [Lieke]
Lieke has joined #odw
09:02:39 [PhilA]
... can we reduce the amount of money we spend on getting our own data and maybe we can profit from it ourselves
09:03:32 [PhilA]
Hayo: We notice that policy makers often say "I base my policy on law x" - people make comments or annotations - we can use those in linked data and make the data more useful
09:03:42 [PhilA]
... shows nice labelled directed graph
09:04:31 [PhilA]
Hayo: We're allowing people to make real links between laws, policies, their text or whatever
09:04:37 [PhilA]
... what marketeers call deep linking
09:05:03 [PhilA]
... we reward people for linking to laws. We contact people and say, Ok you link to the law, how about linking to this policy
09:05:16 [PhilA]
... we can notify people that link to a law as it's clearly important to them
09:05:25 [PhilA]
... laws have versions
09:05:38 [PhilA]
... need to be able to point to a lw as it was in 2010 etc.
09:06:34 [PhilA]
Hayo: System will be available in September - getting government people enthusiastic about using their open data. This is a good example of showing govs how they can use their data
09:06:44 [PhilA]
... of course others can use it too.
09:07:19 [PhilA]
Topic: Open Data on the Web: 3 Principles For Maximum Participation, Bob Schloss, IBM
09:07:31 [PhilA]
paper http://www.w3.org/2013/04/odw/odw13_submission_54.pdf
09:07:44 [PhilA]
slides (already!) http://www.w3.org/2013/04/odw/W3COpenDataBriefingMaximumParticipation2013Apr19.pdf
09:07:51 [cjg_]
cjg_ has joined #odw
09:08:41 [PhilA]
BobS: We put together what we've put together when considering what we think might be missing
09:08:52 [PhilA]
BobS: I think it's great when we get lots of open PSI
09:09:04 [PhilA]
... we need it in educational, arts and business worlds too
09:09:13 [PhilA]
... we need to get a virtuous circle where value is created
09:09:38 [amp]
amp has joined #odw
09:09:39 [PhilA]
... looking at an Irish linked data front end
09:09:41 [HadleyBeeman]
HadleyBeeman has joined #odw
09:10:14 [PhilA]
BobS: we started in Oct 2011 with 4 Irish authorities (Dublin + 3)
09:10:32 [PhilA]
BobS: Looked at the cost/benefits of uploading open data
09:10:53 [PhilA]
... this issue that the people who publish trust that their effort will deliver a return
09:11:05 [PhilA]
... people have to want your data and they want it in their forma
09:11:07 [yaso]
yaso has joined #odw
09:11:10 [PhilA]
... (not yours)
09:11:24 [PhilA]
... you need to be able to state how complete is the data, when and where does it cover etc.
09:11:44 [PhilA]
... whole cluster of ideas
09:11:59 [PhilA]
... you can synthesise this open data with yours and do good stuff
09:12:02 [StevenPemberton]
s/forma/format/
09:12:07 [PhilA]
BobS: The three principles
09:12:25 [PhilA]
bobS: (see slides)
09:12:45 [PhilA]
s/slides/ slide 6/
09:12:48 [cjg]
cjg has joined #odw
09:12:54 [StevenPemberton]
slide 6
09:12:59 [StevenPemberton]
s/slide 6//
09:14:07 [PhilA]
BobS: Slide 7 for the second principle
09:14:25 [PhilA]
... talking about things like showing logos for limited time, potentially contacting data users
09:14:48 [PhilA]
... need to be able to log if there's a new version of the data
09:16:26 [JeniT]
disturbed a bit about the additional limitations bschloss is suggesting for "open" data
09:16:43 [JeniT]
seems to be stretching what "open" means beyond the usual definitions
09:16:45 [cjg]
Bingo!
09:16:46 [StevenPemberton]
+1
09:16:59 [cjg]
"What if terrorists use our data" is on my bingo card: http://is.gd/gXDEaG
09:18:35 [cjg]
(but to be fair, hazardous materials is actually a reasonable dataset to keep limited access. )
09:19:32 [PhilA]
Topic: Q&A session
09:19:32 [StevenPemberton]
Except if you want to see if there is hazardous material stored near your school. #west
09:19:45 [yaso_]
yaso_ has joined #odw
09:19:58 [edsu]
cjg: puts a new spin on the JISC's ‘The coolest thing to do with your data will be thought of by someone else.’
09:20:18 [PhilA]
JohnS: We make instiutional commitments
09:20:36 [PhilA]
Hayo: Our governments trust third parties more than our open data
09:20:39 [markbirbeck]
markbirbeck has joined #odw
09:20:42 [takumi]
takumi has joined #odw
09:20:43 [PhilA]
... we're trying to educate tem
09:21:09 [PhilA]
TimD: We're trying to talk about purposes and use of data more than you need to publish in a given format etc.
09:21:42 [PhilA]
Millie: This is a room full of evangelists, the shift in thinking needed is enormous, don't underestimate that
09:22:01 [masao]
masao has joined #odw
09:22:27 [PhilA]
TomHeath: I like John's quotes. I don't like "if you agree with me you're wise if not you're a fool"
09:22:40 [PhilA]
TomHeath: How do we convince others of the wisdom
09:23:15 [PhilA]
BobS: What we're doing in Dublin - we capture the identity of the app, program and org that downloads everything and there's an offline process for assessing the value of that
09:23:30 [PhilA]
... then go back to the data publisher and tell them what's going on, what people are doing with your data
09:23:50 [cjg_]
cjg_ has joined #odw
09:24:28 [edsu]
aside: best way to convince people is to show them the utility of it, not appealing to their better (wiser) nature imho
09:24:46 [pascalRomain]
pascalRomain has joined #odw
09:25:17 [cjg_]
cjg_ has joined #odw
09:26:34 [edsu]
&coffee;
09:26:56 [cjg_]
edsu: I swear that we've had people suggest that if terrorists got access to the live bus times they could use it… there's a wear and tear on my desk from banging my head on it.
09:27:34 [PhilA]
PhilA has joined #odw
09:27:59 [PhilA]
PhilA: grrr dropped off IRC, sorry, missed some comments and questions
09:28:02 [cjg_]
yeah, I've got a talk at IWMW this year about how open data can get better value for money -- seems a good way to think about it in these tightened times.
09:28:28 [PhilA]
BobS: IBM has been looking at specific cities. We don't push up hill - we find the people that want to do open data
09:28:38 [PhilA]
BobS: We also need to find the person in the street
09:28:51 [PhilA]
... we don't have 'how open data can improve your life' days
09:29:04 [PhilA]
Hayo: Yes, talk about problems, not open data
09:29:25 [PhilA]
TimD: Yes, we want data you can build upon in gov and society
09:29:41 [PhilA]
TimD: Lots of great examples from places like Sao Paulo
09:29:59 [PhilA]
... talking about accountability and capacity not open data
09:30:08 [danbri_]
danbri_ has joined #odw
09:30:08 [cjg]
We have a policy of always putting a front end on our open data; even if it's as simple as a basic HTML page. 99% of the users are just using that and not the underlying data, but that's OK.
09:30:14 [PhilA]
... so the new research project will include lots of case studies from Brazil.
09:30:37 [PhilA]
BobS: In Africa, the knowledge of prices for their farming goods is transforming farming
09:30:50 [edsu]
cjg: :-D
09:30:59 [PhilA]
BobS: So we've been working on projects for people who can't read - working on spoken web in India
09:31:13 [StevenPemberton]
rrsagent, make minutes
09:31:13 [RRSAgent]
I have made the request to generate http://www.w3.org/2013/04/23-odw-minutes.html StevenPemberton
09:31:27 [PhilA]
Millie: In the Balkans we have an issue of forest fires and consequent air quality
09:31:35 [PhilA]
... I want to know if my child can go out on the street
09:31:42 [PhilA]
... we have kids building air quality monitors
09:31:51 [PhilA]
... we move to solutions too quickly
09:32:33 [edsu]
cjg: it's hard to develop all the apps/visualizations people want ; giving them the data and empowering them to do it seems like a no brainer -- except to people who don't want new interesting visualizations of their data :)
09:32:45 [PhilA]
JeniT: For Bob - you spoke about the need for collecting data about people using the data and restricting terrorists's access - that's not the usual definition of open data
09:32:56 [PhilA]
BobS: I see a spectrum, not a point
09:33:17 [cjg]
I generally tell people that "open" means removing as many barriers as possible
09:33:30 [PhilA]
... we're going to have rock solid stuff - it will be there and accurate for 9 years. Then there's softer and softer - we need to cover the specturm
09:33:31 [cjg]
the barriers can be technical, social or legal.
09:33:58 [cjg]
"as open as possible" can still be used to describe data which is confidential.
09:34:03 [HadleyBeeman]
For reference, I think JeniT is referring to the Open Definition http://opendefinition.org/
09:34:09 [markbirbeck]
markbirbeck has joined #odw
09:34:35 [HadleyBeeman]
Great question… I've been wondering as well if we're still having the same discussions (as we were a year or two ago).
09:34:37 [PhilA]
bhyland: Yes. we're all evangelists but we're not working in a vacuum. There are people in gov who are not minded to hand data over to a bunch of smart people they don't trust
09:35:21 [PhilA]
Hayo: It takes pateince. We have to change contracts occasionally. We changed our legislation publishing contractor 5 years ago - that made a big difference
09:35:56 [StevenPemberton]
I think he said that it took 5 years to change the contract
09:36:03 [tomag]
tomag has joined #odw
09:36:21 [StevenPemberton]
and only then could they use their own data
09:36:34 [PhilA]
Millie: SorryScribe note - sorry, I missed Millie's comment about Pulse??
09:37:42 [markbirbeck]
PhilA: Yes, my pencils are sharpened.
09:38:00 [yaso_]
yaso_ has joined #odw
09:38:27 [PhilA]
Billr: My experience as a private sector person working for gov - see that some of the bigger people only just picking up the potential for open data. Some early birds are winning
09:38:40 [StevenPemberton]
s/PhilA: Yes, my pencils are sharpened.//
09:38:46 [PhilA]
Last thoughts...
09:39:04 [PhilA]
JohnS: Spend more time talking to people not involved with open data about fixing problems
09:39:15 [PhilA]
BibS: OD is a means, not an ends. talk about the ends
09:39:25 [bhyland]
bhyland has joined #odw
09:39:25 [PhilA]
Hayo: OD will take time and money. Maybe 5 years +
09:39:32 [PhilA]
BillR: +1
09:39:34 [floppy]
floppy has joined #odw
09:39:53 [PhilA]
Millie: UNDP uses tax payer's money to change people's lives - we need help
09:40:08 [PhilA]
TimD: Think about who's in the room when we define standards
09:40:46 [markbirbeck]
Topic: The Role of PDF and Open Data (Jim King, Adobe)
09:41:17 [StevenPemberton]
scribenick: markbirbeck
09:41:19 [timdavies]
timdavies has joined #odw
09:41:23 [StevenPemberton]
Scribe: Mark Birbeck
09:41:34 [bhyland]
bhyland has joined #odw
09:41:47 [StevenPemberton]
Paper: http://www.w3.org/2013/04/odw/odw13_submission_52.pdf
09:42:17 [bhyland]
Concluding remark from first session: "Open data is a means, not an end. Come at it from what real world problems it will solve."
09:42:24 [cjg]
"
09:42:46 [HadleyBeeman]
HadleyBeeman has joined #odw
09:42:49 [markbirbeck]
Paul Davidson introducing James King — senior principal scientist at Adobe — to talk about how PDF is more open than we all think it is.
09:42:57 [edsu]
BibS++ concur
09:44:27 [markbirbeck]
Structure of talk: open data paradigm, PDF itself, and then its role in open data.
09:44:42 [bhyland]
bhyland has joined #odw
09:44:44 [StevenPemberton]
s/:/-
09:44:46 [jpcs1]
jpcs1 has joined #odw
09:45:08 [markbirbeck]
Organisations taking data, shaping it and presenting it.
09:45:33 [markbirbeck]
…but others — the "processors" — would prefer to deal with the raw data...
09:45:58 [markbirbeck]
…they might present that too, but also use the data to draw new conclusions, or use it for advocacy.
09:46:17 [markbirbeck]
…A further group is that of the tool providers, who will help us process this data.
09:46:40 [markbirbeck]
…About 30% of the room are providers...
09:46:53 [markbirbeck]
…80% are processors...
09:47:23 [markbirbeck]
…most are consumers, and some are tool providers.
09:48:11 [markbirbeck]
…PDF will be 20 years old this June.
09:48:23 [cjg]
cjg has joined #odw
09:48:24 [markbirbeck]
…PDF and Acrobat are different beasts.
09:48:57 [markbirbeck]
…The internals of PDF have always been published, and it became an ISO Standard in 2008.
09:49:06 [PhilA]
PhilA: Nice approach to backwards compatibility from Adobe for PDF
09:49:18 [markbirbeck]
…A PDF 1.0 doc is also a 1.7 doc — always backwards compatible.
09:49:27 [bhyland]
Jim King: PDF will be 20 years old this June. PDF 1.7 became an ISO Standard in July 2008. ISO work on PDF is ongoing.
09:49:40 [edsu]
hopefully mozilla's pdf.js will get a mention ...
09:50:55 [markbirbeck]
…To make the PDF spec into a 'proper' ISO Standard the team at Adobe had to go through the entire document…very thoroughly…
09:51:12 [amp]
amp has joined #odw
09:51:31 [markbirbeck]
…PDFs are abundant, containing lots of useful information.
09:51:50 [cjg]
I had surprisingly good results converting our student union committee minutes from PDF to RDF: http://lemur.ecs.soton.ac.uk/~cjg/TheyWorkForSUSU -- just looking at where on the page text appears gives more semantics than the naive pdf2utf8 (or 2html) approach.
09:52:12 [markbirbeck]
…It's a format that distinguishes between text and graphics, and can be used to produce good looking documents.
09:52:22 [markbirbeck]
…But it's not a data format.
09:52:27 [edsu]
cjg: i think that's roughly what google scholar does when it scrapes pdfs
09:52:55 [bschloss]
bschloss has joined #odw
09:53:15 [markbirbeck]
…Billions of documents out there, but difficult to extract any data that's in there.
09:53:38 [edsu]
cjg: grabbing the largest text at the top of the first page as the title
09:53:51 [markbirbeck]
…If pages *contain* graphics then extract that with something like Illustrator.
09:54:09 [markbirbeck]
…If pages are text then there's a bunch of software that can process the text.
09:54:23 [markbirbeck]
…(A big list is on Wikipedia.)
09:54:37 [bschloss]
There is a 'spectrum of open data' -- totally free, available forever, no recording of downloader is one end of that spectrum, but airlines, investment markets, sports leagues, available job listing websites, retailers are all doing open data on a slightly different point on the spectrum.
09:54:50 [markbirbeck]
…And if the pages are images (i.e., rather than *containing* images) then need to go the OCR route.
09:54:57 [trc]
trc has joined #odw
09:55:07 [StevenPemberton]
http://en.wikipedia.org/wiki/Comparison_of_optical_character_recognition_software
09:55:30 [cjg]
We found a nice command line tool which converts PDF to and XML representation of the data structure inside and that gets it into our 'hacking comfort zone'
09:55:46 [ivan]
-> http://en.wikipedia.org/wiki/List_of_PDF_software wikipedia list for pdf tools
09:56:11 [markbirbeck]
http://en.wikipedia.org/wiki/Comparison_of_optical_character_recognition_software
09:56:19 [markbirbeck]
http://en.wikipedia.org/wiki/List_of_PDF_software wikipedia list for pdf tools
09:56:25 [markbirbeck]
(Thanks Ivan and Steven!)
09:56:58 [markbirbeck]
…If you're making PDFs, here's what you could do to make things easier.
09:57:37 [markbirbeck]
…Making files that both contain raw data and look good is difficult.
09:58:32 [markbirbeck]
…There *is* software around that can embed metadata to provide structural information.
09:58:40 [AndreaP]
AndreaP has joined #odw
09:59:09 [bschloss]
Seems to me that any producer of a PDF who wants it to be available to people with no sight is hopefully providing a table or textual alternative rendering in the PDF for any diagram or image in the PDF, yes?
09:59:17 [markbirbeck]
…The structural information would be stuff like reading order, tags such as headers, footnotes, figures, maths, and so on.
09:59:42 [markbirbeck]
…Tools can make use of this extra data which will make the extraction process much more reliable.
10:00:13 [markbirbeck1]
markbirbeck1 has joined #odw
10:00:25 [markbirbeck1]
…A second thing to do is make use of the attachment facility.
10:00:37 [markbirbeck1]
scribenick: markbirbeck1
10:00:45 [markbirbeck1]
…A second thing to do is make use of the attachment facility.
10:01:08 [lottebelice]
lottebelice has joined #odw
10:01:27 [markbirbeck1]
…Raw data on its own is probably insufficient for doing something useful.
10:01:48 [StevenPemberton]
rrsagent, make minutes
10:01:48 [RRSAgent]
I have made the request to generate http://www.w3.org/2013/04/23-odw-minutes.html StevenPemberton
10:01:53 [markbirbeck1]
…For example, what's the currency? the data format? the semantics of the fields? provenance?
10:02:00 [alex]
the attachments-in-PDFs thing might actually be useful for scholarly publications, so that the data doesn't get divorced from the paper
10:02:27 [markbirbeck1]
…So we create a PDF file that contains raw data with a schema, giving the end-user everything they need.
10:02:51 [StevenPemberton]
s/(Thanks Ivan and Steven!)//
10:03:04 [alex]
bhyland: yeah, presumably there's not the tools support beyond what Adobe sells
10:03:06 [markbirbeck1]
…Can then make use of all the nice PDF features that have evolved over the last 20 years, such as digital signing.
10:04:03 [markbirbeck1]
…There are some examples in the slides.
10:04:23 [AndyS]
AndyS has joined #odw
10:04:23 [edsu]
bhyland: same could be said of most metadata on the web
10:04:48 [markbirbeck1]
Peter Murray-Rust: Spent years hacking PDFs in the wild.
10:05:15 [markbirbeck1]
…Trying to write software that will process them, but they are generally pretty bad.
10:06:08 [markbirbeck1]
…If anyone else is trying to hack on this then please talk to me; there's hundreds of billions of dollars worth of information out there that is simply unusable at the moment.
10:06:31 [cjg]
I had a bit of a rant about PDFs as a way of communicating data to a reporter from the register, which resulted in them publishing this: http://lemur.ecs.soton.ac.uk/~cjg/Archive/Photos/2011/cjg-boffin.png (I'm quite proud of that)
10:06:39 [markbirbeck1]
Dan Brickley: Is this thing loud enough?
10:07:04 [markbirbeck1]
…PDF can be used well and powerfully, and of course it's clear that some people aren't using it well.
10:07:08 [edsu]
heh, re: billions of dollars worth of information that's unusable, you have to wonder if that's by design, not by accident ...
10:07:14 [markbirbeck1]
…You didn't mention XMP, though, which includes RDF.
10:07:24 [markbirbeck1]
…You also didn't mention accessibility.
10:07:44 [bhyland]
Peter Murry-Rust - Scientific publishers are paid $10B/yr worldwide to lock up scholarly publishing, that is after governments spend $100B/yr globally on scientific funding for R&D in the first place. He is looking for people to help him in his mission to unlock the enormous value locked in PDFs.
10:08:45 [serena_v]
serena_v has joined #odw
10:08:46 [bhyland]
s/Murry-Rust/Murray-Rust
10:09:08 [markbirbeck1]
James: The accessibility aspects are quite mature in PDF, and the structured aspects help that.
10:09:13 [roger]
roger has joined #odw
10:09:15 [StevenPemberton]
PDF is a page description language, so not in a reading order necessarily
10:10:04 [markbirbeck1]
…We don't have much control over what people produce, although things have improved in the last 5 years.
10:10:45 [bhyland]
@edsu - perhaps re: your comment above. My experience suggests that we're more thoughtful publishing structured data about data sets (metadata) because they are fewer in quantity whereas PDF are like water, they are everywhere and almost "too easy" to create but the mere click of "Print —> PDF" …
10:11:03 [bhyland]
s/but the/by the
10:11:07 [markbirbeck1]
speaker: For many people PDF data is closed data.
10:11:31 [yaso]
yaso has joined #odw
10:11:55 [markbirbeck1]
speaker2: You've outlined many things I didn't know were possible, so why is there not the uptake on these features?
10:12:45 [bhyland]
@hadleybeeman - because the tools are proprietary, complex to use … at least harder than clicking "Print —> PDF" and well let's face it, people are lazy and hand entered metadata has been proven to be *very* challenging and highly inconsistent.
10:13:36 [markbirbeck1]
James: Not sure if it's our fault. In some areas there have been successes, perhaps where there's industry interest or our sales people have promoted a feature.
10:13:52 [StevenPemberton]
s/speaker2/hadleybeeman/
10:14:07 [markbirbeck1]
s/speaker2/HadleyBeeman/
10:14:21 [markbirbeck]
markbirbeck has joined #odw
10:15:08 [alex]
If they want stuff like metadata to be adopted, then surely they need to encourage support in tools other than their own (OpenOffice; Word)
10:16:19 [hideaki]
hideaki has joined #odw
10:17:16 [yoshiaki]
yoshiaki has joined #odw
10:17:33 [hideaki]
hideaki has left #odw
10:18:26 [JeniT]
JeniT has joined #odw
10:26:39 [jpcs1]
jpcs1 has joined #odw
10:29:19 [StevenPemberton]
StevenPemberton has joined #odw
10:31:16 [floppy]
floppy has joined #odw
10:31:39 [fumi]
fumi has joined #odw
10:32:40 [rjw]
rjw has joined #odw
10:33:05 [bhyland]
bhyland has joined #odw
10:33:16 [cjg]
cjg has joined #odw
10:33:18 [stressindikator]
stressindikator has joined #odw
10:33:26 [cgueret]
cgueret has joined #odw
10:33:47 [st]
st has joined #odw
10:33:56 [markbirbeck]
markbirbeck has joined #odw
10:34:15 [ldodds]
ldodds has joined #odw
10:34:17 [hideaki]
hideaki has joined #odw
10:34:17 [yoshiaki]
yoshiaki has joined #odw
10:34:44 [StevenPemberton]
StevenPemberton has joined #odw
10:34:46 [yaso]
yaso has joined #odw
10:34:48 [rtroncy]
rtroncy has joined #odw
10:35:06 [takumi]
takumi has joined #odw
10:35:12 [StevenPemberton]
rrsagent, here?
10:35:13 [RRSAgent]
See http://www.w3.org/2013/04/23-odw-irc#T10-35-12
10:35:30 [bhyland]
Topic: Panel: Tabular Data Formats and Packages Chair: Jeni Tennison
10:36:17 [bhyland]
Jeni: sets the tone around different formats for tabular data, advantages & disadvantages of various approachs.
10:36:33 [bhyland]
… NB: Special allowance for Rufus who has been known from time to go on ...
10:36:44 [markbirbeck1]
markbirbeck1 has joined #odw
10:36:50 [serena]
serena has joined #odw
10:36:54 [bhyland]
Rufus: Intro on OKFN and their mission to liberate data
10:37:03 [HadleyBeeman]
HadleyBeeman has joined #odw
10:37:09 [cjg_]
cjg_ has joined #odw
10:37:12 [bhyland]
… Proposed "Our Mission" is to make it radically easier for data to be made used & useful"
10:37:16 [ivan]
scribenick: bhyland
10:38:26 [bhyland]
Rufus: Stated problem of data on the Web in many different formats & issues that poses.
10:38:38 [cjg_]
http://data.okfn.org/
10:38:45 [floppy]
floppy has joined #odw
10:38:46 [johnlsheridan]
johnlsheridan has joined #odw
10:39:22 [naomi]
naomi has joined #odw
10:39:31 [bhyland]
… propose 3 minor innovations involving " borrowing" approaches others have used before us.
10:40:31 [bhyland]
In this model, there are the usual suspects … data creators & packagers, consumers and the effort in the middle to do "data packaging"
10:41:03 [bhyland]
… Linked Data effort has been knowledge APIs and has been successful [to varying degrees]
10:42:40 [bhyland]
… Packaging has to be done as a distinct step that minor packaging effort, agnostic about the data, its packaging is designed specifically ...
10:42:52 [masao]
masao has joined #odw
10:43:18 [pieterc]
pieterc has joined #odw
10:43:50 [bhyland]
… Today, there is a huge amount of friction on getting & using data on the Web. We want to build for the Web. Rufus said RDF is not Web native … he has been laughed at when proposed its use ...
10:44:31 [bhyland]
Proposal: 1 - One (small) part of the data chain; 2- Build for the Web; 3 − 4 − 5 [to fast to record]
10:44:53 [AndyS]
AndyS has joined #odw
10:44:58 [PhilA]
rrsagent, make logs public
10:45:04 [bhyland]
Concluding remark: Package data more effectively and produce one killer tool to make data more accessible.
10:45:06 [PhilA]
rrsagent, draft minutes
10:45:06 [RRSAgent]
I have made the request to generate http://www.w3.org/2013/04/23-odw-minutes.html PhilA
10:45:22 [bhyland]
Speaker: Omar Benjelloun, Google, DSPL
10:46:12 [bhyland]
Omar highlighted Google public search feature, Knowledge Graph capability and origins.
10:46:14 [HadleyBeeman]
s/DSPL/GPDE
10:46:44 [StevenPemberton]
StevenPemberton has joined #odw
10:46:55 [bhyland]
… Highlighted the Public Data Explorer, using Dat Cube representation. Anyone can upload & share data using RDF.
10:47:44 [HadleyBeeman]
Oh. ignore my s/DSPL/GPDE
10:47:52 [bhyland]
DSPL = Dataset Publishing Language, describes tabular data + semantic description including concepts describing re-useable data types. All packaged in a zip file. Visualizations can be shared.
10:48:09 [StevenPemberton]
rrsagent, hre?
10:48:09 [RRSAgent]
I'm logging. Sorry, nothing found for 'hre'
10:48:15 [StevenPemberton]
rrsagent, here?
10:48:15 [RRSAgent]
See http://www.w3.org/2013/04/23-odw-irc#T10-48-15
10:48:54 [bhyland]
Omar's Propositions: Datasets need good Web pages with stable, official, up-to-date canoncial location. Also, add good markup for reasonable SEO.
10:49:07 [StevenPemberton]
s;s/DSPL/GPDE;;
10:49:29 [AndyS]
AndyS has joined #odw
10:49:54 [bhyland]
… Let tables be tables. [Let it be… ] Relational data & schema are well understood. Better than triples: tables naturally capture relations. Better than APIs: no access patterns, scalability issues.
10:50:28 [bhyland]
… Add semantic annotations to tables. Leverage EXISTING approaches (RDF, schema.org) [emphasis is scribes :-)]
10:50:47 [bhyland]
… Better to follow this approach than create custom data models (SDMX, DSPL).
10:51:10 [bhyland]
Next speaker: Stuart Williams, Epimorphics
10:51:49 [bhyland]
Overview of Epimorphics, doing services and LD design work. Working with data.gov.uk. Helped to lay down some of the sand that John Sheridan previously described.
10:52:05 [bhyland]
Working to publish bathing water quality, now expanding to the river network in the UK.
10:52:54 [bhyland]
… Thinking about getting 'beyond the data', we feel that we need to get beyond the 4 & 5 Star Data attribute, and evolve the message to solving a real world problem.
10:53:23 [bhyland]
… Works with the UK Environmental Agency to make publication of valuable data … easy!
10:53:44 [bhyland]
… Think about how to allow publishers to add simple bits of markup.
10:54:09 [bhyland]
… Think about how to contribute to the virtuous circle of making it easy to contribute something valuable & receiving something valuable.
10:54:16 [pieterc]
pieterc has joined #odw
10:54:23 [bhyland]
Next speaker: John Snelson from MarkLogic
10:54:42 [bhyland]
Describes himself as an XML-guy and actively involved in W3C around those recommendations.
10:55:01 [hideaki]
hideaki has joined #odw
10:55:04 [bhyland]
… MarkLogic helps its customers use data effectively using XML.
10:55:26 [timdavies]
timdavies has joined #odw
10:55:26 [bhyland]
… John is a data pragmatist. We must look beyond those formats.
10:55:53 [bhyland]
Next Speaker: Tyng-Ruey Chuang, from Academic Sinica in Taiwan
10:56:23 [bhyland]
Involved in Taiwan's culture heritage efforts.
10:56:25 [PhilA]
Tyng-Ruey Chuang, Academia Sinica (Taipei) see http://www.iis.sinica.edu.tw/pages/trc/
10:56:40 [StevenPemberton]
s/Academic/Academia/
10:56:49 [cjg]
I can't ask our catering department to provide menus in a well structured RDF format :-)
10:56:56 [cjg]
(much as I wish they would)
10:57:29 [bhyland]
… dealing with heterogeneous collections of content including media files, documentation. His focus is on sharing & making cultural heritage content usable for the long term.
10:57:55 [StevenPemberton]
rrsagent, make minutes
10:57:55 [RRSAgent]
I have made the request to generate http://www.w3.org/2013/04/23-odw-minutes.html StevenPemberton
10:57:58 [bhyland]
… Putting data on the Web itself does not guarantee longevity.
10:58:53 [StevenPemberton]
i/Jeni: sets the tone/scribenick: bhyland
10:58:56 [StevenPemberton]
rrsagent, make minutes
10:58:56 [RRSAgent]
I have made the request to generate http://www.w3.org/2013/04/23-odw-minutes.html StevenPemberton
10:59:23 [bhyland]
… We can & should learn from the Free Software Foundation. Supports giving people the ability to make copies of content. Highlighted the importance of porting content to be ported to many other computer systems, both on & off the Web, for it to be considered truly open.
11:00:34 [bhyland]
Panel Convener is Jeni … She puts the following question to Rufus. Q) There is debate on how manage metadata, to embed or not ...
11:00:47 [StevenPemberton]
i/Jeni: sets the tone/scribe: Bernadette Hyland
11:01:08 [StevenPemberton]
rrsagent, make minutes
11:01:08 [RRSAgent]
I have made the request to generate http://www.w3.org/2013/04/23-odw-minutes.html StevenPemberton
11:02:51 [bhyland]
Rufus: Regarding embedding, it almost becomes an AI project to figure out metadata that is embedded. It can be a nightmare. The beauty of keeping it separate is it is easier on tools & therefore treatment by tools. He is supportive of graceful degradation.
11:03:44 [DeirdreLee]
DeirdreLee has joined #odw
11:04:09 [bhyland]
Tyng-Ruey Chuang: Prefers to have structured schema as part of the data (?)
11:04:59 [bhyland]
Omar: Mainly, the important thing is to get agreement on format, then all kinds of good things can happen. Linking tables & metadata to Web pages (authoritative) is really important.
11:05:37 [bhyland]
Stuart: We're been using this word "metadata" which leads us to schema information. In RDF world, we can click through to it & immediately see it.
11:06:14 [bhyland]
… Using RDF model, you don't have to scramble all over the Web, rather, you get bits of schema info back because it is carried *with* the data.
11:07:15 [bhyland]
… Highlighted the perils of carry possible too much provenance information that it drowns out the important data itself.
11:07:17 [cjg]
Quite simply, tabular data requires a lower cognitive load to work with. Most people can't be bothered to learn to think in graphs. So tabular is more open because it's easier to comprehend.
11:07:34 [edsu]
aside: embedded metadata (facebook opengraph, schema.org) is getting published because it is getting used
11:07:47 [HadleyBeeman]
cjg I wonder how much of that is because our computer science training wasn't very graph-focused. Next generation might be different?
11:07:54 [edsu]
i don't buy the argument that it needs to be separate ...
11:08:13 [bhyland]
Questions from the audience ...
11:08:59 [bhyland]
Ivan: When we speak of metadata, my biggest issue is what vocabularies to use. It is the biggest problem we have to solve, even more important than the data format/model … if we had widely available vocabularies, it would solve many problems.
11:09:13 [cjg]
HadleyBeeman: I'm talking about the people who maintain my data. They are *not* computer scientists… they are in finance, buildings & estates, catering...
11:10:03 [cjg]
http://data.southampton.ac.uk/dataset/catering.html
11:10:18 [bhyland]
Rufus: If you meet most developers, and start talking about vocabularies, "they'll run for the hills." Been part of long countless fights on what vocab. Suggested a new site called http://GiveMeTheDamnSchema.org as a joint project of cygri and Rufus ;-)
11:10:23 [HadleyBeeman]
cjg: Ah, I see. Yes, different user base there.
11:10:55 [cjg]
I went to see what they already had, tidied it all up in excel and moved it to google spreadsheets so it was easy to grab automatically.
11:11:15 [andyhedges]
andyhedges has joined #odw
11:11:15 [bhyland]
… What is the minimum to make CSV files useful. Just give me the basics, string, integer. This is *our* problem, not publishers. I'm all about 'reducing the time' … open vs. closed data.
11:11:25 [edsu]
problem hasn't been schemas per se, as much as it has been schemas divorced from their actual use
11:11:28 [bhyland]
… Licensing is a lower priority for many.
11:11:40 [bhyland]
… Ease of publishing is king
11:12:00 [cjg]
Also, I want to create a collection of SPARQL queries which produce useful spreadsheet downloads for humans to consume. Secretaries are a whizz with Excel, but only if the file loads first time. Telling them TSV can be "easily imported" is already outside their comfort zone.
11:12:01 [bhyland]
… Our mission is to reduce the cost & RDF, at the moment, is not doing that.
11:12:59 [bhyland]
Omar: If we want to bring data together, we have to harmonize into a common model. I don't know whether developers should have to be encumbered with that responsibility. But it is a real problem to solve.
11:13:51 [bhyland]
Bhyland notes, (not in a comment), there is a wide spectrum of opinions in the room & that is good to stimulate that discussion. Deepening understanding is key to all of this.
11:14:13 [AndyS]
AndyS has joined #odw
11:14:34 [bhyland]
Stuart: Finding the stuff in the first place, with schematic markup answering provenance information, is critical to solving the hurdles we face with better use of open data on the Web.
11:15:18 [alex]
cjg: I played with SharePoint/Excel integration yesterday, and it looks like you can get Excel to live-update from SharePoint lists; I suppose something similar could be done with s/SharePoint/SPARQL endpoint/
11:15:35 [bhyland]
John Snelson: Vocabularies have their place, but search is a great way to find data that is not expressed perhaps as nicely as we'd like...
11:15:36 [alex]
then SPARQL would be truly Enterprise™
11:15:43 [masao]
masao has joined #odw
11:16:42 [alex]
it would also be possible to embed the metadata for a table in a second sheet of an XLSX/ODS file, instead of prepending it to a CSV file
11:16:46 [cjg_]
cjg_ has joined #odw
11:16:52 [bhyland]
Questions from the mob: You've got to help represent/model data, but that is not the entire story. It is a "horses for courses" kind of thing. Please be careful not to reinvent RDF with JSON glasses on.
11:16:55 [pascalRomain_]
pascalRomain_ has joined #odw
11:18:17 [bhyland]
IBM guy - Dealing with data is hard. It is harder than process. We won't solve problems with data exchange standards alone. One thing we haven't heard about today is Best Practices and Architectural processes. We need to rise above data formats and really focus on data patterns, best practices.
11:18:22 [cjg]
I have this horrific image of people creating n-triples documents in Excel...
11:18:42 [yvesr]
cjg, i saw that being done *a lot* at the bbc
11:18:46 [cjg]
gah!
11:18:49 [pieterc]
cjg: why would that be horrific?
11:19:14 [cjg]
for one thing, excel plays silly buggers with certain values.
11:19:24 [bhyland]
Bhyland to 'IBM guy' - let's talk real soon — there is Best Practices work, albeit nascient, underway within W3C Gov't Linked Data working group & we'd welcome your input.
11:19:47 [cjg]
We have real trouble getting people to enter phone numbers without it getting muddled. 079671234567 gets converted to an integer as does +44....
11:20:53 [bhyland]
Rufus: Described state of the world that is very fragmented, messy & dirty and urges us to not look for [utopian data model] that everyone if required to use.
11:21:58 [bhyland]
Tyng-Ruey: RE: Best Practices, validators would be helpful to check data representation is correct. Need: Better validators (Note to PhilA).
11:22:24 [StevenPemberton]
rrsagent, make minutes
11:22:24 [RRSAgent]
I have made the request to generate http://www.w3.org/2013/04/23-odw-minutes.html StevenPemberton
11:22:32 [cjg]
Shout out for http://app.easyopendata.com/ - for converting live google spreadsheets into XML or RDF/XML etc.
11:22:48 [HadleyBeeman]
Oo, fun, cjg. Thanks.
11:22:50 [bhyland]
Omar: "I think we've been spoiled by the Web" because search engines have done a good job. The question is, can we make this Web of Data thing work such that we publish our metadata & data and have it easily found. This is the question.
11:22:57 [pieterc]
cjg: spreadsheets are for calculations, not data. CSV is a format which people use with spreadsheet programs, thus not suited for the job. Got your point?
11:23:43 [bhyland]
Peter Murray-Rusk: To Omar - what do you do with things are labelled as tables but really are not tables?
11:23:50 [cjg]
yeah, maybe we need a nice "CSV" editor?
11:24:04 [cjg]
Or even a "table" editor, using PMR's description.
11:24:07 [bhyland]
Omar: Smart people are working on it … it's complicated.
11:24:22 [pieterc]
cjg: thought of it as well already
11:24:36 [cjg]
basically a cut-down google docs.
11:24:46 [pieterc]
cjg: open refine? ;)
11:24:47 [BartvanLeeuwe]
BartvanLeeuwe has joined #odw
11:24:50 [bhyland]
John Snelson: Need to be able to break out & work with data in a schema-less fashion.
11:25:04 [cjg]
with a magic table heading
11:25:39 [bhyland]
John Sheridan asked, in the world of tables & CSVs and [screw the metadata], how are you prepared to deal with the license matter?
11:26:39 [bhyland]
Rufus: I didn't say, 'screw the metadata'. Rather, we need simplicity and innovation about process. He suggested having multiple parties be part of the "packaging process".
11:28:02 [bhyland]
… Clearly a license has to come from an authoritative source. Gave example about data from Bank of England. Two important points, we need minimal metadata and … [some one else augment please, scribe missed second point]
11:28:12 [cjg]
*if* the source of the metadata is the same website as the data then that's probably good enough for me.
11:28:47 [bhyland]
Wrap up from panelists - 'wear your data on the outside, use HTTP URIs to describe things if putting on the Web.'
11:29:00 [alex]
s/data/schemas/
11:29:16 [bhyland]
John: Great opportunity for tool developers to liberate data.
11:29:23 [StevenPemberton]
Topic: Lightning Talks with a linked data theme
11:29:28 [StevenPemberton]
Scribe: Steven Pemberton
11:29:30 [bhyland]
End of panel facilitated by Jeni. Thanks all.
11:29:35 [StevenPemberton]
scribenick: StevenPemberton
11:29:36 [pieterc]
I have a problem with the fact that the data are/is being able to be processed through quick bash scripts, or other low barrier scripting languages, but the meta-data needs a json parser
11:30:07 [bhyland]
Someone else able to scribe, please? Pretty please??
11:30:13 [StevenPemberton]
Topic: Linked Data, Open Data and Big Data: Understanding the need for all three, Mark Birbeck, Sidewinder Labs
11:30:34 [StevenPemberton]
rrsagent, make minutes
11:30:34 [RRSAgent]
I have made the request to generate http://www.w3.org/2013/04/23-odw-minutes.html StevenPemberton
11:30:35 [jpcs1]
jpcs1 has joined #odw
11:31:46 [StevenPemberton]
markbirbeck1: I am from a semweb background
11:32:12 [StevenPemberton]
... software developer for decades
11:32:23 [johnlsheridan]
johnlsheridan has joined #odw
11:32:34 [StevenPemberton]
... [lists examples of RDF-based software project he has worked on]
11:32:45 [StevenPemberton]
s/project/projects/
11:33:00 [StevenPemberton]
... also involved with RDFa at W3C
11:33:11 [StevenPemberton]
... you can tell I'm setting things up to have a good moan
11:33:30 [StevenPemberton]
... usually data not available, or in inconvenient formats
11:33:35 [trc]
trc has joined #odw
11:33:38 [alex]
markbirbeck1: ooh, a jobs ontology. we wrote our own having found nothing in the wild (https://data.ox.ac.uk/id/dataset/vacancies)
11:33:40 [StevenPemberton]
... or not linked
11:34:20 [StevenPemberton]
... Lessons -
11:34:37 [StevenPemberton]
... - need a big cultural change to get open data
11:35:04 [StevenPemberton]
... - spreadsheets aren't that bad, don't need to wait for RDF
11:35:35 [edsu]
alex: http://schema.org/JobPosting
11:35:38 [StevenPemberton]
... but the timeframes were a big issue
11:35:58 [alex]
edsu: ah, cool; thanks :-)
11:36:14 [StevenPemberton]
... - Join question. Linked data would be great, but consistent code would be enough
11:36:27 [cjg]
hmm, is there a schema.org->RDF mapping? there must be...
11:36:50 [yvesr]
cjg, http://schema.rdfs.org/
11:36:59 [edsu]
cjg: there is, but really who cares?
11:37:06 [StevenPemberton]
... Big data is relevant, lessons learned from that are useful.
11:37:34 [floppy]
floppy has joined #odw
11:37:50 [CaptSolo]
CaptSolo has joined #odw
11:37:51 [StevenPemberton]
... Open data doesn't need to be RDF, use context
11:38:37 [StevenPemberton]
... only when you cross (company) boudaries, do thinks like schemas become importnant
11:38:42 [StevenPemberton]
s/nant/ant/
11:39:03 [StevenPemberton]
Topic: Publishing Linked Data Requires More than Just Using a Tool, Raphaël Troncy, Serena Villata & François Scharffe, Eurecom, INRIA wimmics, LIRMM University of Montpellier
11:39:33 [markbirbeck]
markbirbeck has joined #odw
11:40:13 [cjg]
edsu; me as we've just stared publishing vacancy data last week! Making it Linked Data is useful as it can cross-reference to our URIs for various departments & faculties.
11:40:25 [StevenPemberton]
timbl: when you mention experience you've had, please say who you are/were working for, was it big or small project, public or private, et.
11:40:32 [bhyland]
There goes TimBL again about context, context, context! ;-)
11:40:46 [HadleyBeeman]
Metadata for our conversations. :)
11:41:13 [PhilA]
who'd have thought context mattered for data eh bhyland?
11:41:16 [StevenPemberton]
markbirbeck: There was a layered approach to it in my case, people who had bought in but didn't know enough, which was worse
11:41:39 [StevenPemberton]
... but NHS in my case was an example, timing was bad because of looming cuts
11:42:08 [StevenPemberton]
... but I was naive too about the issues involved about publishing certain types of data and aggregation
11:43:29 [bhyland]
TimBL: Context is important. Users in intelligence community won't consider using data without provenance, won't even start the conversation or analysis.
11:44:03 [StevenPemberton]
Raphael: Most are tool builders here, but we need more than tools
11:44:27 [StevenPemberton]
... this a report of what we have done at a "datalift data Camp" last year
11:44:42 [StevenPemberton]
... lifting data to 5 star status
11:45:10 [StevenPemberton]
... It worked a bit, but was a good learning experience
11:45:42 [StevenPemberton]
... varied data source types
11:45:54 [StevenPemberton]
... and varied companies, with different needs
11:46:40 [StevenPemberton]
... Datalift is a package with single click download
11:46:55 [StevenPemberton]
... cross-platform
11:47:03 [StevenPemberton]
... [shows workflow]
11:47:09 [yoshiaki_]
yoshiaki_ has joined #odw
11:47:18 [StevenPemberton]
... converts to RDF
11:47:27 [StevenPemberton]
... and then the interlinking
11:47:44 [StevenPemberton]
... used for two large data collections in France
11:48:00 [StevenPemberton]
... DIfficulties are how to choose the right vocabulary
11:48:17 [StevenPemberton]
... rdf conversion, URI schemes to adopt
11:48:47 [StevenPemberton]
... automatic detection of datasets to link to
11:49:04 [StevenPemberton]
... LOV initiative, 260+ vocabs
11:49:20 [StevenPemberton]
... now open sourcee!
11:49:33 [StevenPemberton]
... http://lov.okfn.org
11:49:42 [StevenPemberton]
s/ee!/e!/
11:49:59 [bhyland]
I love how a French speaker says "LOV bot" as love boat.
11:50:19 [StevenPemberton]
... Conclusion - multilingual vocabs important
11:50:27 [StevenPemberton]
... hide complexity of sparql
11:50:37 [StevenPemberton]
... eg QAKIS
11:50:47 [StevenPemberton]
... Shape files are important
11:51:08 [masao]
masao has joined #odw
11:51:11 [StevenPemberton]
... INSPIRE directive and W3C GLD vocabs need to be covered
11:51:22 [bschloss]
Since Open Data is a means to several valuable ends, IBM is talking to our clients about thoughts of "becoming a Contextual Enterprise" and we emphasize the critical need to dynamically assemble context for every key input and output of their work, including the context of external data they import. See http://www.research.ibm.com/files/pdfs/gto_booklet_executive_review_march_12.pdf for very high-level summary of our recently released Global Technology Outlook.
11:51:26 [StevenPemberton]
... GTFS/DSPL formats
11:51:43 [StevenPemberton]
Topic: Linked Data at the Science Museum, Tristan Roddis, Cogapp
11:52:04 [StevenPemberton]
Tristan: We work with cultural heritage. Will talk about science museum now
11:52:11 [rtroncy]
rtroncy has joined #odw
11:52:11 [StevenPemberton]
... also a plea for help
11:52:36 [StevenPemberton]
... Science Museum is august and venerable, with loads of internal systems, we are trying to consolidate them
11:52:55 [rtroncy]
rtroncy has joined #odw
11:53:11 [StevenPemberton]
... we extract, and convert to linked data
11:53:20 [StevenPemberton]
... triple store
11:53:45 [pieterc]
rtroncy: how active is the development of Datalift? I haven't seen a lot of activity on the SCM
11:53:57 [StevenPemberton]
... built a data model, in cooperation with British Library, British Musem [others], see the paper
11:54:09 [StevenPemberton]
.. use that to drive the website
11:54:29 [StevenPemberton]
... my plea for help is what should be the next steps
11:54:45 [StevenPemberton]
... how can we make it more open?
11:55:02 [StevenPemberton]
... Publication strategies, stable URIs, dereferencable etc
11:55:15 [StevenPemberton]
... IS the data model interoperable
11:55:21 [StevenPemberton]
s/IS/Is/
11:56:02 [StevenPemberton]
Topic: Open Linked Education: a new Community Group, Madi Solomon, Pearson
11:56:23 [StevenPemberton]
Madi: I am new to W3C, and open linked data devotee
11:56:51 [StevenPemberton]
... Pearson is a publishing company, owns Financial Times and some Penguin books.
11:57:04 [StevenPemberton]
... I think we are the first W3C publisher member
11:57:15 [StevenPemberton]
[applause]
11:57:21 [edsu]
that says a lot
11:57:39 [StevenPemberton]
Madi: There is a new Community Group at W3C with 23 members
11:58:09 [StevenPemberton]
[link here to CG please]
11:58:19 [HadleyBeeman]
http://www.w3.org/community/opened/ https://twitter.com/search?q=%23ODW13https://twitter.com/search?q=%23ODW13
11:58:21 [StevenPemberton]
Topic: Questions
11:58:42 [StevenPemberton]
Ivana: Raphael, what were the outcomes?
11:58:43 [HadleyBeeman]
Eek, sorry. Try this: http://www.w3.org/community/opened/
11:58:48 [bhyland]
Madi: Data + education is a natural fit. Whatever we can do to make it easy for students + instructors + open data advocates will together make the world a better place.
11:58:51 [bhyland]
+10
11:59:18 [StevenPemberton]
Raphael: It was part one of a two part process. We wanted clean data, the next step will happen later this year, to reuse the data to build apps.
11:59:37 [bhyland]
s/advocates will/advocates to get
12:00:02 [StevenPemberton]
Raphael: Some of data sets are just data dumps
12:00:20 [StevenPemberton]
s/Ivana/Irina/
12:01:27 [StevenPemberton]
q1: is there automatic linking between data possible?
12:01:31 [ivan]
s/[link here to CG please]/-> http://www.w3.org/community/opened/ Open Linked Education Community Group/
12:01:58 [StevenPemberton]
MarkBirbeck: It is not just topics
12:02:26 [StevenPemberton]
.... do you mean just numerics?
12:02:37 [StevenPemberton]
q1: Not necessarily,
12:03:13 [StevenPemberton]
MarkBirbeck: This is what I was referring to earlier, for instance trying to identify a company from different versions of its name
12:03:26 [StevenPemberton]
... URIs are a great goal, but you can get there earlier
12:03:56 [StevenPemberton]
[SESSION ENDS]
12:05:13 [yoshiaki]
yoshiaki has joined #odw
12:07:21 [naomi]
naomi has joined #odw
12:12:09 [AndyS]
AndyS has joined #odw
12:28:03 [craig552uk]
craig552uk has joined #odw
12:32:15 [craig552uk]
craig552uk has joined #odw
12:35:07 [yoshiaki]
yoshiaki has joined #odw
12:52:02 [cjg]
cjg has joined #odw
12:58:45 [StevenPemberton]
StevenPemberton has joined #odw
12:59:05 [floppy]
floppy has joined #odw
12:59:40 [floppy1]
floppy1 has joined #odw
13:00:30 [rjw]
rjw has joined #odw
13:02:04 [HadleyBeeman]
HadleyBeeman has joined #odw
13:02:11 [masao]
masao has joined #odw
13:02:21 [cjg]
cjg has joined #odw
13:02:25 [HadleyBeeman]
scribenick: hadleybeeman
13:02:33 [JeniT]
JeniT has joined #odw
13:02:43 [fumi]
fumi has joined #odw
13:02:50 [HadleyBeeman]
Chair: LeighDodds
13:03:13 [rtroncy]
rtroncy has joined #odw
13:03:13 [HadleyBeeman]
topic: Data Interoperability
13:03:37 [pieterc]
pieterc has joined #odw
13:04:11 [yaso_]
yaso_ has joined #odw
13:04:17 [pieterc]
pieterc has joined #odw
13:05:11 [daveL]
daveL has joined #odw
13:05:41 [HadleyBeeman]
Kal Ahmed: Intro to talk on OData
13:05:58 [cerealtom]
cerealtom has joined #odw
13:06:01 [HadleyBeeman]
… OData is a standardised protocol for consuming and creating data APIs. -odata.org
13:06:20 [HadleyBeeman]
… originally conceived by Microsoft, this is bringing it into being a common protocol.
13:07:11 [HadleyBeeman]
… Odata is entity-centric. Comes from .NET developers with tables of data. STandard itself defines how you publish your metadata: service metadata and schema.
13:07:13 [ivan]
scribenick: HadleyBeeman
13:07:33 [yoshiaki]
yoshiaki has joined #odw
13:07:37 [HadleyBeeman]
… OData has a URL-based syntax for access.
13:08:03 [HadleyBeeman]
… Includes inline expansion between entities
13:08:08 [StevenPemberton]
rrsagent, make minutes
13:08:08 [RRSAgent]
I have made the request to generate http://www.w3.org/2013/04/23-odw-minutes.html StevenPemberton
13:08:18 [yoshiaki]
yoshiaki has joined #odw
13:08:38 [HadleyBeeman]
… POST a represention to an entity set's URL. PUT, PATCH, MERGE, or DELETE.
13:09:04 [cjg]
I've never heard of MERGE or PATCH before…
13:09:17 [alex]
PATCH is only just a Thing, isn't it?
13:09:24 [johnlsheridan]
johnlsheridan has joined #odw
13:09:28 [HadleyBeeman]
… Other nice features: combines metadata properties with a special media source URL. Named streams. Ability to embed your own custom actions and functions and expose them as URLs
13:09:35 [JeniT]
PATCH is a proper thing, haven't heard of MERGE
13:09:47 [bschloss]
bschloss has joined #odw
13:09:54 [alex]
only just> March 2010, according to http://tools.ietf.org/html/rfc5789
13:10:06 [HadleyBeeman]
… There are a lot of reasons to like OData. You can reliably discover the schema. Clients are all linked. Easy to experiment using those URLs.
13:10:13 [pieterc]
alex: The DataTank supports PATCH
13:10:21 [HadleyBeeman]
… There is a javascript serialization format
13:10:39 [pieterc]
alex: (tdt is a RESTful data adapter project in PHP)
13:10:42 [HadleyBeeman]
… There is a growing set of OData consumers. GUI controls and libraries.
13:10:45 [pieterc]
alex: (it sounds worse than it is)
13:11:07 [alex]
http://msdn.microsoft.com/en-us/library/dd541276.aspx "The remainder of this section defines a custom HTTP MERGE method"
13:11:09 [HadleyBeeman]
… Criticisms of OData: Service definitions tend to be siloed. Links don't tend to go outside the data service. Don't use any shared ontologies.
13:11:49 [HadleyBeeman]
… Another slight criticism: because of its history as being pushed by Microsoft, it's seen as being vendor specific. Not true; standarisation now under OASIS, other contributors
13:12:18 [stressindikator]
stressindikator has joined #odw
13:12:27 [HadleyBeeman]
… Why do developers use it? We live the features and the flexibility of RDF/SPARQL. We were disappointed with the Linked Data Platform proposals and the flexibility it would give.
13:12:50 [HadleyBeeman]
… We wanted it to be a declarative configuration only, ultimately to do that config automatically.
13:13:11 [HadleyBeeman]
… Previous attempt: LINQ - to - SPARQL, hand crafted as c#
13:13:20 [mig_garcia]
mig_garcia has joined #odw
13:13:38 [HadleyBeeman]
… This implementation: Proxy service for a SPARQL endpoint. http://github.com/brightstardb/odata-sparql
13:13:39 [lechatpito]
lechatpito has joined #odw
13:13:44 [AndyS]
AndyS has joined #odw
13:13:56 [pieterc]
pieterc has joined #odw
13:14:34 [HadleyBeeman]
… Key part of this: the annotations. They're in the OData spec. Defined for: URI namespace for entity primary keys, URIs for entity typoes, properties and directionality of links
13:14:35 [naomi]
naomi has joined #odw
13:14:46 [bhyland]
bhyland has joined #odw
13:15:03 [HadleyBeeman]
… Annotations are visible to the consumer, mappings done against the SPARQL endpoint are visible
13:15:30 [HadleyBeeman]
… Allows you to reconstruct the source triples you've just queried, if you'd ever want to.
13:16:24 [HadleyBeeman]
… Implementation issues: Our naive approach: if you ask for an entity, a DESCRIBE will give you what you want. It was too unspecified, so you have to use CONSTRUCT, which led to sroting and identification issues.
13:16:31 [roger]
roger has joined #odw
13:17:03 [HadleyBeeman]
… OData allows the server to do paging. If there's been a server-side limit imposed, you don't know that.
13:17:43 [HadleyBeeman]
… Biggest implementation issue: because we're turning primary keys into URI identifiers, every entity in the entity set has to have the same base URI. Not a problem in most cases, but potentially.
13:17:58 [HadleyBeeman]
… [Example query to select a simple film]
13:18:07 [pieterc`]
pieterc` has joined #odw
13:18:10 [HadleyBeeman]
… [Example query to enumerate films]
13:18:49 [HadleyBeeman]
… [example query to show property navigation]
13:19:09 [jpcs1]
jpcs1 has joined #odw
13:19:49 [HadleyBeeman]
… That's all leading up to a bunch of questions. First and I'm most interested in discussing here: What is the group's seen importance of standards in interoperability? Do standards need to interoperate? Do different standards body's standards need to interoperate? Whose responsibility is it?
13:20:15 [francois]
francois has joined #odw
13:20:49 [AndreaP]
AndreaP has joined #odw
13:20:51 [HadleyBeeman]
… More questions: what could the W3C LDP WG learn from OData and vice versa. OData changed in response to feedback/requirements. Now on third iteration… Should these requirements and use cases be shared between groups?
13:21:34 [HadleyBeeman]
13:22:01 [HadleyBeeman]
… Finally, is there a shared meta-model for entity-oriented view of data resources between the two?
13:22:14 [HadleyBeeman]
LeighDodds: Do you have a sense of uptake?
13:22:36 [JeniT]
(uptake of OData)
13:22:48 [HadleyBeeman]
Kal: hard to tell because search discovery of OData endpoints is hard. Probably more not visible to the Web than those that are.
13:23:09 [bschloss]
[I think the SAP ERP platform, recent version, has APIs to get information as ODATA]
13:23:41 [pieterc`]
pieterc` has joined #odw
13:23:46 [HadleyBeeman]
ivan: There have been several attempts to get these groups together. For all kinds of personal reasons, it did not work out. There is a community group at W3C on OData vs RDF; the group is silent, empty.
13:24:05 [HadleyBeeman]
Kal: It shouldn't be "OData vs RDF". They should be coexistant and work together.
13:24:42 [yaso]
yaso has joined #odw
13:24:47 [bhyland]
My question is (and I'm not being snarky or flip), why OData? Isn't this MS trying to redo RDF? RDF has matured and is well-documented. It is not perfect & use is far from ubiquitous however, why fragment?
13:24:58 [HadleyBeeman]
subtopic: Neil Benn, Fujitsu. LOD approach to engineering health-sensory datasets/
13:25:49 [HadleyBeeman]
Neil: I'll focus more specifically on health and health sensor data. I've recently joined this group, and this is one of the projects we're working on.
13:26:37 [HadleyBeeman]
… We're working on a cloud platform for large-scale graph storage. Public and private data. That seems to be a tension that is coming across throughout today. Therefore, Linked (Open|Closed) (Big) Data
13:26:54 [bschloss]
Mentions Linked (Open|Close) (Big) Data and mentions Fijitsu and DERI Collaboration on Linked Data Global Repository
13:27:03 [HadleyBeeman]
… We've been working with DERI on a CKAN-like LInked Data Global Repository. Faster and more searchable.
13:27:30 [HadleyBeeman]
… We're also involved in the W3C LDP WG
13:28:43 [HadleyBeeman]
… With the University of Singapore, we've been working on health care sensors. Temperature monitor, heart rate monitor, establish patient history. Challenge: how to combine sensor data with patient specific data from their health record, which might be different to medical best practice, clinical recommendations, etc?
13:29:16 [HadleyBeeman]
… We're making this sensor data linkable - 10m triples per person per week, for example - standardise, and link to data about effective drugs.
13:29:45 [HadleyBeeman]
… Announced in Nov, just working out how to do this. Open, closed and anonymisable data involved.
13:30:51 [HadleyBeeman]
… We are handling temporal data and binary data. Do we want to convert binary sensory data, with an established community of tools, into RDF? Maybe not. If not, how to work with the binary and the (other) linked data?
13:31:05 [HadleyBeeman]
… These things keep me… well, not awake at night, but certainly busy during the coffee break.
13:31:19 [floppy]
floppy has joined #odw
13:32:03 [masao]
masao has joined #odw
13:32:10 [HadleyBeeman]
… Non-technical challenges: main motivator for this paper: most open health data is on hospital numbers, costs of services, etc. But these are questions for policy makers; not as much emphasis on medical research.
13:32:42 [yaso_]
yaso_ has joined #odw
13:32:57 [HadleyBeeman]
… Found data on ECG and HBR stuff… but not as much emphasis of having a "broad church" of open medical health care data to generate further epidemiological and clinical research.
13:33:30 [HadleyBeeman]
… Generating these datasets is labour-intensive. One researcher said teams of researchers working on a dataset would be useful… How to do on the Web?
13:33:35 [floppy1]
floppy1 has joined #odw
13:33:57 [HadleyBeeman]
… Could be that we have more administrative hospital data than clinical data because it's easier to lobby governments than universities and researchers?
13:34:50 [HadleyBeeman]
… There still isn't much best practice on this. Vocabularies, dataset engineering patterns. We have patterns for building modular software… is there an equivalent here?
13:35:19 [HadleyBeeman]
… Ex: There is an ECG ontology I came across… should I use it?
13:35:44 [HadleyBeeman]
Questions
13:36:09 [markbirbeck]
markbirbeck has joined #odw
13:36:14 [HadleyBeeman]
BillR: You should look at Linked Data Patterns, LeighDodds is one of the authors
13:36:47 [HadleyBeeman]
Discussion with panel, including Albert Meroño-Peñuela
13:37:43 [CaptSolo]
http://patterns.dataincubator.org/book/
13:37:46 [HadleyBeeman]
Albert: We work with historical censuses, encoded in thousands of .xls spreadsheets. We would like to uniformly query them, but they are extremely messy. We'd like to transform them into RDF Data Cube and other vocabularies using SPARQL queries?
13:38:44 [HadleyBeeman]
Question: Bob Schloss: The value we seem to be talking about is mashups between datasets with unexpected results. Mapping was one of the first join points. What other join points do you see and do you agree this is critical?
13:38:51 [BartvanLeeuwen]
BartvanLeeuwen has joined #odw
13:39:15 [markbirbeck1]
markbirbeck1 has joined #odw
13:39:55 [HadleyBeeman]
Kal: Yes, I agree. Increasingly, I see a lot of time-series value type data, sets combined in a way to expose latent knowledge. Biggest problem is vocabulary interoperability. Odata doesn't have them so we can't do conceptual joins with data tagged with different systems.
13:40:07 [rszeno]
rszeno has joined #odw
13:40:18 [HadleyBeeman]
Bob: Let's reuse the requirements gathered from XBRL in the Financial industry. They do have publicly listed busineses.
13:41:21 [HadleyBeeman]
Neil: Open data is administrative, government-driven. People want to answer local questions, so that has driven a lot of the applications. But in that healthcare example, it's not geographically-specific. New disease patterns may not be tied to parts of a city.
13:41:43 [lottebelice]
lottebelice has joined #odw
13:42:26 [HadleyBeeman]
… With regard to the vocabularies question… I don't want to learn about all the vocabs out there. In the same way I can modularly take a bit of a software library to see what's in it, I'd like to do the same with a vocabulary. I want to conceptualise my data first, and modularly pick a vocabulary.
13:42:47 [HadleyBeeman]
Kal: The individual is an interesting join-point. For governments and otherwise.
13:42:59 [rszeno]
rszeno has left #odw
13:43:22 [roger]
roger has joined #odw
13:43:24 [HadleyBeeman]
Albert: In some domains, historical data is so badly degraded… and it may not have been intended to be comparable.
13:43:58 [rtroncy]
rtroncy has joined #odw
13:44:29 [HadleyBeeman]
TomHeath: Re data engineering patterns: we do need to go further than Leigh's book. Hack-y stuff (download, GREP, etc), ad-hoc processes. Things going on in the Hadoop community to describe these processes
13:44:59 [HadleyBeeman]
Neil: The term dataset engineering patterns… [coining a new phrase]
13:45:01 [markbirbeck]
markbirbeck has joined #odw
13:45:47 [HadleyBeeman]
Michael (from the EC): to Neil: re the link between closed/sensitive/open data… Are you looking at aggregated personal data that then can be opened? As in other areas of sensitive public data
13:46:48 [HadleyBeeman]
Neil: we don't quite have a generic process for anonymising sensitive data. Some organisations do that… I'm just in the early stages of learning the issues around that.
13:47:33 [HadleyBeeman]
questionasker?: concerned about applying the label of "open data" to data that's locked behind a query API. Do you share my concerns?
13:48:33 [HadleyBeeman]
Kal: OData entity set that conforms to the standard is enumerable… It's an ATOM feed with Next links in it. You can download it. Also, a data dump isn't any better — you're relying on the server's capacity to provide the data and the data being up to date.
13:48:44 [HadleyBeeman]
… I can see your point but I think it applies to all open data.
13:49:04 [HadleyBeeman]
questionasker?: If I were going to mortgage my house to fund a startup on this data, I would see this as a problem.
13:49:21 [HadleyBeeman]
Kal: Of course, there are different applications.
13:49:27 [HadleyBeeman]
[Closing session]
13:49:55 [rtroncy]
Topic: Lightning Talks with a geospatial theme
13:50:03 [rtroncy]
scribenick: rtroncy
13:50:11 [rtroncy]
scribe: rtroncy
13:50:24 [rtroncy]
Chair:Alex Coley
13:50:51 [jpcs1]
jpcs1 has joined #odw
13:51:39 [rtroncy]
Alex introducing the session, composed of three talks
13:52:35 [rtroncy]
on Jay le Grange - GeoKnow: Leveraging Geospatial Data in the Web of Data
13:52:46 [rtroncy]
[http://www.w3.org/2013/04/odw/odw13_submission_15.pdf paper]
13:53:20 [rtroncy]
EU Project GeoKnow: http://geoknow.eu/Welcome.html
13:53:51 [markbirbeck1]
markbirbeck1 has joined #odw
13:53:53 [rtroncy]
... inspired by earlier work on transforming OSM into Linked Data
13:54:03 [rtroncy]
s/OSM/Open Street Map
13:54:20 [rtroncy]
... 3 major sources of open geospatial data
13:54:50 [yaso_]
yaso_ has joined #odw
13:54:50 [hideaki]
hideaki has joined #odw
13:55:05 [rtroncy]
... spatial data infrastructures (compatible with almost all GIS), open data catalogue (SHP, KML files), crowdsourced geospatial data
13:55:08 [ldodds]
ldodds has joined #odw
13:55:28 [rtroncy]
... ontologies: basic geo vocabulary, GeoOWL ... and GeoSPARQL
13:56:26 [rtroncy]
... efficient geosparql RDF querying, fusion and aggregation of geospatial RDF data, visualization and authoring, public-private geo-spatial data (sync workflows)
13:56:47 [rtroncy]
... aim to provide a suite of GeoKnow Generator tools
13:57:10 [rtroncy]
... two use case scenarios: e-commerce and supply chain
13:57:22 [rtroncy]
... the GeoKnow generator is expected by December 2013
13:57:49 [rtroncy]
RRSAgent: draft minutes
13:57:49 [RRSAgent]
I have made the request to generate http://www.w3.org/2013/04/23-odw-minutes.html rtroncy
13:59:19 [rtroncy]
... see also: http://blog.geoknow.eu/
14:00:48 [rtroncy]
Michael Lutz - Interoperability of (open) geospatial data – INSPIRE and beyond
14:00:57 [rtroncy]
[http://www.w3.org/2013/04/odw/odw13_submission_58.pdf paper]
14:01:58 [rtroncy]
Michael: INSPIRE in a nutshell
14:02:26 [rtroncy]
... legal framework for establishing an infrastructure for spatial information in Europe
14:02:33 [rtroncy]
... 34 spatial themes
14:02:57 [rtroncy]
... implementation 2009-2020
14:03:32 [rtroncy]
... there is a growing interest in creating innovative products and services based on INSPIRE and other data
14:05:07 [rtroncy]
... we realize that with INSPIRE we cover a lot of topics of this workshop
14:05:33 [rtroncy]
... key issues with INSPIRE: enriching INSPIRE data models with application specific business data
14:06:27 [rtroncy]
... example: urban planning, waste management plans, environmental impact assessment, risk management on top of geo data
14:07:43 [rtroncy]
... beyond INSPIRE, traditionnally link with GIS format and XML ... how we move towards RDF
14:08:00 [rtroncy]
... how to create and manage persistent identifiers
14:08:30 [rtroncy]
... implications of opening up data for the organisations: governance, long term commitments, etc.
14:08:36 [Albert]
Albert has joined #odw
14:09:07 [rtroncy]
... how to address those issues? ISA = Interoperability Solutions for European Public Administrations program
14:09:43 [rtroncy]
... see also: ARe3NA (INSPIRE reference platform), EULF (EU Location Framework)
14:09:58 [rtroncy]
... W3C LOCADD community group
14:10:31 [rtroncy]
... advertisement for the INSPIRE conference in Florence 23-27 June 2013
14:11:57 [rtroncy]
... ISA program http://ec.europa.eu/isa/
14:14:16 [rtroncy]
Mark Herringer - Open Data on the Web and how to publish it within the context of Primary health care
14:14:24 [rtroncy]
[http://www.w3.org/2013/04/odw/odw13_submission_51.pdf paper]
14:14:31 [rtroncy]
Panel opened
14:15:06 [bhyland]
bhyland has joined #odw
14:15:25 [rtroncy]
unknown: question about identifiers, can we expect a better framework, e.g. URI in INSPIRE ?
14:16:00 [rtroncy]
Michael: in INSPIRE, there are 2 types of identifiers
14:16:13 [rtroncy]
... for data objects and for real-world things
14:16:47 [StevenPemberton]
StevenPemberton has joined #odw
14:17:17 [rtroncy]
... we relax recently how to write those identifiers and enable http identifiers
14:17:30 [PhilA]
Thank you Michael Lutz on URIs
14:17:36 [st]
st has joined #odw
14:17:59 [markbirbeck]
markbirbeck has joined #odw
14:18:15 [roger_]
roger_ has joined #odw
14:19:09 [rtroncy]
Raphael: there are a number of initiatives that try to take part of UML diagrams of INSPIRE and build RDF schema, see e.g. efforts from Laurent Lefort and others
14:19:41 [rtroncy]
... are there plans to have an official schema in RDF for INSPIRE ?
14:20:02 [rtroncy]
Michael: yes, we will organize a workshop where everybody presents its modeling ... and we wish to have an agreed upon model
14:20:28 [rtroncy]
RRSAgent: generate minutes
14:20:28 [RRSAgent]
I have made the request to generate http://www.w3.org/2013/04/23-odw-minutes.html rtroncy
14:21:09 [HadleyBeeman]
HadleyBeeman has joined #odw
14:22:01 [naomi]
naomi has joined #odw
14:29:14 [cjg]
cjg has joined #odw
14:30:05 [laurent_au]
laurent_au has left #odw
14:32:19 [cjg]
cjg has joined #odw
14:39:42 [johnlsheridan]
johnlsheridan has joined #odw
14:39:57 [rjw]
rjw has joined #odw
14:40:26 [fumi]
fumi has joined #odw
14:40:52 [HadleyBeeman]
HadleyBeeman has joined #odw
14:40:55 [yoshiaki]
yoshiaki has joined #odw
14:40:59 [cjg]
cjg has joined #odw
14:40:59 [JeniT]
JeniT has joined #odw
14:41:00 [StevenPemberton]
StevenPemberton has joined #odw
14:41:07 [jpcs1]
jpcs1 has joined #odw
14:41:23 [yaso]
yaso has joined #odw
14:42:02 [StevenPemberton]
rrsagent, here?
14:42:02 [RRSAgent]
See http://www.w3.org/2013/04/23-odw-irc#T14-42-02
14:42:14 [yoshiaki_]
yoshiaki_ has joined #odw
14:42:14 [yaso]
Lotte Belice about Open Culture Data
14:42:27 [JeniT]
Scribe: yaso
14:44:57 [AndreaP]
AndreaP has joined #odw
14:44:58 [stressindikator]
stressindikator has joined #odw
14:45:38 [stressindikator]
stressindikator has left #odw
14:47:17 [MLutz]
MLutz has joined #odw
14:47:39 [bhyland]
bhyland has joined #odw
14:48:25 [naomi]
naomi has joined #odw
14:48:35 [albertm]
albertm has joined #odw
14:49:09 [yaso]
yaso has joined #odw
14:50:10 [bhyland]
bhyland has joined #odw
14:51:55 [HadleyBeeman]
scribenick: hadleybeeman
14:52:06 [HadleyBeeman]
Topic: Panel: The Business of Open Data
14:52:33 [HadleyBeeman]
Johnlsheridan: It's 2020 and we've seen the failure of the world's first multibillion dollar open data corporation. How did this happen?
14:52:36 [yaso]
Yes, I'm with connection problems
14:53:08 [HadleyBeeman]
Conor Riffle: We've been looking at lots of business models. Sponsorship would be hard to scale to that level.
14:53:15 [yaso_]
yaso_ has joined #odw
14:53:31 [HadleyBeeman]
… Also look at people like Google who make tons of apps and sell ads on that.
14:54:01 [HadleyBeeman]
JohnLsheridan: which of the eight business models Michele has identified could scale to that level?
14:54:50 [yaso__]
yaso__ has joined #odw
14:55:07 [HadleyBeeman]
Miguel: Usually, all the four actors are able to manage a huge amount of data. We have some enablers - usually they are scalable - but they do not serve end users. They're in a wholesale position in the value chain. Examples: Microsoft, Socrata.
14:55:27 [HadleyBeeman]
… Many of them have other business lines, even outside the boundary of public sector information.
14:56:31 [HadleyBeeman]
Irina: I think you'd want lots and lots of smaller companies, not one big one. As small music app companies are threatening the big distributors, a big company doesn't fit.
14:57:10 [yaso]
yaso has joined #odw
14:57:16 [HadleyBeeman]
Bart: The Fire Department wants to be the authoritative source of information. They won't make a business out of it, but they will engage to have usable data.
14:57:50 [HadleyBeeman]
Michele: Risk to opening up data… fear of losing control. But benefit: they will be seen as the authoritative source. We see both.
14:58:19 [HadleyBeeman]
Lotte: open data can bring big benefits to companies.
14:58:45 [yaso_]
yaso_ has joined #odw
14:59:39 [HadleyBeeman]
questionasker?: Do we all agree that we should build public infrastructure, basic datasets to build business models on top of… If we don't do it fast, a big multi-billion company maybe wants to become a public infrastructure provider? Or the market will collapse and transform in another way. We, as a community, need to identify the basic datasets which will be the "streets" of open data.
14:59:59 [HadleyBeeman]
JohnLsheridan: What are the basic datasets of interest for fire services?
15:00:21 [HadleyBeeman]
Bart: Address data. Real streets. We don't have "highways" for open data yet; we have "rural roads."
15:00:49 [HadleyBeeman]
… Large companies taking over scares the Fire departments as well. "What if a company over in America is holding our data?" An important discussion to have.
15:01:05 [HadleyBeeman]
Johnlsheridan: Do you see CDP becoming that sort of infrastructure provider?
15:02:02 [HadleyBeeman]
Conor: I think we are. Especially where companies are contributing pollutants to that atmosphere, it impacts all of us. But we see it's useful where people can make money out of it. Investors will use it. But there's more to do with it. We need a hybrid model: some monitisable, some open.
15:03:11 [HadleyBeeman]
Bernadette: I'd recast the question: It would give me great joy if, next year, there are 20 companies 10-100 people with $2-20m in gross revenues who are using this technology to share information, for-profits (not grant-funded). We don't need yet another social network or cow-tipping site.
15:03:47 [HadleyBeeman]
… If they are venture-funded, it would be with a social enterprise angle.
15:04:31 [HadleyBeeman]
Chris Metcalf: In the US, I feel like we're seeing the steam come out of pure open data. We need to show the benefits, which are often business. We work with small businesses to do that. We need to focus on that in the community.
15:05:44 [HadleyBeeman]
Bob: Infrastructure isn't always provided by regulators, grant makers and hackers/coders. It's sometimes created by lawyers and judges. I think some orgs and agencies are hesitating to publish open data because they're afraid of inaccurate records and resulting harm and subsequent lawsuits. We may need some case law to determine this.
15:05:51 [bhyland]
bhyland has joined #odw
15:05:53 [HadleyBeeman]
… To Conor: because your data can impact stock price, do you have T&Cs to cover that?
15:06:48 [yaso]
yaso has joined #odw
15:07:02 [HadleyBeeman]
Conor: We do have cleverly-written T&Cs. Many many companies to agree to them. Other orgs can learn from our lessons: we don't own the data submitted to us.
15:07:44 [HadleyBeeman]
… To Chris: Yes, we need to crate value from things built on public data, but also as a provider: how can we increase the value all along the chain?
15:08:44 [HadleyBeeman]
Michele: What we see: one the benefits is people correcting data and pushing it back to the publisher. Enhancing it, geotagging, improving our metadata.
15:09:32 [HadleyBeeman]
… There was a company who wanted to make money out of the data, and we want them to succeed. But this is a public sector answer, I realise.
15:09:44 [DeirdreLee]
DeirdreLee has joined #odw
15:10:16 [HadleyBeeman]
Lotte: Do not forget SMEs like ours: manufacturers, consulting services, pharmacies… they are the ones who will recreate the value in the data.
15:10:21 [albertm`]
albertm` has joined #odw
15:10:30 [HadleyBeeman]
… New standards, new protocols, new releases, new things.
15:11:15 [HadleyBeeman]
questionasker?: This isn't a level playing field. In the development of the Web, it's a case of survival of the fittest, driven by quality, quantity and cost.
15:11:55 [HadleyBeeman]
… Chances are high that whoever that company is in the future, they are here today. I'm hearing that open data should be a communal type where everyone has a chance. Those at the front will probably stay there; this is a call to them to maintain the lead.
15:12:10 [AndreaP]
AndreaP has joined #odw
15:12:31 [HadleyBeeman]
s/questionasker/phil tetlow
15:12:33 [AndyS]
AndyS has joined #odw
15:12:52 [HadleyBeeman]
questionasker?: Can we learn from the open source business models?
15:13:02 [JeniT]
s/questionasker/Thijs/
15:13:11 [HadleyBeeman]
Miguel: Yes, one of our models is called "open source like".
15:13:22 [JeniT]
s/Miguel/Michele/
15:14:33 [HadleyBeeman]
… where reusers do not pay. As with Open Corporates, Licenses allowing non-commercial reuse.
15:14:56 [HadleyBeeman]
Conor: Ask: How did the open source software people monitise it? A lot of them got burned.
15:15:19 [HadleyBeeman]
Thijs: Training, consultancy,
15:15:56 [HadleyBeeman]
Bart: In the Netherlands, the interesting datasets are often 3GB downloads. They will pay someone to maintain it in a usable form for them. That's the added value.
15:15:58 [StevenPemberton]
rrsagent, make minutes
15:15:58 [RRSAgent]
I have made the request to generate http://www.w3.org/2013/04/23-odw-minutes.html StevenPemberton
15:16:22 [bhyland]
Bart: Services model similar to what RedHat does — good packaging and great support for enterprises.
15:17:11 [StevenPemberton]
s/tetlow?:/tetlow:
15:17:24 [StevenPemberton]
s/Thijs?:/Thijs:/
15:17:48 [HadleyBeeman]
Irina: CKAN is both open source and open data. How do you make it sustainable for businesses who publish data? Isn't that only an issue for businesses who only sell data? If it's a by-product of something else, it may drive more traffic
15:18:52 [StevenPemberton]
s/monitise/monetise/
15:18:57 [HadleyBeeman]
John: final thoughts
15:18:58 [JeniT_]
JeniT_ has joined #odw
15:19:40 [HadleyBeeman]
Lotte: We're seeing a shift from the fear of publishing to the network of data and content. Besides data, I look forward to opening more videos and content.
15:20:17 [atlets]
atlets has joined #odw
15:20:51 [HadleyBeeman]
Michele: The first enabler is the government itself. Gov has to build the governmental infrastructure. Inspiring motto from Federal CIO of USA: Everything should be an API.
15:21:26 [ci]
ci has joined #odw
15:21:32 [HadleyBeeman]
… 1st step: publish open data, 2nd step: bring gov into the business model.
15:22:26 [albertm`]
albertm` has joined #odw
15:22:28 [HadleyBeeman]
… data reuse. A shared data model across agencies.
15:22:31 [yaso]
yaso has joined #odw
15:22:51 [HadleyBeeman]
Miguel: SMEs need data to create value and generate new business lines.
15:23:31 [HadleyBeeman]
Bart: Fire fighting data work is 20% technology and 80% people and politics. I'd like to see this reversed.
15:23:52 [HadleyBeeman]
Conor: We need to get the business model right both for the providers and users.
15:23:56 [HadleyBeeman]
[Session ends]
15:24:01 [HadleyBeeman]
rrsagent, draft minutes
15:24:01 [RRSAgent]
I have made the request to generate http://www.w3.org/2013/04/23-odw-minutes.html HadleyBeeman
15:25:30 [StevenPemberton]
Scribe: Deirdre Lee
15:25:39 [StevenPemberton]
scribenick: DeirdreLee
15:26:17 [markbirbeck]
markbirbeck has joined #odw
15:26:33 [HadleyBeeman]
Scribenick: hadleybeeman
15:26:46 [HadleyBeeman]
Topic: The Exhibitionists
15:26:50 [HadleyBeeman]
Chair: Julian Tate
15:26:54 [StevenPemberton]
Topic: Opening up the BBC's data to the Web, Sofia Angeletou
15:27:01 [masao]
masao has joined #odw
15:28:37 [ivan]
scribe: ivan
15:28:52 [ivan]
Opening up the BBC's data to the Web, Olivier Thereaux, Sofia Angeletou, Jeremy Tarling and Michael Smethurst
15:29:13 [ivan]
http://www.w3.org/2013/04/odw/odw13_submission_22.pdf
15:29:36 [ivan]
Sofia: The problem with the older approaches was that the material was not ours:
15:29:48 [ivan]
… we have only certain freedom to use it for some purposes
15:30:24 [ivan]
… another thing we were doing is to use MusicBrainz for the music website
15:30:24 [ivan]
… we do the same thing for the weather website
15:30:35 [ivan]
…. we use a lot of reuse from open datasets
15:30:43 [ivan]
… also from wikipedia for nature and wild life
15:30:50 [ivan]
… we reuse the wikipedia id-s
15:31:02 [ivan]
… because the uris are not static, then the service breaks
15:31:07 [ivan]
… this is a big deal for the BBC
15:31:19 [ivan]
… we cannot blindly rely on dataset and we need editorial control
15:31:28 [ivan]
… these were the first efforts with using LOD
15:31:43 [ivan]
… all of these experiences convinced BBC to invest more into the SW stuff
15:31:48 [DeirdreLee]
DeirdreLee has joined #odw
15:31:52 [ivan]
…. eg for the olymic web site
15:32:08 [ivan]
sofia: the sport web site uses about 4 million user a day
15:32:25 [pieterc]
pieterc has joined #odw
15:33:04 [ivan]
scribe: DeirdreLee
15:33:19 [DeirdreLee]
Sofia: next steps for BBC is to roll-out aproach beyond sport
15:33:45 [DeirdreLee]
... currently working on linking content together on news site
15:33:59 [StevenPemberton]
rrsagent, make minutes
15:33:59 [RRSAgent]
I have made the request to generate http://www.w3.org/2013/04/23-odw-minutes.html StevenPemberton
15:34:22 [DeirdreLee]
... trial from birmingham an black country will be rolled out nationwide in coming months
15:34:28 [DeirdreLee]
s/an/and
15:35:26 [DeirdreLee]
... will annotate news items with other pieces of related content
15:35:47 [DeirdreLee]
... would like to roll this out with archival content also
15:36:47 [bhyland]
Appreciate Sofia's choice of headline at Google London office, "Google boss defends UK tax record to BBC" with byline "Eric Schmidt defends Google just paying 6M GBP in UK corporation taxes"
15:36:47 [DeirdreLee]
... diagram from presentation shows content from archives, BBC hope to use Linked Data to expose their data in interesting ways
15:37:28 [DeirdreLee]
... BBC have identified some challenges with publishing Linked Data (listed in presentation)
15:38:36 [DeirdreLee]
... what are the drivers for opening up their LD datasets, how to select good quality datasets, and how to meaure success
15:39:15 [DeirdreLee]
Alvaro Graves from RPI up next on Democratizing Open Data
15:40:09 [DeirdreLee]
Alvaro: Good news, there is millions of Open Datasets on the Web, billions of triples in the LOD cloud
15:40:22 [jpcs1]
jpcs1 has joined #odw
15:40:34 [DeirdreLee]
... Bad news, there is a lot of inconsistent noisy data out there
15:40:46 [DeirdreLee]
... but this can be solved with standards, etc
15:41:01 [DeirdreLee]
... other bad news is that much of the datasets out there is boring!
15:41:27 [DeirdreLee]
... for example, stale data
15:41:50 [StevenPemberton]
StevenPemberton has joined #odw
15:42:03 [StevenPemberton]
rrsagent, here?
15:42:03 [RRSAgent]
See http://www.w3.org/2013/04/23-odw-irc#T15-42-03
15:42:29 [DeirdreLee]
... there is also 'unusable' data, that the majority of the general public can use
15:42:53 [DeirdreLee]
... how can those without access to technical skills & expertise make use of Open Data?
15:43:20 [DeirdreLee]
.. small-scale communities or journalists?
15:44:36 [DeirdreLee]
... If we look at the Web, in the beginning there was a need for a webmaster to develop web-pages, but then tools like wikis, blogs came along that helped everyone to create web-content
15:44:58 [DeirdreLee]
... this should be possible with Open Data too, to encourage use
15:45:23 [DeirdreLee]
... visualisations are an easy win to get people to make use of Open Data
15:46:28 [DeirdreLee]
... Visualbox, a tool for creating visuallisations based on LD, used in workshop
15:47:24 [DeirdreLee]
... feedback was positive, and people learned quickly. however SPARQL was deemed difficult by workshop participants
15:47:44 [DeirdreLee]
... another complaint was about the quality of the data
15:48:10 [DeirdreLee]
... Call to arms: we need better tools - libraries and APIs for geeks are not enough
15:48:50 [DeirdreLee]
... general public usually have better needs. citizens need to be empowered to use Open Data, so they don't need a PhD in Semantic Web to get started!
15:49:00 [DeirdreLee]
... visualisations are a good way to start
15:49:05 [fumi]
fumi has joined #odw
15:49:13 [jpcs1]
jpcs1 has joined #odw
15:49:20 [bhyland]
bhyland has joined #odw
15:49:53 [JeniT]
seems like http://www.tableausoftware.com/products/public is relevant re tools
15:49:56 [DeirdreLee]
subtopic: Andreas Koller from Royal College of Art, talking about Opening Open Data
15:49:59 [cjg]
cjg has joined #odw
15:50:28 [StevenPemberton]
rrsagent, make minutes
15:50:28 [RRSAgent]
I have made the request to generate http://www.w3.org/2013/04/23-odw-minutes.html StevenPemberton
15:50:30 [DeirdreLee]
Andreas: background in graphic design
15:51:09 [DeirdreLee]
... wants to discuss graphic design and coding, and tools that allow ordinary people use Open Data
15:51:24 [alex]
JeniT: I once asked them whether they had documentation for an API for whatever software Oxford had bought, and they pointed me back at our own people
15:51:43 [alex]
(they didn't seem to do 'open' at that time)
15:52:20 [alex]
but stuff like http://public.tableausoftware.com/views/Studentstatistics-UniversityofOxford/Yearlysnapshotsummary?:embed=yes&:tabs=yes&:toolbar=yes is cool
15:52:57 [JeniT]
alex: you still have to upload your data to them, I think, to use it, so not for everyone, but in terms of interface it's something to look at
15:53:07 [DeirdreLee]
... designers could help with data ownership and data ethics
15:53:54 [bhyland]
RE: reference to the saying, "Data is the new oil!", see http://blogs.hbr.org/cs/2012/11/data_humans_and_the_new_oil.html
15:54:09 [StevenPemberton]
s;[link here to CG please];-> http://www.w3.org/community/opened/ Open Linked Education Community Group;
15:54:23 [DeirdreLee]
... When teaching students to code, they may have a fear of tools
15:54:27 [bhyland]
Jer Thorp, "Any kind of data reserve that exists has not been lying in wait beneath the surface; data are being created, in vast quantities, every day. Finding value from data is much more a process of cultivation than it is one of extraction or refinement."
15:54:41 [BartvanLeeuwen]
BartvanLeeuwen has joined #odw
15:55:09 [DeirdreLee]
... having libraries for existing designers' tools would enable easy access to Open Data
15:55:28 [DeirdreLee]
... as would low-level examples and list of data catalogues
15:55:51 [DeirdreLee]
... This is an example of how Open Data could be opened up to another community
15:56:22 [DeirdreLee]
... small effort for Open Data practitioners, but would be of great benefit to other communities
15:56:40 [StevenPemberton_]
StevenPemberton_ has joined #odw
15:57:22 [DeirdreLee]
... easy access to Open Data would enable designers (and other communities) to see the value within the data and enable them to use it and extract knowledge from it
15:57:48 [DeirdreLee]
subtopic: Benedikt Groß, Royal College of Art, Large Scale Data & Speculative Maps
15:57:57 [StevenPemberton_]
rrsagent, here?
15:57:57 [RRSAgent]
See http://www.w3.org/2013/04/23-odw-irc#T15-57-57
15:58:35 [DeirdreLee]
Benedikt shows Data Viz Pipeline
15:58:58 [DeirdreLee]
Benedikt: most of what we have been talking about today focuses on the left side of the pipeline
15:59:05 [StevenPemberton_]
rrsagent, make minutes
15:59:05 [RRSAgent]
I have made the request to generate http://www.w3.org/2013/04/23-odw-minutes.html StevenPemberton_
15:59:22 [DeirdreLee]
... will show some projects that use Open Data
15:59:54 [bhyland]
The HBR article by Jer Throp nicely supports the thoughts of the speakers, (I think), "As we proceed towards profit and progress with data, let us encourage artists, novelists, performers and poets to take an active role in the conversation. In doing so we may avoid some of the mistakes that we made with the old oil."
16:02:48 [DeirdreLee]
... Metrology, visualises the London tube map with Open Street Map data as a mental map, by mapping actual locations to tube map, using mathematical models
16:03:35 [StevenPemberton_]
He showed the mapping from true life to the tube map, and then reversed the process to make a real map with the same distortions
16:06:09 [DeirdreLee]
... Speculative Sea Level Explorer project Combines NASA data on sea level with map visualisations to show effects of sea levels rising and falling
16:07:10 [bschloss]
bschloss has joined #odw
16:07:53 [DeirdreLee]
... sneak preview to m3ta.js, a visual programming language with metaphor to lego-blocks
16:08:04 [bschloss]
Fascinating to see what Royal Academy of Art people can do for visualizations. Can less skilled people do something nearly as good. My IBM colleagues are experimenting with a site called Many Eyes 2.0 (beta) at http://www-958.ibm.com/software/analytics/labs/manyeyes/
16:08:16 [DeirdreLee]
suptopic: panel discussion
16:08:28 [HadleyBeeman]
s/suptopic/subtopic
16:09:38 [DeirdreLee]
julian: do you see yourself creating a toolbox for visulaising open data?
16:09:46 [albertm`]
albertm` has joined #odw
16:10:24 [DeirdreLee]
benedikt: great to release tools, but you can't just release source-code but need documentation and examples too, which is time-consuming
16:11:02 [st]
st has joined #odw
16:11:28 [DeirdreLee]
Alvaro: you can't just release code/tools/projects, but you are responsible for maintaining it (like kids :) )
16:11:37 [yvesr]
had very good experiences with http://d3js.org/ for data visualisation - very powerful toolkit
16:12:13 [DeirdreLee]
Question from audience
16:12:24 [bhyland]
@Alvaro, Interesting analogy, Open Source is like a marriage, 'it comes back and you have to answer questions… it is also like children, you cannot let them out into the wild [without guidance]' ;-)
16:12:58 [DeirdreLee]
Aivan: if you have to convince CNN in an elevator pitch to use the approach as BBC, how would you do it?
16:13:35 [DeirdreLee]
Olivier (BBC, from audience): focus on your own data, and use Open Data where possible to fill the gaps
16:13:59 [DeirdreLee]
TimBL: Who publlishes data about their own products?
16:14:42 [yvesr]
s/Olivier/yvesr :)
16:15:11 [ivan]
s/Aivan/Ivan/
16:15:18 [DeirdreLee]
... if people publich data about their own products, there won't be a need for CNN to publish data
16:15:22 [albertm``]
albertm`` has joined #odw
16:15:59 [StevenPemberton]
s/publich/publish
16:16:24 [bhyland]
I invite everyone to publish information about their organization, project, product and/or service on the Web today using http://dir.w3.org.
16:16:25 [bhyland]
If you care, it is a entirely Linked Data app. If you don't care, just fill out the form, publish the dir.ttl file produced for you automagically (like FOAF-a-Matic) on the public Web and submit it for harvesting.
16:16:33 [StevenPemberton]
s/publlishes/publishes
16:16:40 [DeirdreLee]
sofia: so much in archives, not just about publishing data, but reusing data
16:16:56 [DeirdreLee]
Comment from audience: metadata is advertising for your data
16:17:12 [bhyland]
RE: dir.w3.org, if you want to read an FAQ, see http://dir.w3.org/directory/pages/faq.docbook?view
16:17:53 [DeirdreLee]
Neil Benn (Fujitsu): in 2020, what have the political arguements been to convince governments to publish Open Data
16:18:53 [DeirdreLee]
Alvara: it's socially beneficial for everyone, Open Data enables people to solve more problems
16:19:57 [DeirdreLee]
... in chile, a lot of money is being invested in start-ups and entrepreneur programmes; is is not fair to ask for similar spend on democratising data?
16:20:32 [DeirdreLee]
Benedikt: in the future, there mightn't be an open data debate, it will just be the standard
16:21:39 [DeirdreLee]
Bschloss: TimBL alluded to a key thing, CNN will have to put out metadata on related content
16:22:57 [DeirdreLee]
... uses the example of airlines. putting out ticket information because they wanted to be listed
16:23:27 [DeirdreLee]
Andreas: key is that entry level for using Open Data is very low
16:23:46 [cgueret]
cgueret has joined #odw
16:24:37 [bhyland]
bhyland has joined #odw
16:24:37 [DeirdreLee]
bhyland: there is now a community directory online dir.w3.org/
16:24:39 [LarsG]
LarsG has joined #odw
16:24:48 [DeirdreLee]
http://dir.w3.org
16:24:57 [timbl]
timbl has joined #odw
16:25:10 [timbl]
logger, pointer?
16:25:15 [bschloss]
CNN will have to put out metadata or risk losing sales or eyeballs. Let's learn from history where first movers got value (like Airlines that listed their schedules and prices on GDS', then other Airlines followed rapidly to not be at a disadvantage)
16:25:16 [DeirdreLee]
to list Linked Data products, services and projects
16:25:17 [alex]
bhyland: the "Create an entry" link at http://dir.w3.org/directory/pages/faq.docbook?view doesn't work, and there's a missing stylesheet error when one goes where you'd think it should have pointed
16:25:25 [timbl]
RRSAgent, pointer?
16:25:25 [RRSAgent]
See http://www.w3.org/2013/04/23-odw-irc#T16-25-25
16:25:26 [bhyland]
On behalf of the W3C Gov't Linked Data Working Group, I encourage everyone attending this workshop to add their organization to dir.w3.org today or tomorrow.
16:26:00 [alex]
(ah; I'd missed the '?view' off the end of my guessed URL)
16:26:03 [DeirdreLee]
sofia: important to show the value to publishers of opening up data
16:26:05 [StevenPemberton]
http://readwrite.com/2010/06/30/how_best_buy_is_using_the_semantic_web
16:26:11 [bhyland]
It is simple to do, fast and gets more valuable Linked Data on the public web … plus it builds community & helps us all help one another.
16:26:39 [StevenPemberton]
Best Buy reports a 30% increase in page views, and 15% increase in click throughs
16:26:52 [DeirdreLee]
Alvaro: if a major part of the population cannot access the data, the technical discussions are irrelevant. general public needs to be empowered to access and use Open Data
16:27:04 [bhyland]
@alex, what browser are you using? I see it ok on FF & Chrome
16:27:16 [DeirdreLee]
Andreas: agrees, general public should realise Open Data is THEIR data
16:27:51 [rjw]
bhyland: the Create an entry link on http://dir.w3.org/directory/pages/faq.docbook?view fails :-(
16:27:56 [DeirdreLee]
Benedikt: things are looking positive, lets hope to implment even 30% of what we have been discussing here tday
16:27:59 [bhyland]
Ah Alex, I see the problem, try this http://dir.w3.org/directory/pages/create-entry.xhtml?view
16:28:10 [bhyland]
Thanks for pointing out that incorrect link, will fix now.
16:28:33 [ivan]
rrsagent, draft minutes
16:28:33 [RRSAgent]
I have made the request to generate http://www.w3.org/2013/04/23-odw-minutes.html ivan
16:28:56 [alex]
bhyland: thanks :-)
16:29:06 [jpcs1]
jpcs1 has joined #odw
16:30:10 [DeirdreLee]
s/tday/today
16:31:28 [yoshiaki]
yoshiaki has joined #odw
16:31:39 [bhyland]
Hats off to Phil for keeping us on military time & getting us to the pub on time! Awesome job program chairs, thank you Phil, Jeni, Rufus and DanBri
16:32:20 [yoshiaki]
yoshiaki has joined #odw
16:35:02 [cjg]
cjg has joined #odw
16:37:06 [cjg]
cjg has joined #odw