07:50:05 RRSAgent has joined #odw 07:50:05 logging to http://www.w3.org/2013/04/23-odw-irc 07:50:38 PhilA has changed the topic to: ODW13 Day 1 07:50:48 meeting: Open Data on the Web Day 1 07:50:52 chair:PhilA 07:51:36 jpcs1 has joined #odw 08:19:22 jpcs1 has joined #odw 08:25:33 jpcs1 has joined #odw 08:25:39 ivan has joined #odw 08:25:45 timdavies has joined #odw 08:25:47 daveL has joined #odw 08:25:49 JeniT has joined #odw 08:26:01 Steven has joined #odw 08:26:29 yvesr has joined #odw 08:27:23 markbirbeck has joined #odw 08:27:43 mig_garcia has joined #odw 08:28:01 danbri has joined #odw 08:28:05 floppy has joined #odw 08:28:12 cjg has joined #odw 08:28:18 bschloss has joined #odw 08:29:02 scribe: PhilA 08:29:05 Agenda: http://www.w3.org/2013/04/odw/agenda 08:29:08 scribeNick: PhilA 08:29:27 Topic: John Sheridan - Building our houses on rock 08:29:37 paper http://www.w3.org/2013/04/odw/odw13_submission_25.pdf 08:29:38 laurent_au has joined #odw 08:30:04 John Sheridan from the National Archives begins talking - move from 'ephermal, temporary' world to a world where there is more confidence in our data 08:30:06 JohnS: Talking philosophically about the need for longevity 08:30:21 ... how do I discover data that I can trust nad rely on, use etc. 08:30:25 How to discover open data I can trust and rely on 08:30:46 Johns: How do we firm up our open data can begin to use it and re-use it confdeintly and well 08:30:58 JohnS: Sustaining our open data. How do we do that. 08:31:15 bhyland has joined #odw 08:31:16 ... our budgets are declining. how do we sustain our publishing activity 08:31:41 ldodds has joined #odw 08:31:48 AndyS has joined #odw 08:31:58 JohnS: Share the responsibility of supporting and curating open data 08:32:11 libby has joined #odw 08:32:17 ... open data community is good at coming up with the rock to build on 08:32:40 yoshiaki has joined #odw 08:32:45 JohnS: I work for a reputable institution. You'll trust the data if you trust the institution 08:32:51 johnS: Adds solidity 08:33:10 cerealtom has joined #odw 08:33:19 JohnS: Extreme end is legislation. e.g. INSPIRE regulations that demand certain data norms 08:34:08 JohnS: How can we know if policies like INSPIRE will work? Should we be asking for more of that or going to people like the National Archives and asking them for commitments 08:34:25 JohnS: There's a lot to do to build our data on rock 08:34:26 edsu has joined #odw 08:34:33 lottebelice has joined #odw 08:34:45 JohnS: The ODI certificate may be one of the most important things for the community to work on this year 08:34:49 jpcs1 has joined #odw 08:34:58 http://theodi.github.io/open-data-certificate/ 08:34:58 fumi has joined #odw 08:35:04 JohnS: it would be good to discuss here what role things like the ODI certificate can have 08:35:23 JohnS: Talking about the Gazettes (London Belfast etc.) 08:35:50 JohnS: This is about putting things on the public record, where data is available, provenence and authenticity supported and availability guaranteed 08:35:56 ... service will be completed by September 08:36:04 ... how do we see more services like this come into existence 08:36:15 ... it's about devising tracks 08:36:40 ... the way forward to make all this happen with a solid basis, that we can build on 08:37:05 cgueret has joined #odw 08:37:19 JohnS: No one organisation can do this on its own, we need to act as a community to solidify our efforts 08:37:52 THe ODI certificate is really interesting, something to consider for things like http://openglam.org/principles/ and http://www.opencultuurdata.nl/about/ 08:37:52 jpcs1 has joined #odw 08:37:53 topic: Can open data (and big data) be used to improve the operations of development organisations?, Millie Begovic Radojevic 08:38:07 paper http://www.w3.org/2013/04/odw/odw13_submission_3.pdf 08:38:36 Millie: UNDP spends about $5bn a year that generates a lot of data. Have we improved things? What effect have we ahd 08:38:44 ... we also generate procurement data 08:38:56 ... we use thaty data mostly for accountability purposes 08:39:13 .... we've been wondering what other insights might be accessible from that data 08:39:30 ... can we work out which projects will be most effective 08:39:46 Alexrcoley has joined #Odw 08:39:47 ... what about the companies we pay, who is most effective, who do they employ etc 08:39:52 rrsagent, set log public 08:40:12 rjw has joined #odw 08:40:17 Millie: We started a series of events called Data Dives where we worked with people we don't normally work with 08:40:21 rjw has left #odw 08:40:26 ... data analysists, programmers etc 08:40:35 ... are there problems that we're not asking that we shoujld be asking 08:40:56 Millie: We'll be opening a new challenge prize shortly for the best algorithm 08:40:59 StevenPemberton has joined #odw 08:41:13 stressindikator has joined #odw 08:41:41 Millie: We took data from the World Bank on major contracts in 2007. We were interested in the suppliers and the relationships between those companies 08:42:00 PhilA: As an aside - must introduce Mille to Chris Taggart this evening 08:42:17 Mille: Certain companies tend to win contracts in particular sectors 08:42:50 ... two companies dominate this sub network of projects. What happens to the sub contractors is something goes wrwong with the main contractor - few points of failures 08:43:00 ... do certain clusters of companies that tend to bid together 08:43:21 ... we see clusters. Are these people really good ior is there something else going on? 08:43:34 ... do contracts go to home countries or from the more developed world 08:43:40 rrsagent, here? 08:43:40 See http://www.w3.org/2013/04/23-odw-irc#T08-43-40 08:44:20 Millie: A few hours' work produced these insights 08:44:30 trc has joined #odw 08:44:42 ... the World bank folks had the data but not the insights which actually didn't take a huge time to create 08:44:56 This analysis might be interesting (and easy) to apply to http://gtr.rcuk.ac.uk/ ... 08:44:57 richardm has joined #odw 08:45:07 http://www.w3.org/2013/04/odw/odw13_submission_3.pdf is a 404 for me btw 08:45:09 Millie: shows visualisation of projects and performance 08:45:32 edsu, i think the whole paper is in the 'abstract' 08:45:47 PhilA: Thanks edsu - I'll fix than when I'm done scribing 08:45:59 jpcs1 has joined #odw 08:46:08 Millie: It's not big data, it's lots of little data scattered around 08:46:35 danbri: thanks I found http://www.w3.org/2013/04/odw/papers now :) 08:46:39 ... global challenges coming up. We need help, people in orgs who can help open more data sets and help us get more inshights out of that data 08:46:46 danbri: would make interesting reading, although I've not seen any open data on that? 08:47:15 re eu, I think you'd need a temporal view... some partners sorta dominate, then EU notice that and punish them in later rounds 08:47:17 this was the link from the final slide of the talk: http://europeandcis.undp.org/ 08:47:32 AndreaP has joined #odw 08:48:17 Topic: Researching the emerging impacts of open data in developing countries (ODDC) Tim Davies 08:48:22 rjw has joined #odw 08:48:37 paper http://www.w3.org/2013/04/odw/odw13_submission_19.pdf 08:49:16 TimD: Poses questions - why people are interested in open data - transparency, innovation, inclusion and empowerment 08:49:30 MLutz has joined #odw 08:49:32 ... the way we do open data can make it easier to realise these differnet aspects 08:49:57 richardm has left #odw 08:50:47 s/differnet/different 08:50:58 TimD: Talking about the launch (tomorrow) of ODDC 08:51:13 ... Web Foundation and OGP are behind it 08:52:08 bhyland has joined #odw 08:52:13 s/inshights/insights 08:52:18 chrismetcalf has joined #odw 08:52:41 TimD: Slides are expressive and contain the gist of the talk 08:52:56 TimD: Draw out some key points 08:53:12 TimD: As we've seen, supply needs to be built on solid foundations 08:53:36 naomi has joined #odw 08:53:50 TimD: Are we building platforms that reply on always on high capacity systems in rural areas of the developing world 08:54:13 ... are the standards right/ We articulate standards but are the right people in the room 08:54:39 ... loads of stahndards being specifed - but do they work in all contexts? Does a London-based system work in Kenya? 08:54:51 s/stahn/stan/ 08:55:23 TimD: Are the licensing arrangements, correct/ Are first movers keeping others out? 08:56:08 TimD: We have opendataresearch.org and more - see sldies 08:56:40 Topic: Open Data NEXT: a strategy for social & economic value from Linked Open Data Hayo Schreijer 08:56:57 paper http://www.w3.org/2013/04/odw/odw13_submission_50.pdf 08:58:16 Hayo: Talking about the Dutch linked data project in NL 08:58:31 hayo: We started out open data programme 2 years ago 08:58:43 ... want to help government depts open their data 08:58:44 good collection of questions there 08:58:46 ... 08:58:52 what problem are we solving? 08:58:56 ... now 6K data sets from national and local administrations. 08:59:04 why spend money on opening data? 08:59:07 ... some great apps but not really solving real problems 08:59:13 why is nobody using our data? 08:59:25 why dont they build an app like...? 08:59:27 ... what actual problem does it solve? Where are the apps that do clever stuff? 08:59:43 hayo: we've reached a kind of impasse; governments are losing enthusiasm 08:59:44 s/sldies/slides 08:59:51 Hayo: We need to look at how OD is being used to solve real problems? 09:00:05 hayo: our approach: focus on real-life problems 09:00:18 Hayo: Purple areas on shown map are where population is declining, orange it's growing 09:00:29 e.g. disadvantaged and depopulated areas 09:00:42 Hayo: we want to help those people with the real problems, disadvantaged areas etc. 09:01:02 Hayo: trying to companies together, working on the problem 09:01:25 ... There's a problem of continuity. data is opened once and not updated 09:01:37 ... produced for one hackathon and then stopped 09:01:46 ... we're tackling that with linked data 09:02:14 ... NL has a lot of open data around legislation, case law etc. Gov not using it, they're buying it from people who put wrapper around our data and sell it back 09:02:36 Lieke has joined #odw 09:02:39 ... can we reduce the amount of money we spend on getting our own data and maybe we can profit from it ourselves 09:03:32 Hayo: We notice that policy makers often say "I base my policy on law x" - people make comments or annotations - we can use those in linked data and make the data more useful 09:03:42 ... shows nice labelled directed graph 09:04:31 Hayo: We're allowing people to make real links between laws, policies, their text or whatever 09:04:37 ... what marketeers call deep linking 09:05:03 ... we reward people for linking to laws. We contact people and say, Ok you link to the law, how about linking to this policy 09:05:16 ... we can notify people that link to a law as it's clearly important to them 09:05:25 ... laws have versions 09:05:38 ... need to be able to point to a lw as it was in 2010 etc. 09:06:34 Hayo: System will be available in September - getting government people enthusiastic about using their open data. This is a good example of showing govs how they can use their data 09:06:44 ... of course others can use it too. 09:07:19 Topic: Open Data on the Web: 3 Principles For Maximum Participation, Bob Schloss, IBM 09:07:31 paper http://www.w3.org/2013/04/odw/odw13_submission_54.pdf 09:07:44 slides (already!) http://www.w3.org/2013/04/odw/W3COpenDataBriefingMaximumParticipation2013Apr19.pdf 09:07:51 cjg_ has joined #odw 09:08:41 BobS: We put together what we've put together when considering what we think might be missing 09:08:52 BobS: I think it's great when we get lots of open PSI 09:09:04 ... we need it in educational, arts and business worlds too 09:09:13 ... we need to get a virtuous circle where value is created 09:09:38 amp has joined #odw 09:09:39 ... looking at an Irish linked data front end 09:09:41 HadleyBeeman has joined #odw 09:10:14 BobS: we started in Oct 2011 with 4 Irish authorities (Dublin + 3) 09:10:32 BobS: Looked at the cost/benefits of uploading open data 09:10:53 ... this issue that the people who publish trust that their effort will deliver a return 09:11:05 ... people have to want your data and they want it in their forma 09:11:07 yaso has joined #odw 09:11:10 ... (not yours) 09:11:24 ... you need to be able to state how complete is the data, when and where does it cover etc. 09:11:44 ... whole cluster of ideas 09:11:59 ... you can synthesise this open data with yours and do good stuff 09:12:02 s/forma/format/ 09:12:07 BobS: The three principles 09:12:25 bobS: (see slides) 09:12:45 s/slides/ slide 6/ 09:12:48 cjg has joined #odw 09:12:54 slide 6 09:12:59 s/slide 6// 09:14:07 BobS: Slide 7 for the second principle 09:14:25 ... talking about things like showing logos for limited time, potentially contacting data users 09:14:48 ... need to be able to log if there's a new version of the data 09:16:26 disturbed a bit about the additional limitations bschloss is suggesting for "open" data 09:16:43 seems to be stretching what "open" means beyond the usual definitions 09:16:45 Bingo! 09:16:46 +1 09:16:59 "What if terrorists use our data" is on my bingo card: http://is.gd/gXDEaG 09:18:35 (but to be fair, hazardous materials is actually a reasonable dataset to keep limited access. ) 09:19:32 Topic: Q&A session 09:19:32 Except if you want to see if there is hazardous material stored near your school. #west 09:19:45 yaso_ has joined #odw 09:19:58 cjg: puts a new spin on the JISC's ‘The coolest thing to do with your data will be thought of by someone else.’ 09:20:18 JohnS: We make instiutional commitments 09:20:36 Hayo: Our governments trust third parties more than our open data 09:20:39 markbirbeck has joined #odw 09:20:42 takumi has joined #odw 09:20:43 ... we're trying to educate tem 09:21:09 TimD: We're trying to talk about purposes and use of data more than you need to publish in a given format etc. 09:21:42 Millie: This is a room full of evangelists, the shift in thinking needed is enormous, don't underestimate that 09:22:01 masao has joined #odw 09:22:27 TomHeath: I like John's quotes. I don't like "if you agree with me you're wise if not you're a fool" 09:22:40 TomHeath: How do we convince others of the wisdom 09:23:15 BobS: What we're doing in Dublin - we capture the identity of the app, program and org that downloads everything and there's an offline process for assessing the value of that 09:23:30 ... then go back to the data publisher and tell them what's going on, what people are doing with your data 09:23:50 cjg_ has joined #odw 09:24:28 aside: best way to convince people is to show them the utility of it, not appealing to their better (wiser) nature imho 09:24:46 pascalRomain has joined #odw 09:25:17 cjg_ has joined #odw 09:26:34 &coffee; 09:26:56 edsu: I swear that we've had people suggest that if terrorists got access to the live bus times they could use it… there's a wear and tear on my desk from banging my head on it. 09:27:34 PhilA has joined #odw 09:27:59 PhilA: grrr dropped off IRC, sorry, missed some comments and questions 09:28:02 yeah, I've got a talk at IWMW this year about how open data can get better value for money -- seems a good way to think about it in these tightened times. 09:28:28 BobS: IBM has been looking at specific cities. We don't push up hill - we find the people that want to do open data 09:28:38 BobS: We also need to find the person in the street 09:28:51 ... we don't have 'how open data can improve your life' days 09:29:04 Hayo: Yes, talk about problems, not open data 09:29:25 TimD: Yes, we want data you can build upon in gov and society 09:29:41 TimD: Lots of great examples from places like Sao Paulo 09:29:59 ... talking about accountability and capacity not open data 09:30:08 danbri_ has joined #odw 09:30:08 We have a policy of always putting a front end on our open data; even if it's as simple as a basic HTML page. 99% of the users are just using that and not the underlying data, but that's OK. 09:30:14 ... so the new research project will include lots of case studies from Brazil. 09:30:37 BobS: In Africa, the knowledge of prices for their farming goods is transforming farming 09:30:50 cjg: :-D 09:30:59 BobS: So we've been working on projects for people who can't read - working on spoken web in India 09:31:13 rrsagent, make minutes 09:31:13 I have made the request to generate http://www.w3.org/2013/04/23-odw-minutes.html StevenPemberton 09:31:27 Millie: In the Balkans we have an issue of forest fires and consequent air quality 09:31:35 ... I want to know if my child can go out on the street 09:31:42 ... we have kids building air quality monitors 09:31:51 ... we move to solutions too quickly 09:32:33 cjg: it's hard to develop all the apps/visualizations people want ; giving them the data and empowering them to do it seems like a no brainer -- except to people who don't want new interesting visualizations of their data :) 09:32:45 JeniT: For Bob - you spoke about the need for collecting data about people using the data and restricting terrorists's access - that's not the usual definition of open data 09:32:56 BobS: I see a spectrum, not a point 09:33:17 I generally tell people that "open" means removing as many barriers as possible 09:33:30 ... we're going to have rock solid stuff - it will be there and accurate for 9 years. Then there's softer and softer - we need to cover the specturm 09:33:31 the barriers can be technical, social or legal. 09:33:58 "as open as possible" can still be used to describe data which is confidential. 09:34:03 For reference, I think JeniT is referring to the Open Definition http://opendefinition.org/ 09:34:09 markbirbeck has joined #odw 09:34:35 Great question… I've been wondering as well if we're still having the same discussions (as we were a year or two ago). 09:34:37 bhyland: Yes. we're all evangelists but we're not working in a vacuum. There are people in gov who are not minded to hand data over to a bunch of smart people they don't trust 09:35:21 Hayo: It takes pateince. We have to change contracts occasionally. We changed our legislation publishing contractor 5 years ago - that made a big difference 09:35:56 I think he said that it took 5 years to change the contract 09:36:03 tomag has joined #odw 09:36:21 and only then could they use their own data 09:36:34 Millie: SorryScribe note - sorry, I missed Millie's comment about Pulse?? 09:37:42 PhilA: Yes, my pencils are sharpened. 09:38:00 yaso_ has joined #odw 09:38:27 Billr: My experience as a private sector person working for gov - see that some of the bigger people only just picking up the potential for open data. Some early birds are winning 09:38:40 s/PhilA: Yes, my pencils are sharpened.// 09:38:46 Last thoughts... 09:39:04 JohnS: Spend more time talking to people not involved with open data about fixing problems 09:39:15 BibS: OD is a means, not an ends. talk about the ends 09:39:25 bhyland has joined #odw 09:39:25 Hayo: OD will take time and money. Maybe 5 years + 09:39:32 BillR: +1 09:39:34 floppy has joined #odw 09:39:53 Millie: UNDP uses tax payer's money to change people's lives - we need help 09:40:08 TimD: Think about who's in the room when we define standards 09:40:46 Topic: The Role of PDF and Open Data (Jim King, Adobe) 09:41:17 scribenick: markbirbeck 09:41:19 timdavies has joined #odw 09:41:23 Scribe: Mark Birbeck 09:41:34 bhyland has joined #odw 09:41:47 Paper: http://www.w3.org/2013/04/odw/odw13_submission_52.pdf 09:42:17 Concluding remark from first session: "Open data is a means, not an end. Come at it from what real world problems it will solve." 09:42:24 " 09:42:46 HadleyBeeman has joined #odw 09:42:49 Paul Davidson introducing James King — senior principal scientist at Adobe — to talk about how PDF is more open than we all think it is. 09:42:57 BibS++ concur 09:44:27 Structure of talk: open data paradigm, PDF itself, and then its role in open data. 09:44:42 bhyland has joined #odw 09:44:44 s/:/- 09:44:46 jpcs1 has joined #odw 09:45:08 Organisations taking data, shaping it and presenting it. 09:45:33 …but others — the "processors" — would prefer to deal with the raw data... 09:45:58 …they might present that too, but also use the data to draw new conclusions, or use it for advocacy. 09:46:17 …A further group is that of the tool providers, who will help us process this data. 09:46:40 …About 30% of the room are providers... 09:46:53 …80% are processors... 09:47:23 …most are consumers, and some are tool providers. 09:48:11 …PDF will be 20 years old this June. 09:48:23 cjg has joined #odw 09:48:24 …PDF and Acrobat are different beasts. 09:48:57 …The internals of PDF have always been published, and it became an ISO Standard in 2008. 09:49:06 PhilA: Nice approach to backwards compatibility from Adobe for PDF 09:49:18 …A PDF 1.0 doc is also a 1.7 doc — always backwards compatible. 09:49:27 Jim King: PDF will be 20 years old this June. PDF 1.7 became an ISO Standard in July 2008. ISO work on PDF is ongoing. 09:49:40 hopefully mozilla's pdf.js will get a mention ... 09:50:55 …To make the PDF spec into a 'proper' ISO Standard the team at Adobe had to go through the entire document…very thoroughly… 09:51:12 amp has joined #odw 09:51:31 …PDFs are abundant, containing lots of useful information. 09:51:50 I had surprisingly good results converting our student union committee minutes from PDF to RDF: http://lemur.ecs.soton.ac.uk/~cjg/TheyWorkForSUSU -- just looking at where on the page text appears gives more semantics than the naive pdf2utf8 (or 2html) approach. 09:52:12 …It's a format that distinguishes between text and graphics, and can be used to produce good looking documents. 09:52:22 …But it's not a data format. 09:52:27 cjg: i think that's roughly what google scholar does when it scrapes pdfs 09:52:55 bschloss has joined #odw 09:53:15 …Billions of documents out there, but difficult to extract any data that's in there. 09:53:38 cjg: grabbing the largest text at the top of the first page as the title 09:53:51 …If pages *contain* graphics then extract that with something like Illustrator. 09:54:09 …If pages are text then there's a bunch of software that can process the text. 09:54:23 …(A big list is on Wikipedia.) 09:54:37 There is a 'spectrum of open data' -- totally free, available forever, no recording of downloader is one end of that spectrum, but airlines, investment markets, sports leagues, available job listing websites, retailers are all doing open data on a slightly different point on the spectrum. 09:54:50 …And if the pages are images (i.e., rather than *containing* images) then need to go the OCR route. 09:54:57 trc has joined #odw 09:55:07 http://en.wikipedia.org/wiki/Comparison_of_optical_character_recognition_software 09:55:30 We found a nice command line tool which converts PDF to and XML representation of the data structure inside and that gets it into our 'hacking comfort zone' 09:55:46 -> http://en.wikipedia.org/wiki/List_of_PDF_software wikipedia list for pdf tools 09:56:11 … http://en.wikipedia.org/wiki/Comparison_of_optical_character_recognition_software 09:56:19 … http://en.wikipedia.org/wiki/List_of_PDF_software wikipedia list for pdf tools 09:56:25 (Thanks Ivan and Steven!) 09:56:58 …If you're making PDFs, here's what you could do to make things easier. 09:57:37 …Making files that both contain raw data and look good is difficult. 09:58:32 …There *is* software around that can embed metadata to provide structural information. 09:58:40 AndreaP has joined #odw 09:59:09 Seems to me that any producer of a PDF who wants it to be available to people with no sight is hopefully providing a table or textual alternative rendering in the PDF for any diagram or image in the PDF, yes? 09:59:17 …The structural information would be stuff like reading order, tags such as headers, footnotes, figures, maths, and so on. 09:59:42 …Tools can make use of this extra data which will make the extraction process much more reliable. 10:00:13 markbirbeck1 has joined #odw 10:00:25 …A second thing to do is make use of the attachment facility. 10:00:37 scribenick: markbirbeck1 10:00:45 …A second thing to do is make use of the attachment facility. 10:01:08 lottebelice has joined #odw 10:01:27 …Raw data on its own is probably insufficient for doing something useful. 10:01:48 rrsagent, make minutes 10:01:48 I have made the request to generate http://www.w3.org/2013/04/23-odw-minutes.html StevenPemberton 10:01:53 …For example, what's the currency? the data format? the semantics of the fields? provenance? 10:02:00 the attachments-in-PDFs thing might actually be useful for scholarly publications, so that the data doesn't get divorced from the paper 10:02:27 …So we create a PDF file that contains raw data with a schema, giving the end-user everything they need. 10:02:51 s/(Thanks Ivan and Steven!)// 10:03:04 bhyland: yeah, presumably there's not the tools support beyond what Adobe sells 10:03:06 …Can then make use of all the nice PDF features that have evolved over the last 20 years, such as digital signing. 10:04:03 …There are some examples in the slides. 10:04:23 AndyS has joined #odw 10:04:23 bhyland: same could be said of most metadata on the web 10:04:48 Peter Murray-Rust: Spent years hacking PDFs in the wild. 10:05:15 …Trying to write software that will process them, but they are generally pretty bad. 10:06:08 …If anyone else is trying to hack on this then please talk to me; there's hundreds of billions of dollars worth of information out there that is simply unusable at the moment. 10:06:31 I had a bit of a rant about PDFs as a way of communicating data to a reporter from the register, which resulted in them publishing this: http://lemur.ecs.soton.ac.uk/~cjg/Archive/Photos/2011/cjg-boffin.png (I'm quite proud of that) 10:06:39 Dan Brickley: Is this thing loud enough? 10:07:04 …PDF can be used well and powerfully, and of course it's clear that some people aren't using it well. 10:07:08 heh, re: billions of dollars worth of information that's unusable, you have to wonder if that's by design, not by accident ... 10:07:14 …You didn't mention XMP, though, which includes RDF. 10:07:24 …You also didn't mention accessibility. 10:07:44 Peter Murry-Rust - Scientific publishers are paid $10B/yr worldwide to lock up scholarly publishing, that is after governments spend $100B/yr globally on scientific funding for R&D in the first place. He is looking for people to help him in his mission to unlock the enormous value locked in PDFs. 10:08:45 serena_v has joined #odw 10:08:46 s/Murry-Rust/Murray-Rust 10:09:08 James: The accessibility aspects are quite mature in PDF, and the structured aspects help that. 10:09:13 roger has joined #odw 10:09:15 PDF is a page description language, so not in a reading order necessarily 10:10:04 …We don't have much control over what people produce, although things have improved in the last 5 years. 10:10:45 @edsu - perhaps re: your comment above. My experience suggests that we're more thoughtful publishing structured data about data sets (metadata) because they are fewer in quantity whereas PDF are like water, they are everywhere and almost "too easy" to create but the mere click of "Print —> PDF" … 10:11:03 s/but the/by the 10:11:07 speaker: For many people PDF data is closed data. 10:11:31 yaso has joined #odw 10:11:55 speaker2: You've outlined many things I didn't know were possible, so why is there not the uptake on these features? 10:12:45 @hadleybeeman - because the tools are proprietary, complex to use … at least harder than clicking "Print —> PDF" and well let's face it, people are lazy and hand entered metadata has been proven to be *very* challenging and highly inconsistent. 10:13:36 James: Not sure if it's our fault. In some areas there have been successes, perhaps where there's industry interest or our sales people have promoted a feature. 10:13:52 s/speaker2/hadleybeeman/ 10:14:07 s/speaker2/HadleyBeeman/ 10:14:21 markbirbeck has joined #odw 10:15:08 If they want stuff like metadata to be adopted, then surely they need to encourage support in tools other than their own (OpenOffice; Word) 10:16:19 hideaki has joined #odw 10:17:16 yoshiaki has joined #odw 10:17:33 hideaki has left #odw 10:18:26 JeniT has joined #odw 10:26:39 jpcs1 has joined #odw 10:29:19 StevenPemberton has joined #odw 10:31:16 floppy has joined #odw 10:31:39 fumi has joined #odw 10:32:40 rjw has joined #odw 10:33:05 bhyland has joined #odw 10:33:16 cjg has joined #odw 10:33:18 stressindikator has joined #odw 10:33:26 cgueret has joined #odw 10:33:47 st has joined #odw 10:33:56 markbirbeck has joined #odw 10:34:15 ldodds has joined #odw 10:34:17 hideaki has joined #odw 10:34:17 yoshiaki has joined #odw 10:34:44 StevenPemberton has joined #odw 10:34:46 yaso has joined #odw 10:34:48