13:44:56 RRSAgent has joined #diw2012 13:44:56 logging to http://www.w3.org/2012/02/03-diw2012-irc 13:45:33 Lesson 1: User Centered Design is important to this community, and I need to read Steve Sawyer 13:45:55 Pat has joined #diw2012 13:45:55 Lesson 2:New tools, IRC for collaborative work and Mendeley for bib management 13:46:14 Didn't know that the distinction between science and discovery was so important. 13:46:45 Lesson 3: NSF might actually fund my work, since while the applications are biomedical, the computer science is generalizable and may be of interest to them 13:47:01 I found the challenges in social science "discovery" extremely interesting... and would like to follow up on them! 13:47:02 AlexanderSchliep has joined #diw2012 13:47:29 ChrisS_ has joined #diw2012 13:47:47 Strong emphasize on biology,in oposition to other meetings where physical sciences dominate 13:48:22 Pat_ has joined #diw2012 13:48:38 Haym has joined #diw2012 13:48:46 +1 on bio emphasis 13:50:00 PaoloCiccarese has joined #diw2012 13:50:02 HuanLiu_ has joined #diw2012 13:50:25 1) Importance of usability and user centered design 13:50:38 Lesson 1: formal knowledge representation is really important, but as a starting point, not an endpoint 13:50:46 Lesson 2: Don't ask scientists what tools they need because they don't know 13:50:54 3 points from NIgham 13:51:12 Lesson 3: Workflow can be used not only for integrating computational tasks, but human tasks as well 13:51:26 1) tradeoff between formal representations and the effort they take, can we quantify/formalize that 13:51:41 DavidJensen has joined #diw2012 13:51:43 Demonstrating utilty is tricky 13:51:56 2) determining impact of tools is hard -- how do we measure impact 13:52:34 3) Need to work with scientist to design appropriate tools (Turbotax example) 13:52:47 Kerstin points 13:52:48 Kerstin has joined #diw2012 13:53:03 1) diversity of backgrounds in meeting (I think that is what he said) 13:53:34 @Huan 1 - Discovery has its many definitions and types 13:53:35 2) Scientists often perceive CS as not research 13:53:50 where DOE program managers invited 13:53:57 3) had not realized the importance of knowledge representation (in climate) 13:54:03 1) I did not realize there had been so many "success stories" in other science domains -- helps misperception that CS is 'just a service' 13:54:06 2) Wasn't aware of the many different working definitions of "discovery" -- first hour felt like trying to define and scope 13:54:10 3) Realized how little attention I have paid to knowledge representation -- we often just "make it work" for the task at hand 13:54:12 Loren's 3 points: 13:54:43 1) many people did not know of success stories 13:54:50 2) -- I missed that 13:54:58 People were far less focused on induction from large data sets, and more balanced in their concerns, than I expected. 13:55:22 3) hard to work together when people have diverse backgrounds 13:55:29 We all have unique knowledge. 13:55:34 3 pints from Evelyn 13:55:44 1) importance of discovery for so many people 13:55:50 2) importance of social computing 13:55:52 +1 to three pints 13:55:53 Loren has joined #diw2012 13:56:01 @Huan 2 Social computing may be able to help discover global knowledge from local findings in a tangible way 13:56:10 +1 to pints (of beer) 13:56:27 Learning: We all have unique knowledge. 13:56:41 Learning: Social computing as a first class computing method. 13:56:46 3) contrast between industry and academia: a) in industry anyone can be an innovator 13:56:52 Analyses of the entire scientific process can serve as organizing frameworks for research on computational discovery. 13:57:00 Learning: Renewed emphasis on working on meaningful issues. 13:57:30 Learning: Difficulty of getting a group of people to work effectively and constructively together in a meeting. 13:57:31 would have liked to define the boundaries of social computing applicability more clearly 13:59:26 Breadth of expertise 13:59:29 breadth of expertise... some essentially addressing same problem 13:59:52 know should be sharing data, have good data formats... but either quick and dirty or formalize 14:00:11 @Huan 3 So much interest in social computing and so much expection of its power. 14:00:12 are there enough commonalities to formalize and build shareable tools across communities 14:00:15 sharing data either quick and dirty or fully formulated - makes sense to formalize sharing tools and then adapted to communities 14:00:29 above is from Alex 14:00:41 dilemma, fancy CS tools not being used. User-centric design approach might be the right thing to do. 14:00:47 Vipin 14:01:11 so many aspects of informatics that I did not know about 14:01:14 so many other aspects than DM/ML in informatics and how CS can help 14:01:31 got to know people in other areas coming from different aspects 14:01:36 DavidJensen has joined #diw2012 14:01:51 commonlatiy of experiences in room 14:01:57 commonalityin experience s in different doamins 14:02:06 Alex Schliep: 1. breadth of expertise, but have the same type of problems 2. know we *should* use formalism, but usually do it quick and dirty 3. bioinformatics has lots of nice tools, but many of them dont' get used very much -- user-centered design could help this. 14:02:13 astronomy a good model for users vs experts many users per expert 14:02:26 need success stories to sell to colleagues in other disciplines... CS is not a service discipline 14:02:27 Incentives for participation really seems like an important issue. Researchers may not *want* to share workflows, data, and tools. They may see these as their competitive advantage. 14:02:47 Yan 14:02:49 computer science viewed as a service discipline must work to change that 14:03:19 started working closely with one biologist on single problem, not that many people cared about those results 14:03:51 Vipin Kumar: 1. breadth of interests, not just data mining and machine learning 2. so much in common among people working with very different scientific disciplines 3. we need more success stories to change perception that computer science is a service 14:03:54 Journals are a carefully worked-out balance between cooperation and individual reward. Social computing for DI is a whole new realm where we'll need to work out the incentive structures. 14:04:00 unsatisified with helping biologist of a very narrow problem - happier working across disciplines 14:04:10 challenge is to find teh right people to work with while thinking of teh problem at the same time 14:05:05 the social phenomenon is changing our lives 14:05:48 Need still better methods to connect people - how can we utilize teh social perspective 14:06:18 Yan Liu: 1. not satisfied solving focused problem with narrow focus -- trying to help the broader community 2. difficult to find scientist willing to work with 3. take advantage of the social revolution to connect the right people, define problems 14:06:24 Very interesting point from Yan Liu: Social computing could help crowdsource data gathering for sociology of science. 14:06:50 susan has joined #diw2012 14:06:51 Carla: impressed with preparation of workshop 14:07:03 Carla: learned about new systems 14:07:08 learnt alot about new systems 14:07:14 QED: quasi experimental design 14:07:15 qed 14:07:30 socia computing interesting 14:07:35 emphasize social computing as learning a lot 14:07:53 Carla 1) lots of new information even before workshop 14:07:58 a few people mentioned inference and reasoning - not enough 14:07:59 2) importnace of social computing 14:08:13 formal reasoning not mentioned as much as expected 14:08:22 I forgot to say: I think there's still a tension between problem-centered and technology-centered research. 14:08:40 bringing human computation as a way to do reasoning 14:08:43 brinign human computation as a way of doing reasoning, coupling machines with human 14:08:44 3) human computation as a way to do reasoning (social computing) 14:09:01 Phil 14:09:08 Carla Gomes: 1. Great workshop organization, gather input ahead of time 2. learned a lot about other systems, new definitions -- especially social computing 3. surprised not too many people mentioned reasoning and inference 4. human computation as a way of doing reasoning is an interesting concept, coupling with machines using the strenghts of each 14:09:12 (cynic) 14:09:24 One of the outcomes of studying the foundations of discovery informatics may be a far better understanding of scientific reasoning in general. 14:09:35 Phil 1: absurd that we need discovery tools to discover discovery tools 14:09:40 absurd that need discovery tools to create discovery tools 14:10:13 tools are not advancing very quickly 14:10:29 integration of data is the key first step 14:10:52 Emphasis on understanding failure is good 14:11:14 + integration of data, processes and people 14:11:14 Raul: I learned that way-more scientists use discovery tools than there are computational scientists who can hand-hold them. 14:11:21 Emphasis of success of cyberinfrastr overstated 14:11:30 Take home point: we are at a tipping point 14:11:38 Raul: Science needs "TurboTax-style" discovery tools that (1) understand scientists's workflows; (2) are expert, automated guides, and (3) are really, really simple to use. 14:11:40 Phil: success of CI may have been overstated 14:11:44 Catalyst often comes from something that we don't expect 14:11:59 Phil: we are at a tipping point, after this meeting even more convinced 14:12:25 cognition may be changing more than I realized 14:12:26 Phil: role of cognition research 14:12:39 simplicity, usability, reward will always rule what scientists decide to adopt 14:12:40 I was happy to see abductive inference in some of the introductory slides — it's an under-appreciated and fundamental description of much scientific reasoning. 14:12:50 Simplicity usability reward will always rule whether scientists adopt something 14:12:55 Turbotax will not quite do it for scientists. 14:13:15 jls has joined #diw2012 14:13:25 Pirates of Silicon Valley is a great movie 14:13:37 Pat 14:13:55 Pleased that induction from large data sets didn't dominate 14:13:59 Pleasantly surprised that abduction from large data sets was not so dominant 14:14:02 pat: not obsessed about big datasets induction 14:14:17 People not aware of success stories 14:14:23 don't know about long history of work in discoveryin science... wrote survey paper on 8 success stories back in 2000 14:14:34 missing: search metaphor 14:14:56 Teh search metaphor was missing - what is teh space and how do you constrain it 14:15:10 Study discovery in the context of the whole scientific enterprise 14:15:12 You don't want to study discovery by itself but in context of entire scientific enterprise 14:15:19 Chris 14:15:21 Study discovery in the context of teh full scientific enterprise 14:16:00 nice to see so many people interested in this 14:16:04 fragmentation in comp sci is worse than thought 14:16:14 Chris: fragmentation in teh computational sciences worse than I thought 14:16:34 Chris: NSF truly thinking about the problem 14:16:37 teams should be interdisciplinary 14:17:20 53 14:17:29 5-7 years from a good solution 14:17:30 Hod's point about "garage science" raises an interesting thought for me: Can social computing and DI create a *better* incentive structure for research, so that we can massively broaden who can meaningfully contribute to important science. 14:18:09 domain scientists should be on team to create useful tools 14:19:08 Steve 14:19:13 Work with a scientist not the science in the broader sense 14:19:18 Feels like kid invited to Santa's workshop 14:19:31 Steve: Santa's workshop 14:19:39 ChrisS has joined #diw2012 14:19:54 1. Fragmentation in the computational sciences even worse than I thought 14:20:02 impressed with thoughtful automatization attempt, thoughtful thinking of how to use CS thinking in science 14:20:05 Thoughtful automation and how to use computational ability in science 14:20:14 delighted that didn't turn into BIG science, data 14:20:24 2. I thought all of NSF already knew that domain people need to be on the teams that build things meant to be used. 14:20:32 ddin't know the fascination with models 14:20:46 Fascination of models was not expected 14:21:04 study of science: people who best benefit don't get to see it as much 14:21:05 3. Badges (public acknowledgement in labelled form) and computer-supported competitions for data analysis and reviewing is going to be the wave for the future. 14:21:15 What is teh takeaway as I move into my craft 14:21:15 discussion between science and discovery was interesting 14:21:26 TurboScience 14:21:42 could be in a high school in 10 years 14:21:48 TurboScience - likes that idea- good be in high school in 10 years 14:21:58 collaborations, human/non-human, how can we collaborate with tools better 14:22:05 broadern participation in science through social computing 14:22:24 more people = more energy in science, should be more visible 14:22:28 Having a scientist on your project and trying to solve his or her problem is only *one* model of how to do good work in DI. For example, you could work on a problem that has been previously (and well) defined by another community. There may be other successful approaches. Let's not become a monoculture. 14:22:34 Very excited by broadening participation in science - more energy comes from more people - science shoudl be a more visibkle enterprise 14:22:41 alex 14:22:42 Loren has joined #diw2012 14:22:44 metrics for discovery 14:22:50 discovery is a diverse concept 14:23:05 many issues are more sociological than technological 14:23:26 + 1 social aspect much bigger hurdle than technical in many cases 14:23:26 Alex: Metrics for discovery is hard discovery is a diverse subject teh human factor is still too large pleasantly surprised by amount of understanding and belief in citizen science 14:23:28 lot to do on the front of models 14:23:38 simplicity is good, e.g. user interfaces 14:23:44 didn't hear a killer app 14:23:47 takehome - simplicity of user interfaces - no killer app 14:23:59 education will drammatically change in next 10 years 14:24:04 PatLangley has joined #diw2012 14:24:04 Cecelia 14:24:11 Education will dramatically change in teh next 10 years and we can not be isolated from that 14:24:15 Many people are unaware of the long history of work on computational discovery in scientific domains, and of the many success stories, so we need to do a better job of advertising. 14:24:19 people approaching people from different angles 14:24:43 importance of human, user-centric design 14:24:51 Just came across my twitter feed: An online social network for "pre-submission" peer review of scientific articles: http://news.sciencemag.org/scienceinsider/2012/01/online-social-network-seeks-to.html 14:24:57 Cecilia - pleasantly surprised about teh recognition of teh importance of teh human design 14:24:57 I was surprised that the metaphor of heuristic search was generally absent from the discussions. \ 14:25:24 don't know about models, but have built 2 dozen tools for scientists 14:25:25 Learnt about the importance of models 14:25:48 domain scientist should be on a team of builders? usually the other way around 14:26:04 Liz 14:26:09 17, 37, 46 14:26:20 command line is terrible interface 14:26:26 Command line is a worse interface design than I ever thought 14:26:43 Huan 14:26:56 so many definitions of discovery 14:26:58 Huan: so many definitions of discovery 14:27:00 Cecelia says in her world, scientists say, Hire a computer scientist instead a graduate student? Why would anyone ever do that? Biology will eventually become that way, but it isn't yet. 14:27:04 and ways to discover 14:27:13 so much interesting social computing, expectation of social computing 14:27:17 so much expectation of social computing 14:27:31 finding global findings from local knowledge 14:27:38 social computing may be able to help discover global knowledge from local knowledge in a tangible way (learned from neighbor) 14:27:49 More people are interested in this area than he thought 14:27:52 social computing: a way to find global knowledge 14:28:01 What I'll do differently: rather than just go work with scientists, I'll think hard about it first: read some papers, do some planning. 14:28:07 socialogy of science, ML, social computing 14:28:20 David: concrete ways to work together was evident 14:28:29 lack of whole scientific areas using primitive tools for discovery informatics 14:28:49 Did not know that so many people cared so much about the different definitions of "discovery," "models," and so on. 14:29:03 Surprised that there are tools for specific examples but not more generic 14:29:11 equivalence classes that may not line up with scientific fields that could be useful for generalization 14:29:18 I was pleased by the general understanding of the importance of user-centered design in this community. 14:29:21 Andrey 14:29:41 surprise on focus on success stories, could learn from the failures 14:29:48 surprised at how different we are 14:30:01 Andrey: surprised how different we are and how we think about research differently 14:30:03 how one discipline thinks about another one 14:30:32 Hod 14:30:33 How group dynamics effects what we come up with - would it be different if done remotely? 14:30:55 Whole topic is a meta scientific problem 14:31:08 this is a meta scientific problem, surprised that there haven't been more of these meeting. Lots of basic things still aren't nailed down. 14:31:31 Leverage to be gained if automate, make this process more efficient. Should be high priority 14:31:32 Hod: so much leverage if we can automate and make processes more efficient 14:31:48 How non-linear the scientific process is. 14:31:57 Hod - teh scientif cprocess is soooo non-linear 14:32:00 Having built a couple dozen systems for scientific collaborations without knowing much about the importance of models, I was excited to learn there's so much work in this area, and am looking forward to using models in my own research. 14:32:21 The scientific process is stocastic 14:32:40 process of discovery is not rational 14:32:50 The discovery process is open ended and exploratory 14:33:34 garage science, democratization of discovery: NSF can get in the near future by letting people in garages do research with much smaller budget 14:33:48 The democratization of discovery - people on an island arguing that the tsunami is coming 14:33:55 Hod: we are like people on an island discussing whether a tsunami is coming 14:34:41 The research enterprise is going to change 14:34:45 problem 42 14:35:04 It was interesting to see that people kept saying, "a domain scientist should be on a CS project team." Where I come from, scientists generally need to be persuaded that a computer scientist should be on the team to build the code instead of a grad student. 14:35:11 Finding the interesting non-trivial thing in teh data - 42 14:35:20 finding the interesting non-trivial thing in the data 14:35:32 Google search is an extension of how people think 14:35:39 Google search is an extension of how people think - 14:36:16 tools are not used because they do not add value by producing something new 14:36:24 The tools dont add value - they do not allow us to discover something new 14:36:32 distinction between discovery and informatics is really important, should focus on discovery 14:36:53 shouldn't make a TurboTax for science, companies can do this much better 14:37:03 +1 for the a successful tool will be one that when a scientists enters her data into the tool and it comes back with something interesting to the scientist in the top one or two results. It's a waste of time to make TurboTax for scientist. 14:37:06 trying to compete with industry is a mistake - companiew will do a better TurboScience tool 14:37:27 Actually +10 for Hud's point about returning something interesting from the data 14:37:35 I really disagree that a company can make tools for scientists better - there's not a big enough market for them to want to do this. 14:37:37 Kerstin 14:38:05 discovery of new knowledge in what we have, producing somethign as easy to use as Google but more powerful 14:38:18 Kerstin: As useful as Google but across data and other scientific matter 14:38:35 surprised to see so much focused on the long-tail of science 14:38:59 So much focus on long tail surprising - used to teh other way around focus on big data 14:39:01 maybe due to composition of workshop 14:39:13 It's a mistake to assume that the private sector does everything better and more efficiently than anyone else - that's only true if there's a reasonable chance of profit. For non-profit areas, or ones that have only long-term returns, the public sector does better. 14:39:25 problems getting access to real scientists 14:39:29 Problems in getting access to real scientists a problem 14:39:52 Cecelia has to fend them off, is there a way of her sharing users and us sharing technologies? 14:39:53 How can views and technologies be better shared? 14:40:01 Miriah 14:40:01 +1 to cecilia's comment 14:40:05 A marketplace for informatics 14:40:24 visualization is young but growing, nice to hear its importance in discovery informatics 14:40:34 Pleased taht viz research bought to teh front 14:40:40 it's out job to figure out the prototypes that are useful,only then will industry take it on (and we should let them) 14:40:45 usability, user-centered design 14:41:01 -1 to Kerstein for saying that biologists are focused on small science. Biology has been increasingly focused on larger and more expensive and "grander" challenges. The Democratization ("garage") biology is a complement to big biology (not a replacement) as extremely powerful tools become cheap. Note that decades after the introduction of the cheap computer, expensive computer science research has not gone away. 14:41:09 knowledge representation resonates 14:41:10 lots of people, lots of different aspects: knowledge representation 14:41:16 Haym 14:41:24 53: galaxy zoom, fold-it 14:41:36 Haym - GalaxyZoo and Foldit - we got it straight away 14:41:58 excited about teh thinking of data and models 14:42:04 interplay between data and models, process: routine thing in DM, pleased to see this here 14:42:21 Pleased about the long tail 14:42:24 long-tail is important... complex data is also interesting 14:42:40 dark science 14:42:54 Pleased about the focus on people 14:42:56 PaoloCiccarese has joined #diw2012 14:42:59 people: user-centric design, citixzen science, human bottleneck 14:43:27 will 14:43:34 Will: oh 14:43:56 The problem with saying "human bottleneck" is that it goes back to the old model of thinking that humans are the limiting factor or at fault... I prefer the term "impedance mismatch" between human cognition and computation. 14:44:05 multiple iterations of cartoonmodel of science 14:44:22 prefer the quantum version 14:44:24 Will: multiple iteration of the scientific process - prefers quantum version 14:44:56 Will: many aspects of science do not work with this model 14:45:08 lot of people are interested in incentivizing use of science/tools 14:45:26 have to build in from the front 14:45:39 Will: incentivize use of science and tools - have to build in from teh front - user must be in at the beginning 14:45:46 +1 to Cecilia's comment about disagreeing with the "human bottleneck" idea. *All* interesting science comes out of people, and probably will for some time. 14:45:54 Think about what the incentives might be 14:46:10 What is important to teh scientist must come first 14:46:41 didn't see a lot of talk about social/political forces in science 14:46:45 Not much talk about the social and political aspects of science 14:47:03 break down political and social barriers? 14:47:03 breakdown teh political barriers not discussed much 14:47:27 eg problems of peer review - use technology to change teh structure 14:48:32 Karsten has left #diw2012 14:48:38 Karsten has joined #diw2012 14:49:10 yolanda & haym -- saw the whole irc stream but had trouble posting, will e-mail notes. 14:55:34 Charge to the breakout groups is to identify exciting research that would be enabled by DI. However, we should focus on things that DI would uniquely enable. What will *uniquely* happen with DI? 14:57:36 andrey_rzhetsky_ has joined #diw2012 15:06:23 Daviv - categories to be discussed in the breakouts: efficiency, cover, participation, education and basic knowledge about the scientific process 15:06:32 s/Daviv/David/ 15:08:03 To expand... 15:08:14 Efficiency — More efficient discovery of what is currently discovered 15:08:20 Coverage — Extend what is discoverable beyond what can be discovered currently. 15:08:30 Participation — Increase the number of people who can participate in science because of broad availability of tools, data, and collaboration opportunities 15:08:38 Education — Increase the availability of actual scientific processes and information to students 15:08:44 Knowledge — Increase understanding about the process of scientific discovery because of data collection and basic research on the process itself 15:09:12 if we want to make an investment in science, we have to understand how we can change it 15:10:29 .com 15:12:41 Karsten has left #diw2012 15:14:47 rrsagent, [please] create [the] minutes 15:14:47 I'm logging. I don't understand '[please] create [the] minutes', PaoloCiccarese. Try /msg RRSAgent help 15:14:57 rrsagent, create minutes 15:14:57 I have made the request to generate http://www.w3.org/2012/02/03-diw2012-minutes.html PaoloCiccarese 15:15:48 rrsagent, set logs world-visible