From High-Performance Computing Community Group
January 9, 2014, 2:00 PM pacific time
Notes from Past Meetings
November 19, 2013
Several of us attended the HPC and the Web BOF at SC13. We had a lively and well attended discussion. Live notes were kept in a Google document. Several needs were identified by the group, and these may direct future activities of this group. One such identified need is advice to developers new to this field. This could include guidance for getting started, help identifying appropriate tools. Another need identified is common middleware, something like a JQuery for science, that would enable developers to put something together quickly without reinventing the wheel. Another suggestion was a collection of sample apps in various MVC frameworks, along the lines of TodoMVC, for HPC. There was a call for more outreach and opportunity to share successes and lessons learned.
October 3, 2013
Nancy Wilkins-Diehr - the Science Gateway Institute
Nancy mentioned a few of the talks from the XSEDE conference this year, including Kay Thainey's talk about gateways for open science, and one about brain research. The theme was Gateways to Discovery. HPCWire has good writeups about them.
Nancy discussed her work for the Science Gateway Institute. They are concentrating on the back end, not the front end. Interfaces have changed a lot over the years. Gateways obviate the need to learn everything needed to run a job.
Having things running under a shared account is pushing out responsibility to gateway developers. It was a change in policy that NSF would even allow people to use supercomputers this way. Before, you could get an account, there was some vetting, access was not automatic. Not having to do that was a big win.
40% of XSEDE users come in through gateways. CIPRES (cyberinfrastructure for phylogenetic research) is the most popular gateway in teragrid.
Recently, a 15-year-old student used XSEDE, won Massachusetts state science fair with no help from gateway staff.
Other applications: CyberGIS, analytical ultracentrifugation - UltraScan 9.0
gateways in the marketplace, kids control telescopes and share images (observatories)
NSF vision for cyberinfrascructure in the 21st century: "science is all about connections"
software panel recommendations
13 awards in 2012 to different groups in conceptualization phase
scientific software innovation institutes (S2I2), long term hubs of excellence
short funding cycles can be difficult - need to break this cycle
great video about what they learned so far
staff members available who can help get you off the ground, pool of experts
workforce development - opportunities for students, IT professionals
existing opportunities like Google summer of code
campus IT departments (Indiana, NotreDame)
final focus group session mid-october
second round of conceptualization awards, theirs was delayed a year
they have connections to europe, australia
surveys - about stuff they would offer, gauging interest
any projects coming up that involve collaboration between NSF/DOE? OSG would have qualified
there were 13 institute awards - some of them might be.
maybe an x-ray scattering one? they tried SNS, didn't fit into their model
maybe light sources?
Plans for getting together at SC13: we're hoping to organize a BOF
August 1, 2013
David Skinner (NERSC, LBNL) reprised his talk from Globus World, discussing new approaches to building science gateways.
Kieron Mottley and Markus Binsteiner reviewed the New Zealand eScience Infrastructure.
July 11, 2013
Attending: Nick Jones, Joshua Boverhof, Oliver Ruebel, Young Qin, Fabiana Kubke, Nancy Wilkins-Diehr, Markus Binsteiner, Rion Dooley, Charlie Dey, Bob Gunion, David Skinner, Joe Stubbs, Amy Ecclesine, Kaitlin Thaney, and Chee-Hong Wong
Kaitlin Thaney introduced Mozilla Science Lab and discussed the group's efforts thus far. They are still very new, and most of their current effort is going toward getting up and running. They are currently working with Software Carpentry folks to encourage good software engineering practices among scientists. Kaitlin also mentioned some protoyping work with PLOS Computational Biology, a project that involves iPython Notebook and R Notebook. They are planning a large project barn-raising in October. Mozilla reaches out to its communities with weekly or monthly Webmaker community calls, which are open to the public. This is a good way to get plugged into what is happening at Mozilla.
Rion Dooley showed some extensive updates to iPlant and its Agave API. Version 2 is a major upgrade with lots of new features. He showed very nice API documentation using Swagger. iPlant now has widgets available for adding into other web sites. They also offer a turn-key Wordpress install. Rion provided links for the API (http://aci-dev.tacc.utexas.edu/dooley/foundation/) and the latest version (https://iplant-dev.tacc.utexas.edu/v2/docs).
Unfortunately, we ran out of time for David Skinner's talk about Globus World, so that will be on the agenda for our next meeting.
April 11, 2013
Attending: Rion Dooley, Shreyas Cholia, David Skinner, Nancy Wilkins-Diehr, Annette Greiner
The group discussed ways to increase participation. Suggestions were as follows:
- pick a regular meeting time
- meet face to face, hold a BOF at SC'13
- find people who have a stake in being involved, something to offer, or a need
We had a broad-ranging discussion about what we each want out of the CG. Annette wants it to be possible to build a web app once and then quickly and easily convert it to run at another center. Rion wants to build relationships, allow us to leverage each other's work. David thinks we're not ready to develop a true spec, CG is place to get thinking together, show people that HPC can make good use of the web. Interoperability would do a lot for distributed computing. We have much to learn in terms of best practices. even if we don't achieve interoperability, we'll learn a lot from one another. what we have now could change a lot. Nancy has seen lots of projects try to connect over the web, all under one umbrella. She agrees it's premature for someting formal, but what we have at least shows that people are talking. It would be nice to have viewpoints from other countries besides U.S. (more grid based).
There was a bit of discussion of the value of web vs ssh - Rion points out the key question: how to get a (science) gateway for under a million dollars? Rion asks, How do we change the paradigm to things that are compatible with the web?
Shreyas thinks a spec would be a nice byproduct. It's good to communicate with people working on the same problems, share ideas. He thinks of what we're developing less as a spec and more as a recommendation. He is less optimistic about interoperability, believes it would be sufficient to be able to pick up the api for another center quickly.
David would bracket success around science outcomes, lowering the barrier to entry. It's not about compliance as much as the spirit
Nancy is creating a Science Gateway Inst. BOF at SC'13.
David points out that the DOE collaboratories program is coming back, better informed this time, with a "build or buy" strategy (acknowledging when you're using existing tech).
David asked Nancy and Rion about support for HPC on the web in the NSF world. They mentioned EXSEDE, iPlant examples, campus resources, but there is no central place where people are talking about it. There've been workshops, activity driven by interest in cloud computing
SC'13 BOF - Rion will send out an application.
As for finding new members, we might want to get someone from Globus Online. David will be speaking at Globus World about web APIs. Shreyas is going, too, will let people there know about the CG. Exposing the unique capabilities of HPC is important (vs what Amazon can do)
We settled on a time for a recurring meeting: first thursday each month, 2:00. We all agree that show and tells would be good to add back into our meetings. we could link to some of the demos from our w3c page.
File API discussion
how do others implement access control?
Annette looked at DataOne. They have permissions set by uploading an XML file. What we really need to do is let different apps use their own permissions model, just accept a blob of input to be handled per the implementation
Shreyas agrees. We need to allow people to put up a blob of some sort that describes the permissions.
Rion mentioned that irods updates permissions as single rows in a db. For one file, you might need to query 32,000 rows, not scalable. A permission has to be its own resource: /file/permissions. versioning is another thing to think about. What are the things that go into a file object? timestamps, version stamps?
March 28, 2013
Attending: Rion Dooley, Shreyas Cholia, Annette Greiner Very brief discussion in which we agreed to postpone the meeting.
February 28, 2013
Attending: Rion Dooley, Shreyas, Cholia, Jeff Long, David Skinner, Annette Greiner
SC13 - we hope to have a workshop there. Rion led an effort to submit a workshop request. We won't know until May whether it is accepted. There should be time to do a BOF instead, if the workshop is not selected. (BOF requests are due in June.) Other conferences that might be of interest: XSEDE, IEEE Big Data (Santa Clara) has a science/web angle
The group discussed options for the file resource portion of the API. Self-discovery is a useful thing. Rion mentioned HATEOAS (Hypermedia As The Engine of Application State) compliance. The group agreed it is a good starting point. We are not planning to make it explicitly required by our spec, but we will use the ideas it encompasses where applicable. The group considered how to deal with various formats for encoding data, such as XML, etc. We agreed to use JSON as the default return. The group agreed that version numbers should be included in URLs and in responses The group agreed that parent resources should list thier children. Some examples of APIs that do this well are Northwind and Netflix There was much discussion of how to request a file resource versus the metadata for that file resource This led to a discussion about how to handle setting permissions. NEWT uses Unix permissions only. What if one is accessing an object store, or FTP server? iPlant uses FTP, uses ACLs to enable group sharing. They have fine-grained ACLs on everything, not just files. Is there a standard for implementing permissions? possible standards include Apache Shiro (shiro.apache.org), Grouper project (www.internet2.edu/grouper/software.html). Globus Online has something, too (though it focuses on files). Another place to look is at CMSs, Amazon. We might want to have a field in the returned metadata that has the permissions info. The group agreed to study existing implementations for permissions, ACLs, etc. for the next meeting.
Attending: Annette Greiner, David Skinner, Jeff Long, Shreyas Cholia The Google Doc is at https://docs.google.com/document/d/1MCWvDZNCkAhUjt8e50zK6ASy-zb0Om7ZoBn-SWuDz1k/edit?usp=sharing
Note: Cray joined W3C (YarcData). We'd like to involve DoD folks. Annette will ping the one she has contact info for.
David entered into the Google doc some initial commands for the status branch of the API. The group filled in more during the meeting. /status (top level) shows status of the API itself. The very top level, /, would be documentation for the entire API. We are using the top level for each command as a list of entities available. All top-level calls should specify info about themselves plus info about their children. There is concern about overhead with returning a list of all subresources. Should resources document only their immediate children? Or maybe the top level lists everything in the whole API? Anything that can have more specific info but doesn't have it is taken as documentation for what it can have. New book mentioned by David: The RESTful Web Cookbook is good for telling you when to break restful constraints. Shreyas will write up a proposed initial solution for what we want to do with data objects, with concrete examples.
Attending: Jeff Long, Rion Dooley, Annette Greiner, David Skinner Annette set up a Google Doc for job commands, where we can set down the commands we agree on for the standard API. Rion added in some suggested commands, and Jeff added info about the Lorenz API for reference. We worked through the Google Doc and Rion updated it as we discussed.
We settled on the word "jobs" for this section of the API. We added the reference to a specific machine for centers that need that. LLNL has multiple endpoints. People can view other people's jobs. There was discussion about the uniqueness of job ids. We favor having unique IDs that can be used to find data about a job later. A uuid generator can be good. Since queuing systems re-use job IDs, we have a job_id and a unique_id. The latter exists just for the API.
We'll need to clarify the inputs and outputs. We agree that they should be arrays. We considered an additional /queues portion of the API, but we decided the functions could be done elsewhere in the API, as attributes. The one example was a list of possible queues. (LLNL calls them partitions, and bases the the choice of system on requested compute characteristics.) We also talked about including commands for stat, reservations, step, and customization hooks. We decided to keep these things outside the core API for now.
Next, we'll talk about the system commands.
Attending: Joel Martinez, Annette Greiner, Jeff Long, David Skinner, Rion Dooley, and Shreyas Cholia. The discussion started with followup questions from the last meeting. Rion explained the use of events in iPlant. They are following the pub/sub model for long-running tasks, using Rabbit MQ and Java Messaging Service (JMS). Rion asked about how NEWT uses object stores. They are Mongo objects used for session-level info. Binding is to users; users can set permissions for other users. Joel: LLNL uses suexec, unix permissions. There are some issues with that: can't have a generic superuser. Shreyas pointed out that the Mongo service is only available through the API, and that it enables powerful searches.
Going back to the discussion of events, iPlant has separated Rabbit from pub/sub. Can post back to service, send email, do twitter, IM (Jabber, XMPP). They rolled their own permissions model. David asked whether Twitter/Jabber support should be in our API. LLNL said it wouldn't use it.
There was discussion about the apps portion of Agave. iPlant aims to support people who build production gateways (not commercial). Their app services acts as a registry. It's very commonly used for iPlant. Eventually, apps move to officially supported iPlant (published). They associate a DOI with a specific version. Popularity info helps establish trust. It's still pretty complcated, but they have ~250 apps. (Parameters can get very hairy.) There was some discussion of watched folders, Action Folders project, like Mac Automator for Linux.
We had a discussion about the grammar of the API. Rion pointed out that there should be a consistent point of view (e.g., user, administrator). Our existing APIs take the user perspective, so we'll aim for that with the standard API. Rion mentioned a new auth system like Oauth 2. They are making requests as the actual user. They have to deal with multiple hingdoms, "multiplexing users". The group doesn't favor prescribing a specific authentication scheme, but we see nothing wrong with making recommendations for deployment.
We agreed to start a Google Doc in which we'll record the commands for the first area we want to tackle, jobs.
October 25, 2012
We had a brief comparison of the contents of the Google Doc that shows the existing API calls. It was suggested that we should support JSON, use parameter tags for other I/O formats (like XML) We noted that the iPlant API has a calendar and events question: are they events triggered by computational activities or scheduled by time? should we handle alerts? reservations?
We realized that we needed input from the iPlant Collaborative to make any decisions, will reschedule when they can definitely be there.
October 11, 2012
Can we make a standard for HPC web APIs? Acknowledging that the various centers have different ways of handling many things (accounting, for example), we generally agree that there is probably some subset of operations at HPC centers that can be standardized.
|NEWT has||Lorenz has||iPlant has|
| data conversion|
What belongs in an HPC web API standard? rest calls, specific I/O formats
How will we agree on one? what is our process? we'll start with looking at how we handle files and jobs, which seem pretty similar across centers. The next thing to attempt may be status information.
What about an implementation? There is general interest in building a stub implementation. This would allow us to test the spec we come up with and can be useful as a starter for setting up new APIs
Who wants to work on what? To begin, we'll all work together on determining an overall approach, so that we don't end up developing subsets of standards that aren't compatible. We need to have a common grammar, set of verbs, syntax for how urls are crafted. Some of us have found the O'Reilly book Restful Web Services helpful. We agree that the standard for I/O should describe JSON, though we might add support for XML later (might suggest a decorator that does conversion)
We decided to put together a list of API calls in the three services represented by participants
Annette will set up a Google Doc with a big table
We plan to continue meetings on a two-week schedule
May 15, 2012
Election of chair - we decided to use the polling widget on the community group web site.
Discussion about process for developing an API spec - we'll start by looking at what already exists in the HPC community
Discussion about whether we have a broad enough membership - we'll reach out to DOD folks
Discussion about how we can make our web site more useful
March 7, 2012
This meeting consisted primarily of demos, where participants showed apps created with local APIs
Rion Dooley iPlant's restful API
Joel Martinez, LLNL Lorenz updates: dashboard, job management, and application portal
Jack DeSlippe NERSC Mobile
Progress Reports localization of NEWT Efforts at Sandia other groups who are represented
W3C Community Group How do we want to elect a chair? vote at next meeting Do we want to start working on some sort of standard? yes