W3C TAG telcon

13 Sep 2005


See also: IRC log


PaulStrong, Vincent, Norm, Ed, DanC, DOrchard
TimBL, HT, NM, Roy




<scribe> Scribe: Norm

<scribe> ScribeNick: Norm


Most of today is for GRID discussions

Accept minutes of last telcon: http://www.w3.org/2001/tag/2005/09/06-minutes.html

Accepted (Vincent will remove "DRAFT").

Discussion of GRID

Thanks to Paul Strong for joining us

This is an informal discussion of GRID and it's connection to the Web

Paul: Paul Strong is a Systems Architect at Sun. Works in the N1 product group. N1 is a suite of products that leverage the GRID
... Grid is a somewhat ambiguous term being widely used by vendors
... Within N1, I've been working on products for about five years. Mostly working on data center and enterprise applications
... Recommends July issue of ACM Queue
... GRID is a view of the networking infrastructure
... It's a view of computing resources that are pervasive. It's more about the platform than the end-user applications

<DanC> (hm... http://www.sun.com/software/gridware/index.xml Sun N1 Grid Engine 6 ... seems to be a hunk of hardware. I thought maybe N1 was a service.)

Paul: GRID really is about recognizing two trends: growth in network bandwidth, and network distributed services
... GRID platform offers scalability, redundancy, ...
... Needs services for distributing and managing work loads
... Analogous to an electrical grid, in the sense that it's pervasive and more-or-less uniform

<DanC> (hmm... http://en.wikipedia.org/wiki/Grid_computing "The SETI@home project, launched in 1999, is a widely-known example of a simple grid computing project." )

DanC: Sun N1 Grid seems to be a hunk of hardware...

Paul: The N1 products are a mixture of both hardware and services
... Software is a meta-operating environment. Those products are called N1
... They're closely tied to a set of hardware to run them on at Sun. The result is an integrated set of components. You no longer care about individual servers or OS instances.

DanC: So if I buy a chunk of N1, do I get CPU hours or a box?

Paul: It depends what you want, you can buy time on our GRID, or buy hardware and setup your own
... An example of a GRID application is SETI@Home
... The use of the term GRID was prevalent initially in scientific and academic community.
... In the commercial space, rendering and simulation applications
... The software that allows that workload to be distributed/managed/aggregated is the middleware, integration layer that is the meta-operating environment

DanC: Is it a style of computing, or is it technical standards that you could interoperate with?

Paul: It's some of both

DanC: Does SETI@Home conform?

Paul: No, it predates them. The context is still being refined.
... There are a couple of consortia working on this: The Global Grid Forum
... There's The Enterprise GRID Alliance, focused on driving GRID adoption within enterprises

<DanC> (http://en.wikipedia.org/wiki/Grid_computing doesn't seem to mention The Enterprise GRID Alliance )

Paul: To get the GRID used in less compute-intensive environments

<DanC> Enterprise GRID Alliance

Paul: discusses benefits of GRID: ability to manage pools of resources; a mutable, dynamic space
... reiterates the goal of treating these things holisticly...

<DanC> (EJB and J2... missed. hmm... I was starting to understand...)

Paul: workload management, mechanisms for monitoring, managing, controlling processes
... Users need to be able to combine a heterogeneous set of products and services together
... Standards are needed to allow each of these components to be managed.
... The term GRID has become very loaded.

[scribe lost thread]

There's lots of marketing in this space: managing complexity, providing agility, etc.

Paul: They're very similar, but they aren't identical. The GRID space is very confusing for many of the end-users and consumers.

Ed: GRID is a very broad term. Everything from SETI@Home to shared system resource pools that's more of a realtime virtual machine type of thing

Paul: Yes, absolutely.
... One of the difficulties we have as an industry is articulating this
... It's going to take a long time to get to the end.
... A lot of the technologies we think about today in the GRID space that do the mapping of workload onto resources
... There are also provisioning services
... What we're automating today is the provisioning processes, but that's just the beginning.

DanC: How is provisioning expensive?

Paul: Consider an electronic book store that has a web tier, a web service tier(?), and a database server tier
... There's a set of database servers running on particular Sun hardware with a particular OS
... The services layer might be BEA running on some particular Dell hardware
... Right now there isn't a standardized way to describe all these components
... Not only are the components complex, but there's a relationship with every other component already in the data center
... Today, people manage individual resources
... But those are increasing exponentially
... Because they don't trust management tools, each server is typically dedicated to a single function
... This leads to silos of services that perform single tasks
... This leads to waste and lack of agility
... It's very hard to track relationships between all the components

DanC: Are there any GRID computing saves the day stories?

Paul: There are stories that it's leading that way
... A lot of stuff is relatively static today. We have a tool that allows you to provision complete projects, like the bookstore
... It does all the work
... It typically pays for itself in six to twelve months. There are fewer unplanned outages because planned downtime is all automated
... It's more deterministic in production and is more reliable.
... The developers can create the model when they create the application. For provisioning the test and QA engineers can test with a single button.

DanC: It has a little blinking light that says "you need a new database server"

Paul: Yep.

[Scribe hears something about ad hoc construction that seems at odds with the previous story..]

Paul: When load gets high, the provisioning application will attempt to reconfigure (scribe ?)
... Getting to the point where it all "just works" is going to take a long time. It's very easy to solve problems with regards to concrete things, but it's far more complicated when you're trying to model more abstract components (a server vs. a tier of servers)

DanC: It's all proprietary things cobbled together, but Sun does have products in this space?

Paul: Yes. It's mapping workload onto resources with respect to policy.

<Ed> HP and IBM do as well. Unfortunately, they don't work together to create one grid, each has its own grid.

Paul: In the GRID world, we're talking about mapping services (a bookstore, SETI@home, etc.) onto a network of resources (servers, firewalls, etc.) with respect to policies
... The first things that get automated are the simple mechanisms.
... There will eventually be a move towards automating higher order problems, such as managing performance and availability.
... Today there are no single products that let you do all of those things
... Instead you get different products to manage different aspects of that. You get something that is more automated, but still has lots of human interaction
... Sun has products that fit into a number of those spaces, but none are integrated together as a whole meta-operating system. No one's products are.

Vincent: What are the consortia doing today, what are the main standards under development?

Paul: Several things are needed
... A way of describing the requirements of the system

The Enterprise Grid Alliance is working on this sort of thing

Paul: And use cases based on that description
... We're working on a standard set of requirements that we can give to other standards organizations
... The Global Grid Forum is working on standards farther downstream
... A service-centric architectural view; the OGSA (Open Grid Services Architecture)
... Because GRID was originally driven by compute-intensive applications, they have a lot of those, but they're working on getting more broad
... A job control language is one example. How do I describe a work load, schedule it, monitor it, etc.
... As you approach the more concrete things, you want to standardize them too. That's where interaction with DMTF occurs.

DMTF = Distributed Management Task Force (www.dmtf.org)

They own the SIM standard (Standard Information Model)

There's work to make some of these things more abstract as well (pools of servers instead of single servers)

Paul: There are OASIS GRID/WS standards under development as well
... You can look at GRID as the platform that is the network that is the web
... There are other standards in this space too (for storage, for example)

<Zakim> DanC, you wanted to ask if these enterprise grids have peers grids

DanC: Are enterprise grids mostly their own world, or do they have peers?
... Does my grid talk to other grids?

Paul: We define an enterprise grid as the set of components (from disks to CRM applications) managed by a single enterprise
... But each may have several data centers
... In some sense, they're isolated in terms of management, but they do interact with the Web.
... And one enterprise grid could interact with another (the bookstore grid interacting with the credit card company grid)

DanC: How will these two talk to each other?

Paul: The expectation is that we'd be using standard mechanisms for interaction
... But I as the bookstore owner may have expectations about the speed of service from the credit card company
... I may want to negotiate that quality of service.
... Possibly on a per-transaction basis.

If my customer is a real brick-and-mortar store ordering thousands of books, I may want a faster answer than for Joe Individual User.

Paul: We chose to bound the problem at a single enterprise because it makes authority and control simpler
... When you're working across enterprises, then you have federation rather than hierarchy
... GGF views its charter as everything grid, they see what EGA does as (an important) subset
... They care about viewing the internet as a set of computers controlled by different organizations but on which I could impose a virtual organization
... For example, automobile design is sometimes shared across companies because it's so expensive
... From the GGF perspective, a virtual GRID could be constructed between these companies
... Typically, the shared resources are segregated from the companies own resources

Ed: It seems like because the GRID is undefined, a lot of work is hindered. If it's more along the lines of a distributed computing environment, then I can see where that comes into play. Is there progress on defining either striations or a clear definition of what GRID is?

Paul: In terms of the word GRID, no
... We're working on this to some sense in EGA by working on requirements. By being able to clearly enumerate and describe problems, we can guide GGF to work on a particular area.
... A big challenge is identifying the set of problems that people care about most and the boundary between the components we care about.

Paul describes a number of things that can be virtualized

Paul: Having a model for these components and the life cycle of those components is critical for the standards bodies to be able to do stuff that isn't unintentionally competitive

Ed: Right, and I guess that's why I think breaking the big problem down into smaller problems seems like something you'd want to do

Paul: GGF is more of a boil the ocean perspective, EGA is about boiling enough water to make a cup of tea
... There is a working group called the SCRUM (scribe wonders about spelling) in GGF that's trying to look at these issues

<Zakim> DanC, you wanted to ask about job migration between, say, sun's and IBM's grid services

DanC: If Amazon rented time on the Sun N1 thingy and some IBM On Demand computing, is it feasible to migrate jobs across those?

Paul: It totally depends.
... There are certain classes of workflow where you can migrate the work today. In a batchable system, you could move them around in stages.
... Rendering would be a good example. I've got 20,000 jobs, I can send 10,000 to each. 3,000 fail on one system so I can migrate them to the other.
... If you have shared infrastructure, you can migrate between transactions

DanC: Across the Sun/IBM boundary?

Paul: Technically, yes.
... Right now a lot of this is really proprietary. It'll become easier after the standards are written.
... People are mainly looking at whole data centers or whole enterprises at the moment.

Vincent: Is there anything important that you feel wasn't addressed?

Paul: I'm not really sure.

Paul recommends ACM Queue Magazine again

Most of the articles will be online soon.


TAG thanks Paul for a great overview.

Vincent: Thanks also to Norm for organizing Sun's participation

Norm: Thanks again, Paul

Edinburgh Face-to-Face

Draft agenda: http://www.w3.org/2001/tag/2005/09/20-agenda.html

Vincent: Some time for issue status, then time for four or five issues to discuss.
... Return to the discussion of new directions.

<Zakim> DanC, you wanted to ask for abstractComponentRefs-37 on the ftf agenda, maybe

DanC feels more prepared to talk about abstractComponentRefs-37

Vincent: Try to review the draft agenda over the next day or so and send feedback so it can be updated before the f2f.
... Any other business?

Next meeting is the f2f on 20 Sep in Edinburgh


Summary of Action Items

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.127 (CVS log)
$Date: 2005/09/20 08:33:34 $