Overview of Grid Computing
Web Services is starting to cede its pre-eminent position in IT hyperbuzz
to Grid Computing. As always the reason is the same - the major vendors
are
staking out positions on a technology frontier which they believe is going
to unlock either the next great killer app or unleash one or several of
their product lines from the doldrums or malaise. Cisco is the early
adopter followed
by HP, IBM,
Sun
and a brace of start ups promising On Demand or Grid Computing based on
the the Utility Model. For example, IBM has been flaunting
the US Open website as a perfect example of On Demand computing - it ramps
up for in late August for 100 times the demand it will see throughout
the rest of the year. Oracle has released a new version of its database,
Oracle 10g, which is geared
to take advantage of on demand, grid computing.
Make no mistake there
is real technology being developed
in the
trenches; but how well it can or should be expected to meet current IT
needs and requirements is an open question. But
grid computing has long been hatching since the days of fail-safe and
time-sharing processing back in the late 60's and early seventies - MIT's
Multics was designed to be a computing utility service. The basic idea
of sharing
processing loads among several computing systems has long been a target
of many IT shops and vendors. The trick is match transiently unused or
under-exploited computing power and resources with scattered requesters of
computing needs over
a network or campus of systems. Grid Computing has been successfully done
notably in research institutions such as MMS and Neph for weather simulation
or Cactus and Tardis for astrophysics (see Globus website reference).
And commercially successful systems can be found from the Sun GridEngine
clusters used
|

|
| to design and streamline Maclaren racing cars cars for every
F1 race to
the Pratt Whitney Grid usage that helps design aircraft engines. These successes
reflect the fact that vast improvements in hardware are finally being matched
to and managed by better and more standardized distributed processing software.
Web
Services and distributed software for dynamic sharing and recovery tasks
among others have helped drive and perfect new grid software. On many
large computational projects such as genome identification, weather pattern
recognition,
or complex molecular modeling - marshaling spare computing resources is the
only way tasks can hope to get done. So many major vendors, recognizing the
favorable convergence of resources and interests are racing to develop positions
in the
fast emerging field. For example,during the summer of 2002 HP and IDC conducted
a series of seminars entitled "Building Grids: Hype
Meets Reality" whose theme was that medium to large scale businesses can
do grid computing now. The (451)Group has
recently released a 215 page report arguing the next 18 months will see the
inflection point for the growth of grid computing. The result is that the
emergence of grid computing may also be attracting its unfair share of hype
(see articles references).
Grid computing can be divided into four main categories: 1)local clusters which
manage many resources and many tasks but on only one system/project and on
one network; 2)campus
grids typically add many user systems or projects to the
mix and usually many clusters of computing power to be managed. Also the
the tasks may be spread over a wider network but usually within the same firewall;
3)Net
taskers are
single tasks like the SETI or FightAids projects which access and utilize
worldwide Internet accessible resources marshaling them to their individual
projects; and 4)global
grids have many tasks/projects plus owners again accessing and utilizing
shared resources anywhere on the Net and/or other special access networks. Within
the last 5
years the
huge amount of research and development done by universities and research
institutes in sharing and effectively utilizing HPC-High Performance Computing
resources has advanced the state of the art of Grid computing to such a degree
that many vendors (see references for our extensive list of system
vendors) are commercializing these applications. For example, doing the projects
in local clusters and campus grids, are well served by
software from a number of vendors including the small Info Designs DeskGrid
to Entropia Platform Computing, Sun and others
(see our list of software vendors in the references). Second Grid computing
is well served by the new Web Services technologies. And a relative long
history of application in university settings will provide the needed consulting
expertise to launch these projects. So look for a fairly rapid take off of
Grid computing.
Resources
There are a couple of technologies very close to Grid computing which derive
or share technologies - autonomic processing and P2P-Peer to Peer processing.
Autonomic processing is about enabling programs to be more resilient
and self-sufficient in their operation. Autonomic computing enhances the
desktop OS, client programs and servers systems such that they can be more
self- optimizing in performance, self-configuring at start up, self-recovering/healing
and self-protecting against outside attack. Now some parts of these functions
are already available in various
server and desktop client OS. What grid computing in conjunction with Autonomic
computing is doing is to help establish common protocols and standards for
these
important tasks. Fortunately, the Grid and Autonomic developers have a
strong research and development track in the university and institutional
arena to draw upon (see www.gridcomputing.com reference). Peer to Peer processing
is about one or more clients on the Internet sharing primarily data but also
processing with minimal intermediate server intervention. Again, the discovery
and brokering of services used in
P2P are influencing Grid computing software.
As one might imagine given the nature and rate of change of the technology,the
books and articles on the topic fall into two categories.- 1) buried in the
journals of academia and 2)to be found in books and white papers but
with either a proprietary bias or slightly out of date. The latter is true
of two of the best books on Grid computing. Foster's The Grid is from April
of 1998 but despite the rapid advancements still manages to collect together
articles
on the core issues and opportunities in Grid computing. Pfister's book is
from the same time frame but its emphasis on the detailed issues of hardware
clustering and sharing stands the test of time as contemporary readers give
it 5 star ratings. In contrast the articles on Grid computing to be found
in the trade press are primarily news items and status reports on who is leading
the race to dominate the market. There is also a website, www.gridcomputing
planet.com devoted to news, reviews, and other links; but this reviewer prefers
the links only site www.gridcomputing.com for the best pointers on where on
the Grid to Go.
Do It Yourself
The best way to get first hand experience with Grid Computing is to do it
yourself. Their are a number of P2P grids that users can participate in by
loaning spare CPU minutes over the Net to various causes. Perhaps the most
famous is SETI@home which is using spare CPU cycles throughout the world
to help analyze various extraterrestrial electromagnetic signals for signs
of intelligence. Other Grid projects that you can lend CPU time to include
fighting Aids, doing genome research, participating in stock market forecasting
using neural networks or finding the 5
largest prime numbers. So this is an opportunity to test how unobtrusive
sharing CPU cycles really
is. And users can even set up their own Grids by downloading from small,
Windows only InfoDesign's
DeskGrid, or some of the gaming experiences to be found at Butterfly.NET or
the full-scale, yet essentially free Sun GridEngine (the the complete source
code and copious documentation is available at gridengine.sunsource.net)
which works on Linux
and Solaris.
As
well MindElectric and Entropia have downloads available for use on Windows
or Mac machines.
What we found very useful at both the commercial and P2P sites is the strong
community available for help and assistance. This is helpful both for setting
up clients and definitely when setting up a Grid server. Sun's documentation
and community support for GridEngine were very impressive; but support
for Windows clients requires users
to get special add-on software. In contrast, Entropia and MindElectric
have less availability of their software innards and documentation but do
support Windows machines quiet easily in their clusters. Finally, we appreciated
being able to share with MoneyBee where we got some direct benefit from accessing
their stock market forecasting results which we had contributed a small part
to.
Ubero offers a mix of fee
or free grid computing projects that users can sign onto and thus giving
individuals and organizations a
small ROI choice in how they share their excess CPU power. Of course a lot
of major
vendors like Intel and IBM see this as an opportunity; however the global
Grid software is still not up to the
task of transparenly sharing/trading resources among users who may also be
suppliers or competitors at other times.
Generally, Grid software draws on a wide set of client resources from Supercomputers
through midrange servers to workstations and PC desktops. This is the challenge
of Grid distribution and scheduling software - to find and
allocate tasks among a myriad of heterogeneous machines with varying capabilities.
And to do this process while handling the inevitable exception and failover
conditions. Thus the attraction of do-it-yourself is being able
to try the software first as a client and then as Grid server in order to
get a front line feel for how well Grid software handles these tasks. And
as a Grid server, one can step graduate from local clusters to global grid.
It is helpful that Grid computing has so many entry points.
However, a grid is at the ultimate point of optimization in a organization's
systems. First, a grid presumes there is an adequate set of hardware
and software systems in place meeting frontline operational and then
planning needs of
the organization. Then a grid also relies on very high standards
of security, reliability
and
interoperability of those same systems. Finally, a grid will test
the saviness of both the operations and development staff to be able
to customize systems
to meet specific grid-based requirements.
In effect, a grid environ
and on demand, utility-like computing model is where many organizations
ultimately
want to be; but without the disasterous tightly linked, black out
liabilities. And for some disciplined organizations such goals
are reachable. However
the current reality is several steps away from the necessary and
sufficient conditions. Microsoft will have had to deliver substantially
more on its
trustworthy, high reliability, and interoperability initiatives.
Database and middleware vendors likewise will have interoperability
and reliability
gaps to close while hardware and network vendors will have to
deliver easily installed and robust security mechanisms. In short, for
many
organizations Grid and On Demand computing will be a goal, an ideal
to be attained.
Summary.
During the recent pullback in IT investments companies are looking
hard at how to get economies in their IT function. And there sitting on
their desktops
and networks is an enormous resource of idle CPU time waiting
to be tapped into and harnessed for more productive return. Some companies
like Ford,
Pratt-Whitney, Nortel, are taking advantage of this spare capacity
using grid software to
do everything from modeling and simulations to bread and butter
computer bound engineering and market analysis. But many are doing so
quietly, generally
keeping mum about what goes on after hours. And why not - Grid
computing offers not only offers real and substantial cost savings but
also competitive
advantage as it is also a solid development testbed for emerging
Web Services, Autonomic computing and other potentially high pay off computing
strategies.
Do nothing CIOs may see a hybrid adage - "idle CPU minds means having the
devil to pay later". |
References:
Articles:
Grid Computing by Mitchell Waldrop-MIT Technology Review May 2002 - provides a good overview of the major
commercial and academic players in Grid computing
Books:
The Grid : Blueprint for a New Computing Infrastructure by Ian Foster, Morgan Kaufmann 1998 - has remarkable
timeliness defining the nature and issues of grid computing.
In Search of Clusters (2nd Edition) by Gregory F. Pfister, Prentice Hall 1998
- discusses core problems in cluster and grid computing in clear terms
Grid Computing: A Practical Guide by Ahmar Abbas, Charles River Media - recent,
November 2003, book on grids
Do-it-yourself:
SETI -
first of the global grid projects; U. Cal Berkley looking for
signs of intelligence in radio signals from space
Moneybee-Ever wondered if
AI could give you advantage in predicting the stock markets? Lend some CPU time
and get some answers at MoneyBee.
DaliWorldl-shows
where in the world fish you have "adopted" have traveled;
FightAids - Grid project devoted
to fighting aids through drug analyses passed on to the Scripps Institute
Gimps - project is trying
to find the 5 largest prime numbers in the world
COSM - Stanford
needs help in simulating the dynamics of protein folding in gene research
Sun's Gridengine -
Get the full Sun GridEngine 5.3 and setup your own local grid to do compute sharing
Vendors:
Altair - has
OpenPBS-Open Public Batch System, a good test bed for batched grid
apps (Photoshop users take note).
Avaki has campus and global grid solutions
centering on enterprise information integration.
Butterfly.net - the ultimate
gaming experience powered by clusters of PCs and then only one.
Centrata - data center
management software performs grid-like site schedulng, backup and recovery
functions.
Data Synapse - does clustering with emphasis
on ease of conversion of legacy apps to Grid processing.
Ejasent - provides software
to help setup and do policy-based application control for On Demand computing
Enigmatec - delivers
a self managing grid computing platform
Entropia - enables local
clustering of networks of PCs for non-disruptive CPU sharing.
Frontier- prepares Java
programs for being distributed over a local cluster for processing.
Gridfrastructure -
provide local and campus Grid infrastructure tools for managing the grid.
Grid Systems-
provide local and campus grid processing with specialized templates for
converting tasks to the grid
Gridiron Software tools make it very simple to add parallel distributed processing to your tasks
Oracle -
had done quite a lot prior to Oracle 10g to enable grid computing in its
products, see here.
Platform Computing -
has grid solutions resold by IBM and HP.
Powerllel -
provides tools to parallelize applications for use in MPP and Grid computing.
Sun - Sun has cluster,
campus, and global grid solutions using heterogebeous platforms and OpenSource
standards
Symbiant -
does consulting and software in the P2P and Grid Computing world
TheMindelectric - Gaia is P2P/Web
Services/Grid computing software that automatically manages tough load balancing,
clustering and failover
tasks.
United Devices - MetaProcessor
is campus wide grid software
Standards
Grid Forum - grid
standards making group active in such areas as architectures, security, P2P
processes, scheduling and resource
management, etc.
Distributed Resource Management Application API (DRMAA and pronounced "drama")-GridForum
standard supported by IBM, Intel, Platform Computing, Sun
GlobusThe Globus Project
OGSA (Open Grid Services Architecture) will add Web services
and appear in Globus Toolkit 3.0 with other Grid Computing
software.
Websites:
www.dsonline.computer.org -
IEEE distributed systems online site has
info on all aspects of distributed processing
Gridcomputing.com -
all the info on university, research institute work in Grid computing with
scores of solid links
Gridcomputingplanet.com -
site devoted to grid computing news, resources, and commercialization
Globus.org -
global GRID applications like MMS and Neph in weather, Tardis in astrophysics,
etc
Links on the
Grid - a very good set of links to all things Grid
|