Messaging and the Grid, the Perfect Marriage
The chances are high that your organization was one of many that went through the Enterprise application Integration (EAI) phase (craze) in the 90s. Messaging middleware was a primary component of the EAI project stack and was available from a myriad of commercial or open-source vendors.
In this article, you look at features of messaging middleware that apply to Grid computing and learn how marrying the two benefits the end user.
There are multiple messaging middleware vendors, ranging from commercial (such as TIBCO, Tervela, and IBM) to open-source projects (Apache project with at least two Messaging middleware projects). As such, the middleware domain has evolved to a commodity market, with a large number of vendors offering similar products. Unless you are looking for unique attributes—for example, very large number of messages per second (as in Tervela)—you can successfully utilize one of many free-and-open source middleware frameworks that exist in the market.
Messaging decouples the various components that need to communicate with each other. This is exactly what you desire in a distributed computing environment. In the good-old days, CORBA achieved some of this decoupling. However, CORBA had many drawbacks, leading many to eventually replace CORBA with a messaging based model.
Messaging enables communication based on topics or subjects. A user publishes or sends messages to a topic. A user subscribes to a topic and, if any message is published to that topic, it receives them. Messaging provides a very simple, yet very enabling and powerful concept. By definition, a messaging solution must be dynamic and be able to handle changes in the infrastructure. A new subscription can be added at any time; a publisher may start publishing to a new topic or other resource. Furthermore, a messaging solution is expected to be scalable and handle a very large number of clients and topics. These are some of the reasons that messaging was a crucial component for application integration (EAI), and is to this very day.
Grid and Cluster Computing
Grid computing is composed of many nodes that are heterogeneous in nature: different types or versions of the OS, different processors, configuration, and so forth. Any Grid vendor that wants to survive in the Grid market must be able to provide a framework that is adaptable to the needs of the end-user and be able to support system heterogeneity. In contrast, the cluster model is homogeneous in nature: a number of nodes that are virtually the same and are in very close proximity of each other (same datacenter). One would expect that if a vendor can support one model, for instance, the Grid model, it could support the cluster model as well. Surprisingly, the reality is quite the contrary, because the dominant cluster or Grid vendors have been unable to "cross pollinate." Look at Table 1 for a summary of differences between the two models.
Table 1: Grid vs. Cluster
|Heterogeneous Environment||Homogeneous Environment||Heterogeneous Environment|
|Globally installed||Close proximity||Global install|
|Resources come and go any time||Resource pool is constant||Dynamic pool of resources|
|Tasks are typically longer running (10 sec and above)||Tasks are typically very short, even in milliseconds||Short and Long running tasks|
|Communication overhead is acceptable||Communication overhead is minimized||Minimal communication overhead|
|Coarse-grained sharing of resources or Grid hugging#||Resources are shared and accessed by all||Resources are shared and accessed by all|
# Because sharing of resources is not seamless, you still get into the mode of groups that have "their own" Grid installations.
In review of the above specifications, one can make a case that a cluster is a subset of Grid computing.
Figure 1: Problem space solved by Grid and HPC
Furthermore, as summarized in Table 1 (the Optimal Scenario column), if you can combine the features of the two models, you could benefit from the scalability of a Grid model and the performance benefits of the cluster. The idea is to lower the costs of building a compute backbone that is responsive, scalable, and can handle whatever type of workload that is thrown at it.
Combination of Grid and Messaging
As noted above, the inherent benefits of a Grid model form a strong foundation for infrastructure development. You can further exploit the strength of this model by incorporating some of the fundamental attributes of Messaging Middleware. Why messaging? After all, messaging in its pure form is part of any Grid framework. More specifically, messaging can serve to minimize communication overhead, and in turn, provides a model to support both short and long running tasks in a given Grid environment.
Figure 2: Messaging as it relates to Grid and HPC