The chances are high that your organization was one of many that went through the Enterprise application Integration (EAI) phase (craze) in the 90s. Messaging middleware was a primary component of the EAI project stack and was available from a myriad of commercial or open-source vendors.
In this article, you look at features of messaging middleware that apply to Grid computing and learn how marrying the two benefits the end user.
Messaging Middleware
There are multiple messaging middleware vendors, ranging from commercial (such as TIBCO, Tervela, and IBM) to open-source projects (Apache project with at least two Messaging middleware projects). As such, the middleware domain has evolved to a commodity market, with a large number of vendors offering similar products. Unless you are looking for unique attributes—for example, very large number of messages per second (as in Tervela)—you can successfully utilize one of many free-and-open source middleware frameworks that exist in the market.
Messaging decouples the various components that need to communicate with each other. This is exactly what you desire in a distributed computing environment. In the good-old days, CORBA achieved some of this decoupling. However, CORBA had many drawbacks, leading many to eventually replace CORBA with a messaging based model.
Messaging enables communication based on topics or subjects. A user publishes or sends messages to a topic. A user subscribes to a topic and, if any message is published to that topic, it receives them. Messaging provides a very simple, yet very enabling and powerful concept. By definition, a messaging solution must be dynamic and be able to handle changes in the infrastructure. A new subscription can be added at any time; a publisher may start publishing to a new topic or other resource. Furthermore, a messaging solution is expected to be scalable and handle a very large number of clients and topics. These are some of the reasons that messaging was a crucial component for application integration (EAI), and is to this very day.
Grid and Cluster Computing
Grid computing is composed of many nodes that are heterogeneous in nature: different types or versions of the OS, different processors, configuration, and so forth. Any Grid vendor that wants to survive in the Grid market must be able to provide a framework that is adaptable to the needs of the end-user and be able to support system heterogeneity. In contrast, the cluster model is homogeneous in nature: a number of nodes that are virtually the same and are in very close proximity of each other (same datacenter). One would expect that if a vendor can support one model, for instance, the Grid model, it could support the cluster model as well. Surprisingly, the reality is quite the contrary, because the dominant cluster or Grid vendors have been unable to “cross pollinate.” Look at Table 1 for a summary of differences between the two models.
Table 1: Grid vs. Cluster
Grid | Cluster | Optimal Scenario |
---|---|---|
Heterogeneous Environment | Homogeneous Environment | Heterogeneous Environment |
Globally installed | Close proximity | Global install |
Resources come and go any time | Resource pool is constant | Dynamic pool of resources |
Tasks are typically longer running (10 sec and above) | Tasks are typically very short, even in milliseconds | Short and Long running tasks |
Communication overhead is acceptable | Communication overhead is minimized | Minimal communication overhead |
Coarse-grained sharing of resources or Grid hugging# | Resources are shared and accessed by all | Resources are shared and accessed by all |
# Because sharing of resources is not seamless, you still get into the mode of groups that have “their own” Grid installations.
In review of the above specifications, one can make a case that a cluster is a subset of Grid computing.
Figure 1: Problem space solved by Grid and HPC
Furthermore, as summarized in Table 1 (the Optimal Scenario column), if you can combine the features of the two models, you could benefit from the scalability of a Grid model and the performance benefits of the cluster. The idea is to lower the costs of building a compute backbone that is responsive, scalable, and can handle whatever type of workload that is thrown at it.
Combination of Grid and Messaging
As noted above, the inherent benefits of a Grid model form a strong foundation for infrastructure development. You can further exploit the strength of this model by incorporating some of the fundamental attributes of Messaging Middleware. Why messaging? After all, messaging in its pure form is part of any Grid framework. More specifically, messaging can serve to minimize communication overhead, and in turn, provides a model to support both short and long running tasks in a given Grid environment.
Figure 2: Messaging as it relates to Grid and HPC
As mentioned previously, pub/sub can be found in many organizations as part of their EAI strategy. You propose to use the existing infrastructure and expand on that to encompass the Grid (see Figure 3). Even though HPC is a small portion of this problem space, it is considered to be the most challenging one.
Figure 3: Problem space applicable to messaging and the rest of the computing realm
The xFactor Factor
In the previous sections, you have examined the relevant attributes of architectural models, such as Grid and Cluster, as well as those of Messaging Middleware. You have an examined a hypothetical (Optimal Scenario) model that combines the best components, thus forming a hybrid environment. Unfortunately, at this point in time, this model is strictly hypothetical; there is no offering commercially available. The xFactor software offering has taken the hypothetical model and turned it into reality.
xFactor provides enterprise-class solutions to the High Performance Computing (HPC) domain, with the goal of optimizing the client’s datacenter investments. SoftModule’s xFactor Grid management software package provides organizations with a method to optimally run and manage compute intensive applications across thousands of CPUs. By leveraging xFactor‘s distributed architecture and dynamic resource allocation techniques, clients can achieve dramatic improvement in application performance and resource utilization.
To best appreciate the xFactor framework, it’s helpful to characterize the architecture via the software stack as depicted in Figure 4. This stack resolves the environment into three logical functional planes, with all three “glued” together via messaging.
- Control Plane
- Data Plane
- Compute Plane
Figure 4: The xFactor software stack
The Control Plane is the infrastructure itself; it is the messaging installation along with the network elements because some messaging vendors make heavy use of the underlying networking elements such as routers and switches to increase performance.
The Data Plane is where the data exists; this layer could also be referred to as the Data Grid. Messages carry the data and provide data transparency to the Compute Plane. This model allows the messaging applications to go untouched and work alongside the Grid-enabled applications. A pool of resources—whether they are desktops, workstations, blades, servers, or even network resources—are shared amongst all applications. The Compute Plane has the advantage and the capability of utilizing unused and excess resources. In a sense, messaging applications take precedence in a mixed environment, and the Compute Plane takes advantage of unused resources. Furthermore:
- Messaging applications can be “upgraded” or migrated to the Compute Plane by Grid-enablement
- Grid-enabled applications take advantage of the services provided by the Data Plane and the Messaging Infrastructure to ensure SLAs are met
- Resources are shared across the enterprise, and not just within the “Grid group”
Conclusion
Messaging middleware frameworks have been around for a number of years, and have penetrated EAI applications with great success. Grid and Cluster installations have grown considerably in the recent years, but some of the fundamental issues related to performance and scalability have gone unanswered. As noted in Table 1, there are fundamental differences in the approaches and subsequently, there is an opportunity to evolve to an optimal hybrid model. The xFactor product has taken a radically new approach to the environment. Instead of building an infrastructure from the ground up, a layer is added to the existing infrastructure making migration, integration, and administration that much simpler. The scalability and performance are decoupled from the Compute Plane and rest on the shoulders of the Messaging Plane. If the Messaging Plane is not able to meet expectations, it can be replaced with higher performance infrastructure keeping the top layers intact.
About the Author
Art Sedighi is the CTO and founder of SoftModule. SoftModule is a startup company with engineering and development offices in Boston and Tel-Aviv and a sales and management office in New York. He is also the Chief Architect for SoftModule’s xFactor product that has risen from a current need in the market to manage excessive demands of computing power at a lower cost.
Before SoftModule, Mr. Sedighi held a Senior Consulting Engineer position at DataSynapse, where he designed and implemented Grid and Distributed Computing fabrics for the Fortune 500. Before DataSynapse, Mr. Sedighi spent a number of years at TIBCO Software, where he implemented high-speed messaging solutions for organizations such as the New York Stock Exchange, UBS, Credit Suisse, US Department of Energy, US Department of Defense, and many others. Mr. Sedighi received his BS in Electrical Engineering and MS in Computer Science, both from Rensselaer Polytechnic Institute.