The Data Problem, Page 2
Normally, a task goes through S→D→E (See Figure 2). Abstractly speaking, you want to exchange S and D! You want the data to be received by the resource, before the task is even scheduled!
Figure 2: Normal Data Flow thru Scheduler[s]
The xFactor Factor
In the previous sections, you examined the relevant attributes of architectural models as they pertain to managing data across a Grid. You have examined a hypothetical (Optimal Scenario) model that combines the best components, thus forming a hybrid environment. Unfortunately, this model is strictly hypothetical; in other words, there is no commercially available offering—until now. The xFactor software offering has taken the hypothetical model and turned it into reality.
xFactor provides enterprise-class solutions to the High Performance Computing (HPC) domain, with the goal of optimizing clients' datacenter investments. The SoftModule xFactor Grid management software package provides organizations with a method to optimally run and manage compute-intensive applications across thousands of CPUs. By leveraging xFactor's distributed architecture and dynamic resource allocation techniques, clients can achieve dramatic improvement in application performance and resource utilization.
As I mentioned in the previous section, the goal is that the data be received by the resource, before the task is even scheduled. But, how can you go about placing the data at the resource before the task is scheduled? xFactor is an intelligent system; it learns from previous events that occurred in the system and it adjusts its next action based on the previous events. Obviously, the "previous" event is the result from the first task that was scheduled on that resource (See Figure 3), and can be used by the scheduler as the means to determine its next step[s].
Figure 3: Event learning in xFactor
Figure 4 shows the subsequent steps. Although this article provides a cursory review, it is intended to give you a good general understanding of how xFactor manages data to achieve efficiency in the Grid. In the subsequent task, the task arrives at the scheduler, and the data is sent to the resource. The scheduler, realizing that the resource has already been primed with the proper set of data, places (instead of saying 'schedules') the next task on the compute resource that is ready to process the task. Even though this model reduces wait-time, it has flaws as well. You need to realize that the time it takes a client to transmit the data to its destination has not changed (physical constraint); you simply are smarter about how you schedule that wait. In the xFactor case, the scheduler is smart enough to realize that the resource is free (or better yet, it is receiving data for the next task coming down the pipe), and it can handle a task from another job.
If the system is composed of only one job, this method obviously does not help. In systems (Grids) where there are a number of jobs, the benefit can be easily realized because the resource is still considered free during Step 6 (see Figure 4), and it can still process tasks while the scheduler is making its next decision for Step 7.
Figure 4: Data and Task flow thru the xFactor
In this article, you explored the challenges of managing data in a Grid environment. The techniques reviewed within this article help one realize the complexity of the problem and physical constraints surrounding it: network bandwidth, memory bandwidth, processing speed, I/O speed, and the like; the list goes on and on. Ultimately, the optimal solution for your environment will greatly depend on the flexibility in the architecture and the willingness of your vendor(s) to work closely in adapting to your strategy.
About the Author
Art Sedighi is the CTO and founder of SoftModule. SoftModule is a startup company with engineering and development offices in Boston and Tel-Aviv and a sales and management office in New York. He is also the Chief Architect for SoftModule's xFactor product that has risen from a current need in the market to manage excessive demands of computing power at a lower cost.
Before SoftModule, Mr. Sedighi held a Senior Consulting Engineer position at DataSynapse, where he designed and implemented Grid and Distributed Computing fabrics for the Fortune 500. Before DataSynapse, Mr. Sedighi spent a number of years at TIBCO Software, where he implemented high-speed messaging solutions for organizations such as the New York Stock Exchange, UBS, Credit Suisse, US Department of Energy, US Department of Defense, and many others. Mr. Sedighi received his BS in Electrical Engineering and MS in Computer Science, both from Rensselaer Polytechnic Institute.