With the advent of multicore CPUs in recent years, parallel programming is the way to take full advantage of the new processing work horses. Parallel programming refers to the concurrent execution of processes due to the availability of multiple processing cores. This, in essence, leads to a tremendous boost in performance and efficiency of the programs in contrast to linear single core execution or even multithreading. The Fork/Join framework is a part of the Java concurrency API. This framework enables programmers to parallelize algorithms. This article explores the concept of parallel programming with the help of the Fork/Join Framework available in Java.
An Overview
Parallel programming has a much wider connotation and undoubtedly is a vast area to elaborate in a few lines. The crux of the matter is quite simple, yet operationally much harder to achieve. In simple terms, parallel programming means writing programs that use more than one processor to complete a task, that’s all! Guess what; it sounds familiar, doesn’t it? It almost rhymes with the idea of multithreading. But, note that there are some important distinctions between them. On the surface, they are the same, but the undercurrent is absolutely different. In fact, multithreading was introduced to provide a sort of illusion of parallel processing with no real parallel execution at all. What multithreading does really is that it steals CPU idle time and uses it to its advantage.
In short, multithreading is a collection of discrete logical units of tasks that run to grab their share of CPU time while another thread may be temporarily waiting for, say, some user input. The idle CPU time is optimally shared among competing threads. If there is only one CPU, it is time shared. If there are multiple CPU cores, they also are all time shared. Therefore, an optimal multithreaded program squeezes out the CPU’s performance by the clever mechanism of time sharing. In essence, it is always one thread using one CPU while another thread is waiting. This happens in a subtle manner that the user gets a feel of parallel processing where, in reality, processing actually is happening in quick succession. The greatest advantage of multithreading is that it is a technique to derive the most out of the processing resources. Now, this idea is quite useful and can be used in any set of environments, whether it has a single CPU or multiple CPUs. The idea is the same.
Parallel programming, on the other hand, means that there are multiple dedicated CPUs that are harnessed in parallel by the programmer. This type of programming is optimized for a multicore CPU environment. Most of today’s machines use multicore CPUs. Therefore, parallel programming is quite relevant nowadays. Even the most inexpensive machine is mounted with multicore CPUs. Look at the hand-held devices; even they are multicore. Although everything seems hunky-dory with multicore CPUs, here is another side of the story as well. Do more CPU cores mean faster or efficient computing? Not always! The greedy philosophy of “the more the merrier” does not apply to computing, nor in life. But they are there, unignorably—dual, quad, octa, and so on. They are there mostly because we want them and not because we need them, at least in most cases. In reality, it is relatively difficult to keep even a single CPU busy in daily computing. However, multicores have their uses under special circumstances, such as in servers, gaming, and so on, or solving large problems. The problem of having multiple CPUs is that it requires memory that must match up speed with processing power, along with lightning fast data channels and other accessories. In short, multiple CPU cores in daily computing provide performance improvement that cannot outweigh the amount of resources needed to use it. Consequently, we get an underutilised expensive machine, perhaps meant only to be showcased.
Parallel Programming
Unlike multithreading, where each task is a discrete logical unit of a larger task, parallel programming tasks are independent and their execution order does not matter. The tasks are defined according to the function they perform or data used in processing; this is called functional parallelism or data parallelism, respectively. In functional parallelism, each processor works on its section of the problem whereas in data parallelism, the processor works on its section of the data. Parallel programming is suitable for a larger problem base that does not fit in to a single CPU architecture, or it may be the problem is so large that it cannot be solved in a reasonable estimate of time. As a result, tasks, when distributed among processors, can obtain the result relatively fast.
The Fork/Join Framework
The Fork/Join Framework is defined in the java.util.concurrent package. It includes several classes and interfaces that support parallel programming. What it does primarily is that it simplifies the process of multiple thread creation, their uses, and automates the mechanism of process allocation among multiple processors. The notable difference between multithreading and parallel programming with this framework is very similar to what we mentioned earlier. Here, the processing part is optimised to use multiple processors unlike multithreading, where the idle time of the single CPU is optimised on the basis of shared time. The added advantage with this framework is to use multithreading in a parallel execution environment. No harm there.
There are four core classes in this framework:
- ForkJoinTask<V>: This is an abstract class that defines a task. Typically, a task is created with the help of the fork() method defined in this class. This task is almost similar to a normal thread created with the Thread class, but is lighter than it. The mechanism it applies is that it enables management of a large number of tasks with the help of a small number of actual threads that join the ForkJoinPool. The fork() method enables asynchronous execution of the invoking task. The join() method enables waiting until the task on which it is called is finally terminated. There is an another method, called invoke(), which combines the fork and join operations into a single call.
- ForkJoinPool: This class provides a common pool to manage execution of ForkJoinTask tasks. It basically provides the entry point for submissions from non-ForkJoinTask clients, as well as management and monitoring operations.
- RecursiveAction: This is also an abstract extension of the ForkJoinTask class. Typically, we extend this class to create a task that does not return a result or has a void return type. The compute() method defined in this class is overridden to include computational code of the task.
- RecursiveTask<V>: This is another abstract extension of the ForkJoinTask class. We extend this class to create a task that returns a result. And, similar to ResursiveAction, it also includes a protected abstract compute() method. This method is overridden to include the computation part of the task.
The Fork/Join Framework Strategy
This framework employs a recursive divide-and-conquer strategy to implement parallel processing. It basically divides a task into smaller subtasks; then, each subtask is further divided into sub-subtasks. This process is applied recursively on each task until it is small enough to be handled sequentially. Suppose we are to increment the values of an array of N numbers. This is the task. Now, we can divide the array by two creating two subtasks. Divide each of them again into two more subtasks, and so on. In this way, we can apply a divide-and-conquer strategy recursively until the tasks are singled out into a unit problem. This unit problem then can be executed in parallel by the multiple core processors available. In a non-parallel environment, what we had to do is cycle through the entire array and do the processing in sequence. This is clearly an inefficient approach in view of parallel processing. But, the real question is can every problem be divided and conquered? Definitely NO! But, there are problems that often involve some sort of array, collection, of grouping of data that particularly suits this approach. By the way, there are problems that may not use collection of data yet can be optimised to use the strategy for parallel programming. What type of computational problems are suitable for parallel processing or discussion on parallel algorithm is out of the scope of this article. Let’s see a quick example on the application of the Fork/Join Framework.
A Quick Example
This is a very simple example to give you an idea on how to implement parallelism in Java with the Fork/Join Framework.
package org.mano.example; import java.util.concurrent.RecursiveAction; public class CustomRecursiveAction extends RecursiveAction { final int THRESHOLD = 2; double [] numbers; int indexStart, indexLast; CustomRecursiveAction(double [] n, int s, int l) { numbers = n; indexStart = s; indexLast = l; } @Override protected void compute() { if ((indexLast - indexStart) > THRESHOLD) for (int i = indexStart; i < indexLast; i++) numbers [i] = numbers [i] + Math.random(); else invokeAll (new CustomRecursiveAction(numbers, indexStart, (indexStart - indexLast) / 2), new CustomRecursiveAction(numbers, (indexStart - indexLast) / 2, indexLast)); } } package org.mano.example; import java.util.concurrent.ForkJoinPool; import java.util.concurrent.TimeUnit; public class Main { public static void main(String[] args) { final int SIZE = 10; ForkJoinPool pool = new ForkJoinPool(); double na[] = new double [SIZE]; System.out.println("initialized random values :"); for (int i = 0; i < na.length; i++) { na[i] = (double) i + Math.random(); System.out.format("%.4f ", na[i]); } System.out.println(); CustomRecursiveAction task = new CustomRecursiveAction(na, 0, na.length); pool.invoke(task); System.out.println("Changed values :"); for (inti = 0; i < 10; i++) System.out.format("%.4f ", na[i]); System.out.println(); } }
Conclusion
This is a terse description of parallel programming and how it is supported in Java. It is a well-established fact that having N cores is not going to make everything N times faster. Only a section of Java applications effectively use this feature. Parallel programming code is a difficult frame. Moreover, effective parallel programs must consider issues such as load balancing, communication between parallel tasks, and the like. There are some algorithms that better suit parallel execution but many do not. In any case, the Java API is not short of its support. We can always tinker with the APIs to find out what suits the best. Happy coding 🙂