We live in a technological world, at the heart of which are scientists and engineers. They need programming tools to help make important discoveries and bring the next generation of technology to market.
Part 1 of this article discussed how scientists can benefit from Java. We’ve said that despite a few pitfalls which scientists should watch out for, the future of Java as a scientific programming language looks bright. Here in Part II we examine the structure of a scientific program more closely. We’ll define a few scientific OOP design patterns in Java, and we’ll give you a short style guide that can help scientists write good Java programs (please see the sidebar “A Style Guide for Scientific Programs in Java”).
A Style Guide for Scientific Programs in Java |
The Golden Rule of Programming states that computer programs should be understandable by
Translating Formulas to Source Code
1. Use short variable names like e, m and c, to mirror the variables which appear in
Clearly e = m*c*c; is much easier to compare with the original equation and debug than 2. Hungarian notation in Java can be helpful.
It’s not necessary to precede every variable with a letter which indicates its type: d=double, 3. Use descriptive variable names for controlling loops and all other variables.
Remember when looking through source code, loops are what the eye sees first. It’s important // Eq. (5) Loop over all electrons for (i = 0; i < iNumberOfElectrons; i++) E[i] = m[i]*c*c; 4. When coding an equation which appears in a journal article, always cite both the
For example, at the beginning of the method you may type 5. Break up complicated equations to enhance their readability, but make your
The variable names numer and denom make it instantly clear that you've broken the formula // Eq. (5) Loop over all electrons numer = 1.0 - v*v/c/c; denom = 1.0 - v/c; vp = Math.sqr(numer/denom); Java-Specific Advice
1. It's easier to share numerical methods than numerical classes or packages.
But for future scientists who wish to borrow only small pieces of your numerical ideas, it may be
2. If dedicated classes store global variables, preface their names with Global. If
It's a clever idea because it not only reminds you of what's in the class - it also ensures that the
3. When necessary, port Fortran source code rather than C source code to Java.
4. Interfaces are great for numerical constants.
5. Don't forget System.out.println and the Console.
It's tempting to invoke the Java Virtual Machine with javaw, but don't forget that the standard — K. R.
|
Don't OOP? Don't Worry!
Since the real world is composed of objects related in complex ways, object-oriented strategies are often the best solution to programming problems. One reason for Java's success is that it simplifies OOP development.
However, many scientific problems are not amenable to full-blown object-oriented treatments. The problem may be too simple, such as evaluating an equation. Or sometimes the relationship between objects is just too complex. For example, a very simple equation describes how the moon orbits the earth, but add a third object to the problem (such as the sun or a comet or an asteroid) and the resulting equations can be so intertwined and complex that entirely different calculational techniques become necessary.
Nevertheless, OOP principles can be valuable for scientific programming, and increasing numbers of scientific programmers are now learning to think in OOP terms. For the scientist new to Java, the first lesson to learn is this: the classes are where it's at.
Classes: The Heart of Scientific Programs
Coming from a traditional Fortran or C background, it's easy to look at methods in Java and assume they are like functions or subroutines. They're not. For scientific programming, Java methods are weaker than their Fortran or C counterparts. A scientist expects to invoke a subroutine with a long argument list containing lots of variables, then to have those variables updated and changed upon return.
In Java, variables can be passed into methods (passing by value), but at most only one of them can be changed, such as via a function: x=myMethod(a,b,c,x);. Of course, groups of variables can be stored in arrays, and methods can change the contents of arrays (passing by reference). But cheating with variable-arrays is rarely an elegant approach.
This is not a shortcoming of Java — in fact, it's an improvement over Fortran and C! The Java solution is elegant: construct a class and feed it the necessary variables a, b, and c, then provide either public variables or else public methods which return x, like this:
MyCalc mc = new MyCalc(a,b,c);
x = mc.x; // or, alternatively . . .
x = mc.getX();
Java ends endless parameter lists, in which it is never clear which variables stay fixed and which are changed; and Java prohibits pointers — no comment necessary! And as this example shows, the Java code for such classes is clean and concise, and above all, the programmer's intent is crystal clear (Please see the sidebar "Converting Scientific Subroutines to Scientific Classes").
Converting Scientific Subroutines to Scientific Classes |
Subroutines are the building blocks of scientific programs in Fortran and C. They're portable A typical Fortran subroutine might look like this: SUBROUTINE root (a,b,c,root1,root2) c Which input parameters are changed? real*8 a, b, c, root1, root2, disc disc = b*b - 4.*a*c root1 = (-b + DSQR(disc))/2./a root2 = (-b - DSQR(disc))/2./a return end
Java classes provide a better way. The input and output parameters are easy to see. Variables It's easy to convert this Fortran subroutine to a Java class: /** Solve a quadratic equation, a*x^2+b*c+c=0 public class QuadraticEquationSolver { private double root1, root2; /** Initialize the parameters public void setup (double a, double b, double c) { double disc = b*b - 4.*a*c; root1 = (-b + Math.sqrt(disc))/2./a; root2 = (-b - Math.sqrt(disc))/2./a; } /** Returns the "+" root public double getPosRoot() { return root1; } /** Returns the "-" root public double getNegRoot() { return root2; } }
A few extra lines of code may be necessary, but the programmer's intent is crystal clear, and — K. R. |
Handling Global Variables in Java
A scientific program is about numbers, not (necessarily) about interrelationships between inherited classes or objects. So numerical variables — sometimes many dozens of them — must be easy to group and share between sections of the program. Fortran originally provided a pre- object-oriented tool for this task, the named common statement: global variables were easy to share, but it required significant programmer overhead to ensure appropriate typing and dimensioning.
Java provides better ways. The easiest way is simply to create a separate class for each collection of global variables to be grouped, and to declare these variables as static members of the class. The static keyword ensures that no matter how many instances of the class exist, the variables will all share the same address space — which means there is effectively only one instance of each variable in the class.
A more clever approach is known as the Singleton pattern, a technique which ensures that one — and only one — instance of a class can be created. Each desired collection of variables can be declared in its own Singleton class. Here is an example of these strategies (please see "Interfaces: Where Scientific Bakers Bake their Pi!").
Interfaces: Where Scientific Bakers Bake their Pi! |
One advantage of global variables stored in classes is that, to use these variables, their names
But for scientific constants which never change, such as pi (3.141...) or e (2.718...), it is useful Scientists Like to Think Global
Java offers advantages over traditional Fortran 77 for storing and managing global variables in
Here's an example program, GlobalVariableTest.java, which demonstrates both static public class GlobalVariableTest { public static void main (String args[]) { GlobalVariableTest.init(); GlobalVariableTest.calc(); } public static void init() { GlobalSingletonTest gt = GlobalSingletonTest.getInstance(); gt.x = 1.0; GlobalStaticTest gst = new GlobalStaticTest(); gst.y = 2.0; System.out.println("X,Y: "+gt.x+", "+gst.y); } public static void calc() { GlobalSingletonTest gt = GlobalSingletonTest.getInstance(); GlobalStaticTest gst = new GlobalStaticTest(); System.out.println("X,Y: "+gt.x+", "+gst.y); } }
Here's the accompanying file, GlobalSingletonTest.java, defining global variables using the /** Demonstrates Singleton-style global variables */ public class GlobalSingletonTest { /** Manages SingletonClass */ private GlobalSingletonTest() {} static private GlobalSingletonTest _instance; static public GlobalSingletonTest getInstance() { if ( _instance == null ) _instance = new GlobalSingletonTest(); return _instance; } // List of global variables public double x; }
And here's the accompanying file, GlobalStaticTest.java, defining the global variables using /** Global variables via static members */ public class GlobalStaticTest { // List of global variables static public double y; } Better for Scientific Programs: Public Access not Accessor Methods
In these examples, the programmer handles global variables directly (e.g., gt.x = 1.0;) rather — K. R. |
A useful convention is to preface the name of such global variables classes with the word Global, as in GlobalEnergy or GlobalMaterial. This not only describes the class, but also ensures that all these classes will be grouped together in any javadoc HTML files which are produced.
Design Patterns for Scientific Programs
Modern programming languages are like children's Lego-type block toys, comprising myriads of small components which can build structures of great complexity. There are a few basic substructures used frequently, such as boxes to build houses or chassis to build cars. By clearly defining these structures, known as design patterns, and by recycling them in programs, the programmer can quickly assemble elegant programs of great complexity. The Singleton pattern for global variables is one such design pattern.
Here we present three design patterns which are useful for scientific programming in Java.
The DataTransceiver Pattern
As the sidebar on antimatter illustrates, some number-crunching programs take hours or even days to run. What happens when, ten minutes before such a program is finished executing, the system crashes or the computer is accidentally switched off? Such problems are not just frustrating, they are also expensive. The scientist's time costs money, and on supercomputers each CPU second may recorded and billed.
When Matter and Antimatter Collide |
Any Star Trek fan knows what happens when matter and antimatter collide: annihilation! But
Prof. Kelvin Lynn and Dr. Marc Weber are two physicists at Washington State University who
These so-called Monte Carlo computer programs use probability theory to simulate the
Some of Dr. Lynn's programs use the DataTransceiver design pattern (see separate sidebar below). As the program runs, the — K. R. |
The DataTransceiver pattern is a solution to this problem. Like a radio transceiver, which both transmits and receives, the DataTransceiver pattern regularly reads and writes all the essential information for a calculation to data files. It allows the calculation to be interrupted at any time, either accidentally (such as a system crash), or intentionally (such as to check the intermediate results) — and then to be restarted again.
And by giving a little thought to the data files which are produced, interspersing them liberally with variable descriptions and user-comments, they make a useful archival record which stores the calculation results together with the initial parameters used to generate them.
The DataTransceiver Design Pattern |
The Problem
A numerical calculation without user interaction requires lengthy execution time, perhaps The Solution
At periodic times the calculation is halted and such intermediate results as are required Implementation Details
A single class can be used to perform both the data input and the data output Advantages
Disadvantages
Depending on the calculation (e.g., one involving dozens of large, multidimensional Responsibilities for the Programmer
Example
The DataTransceiver design pattern is implemented in many scientific calculations, although
The scientific programmer will often not know ahead of time how many iterations are — K. R. |
The ScientificLibrary Pattern
We've said in Part 2 that numerical methods are the heart and soul of scientific programs, and that many scientific programs are nothing but well-tested legacy methods patched together in new ways. So there must be good techniques for archiving and maintaining these methods.
The ScientificLibrary pattern provides a good technique. A scientific library is a class which contains numerous independent, static public methods and internal classes, with no global variables. Every scientific programmer has a favorite collection of tools which he/she reuses often, such as for solving an equation, calculating a histogram, writing data to an ASCII file, etc. A scientific library is the ideal repository for these methods. Declaring the methods as static is the ultimate in programmer-friendly strategy: the methods can be used without needing to instantiate the class!
The ScientificLibrary Design Pattern |
The Problem
A scientist needs to have easy access to a collection of many small, functionally The Solution
A class is used as a repository for a "library" of recyclable, interdependent Implementation Details
The methods are declared as static whenever possible, so they can be used without the Advantages
Disadvantages Not well-suited for interdependent methods.
Example
There are several useful ScientificLibrary classes in DataScan, a Java-based software — K. R.
|
As with global variable classes described above, a useful convention is to name the scientific library classes with the word Utils, as in UtilsMath or UtilsFile. This not only describes the class, but also ensures that all these classes will be grouped together in any javadoc HTML files which are produced.
The NoOOP Pattern
For simple scientific programs, such as a calculation requiring little user interaction, the NoOOP pattern (pronounced nope) can be useful. It's a way to implement a Fortran-like procedural program. It may contain methods to initialize the starting data, methods to perform the calculation, and methods to output the results.
This pattern is not the optimal use of Java's powerful OOP resources, as the name cleverly suggests. But traditional (non-scientific) software developers should take note: a scientific programmer may have years of successful software design experience, though none of it object-oriented. The NoOOP pattern provides a perfect place for a scientist to begin with Java.
The NoOOP Design Pattern |
The Problem
A scientist — possibly a scientist new to Java and without experience with classes and The Solution A Fortran- or C-style procedural program can be easily developed using Java. Implementation Details A single class can contain all variables and methods to perform all required tasks. Advantages
Disadvantages
Responsibilities for the Programmer
Example
This example shows a simple procedural program to find when a user-specified function Some Java features this program demonstrates:
public class Procedural { // This function just starts the program public static void main(String[] args) { Procedural myproc = new Procedural(); } // The main procedural program goes here public Procedural() { double d, dStart, dStop, dStepSize; dStart = getDoubleInput("Enter startpoint: "); dStop = getDoubleInput("Enter endpoint: "); dStepSize = getDoubleInput("Enter stepsize: "); for (d=dStart; d<dStop; d+=dStepSize) { if ( myFunction(d)*myFunction(d+dStepSize) < 0 ) break; } if (d>dStop) System.out.println("No zero found. "); else System.out.println("The function is zero when x = " + d); } // Example of how to read double numbers from the keyboard double getDoubleInput(String sMessage) { double d = 0.0; System.out.print(sMessage); try { d = Double.valueOf(new java.io.DataInputStream(System.in).readLine()).doubleValue(); } catch (java.io.IOException e) {} return d; } // Example of a user-defined function double myFunction (double x) { return Math.sin(x); } } — K. R.
|
Scientific Programming for the 21st Century
There's good reason for Java's explosive popularity: it gives business programmers the tools they need to write effective, easy-to-maintain programs. But as we've discussed, Java offers excellent tools for scientific programmers as well. Scientists switching to Java will be rewarded with fast-running, platform-independent programs — which are far easier to modify and maintain than their Fortran or C counterparts!
About the Author
Dr. Kenneth A. Ritley is a consultant with HP Consulting in Sindelfingen, Germany. Until recently, he was a physicist in Department Dosch at the Max-Planck-Institut für Metallforschung (Metals Research) in Stuttgart, Germany. He's made scientific computing contributions in the wide-ranging fields of astronomy, antimatter, high-temperature superconductivity, and magnetism.