JavaCase for Preprocessing Capabilities in the Java Language

Case for Preprocessing Capabilities in the Java Language

Java implementation does not have a preprocessor. In this article, we have described what a preprocessor does and what Java offers as a substitute. We have taken the C preprocessor’s features and discussed its pros and cons with respect to Java. Finally, we have by a code example illustrated a few powerful features, which can be imbibed by the Java language.

What Is a Preprocessor?

A preprocessor is a program that works on the source before the compilation. As the name implies, the preprocessor prepares the source for compilation. The notion of the preprocessor has been there from the earliest times of programming languages. The concept of embedded scripting languages and methodologies is derived from the principle of preprocessing, JSP/ASP/PHP engines are essentially like preprocessors. Let us take the example of a JSP page. A JSP page is an HTML page with JSP tags embedded in it.

1: <html>
2: <%@ page import = "java.util.*, java.lang.*" %>
3: <p> The time now is <%= new Date() %>
4: </html>

Line 3 within <% %> contains Java code; this Java code is processed by the JSP engine at the server side and the content returned to the browser that requested this JSP page is pure HTML. The HTML is then processed by the browser and rendered. The important point to note here is that the JSP engine worked on the source file and modified it. A formal preprocessor like the cpp, very much like the JSP engine, looks for preprocessor directives and translates the source file for the compiler. The preprocessor directives work like a scripting language for the compiler where the preprocessor is the interpreter.

Even though most languages, with the exception of C/C++, do not include a preprocessor in their definition, a preprocessor can be used with any language. For example, the GCC (GNU Compiler Collection) has a -E flag which directs it to only do preprocessing and not compilation. This can be used to preprocess any source file. In this article, we will build a very simple preprocessing program to highlight the need for a small enhancement.

Preprocessor and Java

Java does not have a preprocessor. Having said that, we will now explore what a preprocessor does and how we do it in Java without it.

  1. File inclusion: The #include directive tells the preprocessor to copy the text of the file specified in the directive to this source file. Java has advanced from the notion of files and modules to classes. A Java class exists with respect to a namespace that is the package of the class. The packaging of classes maps directly to the directory structure of the operating system, so when a class is required to be used in another class, the compiler knows exactly where the class is located if it is found in the CLASSPATH. The import statement is just an aid in name resolution. Instead of using fully qualified class name like java.util.HashMap everywhere in the code, we can simply use HashMap if we have the statement import java.util.* in the beginning of the source file.
  2. Defining symbols: public static final is the Java’s answer to #define. As far as defining constants is concerned, the Java’s static final is far more superior because there is no type checking associated with #define declarations. Also, because a final variable is defined in a class (properly packaged), there is no chance of name collision. An equivalent of conditional compilation can be achieved by the use of public static final variables in conjunction with a modern Java compiler.
  3. Macro substitution: There is no substitute for a macro in Java. M4 (the preprocessor for RATFOR, Rational Fortran compiler) and cpp are capable of very powerful macro constructs. Take a simple macro:
               #define ARR_SIZE (sizeof / sizeof([0]))

    This macro can be used much like a function call where the preprocessor replaces the macro with the actual C code. It can be argued that macros are inherently type unsafe and lead to hard-to-find bugs and a Java method with a powerful compiler does a much better job, still properly written macro is a lot of convenience.

  4. Predefined macros: The ANSI C standard predefines six macros, the ones significant for this discussion are __FILE__ (the name of the source file) and __LINE__ (line number of the source). There is no equivalent of these predefined macros in Java. There is no direct way of knowing which classfile is being executed and no direct way of knowing which line number of the source is being executed in Java.
  5. Rest of the features: The other features of a preprocessor such as Line Splicing, Stringizing, Token pasting, and so forth are not as conspicuous in their absence as to warrant a discussion.

Logging Demands

One place where it is very important to know the line number and the class/file name is logging and field debugging. In C/C++, we can make use of the __LINE__ macro and let the preprocessor translate it to the correct line number. With JDK1.4 was introduced the logging API in the java.util.logging package. With these APIs, the mechanism of logging has become structured in Java. The LogRecord object has the source class name and the source method name that is either set explicitly or the LogRecord object infers it by analyzing the call stack. Another way to know about the name of the class in question is through the call:

this.getClass().getName().

Similarly, the line number also can be accessed by:

StackTraceElement ste[] = (new Throwable()).getStackTrace();
int lineNumber = ste[0].getLineNumber();

All these mechanisms use either the stack frames or reflection, which are both expensive and difficult to use. Wouldn’t it be a lot easier if we have a __LINE__ macro? It could also fit in the existing code that does not use the new logging framework.

A Simple Line Number Preprocessor

The following is the listing of a simple preprocessor that takes the name of the Java source file as the argument and does __LINE__ macro substitution.

1: import java.io.*;
2: import java.util.regex.*;
3:
4: public class LinePreProcessor {
5:   public static void main (String args[]) {
6:     LinePreProcessor lp = new LinePreProcessor(args[0]);
7:   }
8:
9:   public LinePreProcessor (String filename) {
10:     System.out.println("Name of the file is "+filename);
11:     Pattern p = Pattern.compile("__LINE__");
12:     Matcher m = null;
13:     LineNumberReader lnr = null;
14:     File outputFile = null;
15:     PrintWriter pw = null;
16:     try {
17:       outputFile = File.createTempFile (filename, null);
18:       pw = new PrintWriter (new FileWriter (outputFile),
          true);
19:     }
20:     catch (IOException ioe) {
21:       ioe.printStackTrace();
22:     }
23:     try {
24:       lnr = new LineNumberReader (new FileReader (filename));
25:     }
26:     catch (FileNotFoundException fnfe) {
27:       fnfe.printStackTrace();
28:     }
29:     String line =  null;
30:     try {
31:      while( (line = lnr.readLine())!=null) {
32:        m = p.matcher(line);       
33:        if (m.find()) {
34:          line = m.replaceAll (""+lnr.getLineNumber());
35:        }
36:        pw.println(line);
37:
38:      }
39:      pw.close();
40:      lnr.close();
41:      File f = new File (filename);
42:      cp (outputFile, new File(filename));
43:     }
44:     catch (IOException ioe) {
45:       ioe.printStackTrace();
46:     }
47:   }
48:   
49:   public File cp (File src, File dest) throws
      FileNotFoundException {
50:     FileInputStream fis = new FileInputStream (src);
51:     FileOutputStream fos = new FileOutputStream (dest);
52:
53:     byte buffer[] = new byte [32];
54:     int b_read = 0;
55:     try {
56:       while (true) {
57:         b_read = fis.read(buffer);
58:         if (b_read <=0) break;
59:         fos.write(buffer,0, b_read);
60:         fos.flush();
61:       }
62:     }
63:     catch (IOException ioe) {
64:       ioe.printStackTrace();
65:     }
66:     return dest;
67:   }
68:
69: }

The program on line 11 creates a regular expression pattern “_LINE__”, which will be replaced in the subject source file. It then uses the LineNumberReader class from the standard Java IO library to read the file specified on the command line (line 24). Another interesting thing to note is outputFile = File.createTempFile (filename, null); on line 17. Here we create a temporary file that is guaranteed to have a unique name. The program then reads one line at a time and replaces the __LINE__ macro with the actual line number. Finally, the program uses the cp (copy) function (line 49) to copy the temporary file created to the actual source file (one that was given on the command line). Let’s say we have a source file LineNumberTest.java :

public class LineNumberTest {
  public static void main(String args[]) {
    StackTraceElement ste[] = (new Throwable()).getStackTrace();
    System.out.println(ste[0].getLineNumber());
    System.out.println("The line number here is __LINE__");
    LineNumberTest tt = new LineNumberTest();
  }
}

After having compiled our LinePreProcessor.java, we can call the preprocessor on our LineNumberTest.java:

java LinePreProcessor LineNumberTest.java

The resulting LineNumberTest.java becomes:

public class LineNumberTest {
  public static void main(String args[]) {
    StackTraceElement ste[] = (new Throwable()).getStackTrace();
    System.out.println(ste[0].getLineNumber());
    System.out.println("The line number here is 5");
    LineNumberTest tt = new LineNumberTest();
  }
}

The __LINE__ string is replaced by 5, which is the correct line number.

Case for Compiler Enhancement

In order to use this, we must have a separate staging area for the source files. The CVS will have the version that contains the __LINE__ directive and the source files will be converted to Java files with __LINE__ substituted with actual line numbers in the staging area. Then finally, the Java files can be compiled to the classfiles. Evidently the tedium of the staging area has taken the fun out of our toy program, but that apart, isn’t the __LINE__ macro a powerful feature? And what’s more: Even with spoofing (as in LogRecord ) and code obfuscation, the line number information will always be there because it becomes part of the code. It will not be very difficult for the Java compiler writers to predefine a few macros that the developers can then use in the code. As discussed above, in most of the cases Java does not actually need a preprocessor; however, adding a few macro substitution features in the compiler itself can give greater flexibility and power to the developers.

Suggested References

http://java.sun.com/j2se/1.4/docs/guide/util/logging/overview.html#2.0 The Java logging overview.

http://www.adahome.com/History/Steelman/steeltab.htm A comparison of programming languages.

http://www.ai.mit.edu/~jrb/jse/index.htm Java syntactic extender.

About the Author

Nasir Khan is a Sun Certified Java programmer, with a B.E. in electrical engineering and a masters degree in systems. His areas of interest include Java applications, EJB, RMI, CORBA, JDBC and JFC. He is presently working with BayPackets Inc. in high technology areas of telecom software.

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Latest Posts

Related Stories