Java An Introduction to Java Servlets

An Introduction to Java Servlets

written by Hans Bergsten

I assume that you’re familiar with HTTP and CGI or a proprietary server API like NSAPI or ISAPI. I also assume that you are somewhat familiar with Java programming or some other object-oriented language, such as C++. Even if you’re not a Java programmer you should be able to appreciate the benefits of servlets reading this article, but before you develop your own servlets I recommend that you first learn the Java basics.

The Dark Ages

Early in the World Wide Web’s history, the Common Gateway Interface (CGI) was defined to allow Web servers to process user input and serve dynamic content. CGI programs can be developed in any script or programming language, but Perl is by far the most common language. CGI is supported by virtually all Web servers and many Perl modules are available as freeware or shareware to handle most tasks.

But CGI is not without drawbacks. Performance and scalability are big problems since a new process is created for each request, quickly draining a busy server of resources. Sharing resources such as database connections between scripts or multiple calls to the same script is far from trivial, leading to repeated execution of expensive operations.

Security is another big concern. Most Perl scripts use the command shell to execute OS commands with user-supplied data, for instance to send mail, search for information in a file, or just leverage OS commands in general. This use of a shell opens up many opportunities for a creative hacker to make the script remove all files on the server, mail the server’s password file to a secret account, or do other bad things that the script writer didn’t anticipate.

The Web server vendors defined APIs to solve some of these problems, notably Microsoft’s ISAPI and Netscape’s NSAPI. But an application written to these proprietary APIs is married to one particular server vendor. If you need to move the application to a server from another vendor, you have to start from scratch. Another problem with this approach is reliability. The APIs typically support C/C++ code executing in the Web server process. If the application crashes, e.g. due to a bad pointer or division by zero, it brings the Web server down with it.

Servlets to the rescue!

The Servlet API was developed to leverage the advantages of the Java platform to solve the issues of CGI and proprietary APIs. It’s a simple API supported by virtually all Web servers and even load-balancing, fault-tolerant Application Servers. It solves the performance problem by executing all requests as threads in one process, or in a load-balanced system, in one process per server in the cluster. Servlets can easily share resources as you will see in this article.

Security is improved in many ways. First of all, you rarely need to let a shell execute commands with user-supplied data since the Java APIs provide access to all commonly used functions. You can use JavaMail to read and send email, Java Database Connect (JDBC) to access databases, the File class and related classes to access the file system, RMI, CORBA and Enterprise Java Beans (EJB) to access legacy systems. The Java security model makes it possible to implement fine-grained access controls, for instance only allowing access to a well-defined part of the file system. Java’s exception handling also makes a servlet more reliable than proprietary C/C++ APIs – a divide by zero is reported as an error instead of crashing the Web server.

The Servlet Run-time Environment

A servlet is a Java class and therefore needs to be executed in a Java VM by a service we call a servlet engine.

The servlet engine loads the servlet class the first time the servlet is requested, or optionally already when the servlet engine is started. The servlet then stays loaded to handle multiple requests until it is explicitly unloaded or the servlet engine is shut down.

Some Web servers, such as Sun’s Java Web Server (JWS), W3C’s Jigsaw and Gefion Software’s LiteWebServer (LWS) are implemented in Java and have a built-in servlet engine. Other Web servers, such as Netscape’s Enterprise Server, Microsoft’s Internet Information Server (IIS) and the Apache Group’s Apache, require a servlet engine add-on module. The add-on intercepts all requests for servlets, executes them and returns the response through the Web server to the client. Examples of servlet engine add-ons are Gefion Software’s WAICoolRunner, IBM’s WebSphere, Live Software’s JRun and New Atlanta’s ServletExec.

All Servlet API classes and a simple servlet-enabled Web server are combined into the Java Servlet Development Kit (JSDK), available for download at Sun’s official Servlet site. To get started with servlets I recommend that you download the JSDK and play around with the sample servlets.

As this article is written (early March 1999), the released version of the JSDK is for the Servlet 2.0 API, with an Early Access version of the JSDK 2.1 available at Java Developer’s Connection. All servlet engines mentioned above support the Servlet 2.0 API, and a few also support the 2.1 API. The examples of 2.1 API features in this article are clearly marked so you don’t have to be surprised when they don’t work with your 2.0 servlet engine.

Servlet Interface and Life Cycle

Let’s implement our first servlet. A servlet is a Java class that implements the Servlet interface. This interface has three methods that define the servlet’s life cycle:

  • public void init(ServletConfig config) throws ServletException
    This method is called once when the servlet is loaded into the servlet engine, before the servlet is asked to process its first request.
  • public void service(ServletRequest request, ServletResponse response) throws ServletException, IOException
    This method is called to process a request. It can be called zero, one or many times until the servlet is unloaded. Multiple threads (one per request) can execute this method in parallel so it must be thread safe.
  • public void destroy()
    This method is called once just before the servlet is unloaded and taken out of service.

The init method has a ServletConfig attribute. The servlet can read its initialization arguments through the ServletConfig object. How the initialization arguments are set is servlet engine dependent but they are usually defined in a configuration file.

A typical example of an initialization argument is a database identifier. A servlet can read this argument from the ServletConfig at initialization and then use it later to open a connection to the database during processing of a request:

...
private String databaseURL;

public void init(ServletConfig config) throws ServletException {
  super.init(config);
  databaseURL = config.getInitParameter("database");
}

The Servlet API is structured to make servlets that use a different protocol than HTTP possible. The javax.servlet package contains interfaces and classes intended to be protocol independent and the javax.servlet.http package contains HTTP specific interfaces and classes. Since this is just an introduction to servlets I will ignore this distinction here and focus on HTTP servlets. Our first servlet, named ReqInfoServlet, will therefore extend a class named HttpServlet. HttpServlet is part of the JSDK and implements the Servlet interface plus a number of convenience methods. We define our class like this:

import javax.servlet.*;
import javax.servlet.http.*;

public class ReqInfoServlet extends HttpServlet {

  ...

}

An important set of methods in HttpServlet are the ones that specialize the service method in the Servlet interface. The implementation of service in HttpServlet looks at the type of request it’s asked to handle (GET, POST, HEAD, etc.) and calls a specific method for each type. This way the servlet developer is relieved from handling the details about obscure requests like HEAD, TRACE and OPTIONS and can focus on taking care of the more common request types, i.e. GET and POST. In this first example we will only implement the doGet method.

protected void doGet(HttpServletRequest request, HttpServletResponse response) 
    throws ServletException, IOException {

  ...

}

Request and Response Objects

The doGet method has two interesting parameters: HttpServletRequest and HttpServletResponse. These two objects give you full access to all information about the request and let you control the output sent to the client as the response to the request.

With CGI you read environment variables and stdin to get information about the request, but the names of the environment variables may vary between implementations and some are not provided by all Web servers. The HttpServletRequest object provides the same information as the CGI environment variables, plus more, in a standardized way. It also provides methods for extracting HTTP parameters from the query string or the request body depending on the type of request (GET or POST). As a servlet developer you access parameters the same way for both types of requests. Other methods give you access to all request headers and help you parse date and cookie headers.

Instead of writing the response to stdout as you do with CGI, you get an OutputStream or a PrintWriter from the HttpServletResponse. The OuputStream is intended for binary data, such as a GIF or JPEG image, and the PrintWriter for text output. You can also set all response headers and the status code, without having to rely on special Web server CGI configurations such as Non Parsed Headers (NPH). This makes your servlet easier to install.

Let’s implement the body of our doGet method and see how we can use these methods. We will read most of the information we can get from the HttpServletRequest (saving some methods for the next example) and send the values as the response to the request.

protected void doGet(HttpServletRequest request, HttpServletResponse response) 
    throws ServletException, IOException {

  response.setContentType("text/html");
  PrintWriter out = response.getWriter();

  // Print the HTML header
  out.println("<HTML><HEAD><TITLE>");
  out.println("Request info");
  out.println("</TITLE></HEAD>");

  // Print the HTML body
  out.println("<BODY><H1>Request info</H1><PRE>");
  out.println("getCharacterEncoding: " + request.getCharacterEncoding());
  out.println("getContentLength: " + request.getContentLength());
  out.println("getContentType: " + request.getContentType());
  out.println("getProtocol: " + request.getProtocol());
  out.println("getRemoteAddr: " + request.getRemoteAddr());
  out.println("getRemoteHost: " + request.getRemoteHost());
  out.println("getScheme: " + request.getScheme());
  out.println("getServerName: " + request.getServerName());
  out.println("getServerPort: " + request.getServerPort());
  out.println("getAuthType: " + request.getAuthType());
  out.println("getMethod: " + request.getMethod());
  out.println("getPathInfo: " + request.getPathInfo());
  out.println("getPathTranslated: " + request.getPathTranslated());
  out.println("getQueryString: " + request.getQueryString());
  out.println("getRemoteUser: " + request.getRemoteUser());
  out.println("getRequestURI: " + request.getRequestURI());
  out.println("getServletPath: " + request.getServletPath());

  out.println();
  out.println("Parameters:");
  Enumeration paramNames = request.getParameterNames();
  while (paramNames.hasMoreElements()) {
    String name = (String) paramNames.nextElement();
    String[] values = request.getParameterValues(name);
    out.println("    " + name + ":");
    for (int i = 0; i < values.length; i++) {
      out.println("      " + values[i]);
    }
  }

  out.println();
  out.println("Request headers:");
  Enumeration headerNames = request.getHeaderNames();
  while (headerNames.hasMoreElements()) {
    String name = (String) headerNames.nextElement();
    String value = request.getHeader(name);
    out.println("  " + name + " : " + value);
  }

  out.println();
  out.println("Cookies:");
  Cookie[] cookies = request.getCookies();
  for (int i = 0; i < cookies.length; i++) {
    String name = cookies[i].getName();
    String value = cookies[i].getValue();
    out.println("  " + name + " : " + value);
  }

  // Print the HTML footer
  out.println("</PRE></BODY></HTML>");
  out.close();
}

The doGet method above uses most of the methods in HttpServletRequest that provide information about the request. You can read all about them in the Servlet API documentation so here we’ll just look at the most interesting ones.

getParameterNames and getParameterValues help you access HTTP parameters no matter if the servlet was requested with the GET or the POST method. getParameterValues returns a String array because an HTTP parameter may have multiple values. For instance, if you request the servlet with a URL like http://company.com/servlet/ReqInfoServlet?foo=bar&foo=baz you’ll see that the foo parameter has two values: bar and baz. The same is true if you use the same name for more than one HTML FORM element and use the POST method in the ACTION tag.

If you’re sure that an HTTP parameter only can have one value you can use the getParameter method instead of getParameterValues. It returns a single String and if there are multiple values it returns the first value received with the request.

You have access to all HTTP request headers with the getHeaderNames and getHeader methods. getHeader returns the String value of the header. If you know that the header has a date value or an integer value you can get help converting the header to an appropriate format. getDateHeader returns a date as the number of milliseconds since January 1, 1970, 00:00:00 GMT. This is the standard numeric representation of a timestamp in Java and you can use it to construct a Date object for further manipulation. getIntHeader returns the header value as an int.

getCookies parses the Cookie header and returns all cookies as an array of Cookie objects. To add a cookie to a response the HttpServletResponse class provides an addCookie method that takes a Cookie object as its argument. This saves you from dealing with the format for different versions of cookie header strings.

If you compile the ReqInfoServlet and install it in your servlet engine you can now invoke it through a browser with a URL like http://company.com/servlet/ReqInfoServlet/foo/bar?fee=baz. If everything goes as planned you will see something like this in your browser:

Latest Posts

Related Stories