An Introduction to Java Servlets
written by Hans Bergsten
I assume that you're familiar with HTTP and CGI or a proprietary server API like NSAPI or ISAPI. I also assume that you are somewhat familiar with Java programming or some other object-oriented language, such as C++. Even if you're not a Java programmer you should be able to appreciate the benefits of servlets reading this article, but before you develop your own servlets I recommend that you first learn the Java basics.
The Dark Ages
Early in the World Wide Web's history, the Common Gateway Interface (CGI) was defined to allow Web servers to process user input and serve dynamic content. CGI programs can be developed in any script or programming language, but Perl is by far the most common language. CGI is supported by virtually all Web servers and many Perl modules are available as freeware or shareware to handle most tasks.
But CGI is not without drawbacks. Performance and scalability are big problems since a new process is created for each request, quickly draining a busy server of resources. Sharing resources such as database connections between scripts or multiple calls to the same script is far from trivial, leading to repeated execution of expensive operations.
Security is another big concern. Most Perl scripts use the command shell to execute OS commands with user-supplied data, for instance to send mail, search for information in a file, or just leverage OS commands in general. This use of a shell opens up many opportunities for a creative hacker to make the script remove all files on the server, mail the server's password file to a secret account, or do other bad things that the script writer didn't anticipate.
The Web server vendors defined APIs to solve some of these problems, notably Microsoft's ISAPI and Netscape's NSAPI. But an application written to these proprietary APIs is married to one particular server vendor. If you need to move the application to a server from another vendor, you have to start from scratch. Another problem with this approach is reliability. The APIs typically support C/C++ code executing in the Web server process. If the application crashes, e.g. due to a bad pointer or division by zero, it brings the Web server down with it.
Servlets to the rescue!
The Servlet API was developed to leverage the advantages of the Java platform to solve the issues of CGI and proprietary APIs. It's a simple API supported by virtually all Web servers and even load-balancing, fault-tolerant Application Servers. It solves the performance problem by executing all requests as threads in one process, or in a load-balanced system, in one process per server in the cluster. Servlets can easily share resources as you will see in this article.
Security is improved in many ways. First of all, you rarely need to let a shell execute commands with user-supplied data since the Java APIs provide access to all commonly used functions. You can use JavaMail to read and send email, Java Database Connect (JDBC) to access databases, the File class and related classes to access the file system, RMI, CORBA and Enterprise Java Beans (EJB) to access legacy systems. The Java security model makes it possible to implement fine-grained access controls, for instance only allowing access to a well-defined part of the file system. Java's exception handling also makes a servlet more reliable than proprietary C/C++ APIs - a divide by zero is reported as an error instead of crashing the Web server.
The Servlet Run-time Environment
A servlet is a Java class and therefore needs to be executed in a Java VM by a service we call a servlet engine.
The servlet engine loads the servlet class the first time the servlet is requested, or optionally already when the servlet engine is started. The servlet then stays loaded to handle multiple requests until it is explicitly unloaded or the servlet engine is shut down.
Some Web servers, such as Sun's Java Web Server (JWS), W3C's Jigsaw and Gefion Software's LiteWebServer (LWS) are implemented in Java and have a built-in servlet engine. Other Web servers, such as Netscape's Enterprise Server, Microsoft's Internet Information Server (IIS) and the Apache Group's Apache, require a servlet engine add-on module. The add-on intercepts all requests for servlets, executes them and returns the response through the Web server to the client. Examples of servlet engine add-ons are Gefion Software's WAICoolRunner, IBM's WebSphere, Live Software's JRun and New Atlanta's ServletExec.
All Servlet API classes and a simple servlet-enabled Web server are combined into the Java Servlet Development Kit (JSDK), available for download at Sun's official Servlet site. To get started with servlets I recommend that you download the JSDK and play around with the sample servlets.
As this article is written (early March 1999), the released version of the JSDK is for the Servlet 2.0 API, with an Early Access version of the JSDK 2.1 available at Java Developer's Connection. All servlet engines mentioned above support the Servlet 2.0 API, and a few also support the 2.1 API. The examples of 2.1 API features in this article are clearly marked so you don't have to be surprised when they don't work with your 2.0 servlet engine.
Servlet Interface and Life Cycle
Let's implement our first servlet. A servlet is a Java class that implements the Servlet interface. This interface has three methods that define the servlet's life cycle:
public void init(ServletConfig config) throws ServletException
This method is called once when the servlet is loaded into the servlet engine, before the servlet is asked to process its first request.public void service(ServletRequest request, ServletResponse response) throws ServletException, IOException
This method is called to process a request. It can be called zero, one or many times until the servlet is unloaded. Multiple threads (one per request) can execute this method in parallel so it must be thread safe.public void destroy()
This method is called once just before the servlet is unloaded and taken out of service.
The init
method has a ServletConfig attribute. The servlet can read its initialization arguments through the ServletConfig object. How the initialization arguments are set is servlet engine dependent but they are usually defined in a configuration file.
A typical example of an initialization argument is a database identifier. A servlet can read this argument from the ServletConfig at initialization and then use it later to open a connection to the database during processing of a request:
... private String databaseURL; public void init(ServletConfig config) throws ServletException { super.init(config); databaseURL = config.getInitParameter("database"); }
The Servlet API is structured to make servlets that use a different protocol than HTTP possible. The javax.servlet
package contains interfaces and classes intended to be protocol independent and the javax.servlet.http
package contains HTTP specific interfaces and classes. Since this is just an introduction to servlets I will ignore this distinction here and focus on HTTP servlets. Our first servlet, named ReqInfoServlet, will therefore extend a class named HttpServlet. HttpServlet is part of the JSDK and implements the Servlet interface plus a number of convenience methods. We define our class like this:
import javax.servlet.*; import javax.servlet.http.*; public class ReqInfoServlet extends HttpServlet { ... }
An important set of methods in HttpServlet are the ones that specialize the service
method in the Servlet interface. The implementation of service
in HttpServlet looks at the type of request it's asked to handle (GET, POST, HEAD, etc.) and calls a specific method for each type. This way the servlet developer is relieved from handling the details about obscure requests like HEAD, TRACE and OPTIONS and can focus on taking care of the more common request types, i.e. GET and POST. In this first example we will only implement the doGet
method.
protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { ... }
Request and Response Objects
The doGet
method has two interesting parameters: HttpServletRequest and HttpServletResponse. These two objects give you full access to all information about the request and let you control the output sent to the client as the response to the request.
With CGI you read environment variables and stdin to get information about the request, but the names of the environment variables may vary between implementations and some are not provided by all Web servers. The HttpServletRequest object provides the same information as the CGI environment variables, plus more, in a standardized way. It also provides methods for extracting HTTP parameters from the query string or the request body depending on the type of request (GET or POST). As a servlet developer you access parameters the same way for both types of requests. Other methods give you access to all request headers and help you parse date and cookie headers.
Instead of writing the response to stdout as you do with CGI, you get an OutputStream or a PrintWriter from the HttpServletResponse. The OuputStream is intended for binary data, such as a GIF or JPEG image, and the PrintWriter for text output. You can also set all response headers and the status code, without having to rely on special Web server CGI configurations such as Non Parsed Headers (NPH). This makes your servlet easier to install.
Let's implement the body of our doGet
method and see how we can use these methods. We will read most of the information we can get from the HttpServletRequest (saving some methods for the next example) and send the values as the response to the request.
protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { response.setContentType("text/html"); PrintWriter out = response.getWriter(); // Print the HTML header out.println("<HTML><HEAD><TITLE>"); out.println("Request info"); out.println("</TITLE></HEAD>"); // Print the HTML body out.println("<BODY><H1>Request info</H1><PRE>"); out.println("getCharacterEncoding: " + request.getCharacterEncoding()); out.println("getContentLength: " + request.getContentLength()); out.println("getContentType: " + request.getContentType()); out.println("getProtocol: " + request.getProtocol()); out.println("getRemoteAddr: " + request.getRemoteAddr()); out.println("getRemoteHost: " + request.getRemoteHost()); out.println("getScheme: " + request.getScheme()); out.println("getServerName: " + request.getServerName()); out.println("getServerPort: " + request.getServerPort()); out.println("getAuthType: " + request.getAuthType()); out.println("getMethod: " + request.getMethod()); out.println("getPathInfo: " + request.getPathInfo()); out.println("getPathTranslated: " + request.getPathTranslated()); out.println("getQueryString: " + request.getQueryString()); out.println("getRemoteUser: " + request.getRemoteUser()); out.println("getRequestURI: " + request.getRequestURI()); out.println("getServletPath: " + request.getServletPath()); out.println(); out.println("Parameters:"); Enumeration paramNames = request.getParameterNames(); while (paramNames.hasMoreElements()) { String name = (String) paramNames.nextElement(); String[] values = request.getParameterValues(name); out.println(" " + name + ":"); for (int i = 0; i < values.length; i++) { out.println(" " + values[i]); } } out.println(); out.println("Request headers:"); Enumeration headerNames = request.getHeaderNames(); while (headerNames.hasMoreElements()) { String name = (String) headerNames.nextElement(); String value = request.getHeader(name); out.println(" " + name + " : " + value); } out.println(); out.println("Cookies:"); Cookie[] cookies = request.getCookies(); for (int i = 0; i < cookies.length; i++) { String name = cookies[i].getName(); String value = cookies[i].getValue(); out.println(" " + name + " : " + value); } // Print the HTML footer out.println("</PRE></BODY></HTML>"); out.close(); }
The doGet
method above uses most of the methods in HttpServletRequest that provide information about the request. You can read all about them in the Servlet API documentation so here we'll just look at the most interesting ones.
getParameterNames
and getParameterValues
help you access HTTP parameters no matter if the servlet was requested with the GET or the POST method. getParameterValues
returns a String array because an HTTP parameter may have multiple values. For instance, if you request the servlet with a URL like http://company.com/servlet/ReqInfoServlet?foo=bar&foo=baz
you'll see that the foo
parameter has two values: bar
and baz
. The same is true if you use the same name for more than one HTML FORM element and use the POST method in the ACTION tag.
If you're sure that an HTTP parameter only can have one value you can use the getParameter
method instead of getParameterValues
. It returns a single String and if there are multiple values it returns the first value received with the request.
You have access to all HTTP request headers with the getHeaderNames
and getHeader
methods. getHeader
returns the String value of the header. If you know that the header has a date value or an integer value you can get help converting the header to an appropriate format. getDateHeader
returns a date as the number of milliseconds since January 1, 1970, 00:00:00 GMT. This is the standard numeric representation of a timestamp in Java and you can use it to construct a Date object for further manipulation. getIntHeader
returns the header value as an int
.
getCookies
parses the Cookie header and returns all cookies as an array of Cookie objects. To add a cookie to a response the HttpServletResponse class provides an addCookie
method that takes a Cookie object as its argument. This saves you from dealing with the format for different versions of cookie header strings.
If you compile the ReqInfoServlet and install it in your servlet engine you can now invoke it through a browser with a URL like http://company.com/servlet/ReqInfoServlet/foo/bar?fee=baz
. If everything goes as planned you will see something like this in your browser:
Request info
getCharacterEncoding: getContentLength: -1 getContentType: null getProtocol: HTTP/1.0 getRemoteAddr: 127.0.0.1 getRemoteHost: localhost getScheme: http getServerName: company.com getServerPort: 80 getAuthType: null getMethod: GET getPathInfo: /foo/bar getPathTranslated: D:\PROGRA~1\jsdk2.1\httproot\servlet\ReqInfoServlet\foo\bar getQueryString: fee=baz getRemoteUser: null getRequestURI: /servlet/ReqInfoServlet/foo/bar getServletPath: /servlet/ReqInfoServlet Parameters: fee: baz Request headers: Connection : Keep-Alive User-Agent : Mozilla/4.5 [en] (WinNT; I) Host : company.com Accept : image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */* Accept-Encoding : gzip Accept-Language : en Accept-Charset : iso-8859-1,*,utf-8 Cookie : TOMCATID=TO04695278486734222MC1010AT Cookies: TOMCATID : TO04695278486734222MC1010AT
What if you want this servlet to handle both GET and POST requests? The default implementations of doGet
and doPost
return a message saying the method is not implemented. So far we have only provided a new implementation of doGet
. To handle a POST request the same way we can simply call doGet
from doPost
:
protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { doGet(request, response); }
Persistent and Shared Data
One of the more interesting features of the Servlet API is the support for persistent data. Since a servlet stays loaded between requests, and all servlets are loaded in the same process, it's easy to remember information from one request to another and to let different servlets share data.
The Servlet API contains a number of mechanisms to support this directly. We'll look at some of them in detail below. Another powerful mechanism is to use a singleton object to handle shared resources. You can read more about this technique in Improved Performance with a Connection Pool.
Session Tracking
An HttpSession class was introduced in the 2.0 version of the Servlet API. Instances of this class can hold information for one user session between requests. You start a new session by requesting an HttpSession object from the HttpServletRequest in your doGet
or doPost
method:
HttpSession session = request.getSession(true);
This method takes a boolean argument. true
means a new session shall be started if none exist, while false
only returns an existing session. The HttpSession object is unique for one user session. The Servlet API supports two ways to associate multiple requests with a session: cookies and URL rewriting.
If cookies are used a cookie with a unique session ID is sent to the client when the session is established. The client then includes the cookie in all subsequent requests so the servlet engine can figure out which session the request is associated with. URL rewriting is intended for clients that don't support cookies or when the user has disabled cookies. With URL rewriting the session ID is encoded in the URLs your servlet sends to the client. When the user clicks on an encoded URL, the session ID is sent to the server where it can be extracted and the request associated with the correct session as above. To use URL rewriting you must make sure all URLs that you send to the client are encoded with the encodeURL
or encodeRedirectURL
methods in HttpServletResponse.
An HttpSession can store any type of object. A typical example is a database connection allowing multiple requests to be part of the same database transaction, or information about purchased products in a shopping cart application so the user can add items to the cart while browsing through the site. To save an object in an HttpSession you use the putValue
method:
... Connection con = driver.getConnection(databaseURL, user, password); session.putValue("myappl.connection", con); ...
In another servlet, or the same servlet processing another request, you can get the object with the getValue
method:
... HttpSession session = request.getSession(true); Connection con = (Connection) session.getValue("myappl.connection"); if (con != null) { // Continue the database transaction ...
You can explicitly terminate (invalidate) a session with the invalidate
method or let it be timed-out by the servlet engine. The session times out if no request associated with the session is received within a specified interval. Most servlet engines allow you to specify the length of the interval through a configuration option. In the 2.1 version of the Servlet API there's also a setMaxInactiveInterval
so you can adjust the interval to meet the needs of each individual application.
ServletContext Attributes
All servlets belong to one servlet context. In implementations of the 1.0 and 2.0 versions of the Servlet API all servlets on one host belongs to the same context, but with the 2.1 version of the API the context becomes more powerful and can be seen as the humble beginnings of an Application concept. Future versions of the API will make this even more pronounced.
Many servlet engines implementing the Servlet 2.1 API let you group a set of servlets into one context and support more than one context on the same host. The ServletContext in the 2.1 API is responsible for the state of its servlets and knows about resources and attributes available to the servlets in the context. Here we will only look at how ServletContext attributes can be used to share information among a group of servlets.
There are three ServletContext methods dealing with context attributes: getAttribute
, setAttribute
and removeAttribute
. In addition the servlet engine may provide ways to configure a servlet context with initial attribute values. This serves as a welcome addition to the servlet initialization arguments for configuration information used by a group of servlets, for instance the database identifier we talked about above, a style sheet URL for an application, the name of a mail server, etc.
A servlet gets a reference to its ServletContext object through the ServletConfig object. The HttpServlet actually provides a convenience method (through its superclass GenericServlet) named getServletContext
to make it really easy:
... ServletContext context = getServletContext(); String styleSheet = request.getParameter("stylesheet"); if (styleSheet != null) { // Specify a new style sheet for the application context.setAttribute("stylesheet", styleSheet); } ...
The code above could be part of an application configuration servlet, processing the request from an HTML FORM where a new style sheet can be specified for the application. All servlets in the application that generate HTML can then use the style sheet attribute like this:
... ServletContext context = getServletContext(); String styleSheet = context.getAttribute("stylesheet"); out.println("<HTML><HEAD>"); out.println("<LINK HREF=" + styleSheet + " TYPE=text/css REL=STYLESHEET>"); ...
Request Attributes and Resources
The 2.1 version of the API adds two more mechanisms for sharing data between servlets: request attributes and resources.
The getAttribute
, getAttributeNames
and setAttribute
methods where added to the HttpServletRequest class (or to be picky, to the ServletRequest superclass). They are primarily intended to be used in concert with the RequestDispatcher, an object that can be used to forward a request from one servlet to another and to include the output from one servlet in the output from the main servlet.
The getResource
and getResourceAsStream
in the ServletContext class gives you access to external resources, such as an application configuration file. You may be familiar with the methods with same names in the ClassLoader. The ServletContext methods, however, can provide access to resources that are not necessarily files. A resource can be stored in a database, available through an LDAP server, anything the servlet engine vendor decides to support. The servlet engine provides a context configuration option where you specify the root for the resource base, be it a directory path, an HTTP URL, a JDBC URL, etc.
Examples of how to use these methods may be the subject of a future article. Until then you can read about them in the Servlet 2.1 specification.
Multithreading
As you have seen above, concurrent requests for a servlet are handled by separate threads executing the corresponding request processing method (e.g. doGet
or doPost
). It's therefore important that these methods are thread safe.
The easiest way to guarantee that the code is thread safe is to avoid instance variables altogether and instead pass all information needed by a method as arguments. For instance:
private String someParam; protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { someParam = request.getParameter("someParam"); processParam(); } private void processParam() { // Do something with someParam }
is not safe. If the doGet
method is executed by two threads it's likely that the value of the someParam
instance variable is replaced by the second thread while the first thread is still using it.
A thread safe alternative is:
protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { someParam = request.getParameter("someParam"); processParam(someParam); } private void processParam(String someParam) { // Do something with someParam }
Here the processParam
gets all data it needs as arguments instead of relying on instance variables.
Another reason to avoid instance variables is that in a multi-server system, there may be one instance of the servlet for each server and requests for the same servlet may be distributed between the servers. Keeping track of information in instance variables in this scenario doesn't work at all. In this type of system you can instead use the HttpSession object, the ServletContext attributes, or an external data store such as a database or an RMI/CORBA service to maintain the application state. Even if you start out with a small, single-server system it's a good idea to write your servlets so that they can scale to a large, multi-server system the day you strike oil.
Resources
This article barely scratches the surface on the Servlet API and all the things you can do with servlets. You can learn more by visiting some of the Web sites below:
- Sun Microsystem's official Servlet API site
- Servlet enabled Web servers and add-on servlet engines
- The servlet chapter in Sun's Java tutorial
- Novocode's Servlet Essentials, a Servlet programming tutorial
- Servlet Central, articles about servlet technology, success stories, resources and more
- A database with many servlets, both freeware with source code and commercial products
- Information about the O'Reilly Java Servlet Programming book by Jason Hunter and William Crawford
This article originally appeared on WebDevelopersJournal.com.
Originally published on https://www.developer.com.
This article was originally published on May 18, 2000