start page | rating of books | rating of authors | reviews | copyrights

Book Home Java Enterprise in a Nutshell Search this book

5.2. Servlet Basics

The Servlet API consists of two packages, javax.servlet and javax.servlet.http. The javax is there because servlets are a standard extension to Java, rather than a mandatory part of the API. This means that while servlets are official Java, Java virtual machine developers are not required to include the classes for them in their Java development and execution environments.

At one point, servlets were slated to become part of Version 1.2 of the Java 2 platform, and the API was even included with some Java SDK beta releases. However, since the Servlet API is evolving much faster than the core Java SDK, Sun decided to keep distribution separate. This has led to the revival of the Java Servlet Development Kit (JSDK), which is currently available from Sun at http://java.sun.com/products/servlet/. The JSDK includes the necessary servlet classes and a small servletrunner application for development and testing. As of this writing, the latest available implementation is JSDK 2.1, based on Version 2.1 of the Servlet API.

The examples in this chapter were developed using Sun's Java Web Server 1.1.3, unofficially considered the reference implementation for servlets. As of this writing, a number of other products, including O'Reilly's WebSite Pro and the W3C's JigSaw, have incorporated servlet support. Various third-party vendors, including Live Software, New Atlanta, and IBM, have released add-on servlet modules for most other major web server platforms, including the Netscape server family, Apache, and Microsoft IIS. I'm not going to discuss how to load servlets on each server, since the various implementations differ in this regard. What's important is that the servlets themselves are the same for each platform.

The three core elements of the Servlet API are the javax.servlet.Servlet interface, the javax.servlet.GenericServlet class, and the javax.servlet. http.HttpServlet class. Normally, you create a servlet by subclassing one of the two classes, although if you are adding servlet capability to an existing object, you may find it easier to implement the interface.

The GenericServlet class is used for servlets that do not implement any particular communication protocol. Here's a basic servlet that demonstrates servlet structure by printing a short message:

import javax.servlet.*;
import java.io.*;


public class BasicServlet extends GenericServlet {

  public void service(ServletRequest req, ServletResponse resp)
    throws ServletException, IOException {

    resp.setContentType("text/plain"); 
    PrintWriter out = resp.getWriter();

    out.println("Hello.");
  }
}

BasicServlet extends the GenericServlet class and implements one method: service(). Whenever a server wants to use the servlet, it calls this service() method, passing ServletRequest and ServletResponse objects (we'll look at these in more detail shortly). The servlet tells the server what type of response to expect, gets a PrintWriter from the response object, and transmits its output.

The GenericServlet class can also implement a filtering servlet that takes output from an unspecified source and performs some kind of alteration. For example, a filter servlet might be used to prepend a header, scan servlet output or raw HTML files for <DATE> tags and insert the current date, or remove <BLINK> tags. A more advanced filtering servlet might insert content from a database into HTML templates. We'll talk a little more about filtering later in this chapter.

Although most servlets today work with web servers, there's no requirement for that in GenericServlet: the class implements just that, a generic servlet. As we'll see in a moment, the HttpServlet class is a subclass of GenericServlet that is designed to work with the HTTP protocol. It is entirely possible to develop other subclasses of GenericServlet that work with other server types. For example, a Java-based FTP server might use servlets to return files and directory listings or perform other tasks.

5.2.1. HTTP Servlets

The HttpServlet class is an extension of GenericServlet that includes methods for handling HTTP-specific data. HttpServlet defines a number of methods, such as doGet(), doPost(), and doPut(), to handle particular types of HTTP requests (GET, POST, and so on). These methods are called by the default implementation of the service() method, which figures out what kind of request is being made and then invokes the appropriate method. Here's a simple HttpServlet:

import javax.servlet.*;
import javax.servlet.http.*;
import java.io.*;

public class HelloWorldServlet extends HttpServlet {

  public void doGet(HttpServletRequest req, HttpServletResponse resp)
    throws ServletException, IOException {

    resp.setContentType("text/html");
    PrintWriter out = resp.getWriter();

    out.println("<HTML>");
    out.println("<HEAD><TITLE>Have you seen this before?</TITLE></HEAD>");
    out.println("<BODY><H1>Hello, World!</H1><H6>Again.</H6></BODY></HTML>");
  }
}

HelloWorldServlet demonstrates many essential servlet concepts. The first thing to notice is that HelloWorldServlet extends HttpServlet--standard practice for an HTTP servlet. HelloWorldServlet defines one method, doGet(), which is called whenever anyone requests a URL that points to this servlet.[2] The doGet() method is actually called by the default service() method of HttpServlet. The service() method is called by the web server when a request is made of HelloWorldServlet; the method determines what kind of HTTP request is being made and dispatches the request to the appropriate doXXX() method (in this case, doGet()). doGet() is passed two objects, HttpServletRequest and HttpServletResponse, that contain information about the request and provide a mechanism for the servlet to produce output, respectively.

[2] In a standard Java Web Server installation, with the servlet installed in the standard servlets directory, this URL is http://site:8080/servlet/HelloWorldServlet. Note that the name of the directory (servlets) is unrelated to the use of "servlet" in the URL.

The doGet() method itself does three things. First, it sets the output type to "text/html", which indicates that the servlet produces standard HTML as its output. Second, it calls the getWriter() method of the HttpServletResponse parameter to get a java.io.PrintWriter that points to the client. Finally, it uses the stream to send some HTML back to the client. This isn't really a whole lot different from the BasicServlet example, but it gives us all the tools we'll need later on for more complex web applications.

If you define a doGet() method for a servlet, you may also want to override the getLastModified() method of HttpServlet. The server calls getLastModified() to find out if the content delivered by a servlet has changed. The default implementation of this method returns a negative number, which tells the server that the servlet doesn't know when its content was last updated, so the server is forced to call doGet() and return the servlet's output. If you have a servlet that changes its display data infrequently (such as a servlet that verifies uptime on several server machines once every 15 minutes), you should implement getLastModified() to allow browsers to cache responses. getLastModified() should return a long value that represents the time the content was last modified as the number of milliseconds since midnight, January 1, 1970, GMT.

A servlet should also implement getServletInfo(), which returns a string that contains information about the servlet, such as name, author, and version (just like getAppletInfo() in applets). This method is called by the web server and generally used for logging purposes.

5.2.2. Forms and Interaction

The problem with creating a servlet like HelloWorldServlet is that it doesn't do anything we can't already do with HTML. If we are going to bother with a servlet at all, we should do something dynamic and interactive with it. In many cases, this means processing the results of an HTML form. To make our example less impersonal, let's have it greet the user by name. The HTML form that calls the servlet using a GET request might look like this:

<HTML>
<HEAD><TITLE>Greetings Form</TITLE></HEAD>
<BODY>
<FORM METHOD=GET ACTION="/servlet/HelloServlet">
What is your name?
<INPUT TYPE=TEXT NAME=username SIZE=20>
<INPUT TYPE=SUBMIT VALUE="Introduce Yourself">
</FORM>
</BODY>
</HTML>

This form submits a form variable named username to the URL /servlet/HelloServlet. How does the web server know to load this particular servlet? Most servlet implementations, including the Java Web Server, allow you to place unpackaged servlets into a particular directory, and access them with a URI of /servlet/ServletName. This is similar to the way most web servers support CGI programs.

The HelloServlet itself does little more than create an output stream, read the username form variable, and print out a nice greeting for the user. Here's the code:

import javax.servlet.*;
import javax.servlet.http.*;
import java.io.*;

public class HelloServlet extends HttpServlet {

  public void doGet(HttpServletRequest req, HttpServletResponse resp)
    throws ServletException, IOException {

    resp.setContentType("text/html");
    PrintWriter out = resp.getWriter();

    out.println("<HTML>");
    out.println("<HEAD><TITLE>Finally, interaction!</TITLE></HEAD>");
    out.println("<BODY><H1>Hello, " + req.getParameter("username") + "!</H1>");
    out.println("</BODY></HTML>");
  }
}

All we've done differently here is use the getParameter() method of HttpServletRequest to retrieve the value of a form variable.[3] When a server calls a servlet, it can also pass a set of request parameters. With HTTP servlets, these parameters come from the HTTP request itself, in this case in the guise of URL-encoded form variables. Note that a GenericServlet running in a web server also has access to these parameters using the simpler SerlvetRequest object. When the HelloServlet runs, it inserts the value of the username form variable into the HTML output, as shown in Figure 5-2.

[3] In the Java Web Server 1.1, the getParameter() method was deprecated in favor of getParameterValues(), which returns a String array rather than a single string. However, after an extensive write-in campaign, Sun took getParameter() off the deprecated list for Version 2.0 of the Servlet API, so you can safely use this method in your servlets.

figure

Figure 5-2. Output from HelloServlet

5.2.3. POST, HEAD, and Other Requests

As I mentioned before, doGet() is just one of a collection of enabling methods for HTTP request types. doPost() is the corresponding method for POST requests. The POST request is designed for posting information to the server, although in practice it is also used for long parameterized requests and larger forms, to get around limitations on the length of URLs.

If your servlet is performing database updates, charging a credit card, or doing anything that takes an explicit client action, you should make sure this activity is happening in a doPost() method. That's because POST requests are not idempotent, which means that they are not safely repeatable, and web browsers treat them specially. For example, a browser cannot bookmark or, in some cases, reload a POST request. On the other hand, GET requests are idempotent, so they can safely be bookmarked, and a browser is free to issue the request repeatedly without necessarily consulting the user. You can see why you don't want to charge a credit card in a GET method!

To create a servlet that can handle POST requests, all you have to do is override the default doPost() method from HttpServlet and implement the necessary functionality in it. If necessary, your application can implement different code in doPost() and doGet(). For instance, the doGet() method might display a postable data entry form that the doPost() method processes. doPost() can even call doGet() at the end to display the form again.

The less common HTTP request types, such as HEAD, PUT, TRACE, and DELETE, are handled by other doXXX() dispatch methods. A HEAD request returns HTTP headers only, PUT and DELETE allow clients to create and remove resources from the web server, and TRACE returns the request headers to the client. Since most servlet programmers don't need to worry about these requests, the HttpServlet class includes a default implementation of each corresponding doXXX() method that either informs the client that the request is unsupported or provides a minimal implementation. You can provide your own versions of these methods, but the details of implementing PUT or DELETE functionality go rather beyond our scope.

5.2.4. Servlet Responses

In order to do anything useful, a servlet must send a response to each request that is made of it. In the case of an HTTP servlet, the response can include three components: a status code, any number of HTTP headers, and a response body.

The ServletResponse and HttpServletResponse interfaces include all the methods needed to create and manipulate a servlet's output. We've already seen that you specify the MIME type for the data returned by a servlet using the setContentType() method of the response object passed into the servlet. With an HTTP servlet, the MIME type is generally "text/html," although some servlets return binary data: a servlet that loads a GIF file from a database and sends it to the web browser should set a content type of "image/gif" while a servlet that returns an Adobe Acrobat file should set it to "application/pdf".

ServletResponse and HttpServletResponse each define two methods for producing output streams, getOutputStream() and getWriter(). The former returns a ServletOutputStream, which can be used for textual or binary data. The latter returns a java.io.PrintWriter object, which is used only for textual output. The getWriter() method examines the content-type to determine which charset to use, so setContentType() should be called before getWriter().

HttpServletResponse also includes a number of methods for handling HTTP responses. Most of these allow you to manipulate the HTTP header fields. For example, setHeader(), setIntHeader(), and setDateHeader() allow you to set the value of a specified HTTP header, while containsHeader() indicates whether a certain header has already been set. You can use either the setStatus() or sendError() method to specify the status code sent back to the server. HttpServletResponse defines a long list of integer constants that represent specific status codes (we'll see some of these shortly). You typically don't need to worry about setting a status code, as the default code is 200 ("OK"), meaning that the servlet sent a normal response. However, a servlet that is part of a complex application structure (such as the file servlet included in the Java Web Server that handles the dispatching of HTML pages) may need to use a variety of status codes. Finally, the sendRedirect() method allows you to issue a page redirect. Calling this method sets the Location header to the specified location and uses the appropriate status code for a redirect.

5.2.5. Servlet Requests

When a servlet is asked to handle a request, it typically needs specific information about the request so that it can process the request appropriately. We've already seen how a servlet can retrieve the value of a form variable and use that value in its output. A servlet may also need access to information about the environment in which it is running. For example, a servlet may need to find out about the actual user who is accessing the servlet, for authentication purposes.

The ServletRequest and HttpServletRequest interfaces provide access to this kind of information. When a servlet is asked to handle a request, the server passes it a request object that implements one of these interfaces. With this object, the servlet can find out about the actual request (e.g., protocol, URL, type), access parts of the raw request (e.g., headers, input stream), and get any client-specific request parameters (e.g., form variables, extra path information). For instance, the getProtocol() method returns the protocol used by the request, while getRemoteHost() returns the name of the client host. The interfaces also provide methods that let a servlet get information about the server (e.g., getServername(), getServerPort()). As we saw earlier, the getParameter() method provides access to request parameters such as form variables. There is also the getParameterValues() method, which returns an array of strings that contains all the values for a particular parameter. This array generally contains only one string, but some HTML form elements (as well as non-HTTP oriented services) do allow multiple selections or options, so the method always returns an array, even if it has a length of one.

HttpServletRequest adds a few more methods for handling HTTP-specific request data. For instance, getHeaderNames() returns an enumeration of the names of all the HTTP headers submitted with a request, while getHeader() returns a particular header value. Other methods exist to handle cookies and sessions, as we'll discuss later.

Example 5-1 shows a servlet that restricts access to users who are connecting via the HTTPS protocol, using Digest style authentication, and coming from a government site (a domain ending in .gov).

Example 5-1. Checking Request Information to Restrict Servlet Access

import javax.servlet.*;
import javax.servlet.http.*;
import java.io.*;

public class SecureRequestServlet extends HttpServlet {

  public void doGet(HttpServletRequest req, HttpServletResponse resp)
    throws ServletException, IOException {

    resp.setContentType("text/html");
    PrintWriter out = resp.getWriter();

    out.println("<HTML>");
    out.println("<HEAD><TITLE>Semi-Secure Request</TITLE></HEAD>");
    out.println("<BODY>");
    
    String remoteHost = req.getRemoteHost();
    String scheme = req.getScheme();
    String authType = req.getAuthType();
    
    if((remoteHost == null) || (scheme == null) || (authType == null)) {
      out.println("Request Information Was Not Available.");
      return;
    }

    if(scheme.equalsIgnoreCase("https") && remoteHost.endsWith(".gov") 
       && authType.equals("Digest")) {
      out.println("Special, secret information.");
    } 
    else {
      out.println("You are not authorized to view this data.");
    }

    out.println("</BODY></HTML>");
  }
}

5.2.6. Error Handling

Sometimes things just go wrong. When that happens, it's nice to have a clean way out. The Servlet API gives you two ways of to deal with errors: you can manually send an error message back to the client or you can throw a ServletException. The easiest way to handle an error is simply to write an error message to the servlet's output stream. This is the appropriate technique to use when the error is part of a servlet's normal operation, such as when a user forgets to fill in a required form field.

5.2.6.1. Status codes

When an error is a standard HTTP error, you should use the sendError() method of HttpServletResponse to tell the server to send a standard error status code. HttpServletResponse defines integer constants for all the major HTTP status codes. Table 5-1 lists the most common status codes. For example, if a servlet cannot find a file the user has requested, it can send a 404 ("File Not Found") error and let the browser display it in its usual manner. In this case, we can replace the typical setContentType() and getWriter() calls with something like this:

response.sendError(HttpServletResponse.SC_NOT_FOUND);

If you want to specify your own error message (in addition to the web server's default message for a particular error code), you can call sendError() with an extra String parameter:

response.sendError(HttpServletResponse.SC_NOT_FOUND, 
                   "It's dark. I couldn't find anything.");

Table 5-1. Some Common HTTP Error Codes

MnemonicCodeDefaultMeaning
ContentMessage

SC_OK

200

OK

The client's request succeeded, and the server's response contains the requested data. This is the default status code.

SC_NO_CONTENT

204

No Content

The request succeeded, but there is no new response body to return. A servlet may find this code useful when it accepts data from a form, but wants the browser view to stay at the form. It avoids the "Document contains no data" error message.

SC_MOVED_ PERMANENTLY

301

Moved Permanently

The requested resource has permanently moved to a new location. Any future reference should use the new location given by the Location header. Most browsers automatically access the new location.

SC_MOVED_ TEMPORARILY

302

Moved Temporarily

The requested resource has temporarily moved to another location, but future references should still use the original URL to access the resource. The temporary new location is given by the Location header. Most browsers automatically access the new location.

SC_ UNAUTHORIZED

401

Unauthorized

The request lacked proper authorization. Used in conjunction with the WWW-Authenticate and Authorization headers.

SC_NOT_FOUND

404

Not Found

The requested resource is not available.

SC_INTERNAL_ SERVER_ERROR

500

Internal Server Error

An error occurred inside the server that prevented it from fulfilling the request.

SC_NOT_ IMPLEMENTED

501

Not Implemented

The server does not support the functionality needed to fulfill the request.

SC_SERVICE_ UNAVAILABLE

503

Service Unavailable

The server is temporarily unavailable, but service should be restored in the future. If the server knows when it will be available again, a Retry-After header may also be supplied.

5.2.6.2. Servlet exceptions

The Servlet API includes two Exception subclasses, ServletException and its derivative, UnavailableException. A servlet throws a ServletException to indicate a general servlet problem. When a server catches this exception, it can handle the exception however it sees fit.

UnavailableException is a bit more useful, however. When a servlet throws this exception, it is notifying the server that it is unavailable to service requests. You can throw an UnavailableException when some factor beyond your servlet's control prevents it from dealing with requests. To throw an exception that indicates permanent unavailability, use something like this:

throw new UnavailableException(this, "This is why you can't use the servlet.");

UnavailableException has a second constructor to use if the servlet is going to be temporarily unavailable. With this constructor, you specify how many seconds the servlet is going to be unavailable, as follows:

throw new UnavailableException(120, this, "Try back in two minutes");

One caveat: the servlet specification does not mandate that servers actually try again after the specified interval. If you choose to rely on this capability, you should test it first.

5.2.6.3. A file serving servlet

Example 5-2 demonstrates both of these error-handling techniques, along with another method for reading data from the server. FileServlet reads a pathname from a form parameter and returns the associated file. Note that this servlet is designed only to return HTML files. If the file cannot be found, the servlet sends the browser a 404 error. If the servlet lacks sufficient access privileges to load the file, it sends an UnavailableException instead. Keep in mind that this servlet exists as a teaching exercise: you should not deploy it on your web server. (For one thing, any security exception renders the servlet permanently unavailable, and for another, it can serve files from the root of your hard drive.)

Example 5-2. Serving Files

import javax.servlet.*;
import javax.servlet.http.*;
import java.io.*;

public class FileServlet extends HttpServlet {

  public void doGet(HttpServletRequest req, HttpServletResponse resp)
    throws ServletException, IOException {
  
    File r; 
    FileReader fr;
    BufferedReader br;
    try {
      r = new File(req.getParameter("filename"));
      fr = new FileReader(r);
      br = new BufferedReader(fr);
      if(!r.isFile()) {  // Must be a directory or something else
        resp.sendError(resp.SC_NOT_FOUND);
        return;
      }
    } 
    catch (FileNotFoundException e) {
      resp.sendError(resp.SC_NOT_FOUND);
      return;
    }
    catch (SecurityException se) { // Be unavailable permanently
      throw(new UnavailableException(this, 
        "Servlet lacks appropriate privileges."));
    } 

    resp.setContentType("text/html");
    PrintWriter out = resp.getWriter();
    String text;
    while( (text = br.readLine()) != null)
      out.println(text);
    
    br.close();
  }
}

5.2.7. Security

Servlets don't generally handle their own security arrangements. Instead, they typically rely on the capabilities of the web server to limit access to them. The security capabilities of most web servers are limited to basic on-or-off access to specific resources, controlled by username and password (or digital certificate), with possible encryption-in-transmission using SSL. Most servers are limited to basic authentication, which transmits passwords more or less in the clear, while some (including JWS) support the more advanced digest authentication protocol, which works by transmitting a hash of the user's password and a server-generated value, rather than the password itself. Both of these approaches look the same to the user; the familiar "Enter username and password" window pops up in the web browser.

The HttpServletRequest interface includes a pair of basic methods for retrieving standard HTTP user authentication information from the web server. If your web server is equipped to limit access, a servlet can retrieve the username with getRemoteUser() and the authentication method (basic, digest, or SSL) with getAuthType(). Consult your server documentation for details on using authentication to protect server resources.

Why are these methods useful? Consider a web application that uses the web server's authentication support to restrict access to authorized users, but needs to control access among that set of users. The username returned by getRemoteUser() can be used to look up specific privileges in an access control database. This is similar to what we did in Example 5-1, except access is now controlled by username, instead of hostname.



Library Navigation Links

Copyright © 2001 O'Reilly & Associates. All rights reserved.