JavaEnterprise JavaAn Introduction to Apache Cocoon 2.1

An Introduction to Apache Cocoon 2.1


An Open Source Apache Software Foundation project based on open standards, Apache Cocoon is rapidly gaining momentum in the Web developer community. This is evidenced by ever increasing diverse developer and user communities, and the popularity of recent Cocoon meets that are attracting large numbers, and attendees from many parts of the world.

This article focuses on Apache Cocoon 2.1 and is intended to be an introduction to Cocoon and the concepts behind it.

What is Apache Cocoon?

Essentially, Cocoon is a Web development framework. It provides the developer with the infrastructure needed to quickly build flexible, maintainable, scalable, and robust XML Server Applications.

Cocoon is implemented as an abstract Java engine, and as such, can be run from both within an Application Server or from the command line. Cocoon introduces the concepts of “component pipelines” and “Cocoon flow” to Web Application building, to form the basis of a powerful Web architecture. It allows for easy:

  • Content aggregation
  • Pluggable multi-step XML transformations/augmentations (Typically via XSLT)
  • Pluggable Multi-Channeling
  • Internationalization
  • Form handling and validation (Cocoon Forms/Woody)
  • Centralized application flow logic with Cocoon Flow while at the same encouraging maintainable, robust code and promoting Separation of Concerns (SoC)

The notion of separation of concerns among Logic, Content, and Style sits at the very heart of Cocoon. In fact, this notion was the original inspiration behind Cocoon’s genesis. Cocoon methodology preaches that an “individual should do what they’re good at;” the benefits of this are many fold and obvious. These range from clean uncluttered code, to more productive happy individuals.

Component Pipelines

As a component-based framework, Cocoon is highly modular. It is distributed with many “prefabricated” pluggable components. Because these components either consume or produce SAX events, they can be chained together to form SAX processing pipelines. Hence, the term “component pipelines” is often used.

The concept of “component pipelines” makes SoC easy. Each component within a pipeline is usually tasked with a particular concern. For example, rendering the same content to a Cell Phone or to a Web Browser becomes a simple matter of plugging in the appropriate “style” component into the pipeline, allowing the “styler” to work independently from the content provider.

Individual pipeline components are often compared to XML “Lego Blocks.” This is because, like Lego blocks, pipeline components can be assembled into any number of configurations to quickly tailor the required XML solution. Web developers are not limited to the available component suite, but can write their own enterprise-specific “custom components” by extending existing components or by implementing the appropriate component interface.

A basic Cocoon pipeline would aggregate one or more data sources generating a single XML SAX stream that then is “piped” through the necessary transform components (including transforms that can augment content; for example, SOAP or SQL Transformers), transforming the XML as required before finally being serialized to the desired format.

There are numerous components available at the developers’ disposal. An example of one such component would be the ServerPages generator component, which implements XSP. Extensible Server Pages, or XSP, is an extension of JSP that allows the developer to insert Java code into XML markup. There are too many types of generators, transformers, and serializers to list here.

Component pipelines are mapped to URI spaces via the Cocoon sitemap.

The Sitemap—Configuring Pipelines

The sitemap.xmap file is the core configuration file for Cocoon. This is where the Web Application developer defines enterprise-specific component pipelines.

A sitemap basically defines the set of available sitemap components, and provides a mapping between URIs and the corresponding pipeline assembly “blueprints.” Typically, several pipeline definitions are contained within a single sitemap file.

Consider the following pipeline example:

<map:pipeline type="caching">
  <map:match pattern="page/getUserinfo.html">
    <map:generate type="serverpages" src="screens/userInfo.xsp"/>
    <map:transform src="context://samples/common/style/xsl/html/
    <map:transform type="i18n"/>
    <map:serialize type="html"/>

  <map:match pattern="page/registrationSuccessful.html">
    <map:generate type="serverpages"
    <map:transform src="context://samples/common/style/xsl/html/
    <map:transform type="i18n"/>
    <map:serialize type="html"/>

The preceding pipeline maps the following URIs, “page/getUserInfo.html”, and “page/registrationSuccesful.html”, to the corresponding component pipeline definition. Note that these URIs are anchored under Cocoon’s servlet context.

Both pipeline definitions build a pipeline by instantiating a “Caching” pipeline implementation, and using that to imbed a ServerPages generator component, a XSLT transformer component, an il8n transformer component (that transforms content into the appropriate language according to locale), and an HTL serializer component.

What if the developer wanted to serve one of the above pages to a WAP-enabled devices instead? To do this, the developer would simply slot in the WML stylesheet and change the serializer component as follows:

      <map:match pattern="page/getUserinfo.vml">
        <map:generate type="serverpages"
        <map:transform src="context://samples/common/style/xsl/
        <map:transform type="i18n"/>
        <map:serialize type="wap"/>

The above pipelines are simplified examples for the purpose of this article. There are many classes of component (let alone component types) that are not listed here; these include Action components, Selector components, and so forth. The sitemap allows for calling of sitemap resources, defining of views orthogonal to pipelines, pipeline redirection, defining custom protocols, and so on. The point is that the sitemap language is highly expressive and extensible, allowing the developer to describe pipelines in a powerful and concise fashion.

Of course, the developer need not define an entire Web application within a single sitemap. Cocoon allows for sitemaps to be split up into several mountable sub-sitemaps.

Cocoon Flow, MVC+, and Continuations

Cocoon supports the Model View Controller (MVC) pattern in the form of “Cocoon flow” as an added layer of control on top of the sitemap.

Traditionally, interactive Web applications have been modeled after “finite state machines.” In these models, Flow Logic is fragmented throughout the application to intercept responses from the client and determine the next machine state. Cocoon, with the help of Continuations, breaks through this paradig, allowing the flow logic to be described sequentially and within a single location.

What are Continuations? Continuations are a means to save the current state of execution, allowing for resumption of execution at an arbitrary later date. The concept does not involve any suspending of server threads or the like, but rather, involves the dumping of current state and stack information to memory or disk. This saved state is associated with a Continuation ID. To resume execution from a particular point, a Cocoon application merely needs quote the Continuation ID.

Continuations describing a single flow are linked together in a child/parent relationship, forming a tree. This means, for example, that the situation where a user hits the browser back button no longer presents a problematic case for the Web developer. This is because the Cocoon flow controller in this case will simply resume from the previous parent continuation, maintaining a consistent state. Any subsequent path the user takes creates a different branch in the continuations tree. Thus, during the life of a Web application, forests of Continuation trees can be formed, growing and receding as dictated. Continuations are automatically expired after a specified time-to-live. They also can be manually expired.

With Cocoon MCV, the Model would be the Java code; the component pipelines represent the Views, and the Controller is the Cocoon flow controller. Currently, the Cocoon flow Controller logic is implemented as serverside JavaScript. The advantages of severside JavaScript include quick prototyping and reloading. (There is talk of also adding support for other languages, including Java, for flow control.)

Perhaps the easiest way to illustrate the concept of Cocoon flow and Continuations is via a simple example. This example is based on a working sample from the Cocoon samples site.

function registerUserFlow()
   while (true) {
    cocoon.sendPageAndWait("page/getUserInfo.html",{ });
    // At this point, a Continuation is generated and saved
    // "waiting" to be resumed from the same point once a response
    // is returned from the user. Typically, the response is in
    // the shape of an HTML form submit.

    // The following three lines retrieve information from the
    // response
    login = cocoon.request.getParameter("login");
    firstName = cocoon.request.getParameter("firstName");
    lastName = cocoon.request.getParameter("lastName");

    // Here, the flow calls the "Model" to ascertain what the next
    // "View" should be.
    var existingUser = userRegistry.isLoginNameTaken(login);

    if (!existingUser) {
      user = new Packages...flow.prefs.User(login,, firstName,
    } else {
      errorMsg = "Login name '" + login + "' is already in use,
                                           please choose another
  var session = cocoon.session;
  cocoon.sendPage("page/registrationSuccessful.html", {"user" :
  // Here, we send a page to the user but we don't save the state
  // and we don't wait for a response.

The flow controller above takes care of the steps necessary for the registration of a new user. As long as the user does not successfully log in, the login page will be re-displayed with an error message. There are two views: namely, page/getUserInfo.html, and page/registrationSuccessful.html. These views are generated by the corresponding sitemap component pipeline. The Model here is the userRegistry Java class.

Note how quick and easy it would be to change application flow logic by altering the above script.


While perhaps not the most optimal solution for small, one-page Web sites, Cocoon has many advantages over competing technologies when it comes to larger applications. This is especially true for Web operations that employ multiple people for development and maintenance, and which require internalization, content aggregation, XML transformations, and multi-channeling.

Apache Cocoon’s capabilities are derived from simple but powerful concepts such as Component Pipelines, SoC, and Continuations. Grounded in these solid principles and concepts, and with a strong active developer community behind it, Apache Cocoon is gaining momentum as a premier Web Application framework.


About the Author

Michael Melhem is an experienced Web Developer/Engineer with strong expertise
building scalable Java XML based Web solutions for large
commercial institutions. Michael is committer on the Apache Cocoon
project and member of the Cocoon PMC. He is currently employed by
ManageSoft Corp.

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Latest Posts

Related Stories