October 21, 2014
Hot Topics:
RSS RSS feed Download our iPhone app

Transforming Flat Files To XML With SAX and XSLT

  • March 11, 2003
  • By Jeff Ryan
  • Send Email »
  • More Articles »

Introduction

When we need to transform XML into other formats, XSLT (eXtensible Stylesheet Language for Transformations) does a wonderful job. However, sometimes we have a flat file or non-XML data structure that we need to transform into XML or other markup languages. Wouldn't it be nice if we could use the power of XSLT to transform these data structures as well?

Well, the answer is we can use XSLT to transform non-XML data sources using SAX (Simple API for XML). In this article, we'll build a Java class that transforms Java properties files into XML. This real, working component will illustrate the concept and help you learn how to use this technique for transforming virtually any data structure into XML.

The following outline is a road map for how we'll cover our topic:

  • SAX Parser and Handler Review
  • Writing Your Own SAX Parser (it's easier than you think)
  • The "Echo" Stylesheet
  • Transforming a SAX Source with TrAX (Transformation API for XML)
  • Summary

SAX Parser and Handler Review

If you've worked with SAX, you know that it is an API for processing XML documents as a stream of events. You may have written a handler class to be the recipient of these events. The handler class is notified of the following events, among others:

  • Start of Document
  • Start of Element
  • Characters
  • End of Element
  • End of Document

The handler class can respond to these events as it wishes. The easiest way to implement the ContentHandler interface is by extending the DefaultHandler object.

To parse an XML file using a custom handler, we might use the following code:

File f = new File("test.xml");
ContentHandler handler = new YourCustomHandler();

SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser parser = factory.newSAXParser();
parser.parse(file, handler);

The SAXParser will invoke the callback methods on YourCustomHandler.

Writing Your Own SAX Parser

To work with non-XML data structures, we need to build a parser that broadcasts SAX events to any registered handler classes. We don't even need to write a handler class. This seems strange at first if you are accustomed to writing handler classes.

A SAXSource object, representing the input to the transformation, is needed to use your parser in conjunction with the TrAX API. A SAXSource object can be constructed from an object that implements the XMLReader interface. This interface consists of several methods, most of which we don't need to be concerned with in our example.

We'll create an implementation of an XMLReader that can transform Java properties files into a stream of XML events. It should be a simple enough example to demonstrate how to transform arbitrary data structures into XML.

Here are the contents of the sample properties file we'll be working with.

Font-Family=Arial
Font-Size=12pt
Background-Color=White
Foreground-Color=Black

Note that there can be any number of key value pairs in such a file. Now, let's take a look at our class that will read the properties file and transmit a series of SAX events.

public class PropertyFileParser implements XMLReader
{
private ContentHandler contentHandler = null;

  public ContentHandler getContentHandler()
  {
    return contentHandler;
  }
  public void setContentHandler(ContentHandler handler)
  {
    contentHandler = handler;
  }

PropertyFileParser implements the XMLReader interface. Even though we don't have to write a ContentHandler, we do have to provide a mechanism for content handlers to be registered to receive events from our parser. TrAX will provide a content handler in this scenario.

Our main task is implementing the parse() method. The first parse method is the implementation required by the XMLReader interface. Here, we take the InputSource and load a Properties object. Then, we call our custom parse method.

public void parse(InputSource source) throws IOException,
                                             SAXException
{
  InputStream is = source.getByteStream();
  Properties p = new Properties();
  p.load(is);
  parse(p);
}

The custom parse method starts broadcasting the stream of events with the startDocument() and startElement() events for the root element of the "document." It iterates through an enumeration of the properties and generates startElement(), characters(), and endElement() events for each property. Finally, the endElement() for the root element and endDocument() events are sent.

private void parse(Properties p) throws SAXException
{
  contentHandler.startDocument();
  contentHandler.startElement(namespaceURI,
                              "Properties",
                              "Properties", attribs);

  Enumeration e = p.propertyNames();

while (e.hasMoreElements())
{
  String key = (String)e.nextElement();
  String value = (String)p.getProperty(key);

    contentHandler.startElement(namespaceURI, key, key, attribs);
    contentHandler.characters(value.toCharArray(), 0,
                              value.length());
    contentHandler.endElement(namespaceURI, key, key);
}

  contentHandler.endElement(namespaceURI, "Properties",
                                          "Properties");
  contentHandler.endDocument();
}

To satisfy the XMLReader interface, we will implement several other methods as null methods. However, this is the meat of our SAX Parser. The entire class can be viewed here: PropertyFileParser.java.





Page 1 of 2



Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 


Sitemap | Contact Us

Rocket Fuel