Getting Started with Java JAXP and XSL Transformations (XSLT), Page 2
The Source interfaceSource is an interface, not a class. Sun has this to say about the Source interface:
"An object that implements this interface contains the information needed to act as source input (XML source or transformation instructions)."
(Note that the reference to transformation instructions in the above quotation is a reference to the input parameter to the second overloaded version of the newTransformer method discussed earlier. Again, I will show you how to use this version in a future lesson.)In this program, I will create and use an object of the DOMSource class as the source for the transformation. (The DOMSource class implements the Source interface.)
The DOMSource class
Here is what Sun has to say about an object of the DOMSource class:
"Acts as a holder for a transformation Source tree in the form of a Document Object Model (DOM) tree."The Result interface
Sun has this to say about the Result interface:
"An object that implements this interface contains the information needed to build a transformation result tree."In this program, I will transform the DOMSource object into two different Result objects:
- A StreamResult object that points to the Standard Output Device (typically the screen).
- A StreamResult object that points to the output file.
Sun has this to say about the StreamResult class:
"Acts as an holder for a transformation result, which may be XML, plain Text, HTML, or some other form of markup."Get a DOMSource object
Listing 4 shows the code that gets a DOMSource object, which represents the Document object.
DOMSource source = new DOMSource(document); |
The DOMSource object produced in Listing 4 will later be transformed into two different Result objects.
Get a StreamResult object
The statement in Listing 5 gets a StreamResult object that points to the Standard Output Device.
StreamResult scrResult = |
Transform the DOMSource to text on the screen
The statement in Listing 6 invokes the transform method of the Transformer class to transform the DOMSource object to text on the screen.
transformer.transform(source, scrResult); |
Because the DOMSource object represents the Document object, the code in Listing 6 transforms the Document object to the screen. Since the Document object represents the original XML file, this effectively transforms the contents of the original XML file to the screen.
The screen output
The statement shown in Listing 6 produced the screen output shown in Figure 2.
<?xml version="1.0" encoding="UTF-8"?> Figure 2 |
If you compare Figure 2 with the input XML file shown in Listing 10 near the end of the lesson, you will see that it matches in all respects but one. The one line that doesn't match is the XML declaration in the first line of Figure 2 and Listing 10.
The XML declaration
The XML declaration is really not part of the XML data. Rather, the XML declaration provides information to the processor being used to process the XML data. I don't believe that the XML declaration becomes a part of the DOM tree structure.
(Recall that in the previous lesson, I used a separate statement to write the XML declaration into the output file before beginning the process of writing data in the output file based on data in the DOM tree.)The encoding attribute in the XML declaration shown in Figure 2 is optional. I elected not to include it in the original XML file. The author of the transform method of the Transformer class elected to include it in the transformed output. That is why it appears in Figure 2 and does not appear in Listing 10.
Write an output XML file
The three statements in Listing 7 perform the following three actions in order:
- Get an output stream for the output XML file.
- Get a StreamResult object that points to the output file.
- Transform the DOMSource object to text in the output file.
PrintWriter outStream = new PrintWriter( |
Figure 3 shows the contents of the output file produced by Listing 7.
<?xml version="1.0" encoding="UTF-8"?> Figure 3 |
As you might have surmised, the contents of the output file shown in Figure 3 match the screen output shown in Figure 2. Also, with the exception of the optional encoding attribute in the XML declaration, the contents of the output file match the contents of the original XML file shown in Listing 10.
End of the try block
Listing 7 also signals the end of the try block and the end of the code required to apply an identity XSL Transformation to a Document object.
Now you know how to use an identity transform to either display the XML data encapsulated in a Document object, or to cause that XML data to be written into a new XML file.
The remainder of this lesson deals with errors and exceptions, with particular emphasis on providing meaningful output in the event of a parser error.
Potential errors and exceptions
If we scan back through the code, we can identify the following expressions related to XML processing that have the potential of throwing errors and exceptions (I will omit I/O exceptions from this discussion). A review of the Sun documentation reveals that these expressions can throw the errors and exceptions shown.
- parser.parse(new File(argv[0]) throws SAXException if any parse errors occur.
- docBuildFactory.newDocumentBuilder() throws ParserConfigurationException if a DocumentBuilder cannot be created which satisfies the configuration requested.
- xformFactory.newTransformer() throws TransformerConfigurationException - May throw this during the parse when it is constructing the Templates object and fails.
- transformer.transform(source, scrResult) throws TransformerException if an unrecoverable error occurs during the course of the transformation.
- transformer.transform(source, fileResult) throws TransformerException if an unrecoverable error occurs during the course of the transformation.
- TransformerFactory.newInstance() throws TransformerFactoryConfigurationError if the implementation is not available or cannot be instantiated.
- DocumentBuilderFactory.newInstance() throws FactoryConfigurationError if the implementation is not available or cannot be instantiated.
The remaining code in the program provides specific catch blocks for some, but not all of the exceptions and errors listed above.
(A general Exception catch block is provided to handle those errors and exceptions for which specific catch blocks are not provided.)The SAXException class
The classes of primary interest in this lesson are the SAXException class and a subclass of that class named SAXParseException. Here is part of what Sun has to say about the SAXException class (boldface added by this author for emphasis):
"Encapsulate a general SAX error or warning. ... This class can contain basic error or warning information from either the XML parser or the application: a parser writer or application writer can subclass it to provide additional functionality. SAX handlers may throw this exception or any exception subclassed from it.The SAXParseException class
If the application needs to pass through other types of exceptions, it must wrap those exceptions in a SAXException or an exception derived from a SAXException.
If the parser or application needs to include information about a specific location in an XML document, it should use the SAXParseException subclass."
The SAXParseException class is a subclass of SAXException. An object of SAXParseException can
"Encapsulate an XML parse error or warning. ... This exception will include information for locating the error in the original XML document."The list that I showed you earlier indicated that the parse method of the DocumentBuilder class throws SAXException. That means that it can also throw any exception that is a subclass of SAXException. As it turns out, the parse method actually throws a SAXParseException, for at least some of the possible parsing error types.
The SAXParseException handler
Listing 8 shows the entire catch block for handling an exception of type SAXParseException.
catch(SAXParseException saxEx){
|
Listing 11 contains an XML file named Xsl01bad.xml for which a right angle bracket was purposely omitted from the end tag on the sixth line of text. This caused the XML document to not be well formed because the line element on the sixth line is malformed.
The screen output
When this program was used to process the corrupt file named Xsl01bad.xml, the code in Listing 8 produced the output shown in Figure 4. (Note that I manually inserted a line break to force some of the output to fit in this narrow publication format.)
SAXParseException Figure 4 |
You should be able to correlate each line of output in Figure 4 with the statements in Listing 8.
The -1 reported for the column number in Figure 4 indicates that the column number was "not available" to the method named getColumnNumber. The reported line number value of 7 is also one line beyond the actual line where the error occurs in the XML document.
(My interpretation of this situation is that the parser considered the error to be before the first character in line 7 instead of at the end of line 6. The error because apparent to the parser when it encountered the left angle bracket for a new start tag without the previous end tag having been properly terminated with a right angle bracket.)Parsing with Internet Explorer
For comparison purposes, Figure 5 shows the result of attempting to parse the same corrupt XML file using Internet Explorer.
Figure 5 Parsing error as per Internet Explorer
As you can see, the IE parser considered the error to be at the beginning of line 7 instead of at the end of line 6. However, it was able to provide a column number. (It also provides a nice graphic display showing the location of the error.)
Wrapped exceptions
As indicated in the earlier quotations from Sun, objects of the classes SAXException and SAXParseException can wrap other exceptions. The mechanism for getting and displaying the wrapped exception, if any, is shown by the invocation of the getException method on the SAXParseException object in Listing 8. According to Sun, the getException method, which is inherited from SAXException, "returns the embedded exception, if any." The embedded exception is returned as type Exception.
The screen output in Figure 4 indicates that there was no embedded exception in this sample case.
The remaining exception handlers
You can view the remaining exception handlers in Listing 9 near the end of the lesson. There is nothing unusual about any of them. Therefore, I won't discuss them in detail.
Run the Program
I encourage you to copy the code and XML data from Listings 9, 10, and 11 into your text editor. Compile the program and execute it. Experiment with it, making changes, and observing the results of your changes.
Summary
In this second lesson on Java JAXP, I began by providing a brief
review of XSL and XSL Transformations (XSLT).
Then I showed you how to create an identity Transformer
object, and how to use that object to:
- Display a DOM tree structure on the screen in XML format.
- Write the contents of a DOM tree structure into an output XML file.
Following that, I showed you how to write exception handlers that
provide meaningful information in the event of errors and exceptions,
with particular emphasis on parser errors and exceptions.
What's Next?
In the next lesson, I will show you how to write a program to display a DOM tree on the screen in a format that is much easier to interpret than raw XML code.
Complete Program Listings
/*File Xslt01.java |
<?xml version="1.0"?> |
A listing of the file named Xslt01bad.xml is provided in Listing 11 below. Note the missing right angle bracket at the end of line 6.
<?xml version="1.0"?> |
Copyright 2003, Richard G. Baldwin. Reproduction in whole or in part in any form or medium without express written permission from Richard Baldwin is prohibited.
About the author
Richard Baldwin is a college professor (at Austin Community College in Austin, TX) and private consultant whose primary focus is a combination of Java, C#, and XML. In addition to the many platform and/or language independent benefits of Java and C# applications, he believes that a combination of Java, C#, and XML will become the primary driving force in the delivery of structured information on the Web.Richard has participated in numerous consulting projects, and he frequently provides onsite training at the high-tech companies located in and around Austin, Texas. He is the author of Baldwin's Programming Tutorials, which has gained a worldwide following among experienced and aspiring programmers. He has also published articles in JavaPro magazine.
Richard holds an MSEE degree from Southern Methodist University and has many years of experience in the application of computer technology to real-world problems.
# # #
