A common programming task is manipulation and transformation of data. When data is represented in XML, the usual approach to data transformation is to use XSLT. Under this scheme, a stylesheet is created based on the required transformations. That stylesheet is then linked to the original XML document. An XSLT processor reads both files, applies the transformations, and produces a single output file. This processing can happen on the client (for example, Internet Explorer contains an XSLT processor) or it can happen on the server (a JavaBean can handle the transformation inside a Java servlet).
Transformations can change the actual data, or modify the hierarchy of the original XML file, or both. A common reason for transformations is when two disparate systems need to exchange information. Although the actual data requirements are the same in both systems, their structure may be different based on each system’s DTD. So transformation is a matter of grabbing the data from one file and putting it in another file using a different structure. XSLT relies on XPath (another W3C standard) to define the parts of XML data that must undergo transformation. For example, with XPath you can specify the <address> element under <customer> is subject to transformation. XSLT then defines a template, which dictates how the data identified by XPath is to be transformed. The task of the processor is to then recursively find each item that is subject to change, and apply the changes.
There are two complaints I often hear regarding XSLT. One is that its syntax is especially hard to learn, due to its recursive nature. You really have to think about the transformations, and the order in which you state the various rules becomes important. There are now tools on the market that allow you to interactively apply XSLT to an XML file and instantly see the result. This visual approach should help, but it may not address the complex cases. Another complaint is performance. One way to implement XSLT would be to parse both the XML and XSLT files into separate DOM trees and then apply the transformations. Given the inherent recursive nature of this operation, there is a performance consideration. The various XSLT processors have improved, but architects should still pay close attention to the performance implications of using XSLT.
While XSLT is understood as the de facto standard for XML transformations, I believe there are alternatives that should be considered. For example, version 1.1 of JSP (Java Server Pages) includes support for tag libraries. The idea is not new, but JSP does a good job of providing a comprehensive and familiar framework for implementing it. The basic idea is to associate Java code to tags (element names). Events are defined that denote the beginning, middle or end of the XML tag. These events will fire appropriate Java code that has been explicitly associated with the tag. The Java code could in effect recreate the original XML document by expanding some tags or eliminating the others. This is the same as the transformation advocated by XSLT, although it is implemented in a different way.
Not many people think of JSP taglibs as a transformation framework now. It is, however, an alternative that should be further explored.