Ten Lessons Learned From An XML Developer, Page 2
Lesson 6: Know What Type of Transformation to Use
Another form of procedural abuse can be using DOM programming for transforming XML structures rather than XSLT. For relatively simple transformations, DOM programming could be a good fit. This technique programmatically reads from one DOM while creating another DOM node by node. However, it can become quite tedious. An XSLT stylesheet is a much easier way to transform complex XML structures. It also results in more maintainable code because the transformation code can be organized into discrete, rules-based templates (see Lesson 7). Using the TrAX API, it is very easy to run XSLT transformations from Java programs. Generally, the way this works is by providing a source document to a transformer object for a stylesheet, which returns the result of the transformation.
Lesson 7: Use Rules-Based Stylesheets
An XSLT stylesheet with many <xsl:for-each> statements (and especially nested <xsl:for-each> statements), is a procedural stylesheet. While it is probably functional, a procedural stylesheet is much more difficult to maintain than a rules-based stylesheet and doesn't take advantage of the full features of XSLT.
A rules-based stylesheet consists of several discrete templates that are invoked when their rule is matched. The XSLT engine does the tree walking by recursing over the document nodes selected via an <xsl:apply-templates select="..."> statement and applies the templates of all the matched rules.
A previous article, "Following The Rules-Based Approach To Stylesheet Development," solves the same business problem with both procedural and rules-based stylesheets and compares and contrasts the approaches. Another article, "Building Modular Stylesheet Components," dives a little deeper and shows how to design reusable stylesheet components.
It's a good idea for a beginning stylesheet developer to completely forgo <xsl:for-each> statements that can lead to very procedural code and concentrate on learning the rules-based approach
Lesson 8: Optimize XSLT Stylesheet Execution
Compiling an XSLT stylesheet can be an expensive operation, taking a few hundred milliseconds. Improper use of the TrAX API can cause a compilation with each transformation. Fortunately, TrAX allows a client to create a compiled representation of a stylesheet, called a Templates object, that can be cached and used over many transformations. A previous article, "Optimizing Stylesheet Execution with TrAX," discusses how to cache compiled representations of the stylesheets and make them hot deployable. Learn this technique and don't take an unnecessary performance hit when running XSLT transformations from Java programs.
Lesson 9: Use DTDs or Schemas
DTDs and Schemas document the structure, content, and semantics of XML documents. However, they are not required by XML—a document essentially describes itself. Often, when programming XML, "strong-typing" and compile-time error checking is sacrificed for "strong-tagging" and run-time errors. Using DTDs and Schemas helps mitigate the risk of run time errors. They can be used to communicate the "contract" of a document to various clients and to validate documents at run time.
XML IDEs and editors take advantage of DTDs and Schemas by providing drop-down lists of values and other productivity features. Many can generate DTDs and Schemas for you, thereby eliminating a lot of tedious work.
When DTDs and Schemas are not used, a document's structure has not been fully defined. It can be misinterpreted and inadvertently changed. A document without a DTD or Schema cannot be validated except programmatically. In short, not using DTDs or Schemas are a shortcut a developer cannot afford to take.
Lesson 10: Use Document-Oriented Meta Data
Meta data is essentially data that describes the characteristics of other data or objects. Document-oriented meta data is quite simply meta data stored in an XML document.
Why use XML over relational databases or other data storage mechanisms for representing meta data? A single document can store complex data structures of related meta data elements in a cohesive manner. This document can be versioned in a configuration management tool. It can be edited using standard, off-the-shelf tools. If built with a DTD or Schema (see Lesson 9) and edited with an XML editor, the meta data author will be guided to create a valid document. Document-oriented meta data can be made available to an application via an XML database, file system, or URI. It can be loaded in memory as a DOM and easily queried. Contrast this with relational databases that have a tendency to scatter the meta data over many tables. It will not be cohesive by nature, easily versionable, edited, nor able to be represented in memory directly.
A previous article, "Dynamic Screen Generation with XSL," showed how to build an XSLT stylesheet that can dynamically generate a multitude of screens from document-oriented meta data that describes the screen. Another article, "Code Generation with XSL," shows how to generate Java programs from document-oriented meta data. These articles are examples of how to think more abstractly about how to solve a business problem and how to build meta data-driven frameworks and systems.
In this article, we've reviewed ten short lessons learned from an XML developer. Hopefully, they will help you to adopt some of the best practices and avoid some of the pitfalls in XML development. If you've read my articles, you'll notice that I always end them the same way. "The rest is up to you." This is because I firmly believe that it is in the doing that the learning occurs. It's one thing for me to explain a technique. By itself, this has some value. But it's much better when an article inspires you to want to try it for yourself. The light bulb really goes off in your head when you've got the code working and you are applying it to your problem domain. So, review this list of lessons learned before you start your next XML project. Adopt the best practices and avoid the pitfalls. The rest is up to you!
I would like to thank Tom Sorensen, Mike Stevens, and Jim Linn for taking time out of their busy schedules for reviewing this and other articles prior to publication. Their feedback has been of immeasurable help to me. I would also like to thank all the readers who take the time to e-mail me as well with feedback and ideas for future articles. Please keep them coming! I appreciate it.
About the Author
|Jeff Ryan is an architect for Hartford Financial Services. He has eighteen years of experience designing and developing automated solutions to business problems. His current focus is on Java, XML, and Web Services technology. He may be reached at email@example.com.|