February 2000 Use XML Even As It Changes Here’s how you can tackle application-to-application integration needs while building a migration path to XML Schema These days you can’t predict what other applications an application will have to integrate with. Yet application-to-application (A2A) interfaces often focus on localized data, without regard for cross-function and cross-domain reuse of common data. The focus is now shifting to recognize architectural practices, standards, and the importance of enterprise metadata. But while these techniques help, you need more to really enable exchanging data between architected and legacy applications. If the applications are engineered like jigsaw puzzles, each app mates easily only with adjacent apps that are cut to fit. But if you can build them with pieces engineered like Tinker Toys, they become more versatile (see Figure 1). Such apps exchange information through common interfaces, yet retain their internal functional autonomy. The secret lies in building common messages using standards such as XML, rather than the syntactic or positional formats used by traditional EDI. The current XML specification uses Document Type Definitions (DTDs) to describe the content and structure of an XML document. However, there’s an innovation waiting in the wings: XML Schema. Schema will most likely present the best solution for describing metadata with XML. But current implementations are often based on DTDs. Schema should be adopted rather rapidly, but a number of industry-based XML vocabularies and numerous custom-developed XML DTDs will require a reasonable period of migration. This article will show you how to address your short-term A2A integration needs with XML DTDs and how to build in a handy migration path to Schema, then preview how Schema will most likely work. For A2A integration, you can use XML to define the contents of a message used by an interface. The sending and receiving apps can interrogate, extract, and interpret message contents by the tag, rather than by position or placement. From a broader enterprise architecture view, this lets applications leverage a common interface message framework. Functional message sets can be constructed to support exchange between many applications, rather than use the point-to-point paradigm. You design message content around enterprise domains and functions rather than specific applications. Reusable sets of common data components are clustered and standardized for use by broad sets of applications. Receiving applications extract and interpret the discrete message components they need, ignoring the balance. True message brokers extend this scenario. They act as traffic cops and navigators between applications. Broker objects could interpret functional content from messages, then route them to desired targets. But message brokers conceal some gotchas. Use metadata to bridge data islands Applications that support, say, Sales Order Management, might manage product information in a relational database, with a Product Name element defined by a PROD_NM column. You could also describe this element by the metadata characteristics of data type "character" and length "30". Similarly, applications that support Marketing might also manage product information, only they’re using a different relational database with a PRODUCT_GRP column. This column might be described by a data type "char" and length "15". Further complicating things might be an additional data element in the Marketing database for Product Family. The resulting database column is PROD_FAM_NM, of data type "character" and length "15". It would surely help to be able to exchange product information between the functional domains of Sales Order Management and Marketing. But if the data were extracted from the two different databases and used for the exchange between the apps, you may well get anomalies related to data type, data value truncation, and semantic description. In a perfect world you’d identify the disparate data sources, standardize, then re-engineer both data and applications to be fully built and made reusable (Try selling such a costly project to your management, though). Or you could identify the valuable data that’s common to both domains, then create metadata characteristics that define a minimum set of rules for describing the data elements in a message context. For your legacy apps this might result in some level of re-engineering or utility wrapper development. Rife with complexities As a somewhat simplified and abstract group, these characteristics comprise strong data typing. They’ll generally let you identify data element by its name, the constraints of its use by data type, and the limitations of its value by the length and decimal scale. And if the marketing and sales order management applications were aware of these characteristics, message exchanges would deliver better data quality. In native form, an XML DTD can provide some of these important characteristics. With the ability to describe data by element tags and attributes, XML and DTDs become a great candidate solution for describing the content of a message. However, DTDs are not a universal remedy. As you venture into the XML world, you will quickly learn that data type, length, and decimal scale are not intrinsic to DTD specifications. In fact, most XML data content is simply defined as "string" or "character". XML document content is defined as character data regardless of whether the origin was actually numeric. DTDs today, Schema tomorrow Bray’s model needs some tweaking to scale up to high-volume production environments. First, including #FIXED attributes in the DTD for the XML document of the message can add some dependencies and overhead. Data content for #FIXED attributes is defined by default values in the DTD. You can’t instantiate these fixed attributes separately within the content of an XML document. So even though fixed attributes are defined to the DTD and describe data values for elements of the XML document, they aren’t populated in it. When the DTD is used and validation enabled, the attributes’ values describing metadata characteristics are available only to the document and the instantiated Document Object Model (DOM). For our purposes, think of a DOM as a set of nodes defined within a hierarchical structure. The nodes are populated based on the content and structure of the XML document. An application can then navigate the DOM to extract data, using a set of APIs. When a high volume of messaging and element content is passing between applications, you might lack the headroom to validate to a DTD as part of the parsing process. The #FIXED attributes used to describe the data type, length, and decimal scale in an externally defined DTD would not be instantiated to either the XML document, or the resulting DOM. As for document content overhead, even if validation weren’t an issue, you still get repetitive data values for every metadata attribute applied to every element occurrence of the XML document (and the instantiated DOM). Your application can navigate the DOM and extract the metadata attribute values from each corresponding node, but the values will be the same for each element instance. If an XML document contains 1,000 instances of "Product", then the data type, length, and decimal scale values are repeated for every instance. My morph of Bray’s model tackles these complexities where some object, application, or process is being developed to interpret XML metadata characteristics and triggered to apply the corresponding rules. If not, the metadata attributes are ignored. Scale with metadata templates My alternative also requires providing an interface management process (or object) to map the content of the metadata template to the elements of the same tag name in the XML message document, and to apply any necessary rules or editing. You could do this by instantiating the DOM using the XML message document. Based on a processing instruction, the interface management application would map the XML document’s elements to a corresponding XML metadata template. Of course, nothing’s free. Mapping, validation, and anomaly reporting/resolution need to be accomplished by the interface management process. Mapping occurs between elements of the same name as defined in the template and XML message document. So you need to weigh the overhead of this added processing against the resulting improved integration and high-quality data exchange. As for triggering the process, you might try an XML processing instruction to invoke this supplemental metadata validation process. The process will need to build the DOM, and it will be up to an interrogating application (such as the interface management process) to identify the appropriate instruction, invoke the metadata validation process when necessary, address anomalies, and route the interface message accordingly. My method delivers a lot of reusability. The XML metadata template document is separate from the actual data of the interface message, and is more static. So you can define the template once, reuse it, and apply it to multiple instances of A2A messages as needed. And this separation of the interface message document from the metadata template helps you migrate to XML Schema. By separating out the metadata characteristics, the migration process doesn’t have to deal with non-message content. You translate the native XML message document (and a DTD, if you use one) to a simple baseline XML Schema. You can then address the subtler needs of the metadata as an enhancement of the conversion script. Here as before when the metadata validation process isn’t enabled, the template document and validation process are ignored. This use of a separate metadata XML template document tackles the problems of high-volume message exchanges and XML Message documents with excessive element content. It also addresses some of the more obvious metadata gaps of XML version 1.0 DTDs. Though of course XML DTDs may be migrated or replaced at some point after XML Schema is ratified. XML Schema as a solution Base datatypes include: string, boolean, real, timeinstant, timeduration, binary, uri, language, decimal, integer, and date. Another submission to the W3C, "XML-Data", extends the notion of strong datatyping into a more richly defined datatype. This came from Microsoft, ArborText, DataChannel, Inso, and the University of Edinburgh. It proposes using many discrete datatypes, including float (real number, with no limit on digits) and fixed.14.4 (number with up to 14 digits to the left of the decimal point, and up to 4 to the right). Facets are single defining aspects of concepts or objects. I think of facets as attributive characteristics of other characteristics. For example, fundamental facets described by the W3C draft include: Order, Bounds, Cardinality, Exact and Approximate, and Numeric. Non-fundamental facets include length, maximum length, pattern, and enumeration. Easier migration There’s not a lot of maturity in the XML tool and utility market so far. But there are a few tools that can help you migrate from DTDs to Schema—and help recoup your investment in using XML and DTDs for A2A messaging. My favorite tool for conversions thus far is XML Authority, from Extensibility. It can import an XML document or a DTD, then export XML Schema (based on the current draft recommendations). It also supports some other variations, such as XML-Data and XML frameworks such as BizTalk. Regardless of the XML recommendation currently in play, there are tremendous advantages in the A2A space for XML. Enterprise integration can go a long way in the area of metadata characteristics for messages, but you need to be smart about which solutions will benefit your organization the most. In general, it’s time to start using XML to resolve some of your A2A integration and data disparity issues. A common messaging utility, XML message, or interface broker can really help police your application integration and interface dilemmas. Just be sure to validate your concepts for feasibility, impact, performance, and cost/benefit. And remember to capture volume and performance metrics. This will help to document the effectiveness of your solution.
|
|
From jigsaw puzzle to Tinker Toy. Traditionally, A2A integration was achieved by building apps like pieces of a jigsaw puzzle, using point-to-point, proprietary interfaces. This worked until you had to change any part of the puzzle, or use your pieces in new puzzles. Far better is to design applications like Tinker Toys, using standardized interfaces that let you plug anything anywhere. |
A Tool For Migrating From XML DTDs To Schema | |||
---|---|---|---|
M igration from Document Type Definitions (DTDs) to Schema can be a challenge. But XML tools such as XML Authority from Extensibility can help ease the burden by bringing the benefits of data modeling and design to the XML world. For this you need tools that can provide the graphical representation of an XML document and its structure—especially the metadata-related content. The only products I’ve found to do this thus far are Extensibility’s XML Authority—the subject of this discussion—along with Open Text’s DTD editing and modeling tool Near and Far Designer. But that kind of paucity should be expected in such a new market. Fortunately, XML Authority v1.1 combines a simple interface with extensive functionality. When paired with the editors and authoring tools in my XML toolbox, the combination lets me model overall design and also deal with the more granular aspects of XML prototyping.
XML Authority is aware only of what’s defined by the structure of the source document, so it doesn’t automatically resolve the obvious metadata gaps of DTDs. Also, I have to review the exported XML source and make any necessary changes using the selected syntax. Still, this takes far less manual effort than coding the XML Schema from scratch. Actual effort will vary depending on the DTDs’ size and complexity. Help at data server level In a large A2A environment there are always issues associated with extracting content from the data layer for presentation and exchange at the Web layer. Two examples are element naming and data type conversions. By extracting the metadata from the DBMS layer, XML Authority helps greatly in this arena. Also of benefit to architects are the extensive help text, tour, schema description, and best practices. If you don’t have time to wade through the World Wide Web Consortium drafts in detail, use XML Authority’s help facility, which summarizes the more important topics. Pricing is $99.95 for one user, then priced by user thereafter—quite reasonable compared with other XML tools. Overall, I highly recommend this product. It may not solve all of your XML and Schema challenges, but it will help. —J.B. |
|
Export DTDs every which way. XML Authority v1.1 lets you export and convert your DTDs to an impressive number of formats, including XML Schema and XML Data. |