If you’re a developer and haven’t tripped and fallen face-first into XML yet, it’s only a matter of time before you do so. The rapid expansion of Web business, the need for accelerated Web development, the need for flexible data transport and the diminishing returns of HTLM have entrenched XML in IT development, and in Web services development in particular. Although it’s possible to build and implement Web services that don’t use XML as the default I/O data structure, it’s certainly rare. If you’re headed into Web services, you’re headed into XML.
A common misperception held by those who haven’t yet plunged into XML is that is it merely an extension of HTML, and is just about markup. Let those ideas go; XML is a group of technologies, of which the markup language is only one. XML is about far more than data formatting; in the world of Web services, XML doesn’t just define the messages, it can define the services themselves.
We have XML because HTML, for all its utility and ease-of-use, is static; and the demands of Web applications have, in recent years, required increasingly dynamic functionality. XML meets that demand, of which Web services have become a significant high-level expression. At this point in the evolution of Web apps, XML is an ideal building block.
And at the most abstract level, the argument for XML as that ideal building block is protocol independence. The Web is, increasingly, the circulatory system of global business, and business processes are evolving faster than any one protocol or platform. Web services, to keep pace and accommodate this evolution, should rightly be as protocol-neutral as possible. XML enables this independence while enhancing dynamism.
Many Technologies, Not One
In addition to markup of actual data, XML is used for modeling content (data typing and formatting), linking, for managing document element names, and other essential jobs that are part of defining and implementing a Web service:
XML as a markup language—A model for data that defines data elements and attributes, establishing tags for those elements;
Figure 1: An XML-populated customer record.
XML as schema—An instruction book constraining a specific XML document, defining its data types and structures, content, and rules;
Figure 2: A schema for the XML customer record in Figure 1.
XML as transform: XSLT (extensible stylesheet language transformation)—For remapping the contents of one XML document to a different XML document (or a non-XML format).
Figure 3: An XSLT transformation.
XML Liberates Data
XML constrains a Web service in the same ways a development platform or programming language constrains an application. For instance, if I develop an application in C++, the coding of the program is where I manage data within the application with respect to data types and structures. In the world of Web services, you want (and need) to decouple data type and structure from the traditional type-reference-in-memory paradigm that for years has stood in the way of common data, shareable across multiple apps. XML stores type information independently, moving it beyond the applications that want to use it.
There’s a minor inefficiency associated with this. Consider that the type information needed to handle XML-populated data elements isn’t part of the application that’s handling the data, so it isn’t initially in memory; it must be accessed separately. This represents some overhead, and in the case of multiple clients accessing the Web server hosting the app repeatedly, it’s the same overhead invoked over and over.
On the other hand, apps processing inbound XML objects tend to be very focused, and the actual amount of data tends to be small: both the app and the data tend to be low-volume. In other words, this paradigm assumes a fairly high degree of re-instancing. The turnaround is that in exchange for this overhead, the paradigm in general accommodates very rapid development, dynamic reconfiguration, and this exchange is well worth it.
Defining the Web Service Itself
A typical Web service is, for all practical purposes, built out of XML components.
The fundamental component in the XML web service paradigm is the data-bearing instance of the XML document that constitutes a message to, or from, the Web service—it’s the reason the Web service exists in the first place. Then there’s the schema that defines (and validates) that message’s format, the SOAP envelope.
But that’s only the beginning. The Web service itself constitutes an interface, and all interfaces must be unambiguously defined for all parties using them to be effective. In the world of Web services, this is accomplished with WSDL—Web Service Description Language.
WSDL effectively standardizes how Web service interfaces can be described, in terms of formats and protocols used. Because WSDL exists, it is not necessary to create any special components to talk to any of the endless servers that are out there on the Web, hosting Web services. As with the schemas defining XML documents, WSDL enables any two parties using it to share and implement an interface definition, enabling conversation between them.
And this in turn necessitates two more XML components that comprise Web services. There’s an XML schema bearing the WSDL that validates a Web service’s interface definition—and then there’s the XML instance document that actually establishes the interface.
The Role of the Namespace
Once you’ve jumped into the world of XML Web services, you’ll rapidly accumulate XML documents of various types, and it will become obvious that their organization is a key to the success of both XML as a medium for data storage and definition, and Web services as a medium for processing and transport.
This organization is achieved through the use of the namespace. A namespace-qualified element names within an XML document, and as you might guess, a Web service often handles multiple related documents at the same time (for example, a SOAP envelope schema, a WSDL schema, and the data-bearing instances of both). It is not only possible for such documents to carry elements with the same names, it’s inevitable. Namespaces differentiate those data element names by specific document.
Namespaces have other uses. They can serve as keywords to specify semantic processing; they can also be used as application-specific prefixes in Web services with many application components.
Typically, a namespace is a URI (see Figure 4), although it doesn’t have to be. Part of the rationale for using URIs as namespaces is that they are likely to be unique (note that XML parsers generally will not validate this uniqueness; it is assumed).
Finally, namespaces facilitate the enforcement of desired XML processor functions. If you put the SOAP encoding namespace in a SOAP message, the processor must serialize the message according to the encoding mechanism described in the specification.
Figure 4: Typical XML Namespace attribute.
XML as Schema (the *.xsd File)
The XML schema is basically a contract, agreed-upon by a Web service and those clients that interact with it, as to what any XML instance submitted to the site (or emanating from it) must look like. Its power is that it is external to the Web service as such.
Why is this so? Consider that a schema is generally an in-house description of one company’s data storage. A database schema, for instance, describes tables used in your particular company, and is internal to your systems. An outside entity will generally not have use of your local schemas, but will have its own; we tend not to share them. Remapping, then, is generally both essential and cumbersome, and specific to a relationship between two companies.
But when an XML schema is external to a Web service, it is available to any and all partners that might use the Web service; remapping is no longer a one-to-one proposition between two parties, but a more economical mapping to a shared central point. It’s far better for your company, implementing the Web service, because you have only to produce the single schema for all to use, rather than a unique remapping for each partnership.
In addition, an XML schema is (as opposed to its database-specific cousin) rapidly configured and even more rapidly modified. And these modifications represent little burden in upgrade; companion technologies (see XSLT above) enable rapid transition from an old schema to a new one.
Roll Up Your Sleeves
If you’re moving into the world of Web services, you’ll want some hands-on, get-acquainted time with XML. You can get additional information on this at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnwebsrv/html/webservbasics.asp.
About the Author
Scott Robinson is an IT consultant to the U.S. manufacturing, brokerage, healthcare and biotech industries. He has managed design teams sponsored by the Department of Defense and the Department of Energy, and has worked with academic research groups. He is vice president of development at Quantumetrics, Inc.