Web Services Tutorial: Understanding XML and XML Schema-Part 1
This Web Service article series covers all the important standards in the Web Services stack and ties them together with real-world examples. The first article in this series discusses XML (Extended Markup Language). XML provides a significant advance in how data is described and exchanged by Web-based applications using a simple, flexible, standards-based format. The article focuses on XML Schema, an important component of creating XML documents.
By now, you would have heard about Web Services—a technology that can change the future of computing and e-commerce. Web Services is a distributed computing technology that offers interaction and collaboration among vendors and customers, with the vision of providing ubiquitous computing.
When you plug an appliance into the electricity socket, you don't worry about how the electricity generation and distribution takes place. All you care about is uninterrupted power and of course the utility bill that you get at the end of the month! Similarly, Web Services will make computing resources, both hardware and software, accessible to you through the Internet just like electricity is made available to you. Web Services will do for computing what the Internet did for data. They would encourage a pay-per-usage model and make dynamic collaborations possible. One of key definitions of Web Services is: "Web Services are loosely coupled software components delivered over Internet-standard technologies."
Some of the early products in Web Services started appearing in 1997 when Sun announced its Jini platform and Hewlett-Packard its e-speak. After that, many big players such as IBM and Microsoft joined this race. The Web Services arena picked up steam after the big players roped in and several small players also joined hands for what was perceived as the next Internet wave. Server-standard body consortiums were formed, which developed numerous standards on different aspects of Web Services. Some of the key standard bodies consortiums are: W3C, Oasis, JCP, OMG, and several individual efforts by a group of companies.
Two of the key problems solved by Web Services over earlier distributed systems such as CORBA, DCOM, RPC, and so forth were:
- Interoperability: Earlier distributed systems suffered from interoperability issues because each vendor implemented its own on-wire format for distributed object messaging. By using XML as an on-wire standard, the two camps of Java/J2EE and .NET/C# now could speak to each other.
- Firewall traversal: Collaboration across corporations was an issue because distributed systems such as CORBA and DCOM used non-standard ports. As a result, collaboration meant punching a hole in your firewall, which was often unacceptable to IT. Hence, this did not allow any dynamic collaboration, as it required going through a manual process for collaborating with partners. Web Services use HTTP as a transport protocol and most of the firewalls allow access though port 80 (for HTTP), leading to easier and dynamic collaboration. The dynamic nature of Web Services interaction offers several exciting services for the users.
What are the key technologies that made Web Services possible? Let us now examine the key interactions and the key standards involved in the Web Services stack.
Web Services Stack
To understand what technologies are required for Web Services, we need to understand a typical Web Service interaction.
The Web Services model follows the publish, find, and bind paradigm. In the first step, a service provider publishes a Web Service in a Web Service registry. Secondly, a client who is looking for a service to meet their requirement searches in a registry. After successfully finding multiple matches, it chooses a service. The client then chooses a service based on its preferences. The client then downloads the service description and binds with that to invoke and use the service.
One of the primary concerns of Web-based programmers was how to transmit data in an interoperable manner. At the bottom-most layer is the XML standard that addresses this. SOAP (Simple Object Access Protocol) is an XML-based mechanism for messaging and RPC (Remote Procedure Calls). It addresses the fundamental problem of firewall traversal in RPC systems by using HTTP as the transport. SOAP is the protocol used for invoking the service.
WSDL (Web Services Description Language) provides an XML-based way of describing a Web Service, giving details of using it. WSDL is an XML equivalent of IDL (Interface Definition Language), used in the RPC days. UDDI (Universal Description Discovery Integration) provides a "Yellow page" directory of Web Services, making it easier for clients to discover the services of their choice. The service provider publishes the service description (WSDL) and other searchable details in the UDDI registry. A client uses UDDI to perform the find of a service.
In this tutorial series, we will cover each and every standard in the Web Service stack moving from the bottom up, beginning with XML.
Extensible Markup Language (XML) is a extensible, portable, and structured text format. XML is playing an increasingly important role in the exchange of a wide variety of data on the Web and elsewhere. XML was derived from SGML, which was a complex language for defining other markup languages.
XML initiative consists of bunch of related standards. Apart from the core XML standard, it includes XSL—Extensible Stylesheet language, which is used to transform XML data into a customizable presentation. XLink and XQuery provide a way to provide flexible query facilities to extract data from real and virtual XML documents on the Web. XPath and XPointer are languages for addressing parts of an XML document.
A previous article, "Understanding XML," introduced you to the fundamentals of XML. XML Schema is one of the key components of XML. Therefore, in this article we will closely look at working with XML Schema.
Working with XML
When working with XML, we think of creating XML documents and consuming XML documents. The creation process involves using editors and tools to create XML documents. On the other hand, consuming XML documents involves parsing the XML documents and extracting the useful data.
Creating XML documents
Creating XML documents is a two step process, which involves:
- Defining the grammar and restrictions over data for the XML document.
- Creating the XML document itself. This document can be validated against the grammar.
The DTD and Schema are used to describe the grammar and restriction over data in the XML document.
DTD and Schema
DTD and schema are used to specify the structure of instance documents and the datatype of each element/attribute. DTDs used today in the XML originated from the parent SGML specification. Because SGML was designed for a more document-centric model, it did not require the use of complex datatyping definitions. The XML Schema specification improves greatly upon the DTD content model by providing rich datatyping capabilities for elements and attributes as well as providing OO design principles.
XML Schema was approved as a W3C Recommendation in May, 2001 and is now being widely used for structuring XML documents for e-commerce and Web Services applications.
The two major goals that the W3C XML Schema working group focused on during the design of the XML Schema standard were:
- Expressing Object Oriented design principles found in common OO programming languages into the specification.
- Providing rich datatyping support similar to the datatyping functionality available in most relational database systems.
XML Schemas provides a means of creating a set of rules that can be used to identify document rules governing the validity of the XML documents that you create. Schemas provide a means of defining the structure, content, and semantics of XML documents that can be shared between different types of computers and documents.