One of SOA’s selling points was the concept of reuse; however, most enterprises that have tried SOA have failed to find a lot of reuse. Often, SOA technology itself is blamed for the low reuse problem, but further analysis shows that lack of adherence to standards and best design practices may be to blame. Best practices such as using enterprise-worthy canonical models, enterprise-level metadata, creation of utility services, and adoption of rules-processing engines enable an enterprise to reuse previously deployed service assets and business components when building new business solutions. The hidden benefit is the standardization of information flow—which in turn leads to predictability of information usage and a reduction in the overall operations and maintenance cost.
Business Scenario or Use Case
The example chosen to demonstrate some of the key principles is that of a rich content collaboration service that may be supporting the business needs of an extended enterprise. In this scenario, the rich content collaboration service is shown to be servicing the business need of sharing rich content across multiple lines of business. Needs of diverse user communities with the content requests such as searches, retrievals and the specialized user community workflows can be all be realized provided the disciplined use of the right canonical models, appropriate design patterns, appropriate metadata and reusable utility services are leveraged.
The whitepaper walks the reader through the runtime interactions between the various components that allow the content management service to be provisioned to any end-user community needs. A key assumption here is that the runtime behavior can depend on the usage of enterprise canonical models, the availability of the right metadata taxonomy on the content and the availability of appropriate generic exception handling work flows all of which are defined at design time. Mapping of information that appears on the canonical models occurs at runtime provided the execution flows at have access to the following metadata mapping elements at design time.
At a high level, the interactions described here depend on the keyword/metadata associations at design time that can be used to dynamically wire the execution path at runtime. In other words, metadata that appears on the canonical request model (defines the user intent) can be used to associate with metadata tags that are linked with the content at design time (i.e. at the time the content is persisted to the content repository). In addition, the metadata that appears on the canonical response is used to associate the workflow service metadata so that the appropriate workflow rules and user navigation flows are executed to handle error or exception scenarios. This same design time meta-tag association with other implementation constructs such as XPath/XQuery, SQL Where Clauses or data access APIs etc. is leveraged to make the runtime wiring of new content searches dynamic without sacrificing execution efficiencies. The key is that as long as the canonical model keywords can be associated with the implementation construct metadata all components of the Content Collaboration service can satisfy a consumer request.
This whitepaper discusses the following high-level constructs:
- Content-tagging keywords
- Canonical Request/Response models (Standardized XML backed by XML Schemas)
- Content Management Repository
- Metadata Repository
- Rules engine
- Workflow engine
- Generic Data Access Objects
- Authentication and Authorization Service
- XML filtering functions (such as XPath/XQuery or XSL functions)
- XML and XPath/XQuery Processing appliance or engine
It must be noted that web services technologies such as Service Component Architecture (SCA), Web Service Composition Application Framework (WS-CAF), WS-Notification, and WS-Addressing can be leveraged to enable an enterprise to perform service composition and orchestrations. In addition, the interactions between the service consumer and the Content Collaboration service can be mediated via the Enterprise Service Bus (ESB). However, these cannot replace the need for an enterprise to invest in the above mentioned architecture tenets as these SOA technologies are only enablers of good architectural principles but do not by themselves enforce architectural best practices.
Overview of Architectural Components
A full-scale set of architectural service components includes the following:
- Enterprise Content Collaboration Service: This is a course-grained Service Façade responsible for orchestrating multiple services. This Service Façade enables the details of internal service orchestration details and the inner workings of the request/response flows to be hidden from the consumers (whether they are visual consumers or true system consumers). This layer of indirection also enables the Content Collaboration Service provider to insulate its consumers from any negative impacts when switching work flow engines, content management repositories, or XML-processing engines. In addition, this layer insulates the provider by enabling it to make product decisions based on its ability to meet QoS needs and business SLA needs without getting tied down in consumer interactions with the low-level components. Having the coarse-grained service interface also allows the service provider to make better capacity-planning decisions about which candidate services and/or business components it can scale out to remove bottlenecks without causing disruptions to the consumers. Without this layer of indirection there is a possibility that consumers will consist of hard-wired legacy components and low-level APIs—or even specific service URLs that make both the provider and the consumers more fragile and vulnerable to changes.
- Enterprise Security and Workflow Service: This is a service component that authenticates Content Collaboration service users and provides an interface to edit the authorization rules that control how malformed user request exceptions are handled. Exception flows could include scenarios where the user request cannot be processed due to “data not found,” “invalid request parameters,” or “incompatible grain-of-information issues.”
- Enterprise Information Optimization Service: This Utility Service component applies XPath/XSL and CSS-type XML payload and/or envelope transformation rules to “canonicalize” a request or response. It also internally deals with the complexity of processing the XML, dealing with XML fragments resulting from XPath/XQuery operations, and allows the request to be generated to optimize the interaction with the Information Integration service. Other benefits of the utility service include consumer ease of use, because it can formulate a request with correct metadata without forcing the consumer to create the request in a specific format. This facility enables the provider to offer a friendlier service interface while ensuring that the rest of the service components do not have to invest in exception-handling logic—which in turn makes the subsequent provider service component interactions more deterministic. The response is also canonicalized in one and only one utility to ensure that none of the other service components have to worry about changing their response parameters, all while still enabling consumers to get a well-formed and standard response back from the Content Collaboration service provider.
- Enterprise Information Integration Service: Another utility service that coordinates between structured content and unstructured content repositories to process and aggregate user requests. The efficiency of this service is proportional to the mapping metadata available on the content, the inclusion of the right request metadata, and the canonical request formation by the Information Optimization service. This service takes a canonical XML request and turns it into SQL to access a content repository and/or relational data structures to get the information to create a response.
Execution Path of the Architectural Components
All these architectural components must be coordinated to execute a call. As discussed earlier, user request metadata, content metadata taxonomy, XPath and XQuery expressions, SQL filter statements, or DAO API mappings must all be determined at design time, leaving associations of keywords and metadata to occur efficiently at runtime. Here’s an explanation of the execution path:
- End user service or customer logs into the enterprise portal
- Request parameters and service authentication credentials are passed to Enterprise Content Collaboration Service
- The Enterprise Content Collaboration Service calls the Security Service to authenticate the user and forwards the call to the Enterprise Information Optimization service.
- The Enterprise Information Optimization service validates the request parameters, identifies the right pre-configured XPath/XQuery function to perform XML transformation—making the request a “canonical request” (one that includes the correct request parameters and mapping metadata).
- The canonical request is now forwarded to the Enterprise Information Integration Service, which associates the request keywords to content-tagging information in the content repository or to the structure relational constructs. The XML request is turned into SQL WHERE filter statements by a Data Access Object (i.e. data access API) to call the content repository and/or the relational structures.
- The SQL query gets executed and retrieves the binary content and/or relational data based on the user request parameters. If an error occurs, an exception or error message gets sent back to the Enterprise Information Integration service.
- Enterprise Information Optimization service processes the response to turn this into a canonical response or report any processing errors that occurred during the request. Exceptions could occur because mapping metadata wasn’t found or because user request parameters were at a different grain of information than the data granularity in the content repository—thus causing the request to be incompatible with the data. For example, a user request for low-level transactional information is at a different grain or level than a request for summarized information in the structured data repository. Similarly, a user might request legal pre-trial preparatory documents, but the content repository may contain only trial proceeding documents.
- The Enterprise Content Collaboration service returns the canonical response or error notification to the user. The Content Collaboration service examines the response status to determine the next steps.
- If the response caused an error notification, then that error notification gets forwarded to the Security and Workflow service to identify the “appropriate manual work flow process” to be executed, determined by mapping the error code in the response to the error handling flow in the workflow engine
- As part of the exception workflow, the response error also allows the appropriate call center or help desk service associate to help resolve the issue. In addition, the request/response pair may be logged, to track future enhancements to the Content Collaboration service that will help satisfy the service consumer.
All the steps above demonstrate the importance of including the right keywords in the canonical request/response models to enable identification and execution of the right design constructs. This underscores the importance of having XML Schema-based validation for the canonical request/response models. Additionally, library selection is based on the association of the “appropriate” metadata (such as content meta-tagging, mapping to structured data filter statements, mapping to a DAO to execute the appropriate SQL, association to XPath/XQuery functions to extract the right keywords from the canonical models, error notification tagging of the exception-handling flows etc.).
Key Service Architectural Tenets
Figure 1. Interaction Sequence: Here’s the Service Consumer/Enterprise Content Collaboration Service interaction sequence.
- Successful service assembly requires that service interactions (see Figure 1) be based on predictable service interface contracts, the availability of DAO or data access APIs, the existence of generic exception handling workflow APIs, and QoS management and monitoring tools.
- Use standardized payload formats to package the request and response parameters provided on the envelope. These should include clear action verbs and other enterprise-standard instructions to facilitate mapping the request-based metadata to the processing instructions in the enterprise rules repository.
- Make sure an enterprise-level governance process is in place to rationalize the metadata and the rules published to these centralized repositories to ensure that mapping and processing instructions in the canonical request are predictable and efficient. The governance process should also ascertain the quality of enterprise-level taxonomy and relationship-mapping rules to enable the Enterprise Information Integration Service to navigate efficiently between the structured content and unstructured content repositories.
- Ensure that standardized metadata keywords and an efficient metadata repository is available to the runtime environment.
- Adopt standards such as XML, XML Schemas, XSLT, XPath and XQuery functions to make information processing efficient.
In conclusion, using the best practices and architecture tenets mentioned in this article allow an enterprise to reuse service-based investments to gain time-to-market benefits while promoting information flow standardization and improving the quality of business service management. In addition, using utility service components that “canonicalize” requests and responses into XML allow downstream interactions to be more predictable, while still providing ease of use and flexibility to consumers, making their service interactions simple and easy to code.
The most important byproduct from adopting these best practices and architectural principles is that they make it easy for enterprises to dynamically wire additional exception-handling flows, additional repository searches and/or to offer alternative keyword suggestions to end users or service consumers—provided the design-time constructs are all in place to facilitate the runtime interactions. In addition, given the separation of concerns among the utility service components, any new feature implementation can be concurrently designed, tested and deployed.
One might also be able to offer this type of content collaboration service in the context of cloud computing, where the content collaboration service provider has a mechanism to deal with multi-tenancy in both the content and metadata repositories. The challenge in cloud computing lies more in the realm of transporting large binary content across the wire and in the scalability of the content collaboration service across firewalls rather than in the architectural construction itself. But that’s a discussion for another article!
About the Author
Surekha Durvasula is an enterprise architect with more than 11 years of experience designing and architecting a variety of enterprise applications in the financial services sector and the retail industry vertical. Most of her work has been in the area of distributed N-tiered architecture, including J2EE component architecture. Her more recent focus has been on Service-Oriented Architecture and Business Process Management. Her efforts as an Enterprise Architect involve not only architecting new applications and business services, but also in leveraging the principles of SOA to extend the life of existing Enterprise Information Systems.