The Native Connector Architecture (NCA) is a model that allows for the automatic generation of code bridges between differing languages and platforms. Given a description of the interfaces exposed (and desired), it is then possible to derive and generate the code necessary to bridge the two bodies of code. The architecture defines an extensible and highly configurable model for the description of interfaces as well as the generation of bridging code using open standards for general use. As a result, bridges can easily be created between, for example, the C++ and JavaTM languages, differing C++ Application Binary Interfaces (ABI), Web service protocols (such as SOAP), as well as inter-platform bridges (for example, Microsoft COM/.NET to SolarisTM operating environment binaries).
At present, there are a number of models for bridging bodies of code or applications. Examples include:
Problems and Issues
The general problem of integration between languages and platforms is that for each such bridge between architectures, a new implementation must be created to provide the binding between environments. Worse, for each new application, even though the type of bridge may be identical (for example, between C++ and Java technology), specific interfaces used may be different, resulting in a brand new implementation of the bridge even though the general pattern remains the same.
In general, this manual implementation of bridging code can (and does) lead to the following issues and problems:
- Code being bridged must often be adapted to allow for the use of a different programming language or infrastructure, such as coding to a specific binary object model (COM/CORBA), porting to a specific platform, or through the use of an intermediate language, such as the C language.
- An in-depth knowledge of the platforms, languages, and sometimes an independent model for bridge code must be understood by the developer.
- Such bridges are difficult to maintain for large systems. As the number of interfaces increases, the complexity also increases, which leads to errors in initial development as well as issues with tracking and reproducing changes across an entire system.
- Because of the differing environments on each side of a piece of bridge code, required features such as runtime memory management, object lifetime management, and exception propagation must be managed somewhere within the system as well as injected into each API or set of APIs in the bridge.
The general result is sub-optimal performance of a bridged system because tuning a large-scale, complex application across multiple languages or platforms is necessarily difficult and frequently not attempted until it becomes a problem on deployment.
JNI Integration Model
As an example of such an existing technology, the JNI (Java Native Interface) is a model for integration used by the Java platform. As with all existing bridging technologies or architectures, the general rule is to pick a common implementation language or technology (for instance, a subset of the languages to be bridged) and use that as the foundation for the bridging code. Each component being bridged must therefore implement any conversions and handle any implementation-specific features on its side of the bridge. For the Java language, the C language is a natural fit as a bridging language, for a number of reasons:
- Most popular programming languages (including Java) in use today for commercial purposes are in the same language family, such as C, or have a common model for integrating with the language.
- Most operating systems and environments provide support for the use of libraries which follow C guidelines for interface binding.
The general steps one must use in a JNI-based solution are shown in Figure 1:
Figure 1: The JNI Process
JNI requires special handling within the Java code to enable, load, and use native libraries within the Java environment, namely:
- The dynamic library must be locatable, either via an absolute path or through the use of environment settings, which the dynamic loader uses to find such libraries.
- The Java application must explicitly load the dynamic library with functions provided by the JDKTM.
- The functions defined must be named appropriately so that the dynamic linker can bind calls within the Java code to the native methods.
- The implementation of the native methods must follow specific conventions and use types defined by the JNI specification.
Native Connector Architecture Concepts
NCA defines a model as a process designed to reduce or eliminate the risks and complexity involved with manual integration between languages or platforms such as represented by the Java to C++ language example. While the architecture is agnostic in terms of which languages or platforms can be integrated, in the following discussion examples will typically use the bridging of the Java and C++ languages as an example of the solutions that the architecture provides.
The underlying principles behind NCA are:
- Existing APIs and the desired API are known and can be represented as metadata.
- Tools implementing NCA, such as the Sun ONE StudioTM Native Connector Tool (part of Sun ONE Studio’s enterprise class development tools) can understand the semantics of the languages or platforms being bridged.
- Integration between the interfaces, languages, and platforms can be represented by reproducible patterns.
- Applying these patterns against the metadata-based API descriptions will result in a data-driven automation process to generate code.
- A suitable runtime environment exists to manage a generated bridge.
The patterns defined by NCA are designed to be implemented against a collection of metadata that describes the interfaces for the existing source (client) and the generated target API. When these are provided, and given a specific pattern to generate, the intermediate code can be generated according to the rules and semantics defined for a particular bridge. In modern computing, there are many potential systems that can provide descriptive information about APIs, the most powerful today being that of XML. To that end, NCA defines PLanML (Programming Language Markup Language), which is designed to represent a large number of the common languages used today.
PLanML is defined by its own schema, which defines the principle types, methods and enumerations required by NCA-compliant tools to generate code according to the principal patterns. A representative sample of the metadata generated for a simple C++ class may look like Example 1.
<typeDef aliasName="HelloWorld" byteSize="1" category="struct" name="HelloWorld" typeID="110"> <struct aliasName="HelloWorld" type="class"> <method access="public" methodID="144" name="value" scopeRef="110" typeRef="105"> <parm typeRef="87" name="arg"/> </method> </struct> </typeDef>
Example 1: PLanML for a simple class
Given the existence of common metadata describing the APIs used for a set of common languages (PLanML), a natural extension is to use XSL/T (XSL Transformations) to allow for the generation of bridging code compliant with the Native Connector Architecture. As part of the transformation process, a compliant tool will be able to:
- Generate all of the bridging code required to create a Connector.
- Generate build instructions/rules (such as a Makefile useable by the make tool).
- Generate descriptor information describing the generated code as well as the output of the build for use in assembly and deployment of the Connector.
NCA defines some default patterns for the use of generating Java language to C++ connectors; however, one of the distinct advantages of the architecture is that the templates used are human readable (and thus modifiable). By invoking an XSL/T processor against different templates, one can generate different types of bridges and different bodies of code, resulting in a highly extensible bridging toolkit.
Resulting Component Anatomy
A given component that adheres to the Native Connector Architecture is modeled as shown in Figure 2:
- The original code is unchanged and may be composed of a single object, or a collection of objects or methods.
- The original code is “wrapped” in methods and objects, which both abstract the original code’s APIs as well as provide any “plumbing” (such as marshalling, un-marshalling, or system calls) that may be needed to isolate the original code.
- A single, new interface is generated around the “plumbing” code, which provides an interface that is natural within the context of the client environment. This means, for example, that if the client is a Java application, the new interface is a pure Java class/object rather than a collection of native methods.
- The client interacts with the component in its natural environment, obeying the semantics and rules that govern it. The mechanisms and foreign semantics of the underlying code are encapsulated and hidden to the client application.
Figure 2: Native Connector Component Anatomy
The runtime architecture of a system based on Native Connectors presumes that the existing client code and existing legacy code remain unchanged by the addition of the Connector. The Native Connector Architecture defines the following elements:
- A Container, which logically encapsulates the APIs exposed by the legacy code. Functionality exposed by the legacy code is delegated to it by the Container methods
- A Bridge, which serves the function of mapping the Container methods to the Component (the point of integration for client code) as well as any marshalling, un-marshalling, or special system processing
- A Component or Accessor, which serves as the point of integration for client code
- A Runtime component, which handles any system behavior required, such as object lifetime management, exception handling, and so forth
The general runtime architecture for a Connector-based system can be described as in Figure 3:
Figure 3: Runtime Architecture Overview
Connector Generation Process
The process by which the Bridge, Container, and Component are generated is a mechanical, data-driven set of actions that forms a code generation process. The foundation of this generation is two-fold:
- PLanML, a specific schema of XML that describes the interfaces both consumed and emitted in a language- and platform-agnostic fashion
- XSL/T, a set of patterns embodied as XSL transformations that consume PLanML and emit code for later compilation. The overall process is shown in Figure 4:
Figure 4: Overall Connector Generation Process
The process of metadata generation is shown in Figure 5 and will largely depend on the implementation of tools using the Native Connector Architecture. Examples of metadata sources include:
- Compilers. During the process of compilation, the compiler can emit the metadata that is required by the system.
- Introspection Tools. Certain languages (such as Java, Visual Basic, and languages designed for Microsoft .NET) encode metadata into the generated binaries. Such data can be extracted by introspection tools, which can emit the data required by the system.
- Code Descriptor Databases. While generally not as complete, databases of source data (such as found with source browsers) may be used to generate the metadata. The drawback may be incompleteness because some information needed for correct generation may not be present.
- Other Tools. Other classes of tools that can provide such information, such as interpreters.
Figure 5: Metadata Generation Phase
The process of component specification can be performed automatically or manually and is shown in Figure 6.
Figure 6: Component Design/Specification Phase
The limitations imposed on automated component specification are largely driven by the compatibility between the languages being bridged as well as whether a set of rules can be devised that can automatically handle issues such as naming conflicts, type promotions, or conversions. In contrast, a fully manual implementation requires user interaction to drive the component specification process, including choosing the interfaces, identifiers used, and parameter conversions, as well as the management of unsupported types and methods.
The ideal solution is a hybrid, one in which a tool based on the NCA will handle much of the automated conversion process by the judicious use of established defaults for naming conventions, parameter usage, and conflict resolution. Such a tool will flag elements it cannot handle appropriately and allow the user to interact with the tool to resolve conflicts.
Code generation (Figure 7) is the process of creating the bridge code to map between the two sets of APIs, those present in the target and those specified as part of the client interface.
Figure 7: Code Generation Phase
Since the metadata used to describe both sets of APIs is a specific schema based on XML (PLanML), the architecture defines the use of XSL stylesheets to drive a pattern-based code generation process. Tools based on NCA will consume PLanML and pass this data to an XSL processing engine. In general, this may require multiple passes through the processing engine, once for each type of output desired. This approach may be useful for extensibility, since properly factored XSL stylesheets can be used in alternative bridge generation scenarios.
Build, Assembly, and Deployment
The final process involves taking the generated source, building the result into usable binary components, and packaging the results for deployment, as shown in Figure 8.
Figure 8: Build, Assembly, and Deployment Phase
While building is generally independent of the type of bridge or target environment (the latter may require building on the target platform), packaging and deployment will change depending on the type of bridge being created. For example, when deploying into the J2EETM environment, the location, naming and descriptor information will need to be different than that of a standalone J2SETM application.
Because a bridge generated by a tool such as the Sun ONE Studio Native Connector Tool is fully automated, and the target type is known, the system knows precisely which components were generated and is fully capable of not only generating the build rules (for example, scripts or makefiles), but populating the resulting packages appropriately and emitting descriptor data needed for deployment.
To illustrate the differences between JNI and Native Connector processes, Figure 9 provides a comparison of the steps involved in each.
Figure 9: Process comparison between JNI and a Native Connector implementation.
As can be seen from the above, most of the steps required by JNI are manual in nature, while those required by a Native Connector Architecture implementation are generally automatic:
- Metadata Acquisition. There is no counterpart in JNI. The Native Connector Architecture derives this metadata via implementation-provided tools.
- API Specification. In JNI, this is enabled by the use of the native keyword and then later manual invocations of javac and javah. NCA derives these APIs either manually via some implementation-provided tool or via manual specification.
- Code Generation. JNI code must be manually created as part of the implementation of the bridge. Under NCA, code generation is fully automatic.
- Build. Manual under JNI, automatic under NCA.
- Assembly. Manual under JNI, automatic under NCA.
- Deployment. Manual under JNI, automatic under NCA.
The Native Connector Architecture defines a model for the easy, automatic generation of components in, for instance, the Java language, which are bound to existing legacy implementations written in languages such as C and C++. The architecture and tools provide an extensible model for the addition of new bridge types and support for additional languages and platforms. While at this time tools exist that allow for the generation of J2SE and J2EE Connectors to C and C++, future implementations will likely provide additional capabilities and language/platform support, such as interoperability between Microsoft .NET components and other popular languages, such as Fortran.
About the Author
Robert Brewin is the Architect for Enterprise Development Tools responsible for architectural and engineering issues for Sun ONE Studio products.
Sun, Sun Microsystems, the Sun Logo, Sun ONE Studio, Java, Solaris, J2SE, and J2EE are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries.