A Quick Look at InterSystems' Cache Database
Way back at the dawn of time, dinosaurs ruled the earth (you know, 40 years ago). These dinosaurs used the cutting-edge data architectures known fondly as hierarchical and network data models. These models worked well for their intended purposes (giving fast access to the applications data), but were limited in the sense that only the application knew how to access the data (putting the various pieces together as needed) and did help other system, users, and others make use of the data without going through the application guarding the data, or replicating the logic to navigate through the data. Thus was born the relational model for data access incorporating tables, tuples, DDL, DML, and all the other things we take for granted in a relational database.
Now, if you are like me, you occasionally hit the wall when trying to use cutting-edge object technologies based upon a typical relational database. Mapping internal object classes to their relational table equivalents and working out the underlying details can often be quite a mess.
A significant portion of your code is related to marshalling the data to and from the database into the internal object representation of your code. OODB (Object Oriented Database) vendors like to call this impedance mismatch. So, you either deal with all of the marshalling code, or you tend to deal with the data in a non-object oriented manner to avoid the marshalling code. You probably also find that you spend a lot of time writing code that performs a left outer join on a dozen tables just to get the information for the query. I often find that the update SQL code is considerably worse that the select SQL. And somehow the so-called SQL standard is just not useful enough to handle the real world that was supposed to make it all worthwhile, because to actually do real work, you are forced to either A) use the vendor extensions and code a mound of triggers and stored procedures, or B) accept substandard performance from the database and embed all of the business logic in the application.
These are not new problems as it was recognized in the mid eighties, though it has grown worse over the years as we demand more from computers. With the growing popularity of Object Oriented Programming Language (OOPL), vendors realized a new way to model their data; thus was born the OODB. The OODB's of the mid eighties unfortunately repeated the problems of the earlier network database, where the only way to access the data was through the underlying OOPL. There were also performance problems, vendor instability, and similar problems as typical of new technologies.
The best that I am able to figure out, InterSystem's new Caché 5 post-relational database enters the picture with the goal of presenting the best of relational and object databases in a unified model without sacrificing performance, scalability, or maintainability. Let's see how well they did.
Caché 5 boasts a wealth of features that any relational database programmer or DBA would appreciate, including:
- A multidimensional data engine that uses an advanced storage mechanism to ensure ultra-fast transactional performance, even in high-volume environments. This effectively eliminates the need to do advanced joins as in typical relational databases. Caché's built-in transactional bitmap indexing mechanism delivers excellent query performance while maintaining update performance equal to that of typical RDBMS indexes. Traditional indices are also available, but the bitmapped index is just the ticket for a data-mining application.
- A multi-tier architecture that allows separation of application servers from data servers, or data-mining servers from transaction servers, or replication servers all in an easily administered environment. Intersystems ECP (Enterprise Cache Protocol) is the secret to making this work.
- Unified data architecture that allows data to be simultaneously represented as both objects and tables, giving developers the option to work with the model they are most comfortable with. Developers can also model their data as objects, tables, or multidimensional arrays. Defining an object property automatically results in a corresponding column and vice-versa. Caché includes a powerful IDE (Integrated Development Environment) for creating Caché Object Classes, and can also import its data from external sources such as DDL or Rational Rose.
- Open data access architecture that supports many technologies and protocols including ODBC, SOAP, ActiveX, .NET, XML, HTML, C++, COM, Java, EJB, and JDBC. There are native binding for Java and C++, but other programmers do not have to feel left out.
- Web architecture that allows developers to create cutting-edge Web applications using either their own Web technologies or Caché's own high-performance Web technology known as Caché Server Pages (CSP). In addition, Caché has built-in support for Web Services; any Caché object method or stored procedure can be exposed as an SOAP service or an XML object.
- Cross-Platform support—Red Hat & Suse Linux, AIX, HP/UX, Solaris, Microsoft Windows, as well as some other platforms for older versions of Caché. Caché Web technology also works on various Web servers; IIS and Apache are both supported in release 5.0.
I have looked at OODB before, and I must say that after looking at Caché, I am favorably impressed. Caché has taken great strides in object-relational technology, providing a comprehensive solution that is both powerful and flexible enough for even the most demanding environments. Other environments that attempt to mix OODB and relational technology simply attempt to graft one on top of the other (which is native and which is grafted depends on the vendor). Unlike grafted branches on fruit tree, the grafted DB fruit does not taste right and often has additional compromises. Performance is often a forced tradeoff for the ability to use the grafted technology and you are usually relegated to a second-rate implementation of the grafted technology, forcing the programmer to use unfamiliar object routines or jump through extra hoops to access the data. With Caché, you really do get the best of "both" worlds, without having to compromise performance for flexibility.