http://www.developer.com/

Back to article

Object Serialization


November 7, 2005

This series, The Object-Oriented Thought Process, is intended for someone just learning an object-oriented language and wants to understand the basic concepts before jumping into the code or someone who wants to understand the infrastructure behind an OOP language they are already using. Click here to start at the beginning of the series.

In keeping with the code examples used in the previous articles, Java will be the language used to implement the concepts in code. One of the reasons that I like to use Java is because you can download the Java compiler for personal use at the Sun Microsystems Web site http://java.sun.com/. You can download the J2SE 1.4.2 SDK (software development kit) to compile and execute these applications and I will provide the code listings for all examples in this article. I have the SDK 1.4.0 loaded on my machine. I will also provide figures and the output (when appropriate) for these examples. See the previous articles in this series for detailed descriptions for compiling and running all the code examples.

In the previous article in this series, I covered the topic of object signatures. Despite the fact that object signatures may seem like a fairly basic concept, they are in fact the cornerstones of object-oriented design. The design of the object signatures defines how the objects are used. These signatures can define brand new object services or they can interface to currently existing services. This is where the term of object wrappers comes into play.

In the next few columns, you will explore the concept of object wrappers by considering several interesting technologies. In many ways, object wrapping is simply another example of the paramount object-oriented concept pertaining to separating the interface from the implementation.

Perhaps the most obvious topic to illustrate this concept is that of data handling—saving program data to external media with the intent of restoring it later. The basic idea is to develop a way to encapsulate various data access technologies and make them transparent to the user. In short, you will create a single interface that will allow the user to save and restore an object, regardless of the technology.

In this article, you will investigate the process of serializing an object so that you can write it out to a file. Next month, you will create wrappers that will allow you to write data from a Java program using JDBC to a database (your first example will be Microsoft Access). In the process of these next few articles, you will focus on the concept of object wrappers. After you cover the topics of object serialization and connecting to a database, you will see the power of object wrappers first hand.

Interface/Implementation Revisited

The overriding concept behind object wrappers is that of the separation of the interface from the implementation. By now, you should have realized that this concept is the basis of much of the material in this column. Thus, when designing a class, what the user needs to know and what the user does not need to know are of vital importance. Encapsulation is the means by which nonessential data is hidden from the user.

Consider the example of designing and producing a simple toaster. The toaster, or any appliance for that matter, is simply plugged into an interface, which is an electrical outlet. All appliances can access electricity by complying with and using the correct interface: the electrical outlet. The toaster doesn't need to know about the implementation, or how the electricity is produced. A coal plant or a nuclear plant could produce the electricity—the appliance does not care which, as long as the interface works.

As another example, consider an automobile. The interface between you and the car includes components such as steering wheels, gas pedals, brakes, and ignition switch. For most people, aesthetic issues aside, the main concern when driving a car is that the car starts, accelerates, stops, steers, and so on. The implementation, basically the stuff that you don't see, is of little concern to the average driver. In fact, most people would not even be able to identify certain components, such as the catalytic converters and gaskets. However, any driver would recognize and know how to use the steering wheel because this is a common interface. By installing a standard steering wheel in the car, manufacturers are assured that the people in their target market will be able to use the mechanism.

If, however, a manufacturer decided to install a joystick in place of the steering wheel, most drivers would balk at this, and the automobile might not be a big seller (except for some eclectic people who love bucking the trends). On the other hand, as long as the performance and aesthetics didn't change, the average driver would not notice whether the manufacturer changed the engine—part of the implementation—of the automobile.

It must be stressed that the interchangeable engines must be identical in every way—as far as the driver's perceptions go. Replacing a four-cylinder engine with an eight-cylinder engine would change the rules just as changing the current from AC to DC would affect the rules in the power plant example.

The engine is part of the implementation, and the steering wheel is part of the interface. A change in the implementation should have no impact on the driver, whereas a change to the interface might.

Interfaces also relate directly to classes. End users do not normally see any classes—they see the GUI or command line. However, programmers would see the class interfaces. Class reuse means that someone has already written a class. Thus, a programmer who uses a class must know how to use the class. This programmer will combine many classes to create a system. The programmer is the one who needs to understand the interfaces of a class. Therefore, when I talk about users in this article, I mean designers and developers—not end users. And when I talk about interfaces, I are talking about class interfaces, not GUIs.

To encapsulate data, classes are designed in two parts—the interface and the implementation.

The Interface

The interface is the services that are presented to an end user. In the best case, only the services that the end user needs are presented. Of course, which services the user needs may be a matter of opinion. If you put 10 people in a room and ask each of them to do an independent design, you might receive 10 totally different designs. There is nothing wrong with this. However, as a rule of thumb, the interface to a class should contain only what the user needs to know. In the toaster example, the user only needs to know that the toaster must be plugged into the interface—which in this case is the electrical outlet.

Perhaps the most important issue when designing a class is identifying the audience, or users, of the class.

The Implementation

The implementation details of the interface services are hidden from the user and can be changed as long as the interface remains the same. Recall that in the toaster example, although the interface is always the electric outlet, the implementation could change from a coal power plant to a nuclear power plant without affecting the toaster. There is one very important caveat to be made here: The coal or nuclear plant must also conform to the interface specification. If the coal plant produces AC power, but the nuclear plant produces DC power, there is a problem. The bottom line is that both the user and the implementation must conform to the interface specification.

Using Wrappers to Hide the Implementation

How does the concept of interface/implementation relate to the discussion on serialization and connecting to a database? The basic idea is that you as a user of a class should be able to write an object to a persistence data storage device without knowing what the implementation of the device is. This holds for specific implementations like whether or not you use an MS Access database or an Oracle database, and so on. It also holds true for the actual means of storage, like whether or not you use serialization or connect to a database. In short, all you should have to do is write the object; the underlying implementation should be hidden. In this way, you can change the implementation without affecting the user code. You will explore this concept in great detail after you cover the specific technologies. As stated earlier, this article describes object serialization; the next will cover connecting to a database using JDBC. Then, you will build the appropriate wrappers to allow you to seamlessly use one or both of these technologies. Thus, you can see how to write an object to a file.

Serializing an Object

No matter what type of business application that you create, saving the data to a storage device most likely will be part of the mix. In fact, one of my favorite lines when it comes to software development is "it's all about the data." In short, no matter what hardware, operating system, applications software, and so forth, is used when creating a software application, the data may well be the reason for creating the system in the first place.

Persistent Objects Basics

Recall that when an object is instantiated by an application, it lives only as long as the application itself. Thus, if you instantiate an Employee object that contains attributes such as name, ss#, and the like, that Employee object will cease to exist when the application terminates. Figure 1 illustrates the traditional object life cycle.

Figure 1: The Object Life Cycle.

When the Employee object is instantiated and initialized, it has a specific state. Remember that the state of an object is defined by the value of is attributes. If you want to maintain the state of the Employee object, you must take some sort of action to save the state of this object beyond the life of the application. The concept of saving the state of an object so that it can be used later is called persistence. Thus, you use the term persistent object to define an object that can be restored and used independent of the application. Figure 2 illustrates the traditional object life cycle with persistence.

Figure 2: Object Life Cycle with Persistence.

There are many ways to save the state of an object. Some of these are as follows:

  • Save to a flat file
  • Save to a relational database
  • Save to an object database

Saving to a Flat File

The first example I will cover is that of using a flat file for object persistence. I define a flat file as a simple file managed by the operating system. This is a very simple concept, so don't get too caught up in this description.

Note: Many people do not like to use the term flat file. The word flat implies that the object is literally flattened, and in a way it is.

One of the things that you may have thought about is the fact that an object cannot be saved to a file like a simple variable—and this is true. In fact, the problem of saving the state of an object has led to a complete software application industry, which I will explain at length later in this article. Normally, when you save a number of variables to a file, you know the order and type of each variable, and then you simply write them out to the file. It could be a comma-delimited file or any other protocol that you may determine.

The problem with an object is that it is not simply a collection of primitive variables. An object can be thought of as an indivisible unit that is composed of a number of parts. Thus, the object must be decomposed into a unit that can be written to a flat file. After the object is decomposed and written to a flat file, there is one major issue left to consider—recomposing the object, basically putting it back together.

Another major problem with storing objects relates to the fact that an object can contain other objects. Consider that a Car object may contain objects like Engines and Wheels. When you save the object to a flat file (or anything else for that matter), you need to save the entire object hierarchy, Car, Engines, and the like.

Java has a built-in mechanism for object persistence. Like other C-based languages, Java largely utilizes the concept of a stream to deal with I/O. To save an object to a file, Java writes it to the file via a Stream. To write to a Stream, objects must implement either the Serializable or Externalizable interface.

Serializing and Marshalling Objects

You have already seen the problem of using objects in environments that were originally designed for structured programming. The middleware example, where you wrote objects to a relational database, is one good example. You also touched on the problem of writing an object to a flat file or sending it over a network.

Basically, to send an object over a wire (for example, to a file, over a network), the system must deconstruct the object (that is, flatten it out), send it over the wire, and then reconstruct it on the other end of the wire. This process is called serializing an object. The act of actually sending the object across a wire is called marshalling an object. A serialized object, in theory, can be written to a flat file and retrieved later, in the same state in which it was written.

The major issue here is that the serialization and de-serialization must use the same specifications. It is sort of like an encryption algorithm. If one object encrypts a string, the object that wants to decrypt it must use the same encryption algorithm. Java provides an interface called Serializable that provides this translation.

Serializing a File

As an example, consider the following code for a class called Person:

import java.util.*;
import java.io.*;
class Person implements Serializable{
   private String name;
   public Person(){
   }
   public Person(String n){
      System.out.println("Inside Person's Constructor");
      name = n;
   }
   String getName() {
      return name;
   }
}

This is a simple class that contains only a single attribute representing the name of the person.

The item of note here is the line that identifies the class as Serializable. If you actually inspect the Java documentation, you will realize that the Serializable interface really does not contain much; in fact, it is meant solely to identify that the object will be serialized. Below is a short description from the J2SE API specification.

public interface Serializable

Serializability of a class is enabled by the class implementing the java.io.Serializable interface. Classes that do not implement this interface will not have any of their state serialized or deserialized. All subtypes of a serializable class are themselves serializable. The serialization interface has no methods or fields and serves only to identify the semantics of being serializable.

http://java.sun.com/

You use this Serializable interface to allow you to serialize your Person objects.

class Person implements Serializable {
}

This Person class also contains a method called getName( ) that returns the name of the object. Other than the Serializable interface, there is really nothing new about this class that you have to consider. Here is where the interesting stuff starts. You now want to write an application that will write this object to a flat file. The application is called SavePerson and is as follows:

import java.util.*;
import java.io.*;
public class SavePerson implements Serializable{
   public static void main(String args[]){
      Person person = new Person("Jack Jones");
      try{
         FileOutputStream fos = new FileOutputStream("Name.txt");
         ObjectOutputStream oos = new ObjectOutputStream(fos);
         System.out.print("Person's Name Written: ");
         System.out.println(person.getName());
         oos.writeObject(person);
         oos.flush();
         oos.close();
      }  catch(Exception e){
         e.printStackTrace();
      }
   }
}

While some of this code delves into some more sophisticated Java code, you can get a general idea of what is happening when an object gets serialized and written to a file.

Although you have not explicitly covered some of the code in this example, like file I/O, you can get into the code in a much great detail with a few of the books referenced at the end of this chapter.

By now you should realize that this is an actual application. How can you tell this? The fact that the code has a main method in it is a sure tip that this is an actual application.

This application basically does three things:

  1. Instantiates a Person object.
  2. Serializes the object.
  3. Writes the object to the file Name.txt.

The actual act of serializing and writing the object is accomplished in the following code:

oos.writeObject(person);

Now, this is obviously a lot simpler that than writing each individual attribute out one at a time. It is very convenient to simply write the object directly to the file.

It is important to know that the underlying implementation is not quite that simple. This is yet another great example of the difference between the interface and the implementation. The programmer's interface is to simply write the object to the file. You don't care how the object is actually written at the physical level. All you care about is:

  • That you can write the object as an indivisible unit
  • That you can restore the object exactly as you stored it

It is interesting to actually look at the file produced when the SavePerson application is executed. Remember that when you write data to a text file you can inspect and even edit the data. When you write a serialized object to a file, you can look at it; however, you can't (or shouldn't) edit it. Figure 3 shows what the Name.txt file looks like when opened in Notepad.

Figure 3: The Serialized Object as a Text File.

Although the format of this file is far from obvious, notice that there are some things that you can identify—for example, the name "Jack Jones".

The application SavePerson( ) writes the object to the file Name.txt. The following code restores the object.

import java.io.*;
import java.util.*;
public class RestorePerson{
   public static void main(String args[]){
      try{
         FileInputStream fis = new FileInputStream("Name.txt");
         ObjectInputStream ois = new ObjectInputStream(fis);
         Person person = (Person )ois.readObject();
         System.out.print(Person's Name Restored: ");
         System.out.println(person.getName());
         ois.close();
      }  catch(Exception e){
         e.printStackTrace();

} } }

The main line of interest here is the code that retrieves the object from the file Name.txt.

Person person = (Person )ois.readObject();

It is important to note that the object is reconstructed from the flat file and a new instance of a Person object is instantiated and initialized. This Person object is an exact replica of the Person object that you stored in the SavePerson application. Figure 4 shows the output of both the SavePerson and the RestorePerson applications.

Figure 4: Serializing an Object.

Note: The name "Jack Jones", part of the Person object, is stored in the file Name.txt when the application is executed and then the object is restored when the application RestorePerson is executed. When the object is restored, you can access the name attribute. The beauty of this approach is that all the programmer must do is save and restore the object. There is no need to deal with the individual parts of the object separately.

Conclusion

The main thrust of this article was to describe the technology of object serialization. By studying and using the code in the example above, you can save objects to a simple text file. However, the underlying theme for this article and the following articles is the idea of object wrappers. Next month, you will cover the technology of connecting to a database using JDBC. Once this is complete, you then will create useful object wrappers and design an application that will successfully separate the implementation of the data storage medium from the interfaces that you have carefully designed.

References

Gilbert, Stephen, and Bill McCarty: Object-Oriented Design in Java. The Waite Group, 1998.

Meyers, Scott: Effective C++. Addison-Wesley, 1992.

Tyma, Paul, Gabriel Torok and Troy Downing: Java Primer Plus. The Waite Group, 1996.

Ambler, Scott: The Object Primer. Cambridge University Press, 1998.

Jaworski, Jamie: Java 1.1 Developers Guide. Sams Publishing, 1997.

www.javasoft.com

About the Author

Matt Weisfeld is a faculty member at Cuyahoga Community College (Tri-C) in Cleveland, Ohio. Matt is a member of the Information Technology department, teaching programming languages such as C++, Java, and C# .NET as well as various web technologies. Prior to joining Tri-C, Matt spent 20 years in the information technology industry gaining experience in software development, project management, business development, corporate training, and part-time teaching. Matt holds an MS in computer science and an MBA in project management. Besides The Object-Oriented Thought Process, which is now in it's second edition, Matt has published two other computer books, and more than a dozen articles in magazines and journals such as Dr. Dobb's Journal, The C/C++ Users Journal, Software Development Magazine, Java Report, and the international journal Project Management. Matt has presented at conferences throughout the United States and Canada.

Sitemap | Contact Us

Thanks for your registration, follow us on our social networks to keep up-to-date