The following is extracted from the book Applied Microsoft .NET Framework Programming by Jeffrey Richter (Microsoft Press, 2002, ISBN: 0-7356-1422-9). Copyright 2002, Jeffrey Richter. Reproduced by permission of Microsoft Press. All rights reserved.
Common Object Operations
In this chapter, I’ll describe how to properly implement the operations that all objects must exhibit. Specifically, I’ll talk about object equality, identity, hash codes, and cloning.
Object Equality and Identity
The System.Object type offers a virtual method, named Equals, whose purpose is to return true if two objects have the same “value”. The .NET Framework Class Library (FCL) includes many methods, such as System.Array‘s IndexOf method and System.Collections.ArrayList‘s Contains method, that internally call Equals. Because Equals is defined by Object and because every type is ultimately derived from Object, every instance of every type offers the Equals method. For types that don’t explicitly override Equals, the implementation provided by Object (or the nearest base class that overrides Equals) is inherited. The following code shows how System.Object‘s Equals method is essentially implemented:
class Object { public virtual Boolean Equals(Object obj) { // If both references point to the same // object, they must be equal. if (this == obj) return(true); // Assume that the objects are not equal. return(false); } ' }
As you can see, this method takes the simplest approach possible: if the two references being compared point to the same object, true is returned; in any other case, false is returned. If you define your own types and you want to compare their fields for equality, Object‘s default implementation won’t be sufficient for you; you must override Equals and provide your own implementation.
When you implement your own Equals method, you must ensure that it adheres to the four properties of equality:
- Equals must be reflexive; that is, x.Equals(x) must return true.
- Equals must be symmetric; that is, x.Equals(y) must return the same value as y.Equals(x).
- Equals must be transitive; that is, if x.Equals(y) returns true and y.Equals(z) returns true, then x.Equals(z) must also return true.
- Equals must be consistent. Provided that there are no changes in the two values being compared, Equals should consistently return true or false.
If your implementation of Equals fails to adhere to all these rules, your application will behave in strange and unpredictable ways.
Unfortunately, implementing your own version of Equals isn’t as easy and straightforward as you might expect. You must do a number of operations correctly, and, depending on the type you’re defining, the operations are slightly different. Fortunately, there are only three different ways to implement Equals. Let’s look at each pattern individually.
Implementing Equals for a Reference Type Whose Base Classes Don’t Override Object’s Equals
The following code shows how to implement Equals for a type that directly inherits Object‘s Equals implementation:
// This is a reference type (because of 'class'). class MyRefType : BaseType { RefType refobj; // This field is a reference type. ValType valobj; // This field is a value type. public override Boolean Equals(Object obj) { // Because 'this' isn't null, if obj is null, // then the objects can't be equal. if (obj == null) return false; // If the objects are of different types, they can't be equal. if (this.GetType() != obj.GetType()) return false; // Cast obj to this type to access fields. NOTE: This cast can't // fail because you know that objects are of the same type. MyRefType other = (MyRefType) obj; // To compare reference fields, do this: if (!Object.Equals(refobj, other.refobj)) return false; // To compare value fields, do this: if (!valobj.Equals(other.valobj)) return false; return true; // Objects are equal. } // Optional overloads of the == and != operators public static Boolean operator==(MyRefType o1, MyRefType o2) { if (o1 == null) return false; return o1.Equals(o2); } public static Boolean operator!=(MyRefType o1, MyRefType o2) { return !(o1 == o2); } }
This version of Equals starts out by comparing obj against null. If the object being compared is not null, then the types of the two objects are compared. If the objects are of different types, then they can’t be equal. If both objects are the same type, then you cast obj to MyRefType, which can’t possibly throw an exception because you know that both objects are of the same type. Finally, the fields in both objects are compared, and true is returned if all fields are equal.
You must be very careful when comparing the individual fields. The preceding code shows two different ways to compare the fields based on what types of fields you’re using.
- Comparing reference type fields To compare reference type fields, you should call Object‘s static Equals method. Object‘s static Equals method is just a little helper method that returns true if two reference objects are equal. Here’s how Object‘s static Equals method is implemented internally:
public static Boolean Equals(Object objA, Object objB) { // If objA and objB refer to the same object, return true. if (objA == objB) return true; // If objA or objB is null, they can't be equal, so return false. if ((objA == null) || (objB == null)) return false; // Ask objA if objB is equal to it, and return the result. return objA.Equals(objB); }
You use this method to compare reference type fields because it’s legal for them to have a value of null. Certainly, calling refobj.Equals(other.refobj) will throw a NullReferenceException if refobj is null. Object‘s static Equals helper method performs the proper checks against null for you.
- Comparing value type fields To compare value type fields, you should call the field type’s Equals method to have it compare the two fields. You shouldn’t call Object‘s static Equals method because value types can never be null and calling the static Equals method would box both value type objects.
Implementing Equals for a Reference Type When One or More of Its Base Classes Overrides Object’s Equals
The following code shows how to implement Equals for a type that inherits an implementation of Equals other than the one Object provides:
// This is a reference type (because of 'class'). class MyRefType : BaseType { RefType refobj; // This field is a reference type. ValType valobj; // This field is a value type. public override Boolean Equals(Object obj) { // Let the base type compare its fields. if (!base.Equals(obj)) return false; // All the code from here down is identical to // that shown in the previous version. // Because 'this' isn't null, if obj is null, // then the objects can't be equal. // NOTE: This line can be deleted if you trust that // the base type implemented Equals correctly. if (obj == null) return false; // If the objects are of different types, they can't // be equal. // NOTE: This line can be deleted if you trust that // the base type implemented Equals correctly. if (this.GetType() != obj.GetType()) return false; // Cast obj to this type to access fields. NOTE: This // cast can't fail because you know that objects are of // the same type. MyRefType other = (MyRefType) obj; // To compare reference fields, do this: if (!Object.Equals(refobj, other.refobj)) return false; // To compare value fields, do this: if (!valobj.Equals(other.valobj)) return false; return true; // Objects are equal. } // Optional overloads of the == and != operators public static Boolean operator==(MyRefType o1, MyRefType o2) { if (o1 == null) return false; return o1.Equals(o2); } public static Boolean operator!=(MyRefType o1, MyRefType o2) { return !(o1 == o2); } }
This code is practically identical to the code shown in the previous section. The only difference is that this version allows its base type to compare its fields too. If the base type doesn’t think the objects are equal, then they can’t be equal.
It is very important that you do not call base.Equals if this would result in calling the Equals method provided by System.Object. The reason is that Object‘s Equals method returns true only if the references point to the same object. If the references don’t point to the same object, then false will be returned and your Equals method will always return false!
Certainly, if you’re defining a type that is directly derived from Object, you should implement Equals as shown in the previous section. If you’re defining a type that isn’t directly derived from Object, you must first determine if that type (or any of its base types, except Object) provides an implementation of Equals. If any of the base types provide an implementation of Equals, then call base.Equals as shown in this section.
Implementing Equals for a Value Type
As I mentioned in Chapter 5, all value types are derived from System.ValueType. ValueType overrides the implementation of Equals offered by System.Object. Internally, System.ValueType‘s Equals method uses reflection (covered in Chapter 20) to get the type’s instance fields and compares the fields of both objects to see if they have equal values. This process is very slow, but it’s a reasonably good default implementation that all value types will inherit. However, it does mean that reference types inherit an implementation of Equals that is really identity and that value types inherit an implementation of Equals that is value equality.
For value types that don’t explicitly override Equals, the implementation provided by ValueType is inherited. The following code shows how System.-ValueType‘s Equals method is essentially implemented:
class ValueType { public override Boolean Equals(Object obj) { // Because 'this' isn't null, if obj is null, // then the objects can't be equal. if (obj == null) return false; // Get the type of 'this' object. Type thisType = this.GetType(); // If 'this' and 'obj' are different types, they can't be equal. if (thisType != obj.GetType()) return false; // Get the set of public and private instance // fields associated with this type. FieldInfo[] fields = thisType.GetFields(BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.Instance); // Compare each instance field for equality. for (Int32 i = 0; i < fields.Length; i++) { // Get the value of the field from both objects. Object thisValue = fields[i].GetValue(this); Object thatValue = fields[i].GetValue(obj); // If the values aren't equal, the objects aren't equal. if (!Object.Equals(thisValue, thatValue)) return false; } // All the field values are equal, and the objects are equal. return true; } ' }
Even though ValueType offers a pretty good implementation for Equals that would work for most value types that you define, you should still provide your own implementation of Equals. The reason is that your implementation will perform significantly faster and will be able to avoid extra boxing operations.
The following code shows how to implement Equals for a value type:
// This is a value type (because of 'struct'). struct MyValType { RefType refobj; // This field is a reference type. ValType valobj; // This field is a value type. public override Boolean Equals(Object obj) { // If obj is not your type, then the objects can't be equal. if (!(obj is MyValType)) return false; // Call the type-safe overload of Equals to do the work. return this.Equals((MyValType) obj); } // Implement a strongly typed version of Equals. public Boolean Equals(MyValType obj) { // To compare reference fields, do this: if (!Object.Equals(this.refobj, obj.refobj)) return false; // To compare value fields, do this: if (!this.valobj.Equals(obj.valobj)) return false; return true; // Objects are equal. } // Optionally overload operator== public static Boolean operator==(MyValType v1, MyValType v2) { return (v1.Equals(v2)); } // Optionally overload operator!= public static Boolean operator!=(MyValType v1, MyValType v2) { return !(v1 == v2); } }
For value types, the type should define a strongly typed version of Equals. This version takes the defining type as a parameter, giving you type safety and avoiding extra boxing operations. You should also provide strongly typed operator overloads for the == and != operators. The following code demonstrates how to test two value types for equality:
MyValType v1, v2; // The following line calls the strongly typed version of // Equals (no boxing occurs). if (v1.Equals(v2)) { ... } // The following line calls the version of // Equals that takes an object (4 is boxed). if (v1.Equals(4)) { ... } // The following doesn't compile because operator== // doesn't take a MyValType and an Int32. if (v1 == 4) { ... } // The following compiles, and no boxing occurs. if (v1 == v2) { ... }
Inside the strongly typed Equals method, the code compares the fields in exactly the same way that you'd compare them for reference types. Keep in mind that the code doesn't do any casting, doesn't compare the two instances to see if they're the same type, and doesn't call the base type's Equals method. These operations aren't necessary because the method's parameter already ensures that the instances are of the same type. Also, because all value types are immediately derived from System.ValueType, you know that your base type has no fields of its own that need to be compared.
You'll notice in the Equals method that takes an Object that I used the is operator to check the type of obj. I used is instead of GetType because calling GetType on an instance of a value type requires that the instance be boxed. I demonstrated this in the "Boxing and Unboxing Value Types" section in Chapter 5.
Summary of Implementing Equals and the ==/!= Operators
In this section, I summarize how to implement equality for your own types:
- Compiler primitive types Your compiler will provide implementations of the == and != operators for types that it considers primitives. For example, the C# compiler knows how to compare Object, Boolean, Char, Int16, Uint16, Int32, Uint32, Int64, Uint64, Single, Double, Decimal, and so on for equality. In addition, these types provide implementations of Equals, so you can call this method as well as use operators.
- Reference types For reference types you define, override the Equals method and in the method do all the work necessary to compare object states and return. If your type doesn't inherit Object's Equals method, call the base type's Equals method. If you want to, overload the == and != operators and have them call the Equals method to do the actual work of comparing the fields.
- Value types For your value types, define a type-safe version of Equals that does all the work necessary to compare object states and return. Implement the type unsafe version of Equals by having it call the type-safe Equals internally. You also should provide overloads of the == and != operators that call the type-safe Equals method internally.
Identity
The purpose of a type's Equals method is to compare two instances of the type and return true if the instances have equivalent states or values. However, it's sometimes useful to see whether two references refer to the same, identical object. To do this, System.Object offers a static method called ReferenceEquals, which is implemented as follows:
class Object { public static Boolean ReferenceEquals(Object objA, Object objB) { return (objA == objB); } }
As you can plainly see, ReferenceEquals simply uses the == operator to compare the two references. This works because of rules contained within the C# compiler. When the C# compiler sees that two references of type Object are being compared using the == operator, the compiler generates IL code that checks whether the two variables contain the same reference.
If you're writing C# code, you could use the == operator instead of calling Object's ReferenceEquals method if you prefer. However, you must be very careful. The == operator is guaranteed to check identity only if the variables on both sides of the == operator are of the System.Object type. If a variable isn't of the Object type and if that variable's type has overloaded the == operator, the C# compiler will produce code to call the overloaded operator's method instead. So, for clarity and to ensure that your code always works as expected, don't use the == operator to check for identity; instead, you should use Object's static ReferenceEquals method. Here's some code demonstrating how to use ReferenceEquals:
static void Main() { // Construct a reference type object. RefType r1 = new RefType(); // Make another variable point to the reference object. RefType r2 = r1; // Do r1 and r2 point to the same object? Console.WriteLine(Object.ReferenceEquals(r1, r2)); // "True" // Construct another reference type object. r2 = new RefType(); // Do r1 and r2 point to the same object? Console.WriteLine(Object.ReferenceEquals(r1, r2)); // "False" // Create an instance of a value type. Int32 x = 5; // Do x and x point to the same object? Console.WriteLine(Object.ReferenceEquals(x, x)); // "False" // "False" is displayed because x is boxed twice // into two different objects. }
Object Hash Codes
The designers of the FCL decided that it would be incredibly useful if any instance of any object could be placed into a hash table collection. To this end, System.Object provides a virtual GetHashCode method so that an Int32 hash code can be obtained for any and all objects.
If you define a type and override the Equals method, you should also override the GetHashCode method. In fact, Microsoft's C# compiler emits a warning if you define a type that overrides just one of these methods. For example, compiling the following type yields this warning: "warning CS0659: 'App' overrides Object.Equals(object o) but does not override Object.GetHashCode()."
class App { public override Boolean Equals(Object obj) { ... } }
The reason why a type must define both Equals and GetHashCode is that the implementation of the System.Collections.Hashtable type requires that any two objects that are equal must have the same hash code value. So if you override Equals, you should override GetHashCode to ensure that the algorithm you use for calculating equality corresponds to the algorithm you use for calculating the object's hash code.
Basically, when you add a key/value pair to a Hashtable object, a hash code for the key object is obtained first. This hash code indicates what "bucket" the key/value pair should be stored in. When the Hashtable object needs to look up a key, it gets the hash code for the specified key object. This code identifies the "bucket" that is now searched looking for a stored key object that is equal to the specified key object. Using this algorithm of storing and looking up keys means that if you change a key object that is in a Hashtable, the Hashtable will no longer be able to find the object. If you intend to change a key object in a hash table, you should first remove the original object/value pair, next modify the key object, and then add the new key object/value pair back into the hash table. Defining a GetHashCode method can be easy and straightforward. But, depending on your data types and the distribution of data, it can be tricky to come up with a hashing algorithm that returns a well-distributed range of values. Here's a simple example that will probably work just fine for Point objects:
class Point { Int32 x, y; public override Int32 GetHashCode() { return x ^ y; // x XOR'd with y } ' }
When selecting an algorithm for calculating hash codes for instances of your type, try to follow these guidelines:
- Use an algorithm that gives a good random distribution for the best performance of the hash table.
- Your algorithm can also call the base type's GetHashCode method, including its return value in your own algorithm. However, you don't generally want to call Object's or ValueType's GetHashCode method because the implementation in either method doesn't lend itself to high-performance hashing algorithms.
- Your algorithm should use at least one instance field.
- Ideally, the fields you use in your algorithm should be immutable; that is, the fields should be initialized when the object is constructed and they should never again change during the object's lifetime.
- Your algorithm should execute as quickly as possible.
- Objects with the same value should return the same code. For example, two String objects with the same text should return the same hash code value.
System.Object's implementation of the GetHashCode method doesn't know anything about its derived type and any fields that are in the type. For this reason, Object's GetHashCode method returns a number that is guaranteed to uniquely identify the object within the AppDomain; this number is guaranteed not to change for the lifetime of the object. After the object is garbage collected, however, its unique number can be reused as the hash code for a new object.
System.ValueType's implementation of GetHashCode uses reflection and returns the hash code of the first instance field defined in the type. This is a naove implementation that might be good for some value types, but I still recommend that you implement GetHashCode yourself. Even if your hash code algorithm returns the hash code for the first instance field, your implementation will be faster than ValueType's implementation. Here's what ValueType's implementation of GetHashCode looks like:
class ValueType { public override Int32 GetHashCode() { // Get this type's public/private instance fields. FieldInfo[] fields = this.GetType().GetFields( BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic); if (fields.Length > 0) { // Return the hash code for the first non-null field. for (Int32 i = 0; i < fields.Length; i++) { Object obj = field[i].GetValue(this); if (obj != null) return obj.GetHashCode(); } } // No non-null fields exist; return a unique value for the type. // NOTE: GetMethodTablePtrAsInt is an internal, undocumented method return GetMethodTablePtrAsInt(this); } }
If you're implementing your own hash table collection for some reason or you're implementing any piece of code where you'll be calling GetHashCode, you should never persist hash code values. The reason is that hash code values are subject to change. For example, a future version of a type might use a different algorithm for calculating the object's hash code.
Object Cloning
At times, you want to take an existing object and make a copy of it. For example, you might want to make a copy of an Int32, a String, an ArrayList, a Delegate, or some other object. For some types, however, cloning an object instance doesn't make sense. For example, it doesn't make sense to clone a System.Threading.Thread object since creating another Thread object and copying its fields doesn't create a new thread. Also, for some types, when an instance is constructed, the object is added to a linked list or some other data structure. Simple object cloning would corrupt the semantics of the type. A class must decide whether or not it allows instances of itself to be cloned. If a class wants instances of itself to be cloneable, the class should implement the ICloneable interface, which is defined as follows. (I'll talk about interfaces in depth in Chapter 15.)
public interface ICloneable { Object Clone(); }
This interface defines just one method, Clone. Your implementation of Clone is supposed to construct a new instance of the type and initialize the new object's state so that it is identical to the original object. The ICloneable interface doesn't explicitly state whether Clone should make a shallow copy of its fields or a deep copy. So you must decide for yourself what makes the most sense for your type and then clearly document what your Clone implementation does.
Note |
Many developers implement Clone so that it makes a shallow copy. If you want a shallow copy made for your type, implement your type's Clone method by calling System.Object's protected MemberwiseClone method, as demonstrated here:
class MyType : ICloneable { public Object Clone() { return MemberwiseClone(); } }
Internally, Object's MemberwiseClone method allocates memory for a new object. The new object's type matches the type of the object referred to by the this reference. MemberwiseClone then iterates through all the instance fields for the type (and its base types) and copies the bits from the original object to the new object. Note that no constructor is called for the new object-its state will simply match that of the original object.
Alternatively, you can implement the Clone method entirely yourself, and you don't have to call Object's MemberwiseClone method. Here's an example:
class MyType : ICloneable { ArrayList set; // Private constructor called by Clone private MyType(ArrayList set) { // Refer to a deep copy of the set passed. this.set = set.Clone(); } public Object Clone() { // Construct a new MyType object, passing it the // set used by the original object. return new MyType(set); } }
ou might have realized that the discussion in this section has been geared toward reference types. I concentrated on reference types because instances of value types always support making shallow copies of themselves. After all, the system has to be able to copy a value type's bytes when boxing it. The following code demonstrates the cloning of value types:
static void Main() { Int32 x = 5; Int32 y = x; // Copy the bytes from x to y. Object o = x; // Boxing x copies the bytes from x to the heap. y = (Int32) o; // Unbox o, and copy bytes from the heap to y. }
Of course, if you're defining a value type and you'd like your type to support deep cloning, then you should have the value type implement the ICloneable interface as shown earlier. (Don't call MemberwiseClone, but rather, allocate a new object and implement your deep copy semantics.)
About the Author
Jeffrey Richter is a co-founder of Wintellect (www.Wintellect.com), a training, debugging, and consulting firm dedicated to helping companies build better software, faster. He is the author of "Applied Microsoft .NET Framework Programming" (Microsoft Press) and several Windows programming books. Jeffrey is also a contributing editor to MSDN Magazine where he authors the .NET column. Jeffrey has been consulting with Microsoft's .NET Framework team since October 1999.
Applied Microsoft .NET Framework Programming. Copyright 2002, Jeffrey Richter. Reproduced by permission of Microsoft Press. All rights reserved.
# # #