Brent Rector is with Wintellect.
When you try to learn a new programming language and framework, such as .NET, you try and map what you know to the new .NET concepts you are learning. Generally, this helps you come up to speed in the new environment more quickly. However, occasionally, concepts that seem similar are not and you can end up creating applications that don’t operate as you intended.
C++ destructors and .NET finalizers initially seem to be similar, if not identical, concepts. A developer who creates a finalizer, expecting it to work like a C++ destructor, will be unpleasantly surprised. A C++ destructor is a specially named method (~className) that the C++ runtime executes immediately when you tell it to destroy an instance of the class. A C# finalizer is also a specially named method. In fact, you define it using the ~className syntax just like a C++ destructor (which I personally think is unfortunate). However, any similarity between the two concepts stops at this point.
When the .NET garbage collector (GC) decides it can collect an object, the GC looks to see whether the object has a Finalize method that the GC needs to call. When it does, the GC does not collect the object, but instead schedules a background thread to call the Finalize method at some indefinite time in the future. This difference in semantics can have huge effects on your object’s behavior.
It is the C++ programmer’s responsibility to control the lifetime of an object instance and tell the C++ runtime when to call the destructor method for an object and reclaim the memory for the object. Note that this model makes a few assumptions.
C++ Destructor, .NET Finalizer, and Lifetime Management Semantics
First, the burden is on the client of a C++ object to control the lifetime of both the object’s resources and the object’s memory. When you heap allocate a C++ object, you must call delete at the appropriate time to invoke the destructor (which typically releases the object’s resources) and free the memory allocated for the object. Alternatively, you must stack allocate the object to tell the runtime to call the destructor and (implicitly) reclaim the memory when the stack frame goes out of scope. Of course, some stack frames don’t go out of scope for hours, even days or weeks, so the burden is still on the client to decide whether this is the appropriate lifetime management of the object. When the client gets the lifetime management wrong, the object never gets its destructor called, and the memory leaks.
Second, placing resource clean-up code in a destructor assumes that resource lifetime and object lifetime end at the same point in time. Some designs reuse objects and you may well want to free resources of an object far sooner than releasing the object itself. This is easy to do, actually. You simply add a Close or Dispose method to the object and have the client call the method as soon as the resources should be released. In C++ terms, this means the client must determine the appropriate time to instruct the object to release its resources and the client must determine the appropriate time to tell the object to self-destruct (end of object lifetime).
The client of a .NET object must also determine the appropriate resource lifetime and call the Dispose method to free those resources. However, the client of a .NET object never controls the lifetime of the object. At some unknown point in time, after you can no longer use an object, the GC will attempt to collect the object. At some unknown time after that, the .NET runtime will call your finalize method. At some unknown time after that, the .NET runtime will reclaim the memory for the object.
The obvious advantage to this design is that you can never forget to free unreferenced objects. Additionally, you can never mistakenly reference a previously freed object. This eliminates most common memory management bugs. However, this model forces you to separate the lifetime management of resources encapsulated by an object and the lifetime of the object itself.
Therefore, in the .NET world, by design, we place resource clean-up code in a Dispose method and require the client to call it at the appropriate time. The GC itself automatically handles the object lifetime and memory reclamation.
Performance Considerations
class Foo { ~Foo () { . . . } }
The ~Foo method declaration in the above C# code is identical to a destructor syntactically, but in C# it represents a finalizer. In fact, the C# compiler actually generates the following code:
class Foo { protected override void Finalize ()) { try { . . . } finally { base.Finalize(); } } }
However, adding a finalizer to a class, even if it does nothing, causes instantiations of class instances to run more slowly plus just the presence of such objects in the heap causes the GC to run more slowly.
When creating each instance of a class with a finalizer, the GC must record that it needs to call the finalizer for the instance before it collects the object. This process causes allocations of such objects to take longer. (In absolute terms, this isn’t a big performance loss but in comparison to the extremely fast allocation of non-finalizable objects, it is considerably slower.)
Each time the garbage collector runs, for each collectable object in the heap, the GC has to search this finalization object list to see whether the collectable object is in the list. The longer this list, the more time the GC spends searching the list and the slower the GC runs.
Finally, when the GC finds a collectable object on the finalization list, it cannot collect it right away. Instead, it must keep the object alive and schedule a background thread to call the Finalize method at some time in the future. This operation can have pervasive side-effects.
For example, when the collectable object itself has references to other objects, keeping the finalizable object alive means all the objects to which it holds references are now also kept alive, plus their dependencies, and so on. Additionally, because a finalizable object survives its first garbage collection, the GC promotes it from generation zero to generation one. The GC collects objects in higher generations less frequently than it does objects in lower generations, so the net effect is that the object and everything it references stay in memory much longer than they otherwise would if the initial object didn’t have an empty finalize method.
Usage Considerations
Let’s now examine what code in a Finalize method realistically can do. First, you should put as little code as possible in the method and make it execute as quickly as possible. There is a single background thread that sequentially calls all Finalize methods. When your Finalize method runs slowly, it causes a delay before subsequent objects get their Finalize method called.
Second, don’t block in a finalizer. Technically, this is the same point as the first one because blocking can be considered making the method run extremely slowly.
Third, you cannot, in general, use any member reference variables of an object in its Finalize method. It is impossible to guarantee the finalization order of objects. Therefore, when the GC background thread calls an object’s Finalize method, other objects referenced by its member variables may have already had their Finalize methods called. This means that these referenced objects may no longer be functional.
Conclusion
So what’s the purpose of a Finalize method? That’s a good question. My opinion is that Java had one, so .NET needed one as well. Therefore, marketing comparisons could show that .NET matches Java feature by feature as much as possible. However, I think it wasn’t a terribly useful feature in Java and it’s not a terribly useful feature in .NET.
The only code you can usefully put in a Finalize method releases non-memory resources. However, it would be far better to have the client release such resources as soon as possible via a Dispose method. So, in effect, you are creating a Finalize method solely to free resources in the case the client screws up and forgets to free them.
In the best-case scenario, you are done using an object (its lifetime ends) at the same time as you are done using its non-memory resources (its resources lifetimes end) and you’ve willingly incurred all the performance hits of implementing the Finalize method. Immediately after you could have called the object’s Dispose method, the garbage collector runs and attempts to collect the object. It schedules the free background thread to run and the thread immediately calls your Finalize method. The method releases the non-memory resources. The memory for the object still stays around until the next GC run, at which time the GC reclaims the memory for the object and all objects it references.
In a worst-case scenario, your object’s Finalize method won’t run until your process terminates gracefully. (Actually, the real worst case is that it never runs at all because the process terminates abnormally.) Shortly thereafter, non-memory resources will be released anyway, so the Finalize method served no real purpose other than to keep the application from running as fast as it otherwise could.
Just say No to Finalize methods!
About the Author
Brent Rector has designed and written operating systems, compilers, and many, many applications. He began developing Windows applications using a beta version of Windows 1.0. Brent is the founder of Wise Owl Consulting, Inc. (http://www.wiseowl.com), and the architect and primary developer of Demeanor for .NET, Enterprise Edition—the premier .NET code obfuscator. In addition, Brent is an instructor for Wintellect (http://www.wintellect.com). He has also written several books on Windows programming, including ATL Internals, Win32 Programming, and others.
# # #