Brent Rector is with Wintellect.
Many times, I’ve heard developers refer to the upcoming generics feature of .NET 2.0 (also known as Whidbey) as “templates for .NET.” While often a useful shorthand expression, in many ways, it’s a very misleading expression. I’d like to discuss a number of the ways in which .NET generics differ from C++ templates.
Probably the most noticeable, if not significant, difference is that specialization of a .NET generic class or method occurs at runtime whereas specialization occurs at compile time for a C++ template. Let’s look at C++ templates first.
For example, let’s assume the compiler encounters the following specialization of the List<T> template class for the first time in a compilation unit:
List<int> a;
The compiler processes the source code for the List<T> template class, replacing the type parameter placeholder with the specialization type of int and, in effect, creates a new data type called List<int> within the assembly. Basically, in C++, the compiler translates the source code for a template class into the appropriate binary specialization of that class within the output object module.
Here’s a subtle point, but it’s a direct result of how types are identified in .NET. Let’s say that C++ code in assembly A has used a List<int> template specialization and that C++ code in assembly B expects to use that exact same specialized type. Code in assembly A creates an instance of the List<int> type and passes a reference to that instance to code in assembly B that wishes to use its List<int> type. Boom! At the very least, you will get a runtime type mismatch exception. Depending on how you pass the reference, you might even get the error at compile time. The error is because the List<int> type in assembly A is not the same type as List<int> in assembly B.
In .NET, all types are qualified by the assembly in which they are defined. (You generally don’t notice this when programming using C# or VB.NET, but it’s very explicit if you ever look at the IL version of your code.) Therefore, type Foo in assembly A is not the same as type Foo in assembly B even if Foo is defined in exactly the same way in both assemblies. So, [A]List<int> is a different type from [B]List<int>.
Now, let’s compare the above behavior to how .NET generics behave. In C++, a template exists as source code and specialization of a template type causes the compiler to create the specialized type from the template. With .NET generics, a compiler expects to read metadata describing the template type. In other words, you must first compile the source code for a generic type to an assembly. Subsequent specializations of the generic type refer to the metadata definition of the generic type. Let’s look at the above example in the context of generics.
First, to use the generic List type, you must compile it into an assembly. Let’s call that assembly C (for Collection). Now, to compile the source code for assembly A, you will reference assembly C so that the compiler is aware of the definition of the generic List type that the code for A uses.
At runtime, a method that references a generic type with a set of type arguments causes the runtime to create a specialization of the generic type based on the generic definition’s metadata. Therefore, when the compiler encounters the List<int> specialization in the source code for assembly A, it emits IL and metadata in assembly A that says, in effect, to the .NET runtime “Go to assembly C, find the definition of the generic type called List<T>, and, based on its definition, create a new specialization of that class called List<int>.” So, when assembly A creates an instance of List<int>, it is creating a specialization of [C]List<T>, not [A]List<T> as templates did.
Similarly, when the compiler encounters a parameter of type List<int> when creating assembly B, it generates code that says, in effect, “Expect to receive an instance of the List<int> specialization of the [C]List<T> class.” Another way of saying all this is that the definition of a generic type is unique across the application and the single definition can be used by all assemblies. In contrast, a C++ template type definition is local to the assembly into which you compile it.
As a minor point, in the bad ol’ days of C++ templates, for each source file in which you used List<int>, the compiler produced a definition of the specialized class in the resulting object file. This often resulted in many duplicate copies of the same template class definition—one per object module. However, modern compilers/linkers typically remove all these duplicate classes so it’s not really an issue anymore for C++.
Again, .NET works a little differently here. As the .NET runtime encounters a request to use a specialization of a generic class, it can determine if the specialization, for example, [C]List<int>, has previously been created and, if so, simply reuse the existing class rather than generate a new, duplicate, class. But, the initial request for a different specialization, for example, List<DateTime>, may produce another type definition. In fact, in Whidbey, the runtime creates a new specialization of a generic type or method whenever you specify a different value type for a substitutable type parameter.
However, the .NET runtime optimizes the case where the replaceable type parameters are reference types. Generally speaking, the .NET runtime can produce a single specialization of a generic type that can be reused for all specializations where the replaceable types are reference types. For example, when the runtime produces the specialization of List<System.String>, the resulting specialized type properly handles all possible reference types. Therefore, a subsequent specialization request for List<Employee>, where Employee is a reference type, can reuse the existing code for List<System.String>. After all, in both cases the code is simply manipulating a reference and all references are conceptually used the same way and occupy the same amount of space—unlike the case for value types. Using one shared type for all reference type specializations of a particular generic class or method reduces code bloat and working set.
Another subtle point also related to the compile time/runtime specialization of templates vs. generics relates to what you can actually do in the template code. For example, let’s say you have the following method in a template/generic class:
static T Min<T>(T arg1, T arg2) if (arg1.CompareTo(arg2) < 0) { return arg1; } return arg2; }
In a C++ template, this code is processed during template specialization. If the type specified for T during specialization contains a CompareTo method that accepts a single argument of the same type, the compiler knows how to generate the code to call the method. However, if the type specified for T does not contain such a method, you get a compile-time error. However, when compiling a generic type definition such as the above, the compiler has a problem.
A .NET generic type definition is compiled into an assembly typically long before any specializations are known. In addition, a .NET generic type definition should be type safe (assuming no use of type unsafe features). There is no way, as the example currently stands, for the compiler to generate type safe code to call the CompareTo method on any arbitrary, yet to be specified, class. This ultimately requires the generic constraint feature.
A constraint is a restriction on the set of possible types that may be specified for a replaceable generic parameter. For example, I can rewrite the above example as follows:
static T Min<T>(T arg1, T arg2) where T : IComparable { if (arg1.CompareTo(arg2) < 0) { return arg1; } return arg2; }
The bold syntax is a constraint on the possible types that may be used for parameter T during specialization of the generic method. In this example, I’ve informed the compiler that whatever type is used for T during a subsequent specialization, that type must implement the IComparable interface. This implies that any type that implements the interface must provide an implementation of the IComparable::CompareTo virtual method.
Here’s what happens. The compiler sees a call to the CompareTo method on parameter arg1. The parameter arg1 is of type T and type T is known to be a type that implements IComparable. Therefore, the type of the object to which arg1 refers will have an instance method called CompareTo reachable via the type’s IComparable implementation. So, the compiler generates a call to the interface method. With the constraint, the compiler has enough information to compile the generic method definition into type safe code prior to encountering any specializations of the method. Of course, subsequently, if you try and specialize this method using a type that does not implement IComparable, you will receive a compiler error because the type used during the specialization doesn’t satisfy the constraint.
I need to point out that there are subtleties lurking in this approach. As mentioned above, when using generics, this code calls the IComparable::CompareTo method body. However, in .NET, one can explicitly implement an interface method on a class. In effect, this make the method “invisible” on the class itself and only reachable via the interface. Therefore, you can additionally implement a method with the same name and signature on the class itself. Here’s an example:
class Foo : IComparable { int IComparable.CompareTo (System.Object o) { ... } public int CompareTo (System.Object o) { ... } } ... Foo f1 = ... Foo f2 = ... Foo lessor = Min<Foo> (f1, f2);
The template equivalent of the above would call the public CompareTo method on the class, not the private explicit implementation of the interface’s method.
There are other differences between C++ templates and .NET generics as well.
- Templates support partial specialization; generics don’t.
- Templates allow you to write explicit specializations of a type (typically for more efficiency); generics permit you to provide a single generic definition that will be used at runtime for all specializations.
- Templates allow the replaceable parameters to be any value; generics require the replaceable parameters to be replaced by type names.
- Templates are a C++ only language feature; generics can be supported by any .NET language that wishes to do so.
- A constraint specification is limited to a base class or a list of interfaces. This fundamentally means that code in a generic method body can primarily call interface methods on a parameterized type. Unfortunately, this also means that you cannot call static methods of the type argument’s class, which eliminates many potentially useful methods such as its static conversion operators or static operators.
I’m sure the C++ template fans will continue to advocate template use. I expect VB.NET and C# developers to use generics heavily. Heck, if you’re using a collection class containing value types in the Whidbey release, it makes no sense not to use the generic collection classes. However, with the new language features in the Whidbey version of C++ with Managed Extensions you can have your cake and eat it too (use templates and generics in the same code).
In fact, I may have to retreat from my stance of preferring C# and disliking MC++. With Whidbey, MC++ is a whole new language and nearly all of my current objections to it have disappeared. But, that’s yet another article.
About the Author
Brent Rector has designed and written operating systems, compilers, and many, many applications. He began developing Windows applications using a beta version of Windows 1.0. Brent is the founder of Wise Owl Consulting, Inc. (http://www.wiseowl.com), and the architect and primary developer of Demeanor for .NET, Enterprise Edition—the premier .NET code obfuscator. In addition, Brent is an instructor for Wintellect (http://www.wintellect.com). He has also written several books on Windows programming, including ATL Internals, Win32 Programming, and others.
# # #