February 25, 2021
Hot Topics:

Choosing to Generate

  • By Mike Gunderloy
  • Send Email »
  • More Articles »

There's been a fair amount of buzz lately about code generation: writing code to write code. To some extent, developers use code generators all the time. When you use a compiler, for example, you're generating code in a lower-level language. But that's not the sort of code generation I'm discussing in this article. The current crop of code generators use a variety of techniques and their own high-level languages to generate high-level code.

Like most software tools and techniques, code generation can be overdone. Though it's extremely useful in the right circumstances, you need to be sensible about identifying those circumstances. In this article, I'll try to offer some guidance.

A Generator in Action

To get a better feel for this modern world of code generation, let's take a look at one particular generator: CodeSmith. CodeSmith has its own scripting language that's very similar to ASP.NET. You use this scripting language to define templates. For example, here's a small piece of a CodeSmith template for a HashTable class in C#:

<% if (TargetNamespace != null && TargetNamespace.Length > 0) { %>
namespace <%= TargetNamespace %> {
<% } %>
    #region Class <%= ClassName %>

    /// <summary>
    /// Implements a strongly typed collection of <see cref="<%= PairType %>"/>
    /// key-and-value pairs that are organized based on the hash code of the key.
    /// </summary>
    /// <remarks>
    /// <b><%= ClassName %></b> provides a <see cref="Hashtable"/> that is strongly typed
    /// for <see cref="<%= KeyType %>"/> keys and <see cref="<%=ItemType %>"/> values.
    /// </remarks>

    <%= GetAccessModifier(Accessibility) %> class <%= ClassName %>:

This is not the place to go into the details of the CodeSmith template language (there's a fine tutorial on the CodeSmith Web site), but you should be able to get the general idea. The template provides a picture of the code to be generated, and itself contains some scripting logic and some replaceable parameters. Figure 1 shows this template in use in the CodeSmith interface. To use CodeSmith, you fill in values on the property sheet and click the Generate button. The result is code that can be pasted directly into your application (CodeSmith also includes a Visual Studio .NET add-in, which can automatically generate code whenever you rebuild a VS .NET project).

Click here for a larger image.

This little example demonstrates the essential features of most modern code generators: they are tools that take some sort of abstract input, and use that input to generate source code that can then be fed into a compiler or other build process. Code generators can be freeware or commercial products, or developed in-house for specific projects.

Signs That You Need a Code Generator

So far, code generation is just a neat parlor trick. The idea of writing code to write code is interesting to most developers, but that's not enough to make it useful. When should you choose to buy or build a code generator for a real project?

One good sign that it's time to consider a code generator is that you're wearing out the Ctrl-C and Ctrl-V key combinations on your keyboard. If you find yourself building large chunks of your project by cut and paste, you need to stop and ask yourself why this is happening. It might be that you're just being sloppy, and that a bit of refactoring will eliminate the cut and paste; perhaps all you need to do is define a utility function that you can call from elsewhere in the project. In that case, you don't need a code generator.

But it's also possible that you're going through a cut, paste, and edit cycle. Consider the case of a HashTable class, for example. If your application contains many business classes, you might find yourself with a HashTable that holds Customer objects, and wanting one that holds Order objects. In that case, your first temptation will be to copy the first class and then edit it so it holds Orders instead of Customers. That's not a situation that can be easily solved with refactoring -- but it is a great place to use a code generator. With a single template and a code generator that's capable of making successive substitutions, you could quickly build both HashTable classes, without needing to do any manual editing.

As the cut-paste-edit cycle gets longer, the attraction of a code generator correspondingly increases. Consider the case of writing data access logic classes to interface with a database. Your database might have 50 or 500 tables, each of which will require stored procedures and code to handle data access. You could put a junior developer to work writing that boring boilerplate code over and over again, but a better solution is to use a tool that can iterate through all of the tables in the database and automatically spit out the required stored procedures and classes.

In some cases, schedule pressure will force you to consider code generation, even if it's not uppermost in your mind. What if you need to generate an HTML Help file documenting a large class library, but you don't have much time in the schedule between the final testing of the class library and its release? The answer is to use a tool that can automatically generate the documentation from the final source code. Note that in this case, "code generation" means building the source files that are compiled into HTML Help, rather than something like C# or Java files. It's the same basic process whatever the target.

Finally, don't overlook the impact that code generation can have on code quality. Let's think again about those data access layer bits. Likely each table will require dozens of lines of SQL code and hundreds of lines of high-level code in your application. If you write all of those lines of code by hand, what's the chance of a typo slipping in a subtle bug somewhere along the line? All too high, at least in my experience. By generating the code automatically, you only need to make sure that the template is correct. After that, the individual stored procedures and classes are sure to be correct -- assuming that there's not an error in the code generation code itself! Code generation does not free you from the necessity to test your code, but it can lower the chance of silly errors.

Warning Signs

On the other hand, there are some times when code generation just doesn't make sense. Start with the most obvious barrier: if your application doesn't contain a lot of repetitive code, you're unlikely to save effort with a code generator. Remember, the code generator itself is code that needs to be maintained and tested. If you're only including a single HashTable class, you might as well just write the class by hand. To use a generator for it only increases the footprint for potential errors.

In fact, you should be very wary of using a code generator for any code that you couldn't write by hand if you needed to. The code generator should be a time-saving device, not a black box that turns out magic code. You need to consider that you might have to maintain the code by hand in the future, for a variety of reasons. Perhaps you've upgraded your tools and the code generator won't work in the new version, or perhaps you need to customize the code after all. Either way, you'd best understand the code that's in your project.

Cultural factors can also get in the way of code generation. Just because you've identified a need for such a tool doesn't mean that your boss understands the same need. Building a code generator takes time, and buying one takes money. Before you can spend either one, you need to make sure that you have buy-in from your management. Otherwise, the time you spend writing code to write code may well seem wasted to someone who's only monitoring your output in the "real" project.

Other members of your team, too, can get in the way of code generation. Some developers have the attitude that "real programmers don't use code generators." Such developers are unlikely to use such a tool, even if it's plain to you that it's the best thing for the project. Worse, they may actively sabotage your efforts to use code generation. Most code generation tools are designed to replace the classes they generate when they're run again. If someone else is making changes to those classes by hand, you can end up in an endless cycle of check-ins and check-outs, as people try to recover their code that was overwritten by your tool. In such a case, if you can't educate your coworkers, you might as well give up on code generation.

Finally, beware of the "all I have is a hammer" syndrome, where you treat everything as a nail whether it is one or not. Code generators tend to be very targeted tools that build a particular kind of code. Don't get overly fond of a particular tool before you determine whether it's right for your own project. If a particular object-relational mapper creates business objects that don't have the interfaces you're expecting, you don't want to end up rewriting your entire project to accomodate the tool. Rewrite the tool instead, find a different tool, or rethink your strategy.

If it Works, Do it!

The bottom line is simple: code generation can save you a considerable amount of time and money on a large project. When you're faced with a huge project and time pressure (and when was there ever a huge project without time pressure?), take a few days to get a feel for the code that you need to build. You may well identify areas where a targeted code generation tool can help take the pressure off, and that's a big win when it happens.

About the Author

Mike Gunderloy is the author of over 20 books and numerous articles on development topics, and the lead developer for Larkware. Check out his MCAD 70-305, MCAD 70-306, and MCAD 70-310 Training Guides from Que Publishing. When he's not writing code, Mike putters in the garden on his farm in eastern Washington state.

This article was originally published on September 19, 2003

Enterprise Development Update

Don't miss an article. Subscribe to our newsletter below.

Thanks for your registration, follow us on our social networks to keep up-to-date