http://www.developer.com/

Back to article

Clojure's Approach to Polymorphism: Method Dispatch


April 26, 2010

The word polymorphism derives from Greek where poly means many and morph means form. In programming languages, polymorphism is the idea that values of different data-types can be treated in "the same" way. Often, this is manifested via class inheritance where classes in a hierarchy can have methods of the same name. When such a method is called on an object, the correct code is executed at runtime depending on which specific class the object is an instance of. There are various ways of achieving polymorphic behavior, however, and inheritance is only one such way.

This article helps to understand how Clojure approaches polymorphism. We'll start by looking at method dispatch -- starting with the commonly available single dispatch, followed by double and multiple dispatch. The mechanism of dynamic method dispatch is a fancy way of saying that a when a method is called, the name of the method is mapped to a concrete implementation at runtime.

This article is an excerpt from an Early Access Edition of the book "Clojure in Action" (Manning Publications; ISBN: 9781935182597), written by Amit Rathore.

Single and Double Dispatch

Languages such as Java only support single dispatch. What this means is that the type of receiver (the object on which a method is called) is the only thing that determines which method is executed. To understand this and the next couple of paragraphs, we'll use a commonly cited but well-suited example of a program that needs to process an abstract syntax tree (AST). Let's imagine our AST is implemented in a Java-like OO language and is modeled via classes shown in Figure 1.

Polymorphism in Clojure: Method Dispatch
Figure 1. A simplistic hierarchy representing an AST. Each node is a subclass of a generic SyntaxNode and has functions for various tasks a compiler or IDE might perform.

When the IDE needs to format the source code represented by such an AST, it calls format on the it. Internally, the AST delegates the format call to each component in a typical composite fashion. This walks the tree and ultimately calls format on each node in the AST.

This is straightforward enough. The way this works is that, even though the AST is made up of different kinds of concrete nodes, the operation format is defined in the generic base class and is called using a reference to that base class. Dynamically, the receiver type is determined and the appropriate method is executed. If any of these methods were to accept arguments, the types of those would not influence the determination of which method that needs to execute.

Single dispatch leads to possibly unexpected situations. For instance, consider the following code defined inside some sort of Receiver class.

public void collectSound(Dog d) {
  recordBark(d);
}  

public void collectSound(Cat c) {
  recordMeow(c);
}


Now imagine we had a collection of Animal objects (specifically, a collection of instances of Dog and Cat). The following code would not even compile:

for (Object element: animals) {
  Animal animal = (Animal) element;
  receiver.collectSound(animal);
}


The compiler would complain that there is no such method as collectSound that accepts an instance of the Animal class. The reason, to reiterate, is that thanks to Java and single dispatch, only the type of receiver is used to determine which method to call. The type of the argument is not used at all; leaving the compiler to complain that there is no method defined that accepts an Animal.

The way to satisfy the compiler would be to add another method collectSound(Animal a). This would make the compiler happy, but would not do the desired job of dispatching to the right method. The programmer would have to test for the object type and then dispatch again to appropriate methods (which would need to be renamed to collectDogSound and collectCatSound). It might end up looking like this:

public void collectSound(Animal a) {
  if (a instanceof Dog)
   collectDogSound(a);
  if (a instanceof Cat)
   collectCatSound(a);
}


This is highly unsatisfactory! The language should do the job of dispatching methods not the programmer, and this is the limitation of single dispatch. If the language supported double dispatch, the original code would work just fine, and we wouldn't need what we are about to do in the next section. Later on, we'll see how Clojure completely side-steps this recurring issue by giving complete control to the programmer.

The Visitor Pattern (and Simulating Double Dispatch)

In a language that resolved methods through double dispatch, the types of receiver and the first argument would be used to find the method to execute. This would then work as desired.

We can simulate double dispatch in Java and other languages that suffer from single dispatch. In fact, that is exactly what the visitor pattern does. We'll examine that here and later see how it is not needed in Clojure.

Consider the program we mentioned earlier that needed to process an AST. The implementation from there included things like checkValidity and generateASM inside each concrete implementation of SyntaxNode. The problem with this approach, of course, is that adding new operations like this is quite invasive, as every type of node needs to be touched. The visitor pattern allows such operations to be separated from the data object hierarchy (which, in this case, is the SyntaxNode hierarchy). Let's take a quick look at the modification needed to it, followed by the new visitor classes.

Polymorphism in Clojure: Method Dispatch
Figure 2. Simulating double dispatch requires modification. The accept method is a somewhat unclear but required method in each node, which will call back the visitor.

Figure 3 shows the new visitor classes.

Polymorphism in Clojure: Method Dispatch
Figure 3. Each visitor class does one operation and knows how to process each kind of node. Adding new kinds of operation involves adding new visitors.

The infrastructure method accept(NodeVisitor) in each SyntaxNode is required because of the need to simulate a double dispatch. Inside it, the visitor itself is called back using an appropriate visit method. Here's some more Java code showing an example involving AssignmentNode:

public void accept(NodeVisitor visitor) {
  visitor.visitAssignment(this);
}


If the language natively supported double dispatch, all of this boilerplate code wouldn't need to exist. Further, the NodeVisitor hierarchy needs to know about all the classes in the SyntaxNode hierarchy, which makes it a more coupled design.

Now that we know about double dispatch, the obvious question is -- what about triple dispatch? What about dispatching on any number of argument types?

Multiple Dispatch

In essence, multiple dispatch takes the idea of double dispatch to its logical end. A language that supports multiple dispatch can look up methods based on the type of receiver and all the arguments passed to it.

A language that supports this feature doesn't need the convoluted visitor pattern. Simple, straight forward calls with multiple arguments of different types just work. Figure 4 shows how it might look.

Figure 3 shows the new visitor classes.

Polymorphism in Clojure: Method Dispatch
Figure 4. With multiple dispatch, visitors can have a straightforward, polymorphic visit method (off the type of SyntaxNodes), and SyntaxNode does not need the accept method infrastructure.

Languages such as Dylan, Nice, and R support multiple dispatch. Common Lisp supports multiple dispatch via Common Lisp Object System (CLOS), which is a comprehensive DSL written on top of the base language. Clojure also supports multiple dispatch through its multimethods feature.

Multimethods

Now that we've seen how method polymorphism works underneath, we're finally ready to look at Clojure's multimethod feature. Clojure multimethods support not just multiple dispatch, but much more. Indeed, once you look past multiple dispatch, a commonly asked question is can a language dispatch on things other than the types of values. With Clojure's multimethods, methods can be made to dispatch based on any arbitrary rule.

A multimethod is defined using defmulti. A multimethod by itself is not useful; it needs candidate implementations from which to choose when it is called. The defmethod macro is used to define implementations. Let's take a look.

An Example

Consider the situation where our expense tracking service has become very popular. We've started an affiliate program where we pay referrers if they get users to sign up for our service. Different affiliates have different fees. Let's begin with the case where we have two main ones -- mint.com and the universal google.com.

Using Plain Functions

We would like to create a function that calculates the fee we pay to the affiliate. For the sake of illustration, let's decide we'll pay our affiliates a percentage of the annual salary the user makes. We'll pay Google 0.01%, we'll pay Mint 0.03%, and everyone else gets 0.02%. Let's write this without multimethods first.

(defn fee-amount [percentage user]
  (float (* 0.01 percentage (:salary user))))

(defn affiliate-fee-cond [user]
  (cond 
    (= :google.com (:referrer user)) (fee-amount 0.01 user)
    (= :mint.com (:referrer user)) (fee-amount 0.03 user)
    :default (fee-amount 0.02 user)))


The trouble with this way of writing this function, of course, is that it is painful to add new rules about affiliates and percentages. The cond form will soon get messy. We will now rewrite it using Clojure's multimethods.

Using Multimethods

Before we implement the same functionality using multimethods, let's take a moment to understand how they work. As mentioned earlier, multimethods are declared using the defmulti macro. Here's a simplified general form of this macro:

(defmulti name dispatch-fn & options)


The dispatch-fn is a regular Clojure function that accepts the same arguments that are passed when the multimethod is called. The return value of this dispatch-fn is what is called a dispatching value. Before moving on, let's look at the previously mentioned defmethod macro:

(defmethod multifn dispatch-value & fn-tail)


This creates a concrete implementation for a previously defined multimethod. The multifn identifier should match the name in the previous call to defmulti. The dispatch-value is a value will be compared with the earlier return value of the dispatch-fn in order to determine which method will execute. This will be clearer through an example, so let's now rewrite the affiliate-fee function using multimethods.

(defmulti affiliate-fee :referrer)

(defmethod affiliate-fee :mint.com [user]
  (fee-amount 0.03 user))

(defmethod affiliate-fee :google.com [user]
  (fee-amount 0.01 user))

(defmethod affiliate-fee :default [user]
  (fee-amount 0.02 user))


That looks a lot cleaner than the cond form, since it separates each case into its own method (which looks somewhat similar to a plain function). Let's set up some test data, so we can try it out.

(def user-1 {:login "rob" :referrer :mint.com :salary 100000})
(def user-2 {:login "kyle" :referrer :google.com :salary 90000})
   (def user-3 {:login "celeste" :referrer :yahoo.com :salary 70000})


And now to try a few calls to affiliate-fee:

(affiliate-fee user-1)
30.0
(affiliate-fee user-2)
9.0
   (affiliate-fee user-3)
14.0


So this works as expected. What happens is when a call is made to affiliate-fee, the multimethod first calls the dispatch-fn. In this case, we used :referrer as the dispatch-fn. The arguments to the dispatch-fn are the same as the arguments to the multimethod, which in this case is the user hash-map. Calling :referrer on the user object returns something like :google.com and is considered the dispatch-value. The various methods are then checked over, and when a match is found, it is executed. If a match is not found, then the default case is picked. The default value for this catch-all case is :default, but you can specify anything you want when calling defmulti, like so:

(defmulti affiliate-fee :referrer :default :else)


After changing the default value with such a call, the default case method would need to use the same value:

(defmethod affiliate-fee :else [user]
  (fee-amount 0.02 user))


It's that simple! Now, to add new cases, we just add new methods, which is far cleaner than ending up with a long-winded cond form. Now let's try and stretch this example a bit by expanding the capabilities of our system.

Multiple Dispatch

Now let's imagine that our service is even more successful than before, and that the affiliate program is working out great. So great, in fact, that we'd like to pay them a higher fee for more profitable users. This would be a win-win for the affiliate network and our service, so let's get started by quantifying a user's level of profitability. Consider the following dummy implementation of this functionality:

(defn profit-rating [user]
 (let [ratings [::bronze ::silver ::gold ::platinum]
       randomizer (java.util.Random. )
       random-index (.nextInt randomizer (count ratings))]
   (nth ratings random-index)))


This is quite the dummy implementation, as you can see; it doesn't even use the user parameter. It serves our purpose nicely though, since it demonstrates that this function could be doing anything (including things like number crunching, database access, web service calls, whatever you like). In this case, it returns one of the four possible ratings ::bronze, ::silver, ::gold, or ::platinum.

Now let's consider the business rules shown in the Table 1.

Table 1. Affiliate Fee Business Rules
Affiliate Profit Rating Fee (% of salary)
mint.com Bronze 0.03
mint.com Silver 0.04
mint.com gold/platinum 0.05
google.com gold/platinum 0.03

From the rules it is clear that there are two values based on which the fee percentage is calculated -- the referrer and the profit rating. In a sense, the combination of these two values is the affiliate fee type. Since we'd like to dispatch on this virtual type consisting of two values, let's create a function that computes the pair:

(defn fee-category [user]
  [(:referrer user) (profit-rating user)])


This returns a vector containing two values, for instance [:mint.com ::bronze], [:google.com ::gold] and [:mint.com ::platinum]. We can use these as dispatch-values in our methods, like so:

(defmulti profit-based-affiliate-fee fee-category)
(defmethod profit-based-affiliate-fee [:mint.com ::bronze] [user]
  (fee-amount 0.03 user))
(defmethod profit-based-affiliate-fee [:mint.com ::silver] [user]
  (fee-amount 0.04 user))
(defmethod profit-based-affiliate-fee [:mint.com ::gold] [user]
  (fee-amount 0.05 user))
(defmethod profit-based-affiliate-fee [:mint.com ::platinum] [user]
  (fee-amount 0.05 user))
(defmethod profit-based-affiliate-fee [:google.com ::gold] [user]
  (fee-amount 0.03 user))
(defmethod profit-based-affiliate-fee [:google.com ::platinum] [user]
  (fee-amount 0.03 user))
(defmethod profit-based-affiliate-fee :default [user]
  (fee-amount 0.02 user))


This reads a lot like our table with business rules, and adding new rules is still quite easy, and doesn't involve modifying existing code.

On a separate note, this is a form of double dispatch since we're dispatching on two values. The affiliate-id and profit-level, even though they aren't types in the class-based sense of the word, behave as types in this domain. In other words, the Clojure code we just wrote performs a double dispatch based on types in our domain. There's nothing to stop us from dispatching on any number of such types if we wanted to, and the mechanism is just as straightforward.

Note also that although we've seen polymorphic behavior in the examples above, we've not mentioned inheritance at all. This shows that inheritance is not a necessary condition for polymorphism. However, inheritance can be quite convenient, as we will see in the next few paragraphs.

Ad-hoc Hierarchies

Now that we have the profit-rating infrastructure in place, we might want to expose the ratings to our members in a form similar to airlines membership status. We could treat bronze as an entry-level status, and members could graduate to silver and higher levels. We could also treat gold and platinum as premier statuses, affording such members a higher class of service. This classification might look like this:

Polymorphism in Clojure: Method Dispatch
Figure 5. Our domain demands a hierarchy of membership statuses. We can use this externally facing classification to simplify our code by defining an ad-hoc hierarchy reflecting this structure.

We can codify this hierarchy using Clojure's derive function, as follows:

(derive ::bronze ::basic)
(derive ::silver ::basic)
(derive ::gold ::premier)
(derive ::platinum ::premier)


Having done this, we can programmatically ensure it works as expected using the isa? function.

(isa? ::platinum ::premier)
true

(isa? ::bronze ::premier)
false


Since we now have a hierarchy, we can redefine the multimethod to calculate affiliate fee using it, as follows:

(defmulti affiliate-fee-for-hierarchy fee-category)
(defmethod affiliate-fee-for-hierarchy [:mint.com ::bronze] [user]
  (fee-amount 0.03 user))
(defmethod affiliate-fee-for-hierarchy [:mint.com ::silver] [user]
  (fee-amount 0.04 user))
(defmethod affiliate-fee-for-hierarchy [:mint.com ::premier] [user]
  (fee-amount 0.05 user))
(defmethod affiliate-fee-for-hierarchy [:google.com ::premier] [user]
  (fee-amount 0.03 user))
(defmethod affiliate-fee-for-hierarchy :default [user]
  (fee-amount 0.02 user))


This is even more succinct; it also captures the intent that acquiring a premier member from an affiliate partner is more expensive.

Java Class Hierarchy

While it is easy enough to create ad-hoc hierarchies as shown earlier, Clojure goes one step further to make things simple by providing support for Java classes out of the box. When Java classes are used as dispatch values, inheritance relationships are automatically respected, relieving the need for the hierarchy to be created again via calls to derive.

This ensures that both javax.swing.JComboBox and javax.swing.JFileChooser automatically treat javax.swing.JComponent as a parent. A method that uses the JComponent as a dispatch value will match if the dispatching function were to return either a JFileChooser or a JComboBox. The programmer doesn't have to do anything special for this to work.

The Visitor Pattern Revisited

Now that we know how multimethods solve dispatch problem, let's rewrite the AST program using multimethods. Imagine we represent a couple of syntax nodes thus:

(def aNode {:type :assignment :expr "assignment"})
(def vNode {:type :variable-ref :expr "variableref"})


We will use the appropriately named :type value of the hash-map to behave as the type upon which we will dispatch our multimethods. Here's the implementation of checkValidity:

(defmulti checkValidity :type)
(defmethod checkValidity :assignment [node]
  (println "checking :assignment, expression is" (:expr node)))
(defmethod checkValidity :variable-ref [node]
  (println "checking :variable-ref, expression is" (:expr node)))


And, here's the multimethod for generateASM:

(defmulti generateASM :type)
(defmethod generateASM :assignment [node]
  (println "gen ASM for :assignment, expr is" (:expr node)))
(defmethod generateASM :variable-ref [node]
  (println "gen ASM for :variable-ref, expr is" (:expr node)))


This is much simpler than having created the whole double dispatch mechanism in a language like Java or C++ that doesn't support it. In order to add new types of operations on the AST, we just create a new multimethod.

We've covered creation and usage of multimethods. Before closing our discussion, let's look at the situation where more than one dispatching value matches, which causes the multimethod to become confused.

Resolving Conflicts

Sometimes, a hierarchy may involve what looks like multiple inheritance. Consider the situation where a Programmer can be both an Employee and a Geek.

Polymorphism in Clojure: Method Dispatch
Figure 6. A hierarchy could lead to multiple inheritance. Clojure doesn't do anything to prevent this, but gives an elegant way to break a tie if multiple dispatch values match.

To reflect this, one would call derive with the following:

(derive ::programmer ::employee)
(derive ::programmer ::geek)


Such "multiple inheritance" relationships are perfectly valid in several domains, and Clojure doesn't stop the programmer from creating or using them. However, when methods are created with dispatch values of ::employee and ::geek, and the dispatch function returns ::programmer, the multimethod doesn't know which one to pick since they are both valid.

This multiple inheritance problem is by no means unique to Clojure. Languages like Java avoid it by disallowing classes from being derived from more than one parent class. In Clojure, we can break the tie by specifying our order of preference using the prefer-method function. Here's how we would specify that being a geek trumps being employed:

(prefer-method multimethod-name ::geek ::employee)


Read this as a preference of ::geek over ::employee.

We've covered the mechanics of multimethods and seen how they're a superior way of achieving polymorphism than what languages limited to single dispatch can ever provide. Having said that, multimethods aren't used very much, mostly because the functional programming style makes this kind of polymorphism less needed. When there is a need, however, they can be a very elegant solution.

Summary

As we have seen, what is generally referred to as polymorphism in the context of languages such as Java and C++, is actually a rather specific variety called subtype polymorphism. Further, the implementation in such languages is limited to single dispatch, which is again a very specific case of the far more general possibility of multiple dispatch.

Recall that these languages subscribe to a view of OOP that demands methods (and data) belong to the enclosing class. It is probable that this is the reason these languages support only single dispatch -- they only dispatch on the receiver since it is the class that owns the potentially polymorphic method.

In Clojure, however, a multimethod doesn't belong to anything. The concrete methods, instead, belong to the multimethod, and they can be dispatched based off any number of types. In fact, the dispatch function is just an ordinary function written by the programmer and can do anything it wants. It is by no means limited to only the type of the arguments, opening up possibilities that can probably not even be dreamed off in other languages.

About the Author

Amit Rathore has 10 years of software development experience, building mission-critical enterprise systems. He's the chief architect at runa.com, a targeted dynamic sale pricing service to online merchants. The system, written entirely in Clojure, collects extensive click-stream data from consumers browsing products, analyzes it using statistical models, and then offers the consumers special, personalized pricing in real-time. The system must handle extremely high-loads, and is written in a highly distributed, fault tolerant manner ideal for Clojure.

Sitemap | Contact Us

Thanks for your registration, follow us on our social networks to keep up-to-date