Showing posts with label patterns. Show all posts
Showing posts with label patterns. Show all posts

Saturday, February 9, 2008

Continuations Passing Style (poorly) Explained II: Java 7 BGGA Closures Prototype

Update: Like many people, I've been grappling with trying to understand continuations. In these posts, I tried to reduce continuations to their absolute essence. But in doing so, I may have missed the mark widely. I might have reduced and twisted them to the point where they aren't continuations at all, but something else entirely. As the joke goes, "Why isn't an elephant small, round, and white? Because then it would be an aspirin." Maybe all that is left of this post is a strange clever trick/ugly hack. Sorry if I've added to your confusion and misinformation... it was fun to write.

Update #2: This post is pretty much a disaster. I mangled the return statements within the closures because I didn't understand the non-local return "return" keyword.

Learning a new language opens your mind to new ways of solving old problems, and so I found myself staring at my Java IDE this afternoon thinking, "What I really need is continuation passing style and recursion." To fully convince myself, I came home and spent Friday night writing a continuation passing style solution to my problem four different times: in Java 4 (no generics), in Java 5 (with generics), in Java 7 (using Neal Gafter's closures prototype, and in Groovy.

This is Part II of the post, in which I attempt to show how continuation passing style can be used in Java 7 with the current BGGA closures prototype. If you're new to the style you might read Part I for a broader background on the topic and to see the Java 4 and 5 examples. If you're a continuation expert, you might do the same, but then leave a slew of corrections to the post in the comments section. Part III details the Groovy version, and is available at http://hamletdarcy.blogspot.com/2008/02/continuations-explained-iii-groovy.html.

So a continuation is a function that represents "the rest of the computation given a point in the computation". Continuation passing style is, roughly, passing a function into another function which calls the passed function as the return statement. In a sense, the passed function "receives" the result of method call. In the previous post, I explained how a method to find the first element of a list and a collector can be combined to emulate the standard foreach-loop. They can also be combined to emulate a for-every-other loop, or a for-multiples-of-3-loop, or a for-first-matching, or anything you want. That's the power of moving the flow control to the continuation and out of the language (foreach) or enclosing code (Strategy/Template Patterns).

The original Java 5 examples combined an interface representing a function (Continuation) and another function (findFirst) which finds the first element of a list and calls the continuation, passing that first element and the rest of the list without that first element:

interface Continuation<T, U> {
U call(T first, List<T> rest);
}

<T, U> U findFirst(List<T> list, Continuation<T, U> continuation) {
if (list.isEmpty()) return continuation.call(null, list);
T first = list.get(0);
return continuation.call(first, list.subList(1, list.size()));
}

If you want a foreach-loop then you make sure your continuation recursively calls back findFirst on the rest of the list. If you want a for-every-other loop then you have the continuation remove the first element of the "rest" parameter and then call back findFirst recursively. Any iteration strategy is allowed because it is left up to the calling code to specify it. In this way we can define an evens( ) method which will generate a list of the even numbers from a source list, which normally requires a foreach-loop:
List<Integer> evens(final List<Integer> numbers) {
return findFirst(numbers, new Continuation<Integer, List<Integer>>(){
public List<Integer> call(Integer first, List<Integer> rest) {
if (first == null) return Collections.emptyList(); //no more elements
List<Integer> result = new ArrayList<Integer>();
if (first % 2 == 0) result.add(first); //match!
result.addAll(evens(rest));
return result; //process rest of list
}
});
}

You can also define an exists( ) method that would normally require a for-first-loop:
Boolean exists(final Integer needle, List<Integer> haystack) {
return findFirst(haystack, new Continuation<Integer, Boolean>(){
public Boolean call(Integer first, List<Integer> rest) {
if (first == null) return false; //never found
if (first.equals(needle)) return true; //match!
return exists(needle, rest); //process rest of list
}
});
}

I won't go into detail because I want to spend more time on the Java 7 version, which theoretically should be simpler and easier to read. It's also more fully explained in Part I.

So in Java 7 there is no need to declare the Continuation interface or create an anonymous inner class. The findFirst function can simply declare that it accepts a closure:
<T, U> U findFirst(List<T> list, {T, List<T>=>U} continuation) {
if (list.isEmpty()) return continuation.invoke(null, list);
T first = list.get(0);
return continuation.invoke(first, list.subList(1, list.size()));
}

So reading left to right...

  • <T, U>... this is a function that requires two generic types: T and U

  • U... this function will return an object of type U

  • findFirst... this function is named "findFirst"

  • List<T> list... the first parameter is a List containing elements of type T

  • {T, List<T>=>U} continuation.... ooohh, the fun begins. The second parameter is a function, declared using the curly braces. The parameters to this function are a T object and a List of T objects. This function returns an object of type U, and the function is named "continuation". The => thing separates the parameter list from the return type.

Other than that, there is no difference in implementation of findFirst in Java 5 vs. Java 7. If the list passed to findFirst is empty, then the continuation is invoked with null to signal the end of the computation. Otherwise, the first element and the rest of the list are passed to "continuation" for final processing. For a foreach-loop, the continuation will do something and the call back findFirst with the rest of the list as a parameter. For a find-first-matching-loop, the continuation will simply return true on a matching element and end the looping right there.

Have a look at the evens( ) implementation:
List<Integer> evens(final List<Integer> numbers) {
return findFirst(numbers, { Integer first, List<Integer> rest =>
List<Integer> result;
if (first == null) {
result = Collections.emptyList(); //no more elements
} else {
result = new ArrayList<Integer>();
if (first % 2 == 0) result.add(first); //match!
result.addAll(evens(rest));
}
result //process rest of list
}
);
}

What's missing is the anonymous inner class. It's been replaced by { Integer first, List<Integer> rest =>... removing the class declaration and the method definition reduces the lines of code by 2! (That's by two, not by half). But why is the semi-colon missing off the last line? The original version I wrote used the "return" keyword, which doesn't behave the same way inside and outside of a closure. If you want to return a value from a closure, then it must be the last line of the closure, have no return statement, and lack a semi-colon. This means you can't return from multiple places within a closure, and you can't easily move code in and out of closures without some IDE help. I don't feel bad about getting it wrong the first time.

The exists( ) method becomes similarly changed:
Boolean exists(final Integer needle, List<Integer> haystack) {
return findFirst(haystack, { Integer first, List<Integer> rest =>
boolean result = false;
if (first == null) result = false; //never found
else if (first.equals(needle)) result = true; //match!
else result = exists(needle, rest); //process rest of list
result
}
);
}

Et voila. You have used a collector to define your own flow control. One function worries about how elements in a list are found, and another function worries about what to do with those elements.

The purpose of this post is to answer how to do this in Java 7, not whether you should do this. But I can't resist providing a couple thoughts...

  • I wrote this code with only a basic understanding of recursion
  • ...but I love closures in Groovy, so I already comfortable using closures
  • The generics involved were not difficult to write
  • ...but at work, I'm the guy people go to with generics questions (sometimes I wonder why)
  • With minimum effort, a Java developer could learn functional programming
  • ...but the same can be said of generics, which many Java developers refuse to adopt. I certainly got a lot wrong about continuations in this series of posts.
  • The first revision of the code used the return statement, which does not do what I thought it would do
  • ... maybe using "return" for a non-local return by default isn't the right default
  • Finally, without tail call optimization (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4726340), it would be risky to deploy this code. You'll run out of memory quickly.

After doing this work, I'm impressed with how easy it was for me, personally, to adopt the new closures features. And this isn't the simplest use of closures. I'm also surprised how easy it was to get core concepts wrong, like what "return" does and how continuations work. Some people are asking the question of whether a functional style has any place within Java. I've started to think more about this... and those critics might have a point. It's not entry level code for many beginning programmers. They have my sympathy. But that seems like an issue that businesses can address within a development team rather that within a language feature. After all, beginner programmers will go to extraordinary lengths to do the wrong thing. If you don't want your developers using a language feature then make a policy to forbid it. Easier said then done, though.

This blog post probably won't descend into a closures debate simply because my readership hovers near zero most days... thanks for listening all the same.

Full executable source is available at: http://www.hamletdarcy.com/files/2008/continuations.zip (requires Junit 4 to run). Have fun!

Friday, February 8, 2008

Continuation Passing Style Explained I: Java 5

Update: Like many people, I've been grappling with trying to understand continuations. In these posts, I tried to reduce continuations to their absolute essence. But in doing so, I may have missed the mark widely. I might have reduced and twisted them to the point where they aren't continuations at all, but something else entirely. As the joke goes, "Why isn't an elephant small, round, and white? Because then it would be an aspirin." Maybe all that is left of this post is a strange clever trick/ugly hack. Sorry if I've added to your confusion and misinformation... it was fun to write.

Learning a new language opens your mind to new ways of solving old problems, and so I found myself staring at my Java IDE this afternoon thinking, "What I really need is continuations and recursion." To fully convince myself I came home and spent Friday night writing a continuation based solution to my problem four different times: in Java 4 (no generics), in Java 5 (with generics), in Java 7 (using Neal Gafter's closures prototype), and in Groovy.

In this post I'll try to explain continuation passing style using Java 4 and Java 5 examples. Part II will show how the code changes with the latest closures prototype code, and Part III demonstrates the idea in Groovy.

I run across the scenario that made me think of continuation passing style fairly frequently. It's sort of a variation on the "Hole in the middle" problem described over on Enfranchised Mind. But it's really more of a "Function at the end" approach. I had reams of boilerplate foreach-loop code and just a few lines different right in the inner most loop. A simple, scaled back example is to consider writing a few functions to operator on numeric List data structures, where you need to write an exists( ) method to check for the presence of an element and an evens( ) method to filter all the odd numbers out. Your first revision will probably look something like this:

boolean exists(Integer needle, List haystack) {
for (Object element : haystack) {
if (needle.equals(element)) return true;
}
return false; // nothing found
}

List evens(List numbers) {
List result = new ArrayList();
for (Object element : numbers) {
Integer x = (Integer)element;
if ((x % 2) == 0) result.add(x);
}
return result;
}

You only have to write a couple of these functions before you realize that the list iteration methods are repeated across all of them. And the Template Method and Strategy Pattern don't exactly help here because it's hard to replicate the return statements within the functions. Some iterations stop after a few loops while others iterate over every element. Others might even perform an action on every other element. You'd never write all these loops into the language (foreach, for-first, for-every-other, for-multiples-of-3, etc.)

Enter continuations. A continuation is a function that represents the "rest of the computation given a point in the computation". Naively put, break an algorithm into two parts, have the first part call the second part in the return statement, and then the second part is your continuation. Perhaps you want to count all the files in a set of directories. You might create one function that finds all your directories ( f( ) ) and another function that counts files in a directory ( g( ) ). If you pass g( ) to f( ) as a parameter, and you code f( ) to do it's work and then call g( ), then you have composed a new function and g( ) is the continuation. Most examples you'll find on the Net have to do with tree traversal or web frameworks. The Little Schemer explains it much more simply, and I'm personally sticking with the simpler explanation. Now, full continuations are more complex than this, and I've actually just described a "continuation passing style" and not all continuations. True continuations carry a lot of rules about stack frames and variable scoping that I'm not going to touch.

Now, hopefully your language allows you to easily assign the continuation to a variable and pass it to methods, but perhaps not. You might be stuck with Anonymous Inner Classes, but that's alright. We can piece together a simple version of the evens( ) and exists( ) methods using continuations to see how they work.

The first thing to do is declare a function type:
interface Continuation {
Object call(Integer first, List rest);
}

The argument list may seem odd, but I'll explain. I've been taught to think of a foreach-loop as doing something for each element of a list. Another (better?) way to think of a foreach-loop is doing something for the first element of a list, and then repeating that something on the rest of the list until there are no more elements. That's just recursion. So if we want to emulate a foreach-loop, all we really need is a findFirst( ) method and a continuation that can do something for the first element and then call findFirst( ) again on the remainder of the list.
Object findFirst(List numbers, Continuation continuation) {
if (numbers.isEmpty()) return continuation.call(null, numbers);

Integer first = (Integer)numbers.get(0);
return continuation.call(first, numbers.subList(1, numbers.size()));
}

Basically, if the findFirst method is given an empty list, then our continuation is going to get called with null, signaling the end of the computation. Otherwise, we're going to pop the first element off the head of the list, and hand that element to the continuation along with the rest of the list. In this way you can implement an exists( ) method like so:
Boolean exists(final Integer needle, List haystack) {
return (Boolean) findFirst(haystack, new Continuation(){
public Object call(Integer first, List rest) {
if (first == null) return false; //never found
if (first.equals(needle)) return true; //match!
return exists(needle, rest); //process rest of list
}
});
}

The exists( ) method is looking for a needle (Integer) in a haystack (List). It uses the findFirst method to find the first element in haystack and passes it a continuation which knows how to proceed with the computation. In this case, if the continuation sees that a null is passed then it just returns false because no matching element was ever found. Otherwise, if the element is a match the continuation returns true, control is passed back to findFirst, back out to exists( ), and a false is received on the calling code. Finally, if there is no match, exists( ) is recursively called, but this time the list does not contain the first element.

The evens( ) method is similarly implemented in terms of the findFirst method:
List evens(final List numbers) {
return (List) findFirst(numbers, new Continuation(){
public Object call(Integer first, List rest) {
if (first == null) return Collections.emptyList(); //no more elements
List result = new ArrayList();
if (first % 2 == 0) result.add(first); //match!
result.addAll(evens(rest));
return result; //process rest of list
}
});
}
The continuation looks at the first element, if it is even then it adds the element to it's result set. Then the continuation recursively calls evens( ) on the rest of the list. In the end, you're left with a list of even numbers. The important distinction between the exists( ) and evens( ) methods is the way that both contain unique flow control. Being able to modify how your code returns or accumulates values is an important benefit of using continuation passing style in this way. Also being able to abstract out the control flow allows you to reuse that flow in different contexts. It would be difficult to use the Template or Strategy pattern and have one strategy stop the computation returning a boolean and another continue the computation accumulating Integers without repeating the code the iterates over lists. This example creates more code than it removes and introduces a functional style that is foreign to most Java code bases. Both good reasons not to use it. Other languages are different. Part II covers how the Java 7 prototype will improve this pattern, and Part III shows how Groovy would simplify it. But before moving on, let's consider how Java 5 Generics will make this code look.

I, for one, prefer this pattern with generics. Mostly because it makes me feel smarter than those poor suckers stuck on Java 1.4. We all have our own motives in life.
interface Continuation<T, U> {
U call(T first, List<T> rest);
}

<T, U> U findFirst(List<T> list, Continuation<T, U> continuation) {
if (list.isEmpty()) return continuation.call(null, list);
T first = list.get(0);
return continuation.call(first, list.subList(1, list.size()));
}

The Continuation is declared as a function which takes a T and a List of T's, and returns a U. Think forward and see how exists( ) will take an Integer and a List of Integers, and return a Boolean. Evens( ) will take the same parameters but return a List of Integers.

The implementation of findFirst is really no different, except that it now scares off novice Java developers. Its parameters are a list of T's and a continuation that accepts T's and returns U's. The benefit here is that Continuation interface can now be used type safely in many more scenarios than the previous version.

As far as the implementations of exists( ) and evens( ) goes, they really don't change at all except to pick up some more angle brackets:
Boolean exists(final Integer needle, List<Integer> haystack) {
return findFirst(haystack, new Continuation<Integer, Boolean>(){
public Boolean call(Integer first, List<Integer> rest) {
if (first == null) return false; //never found
if (first.equals(needle)) return true; //match!
return exists(needle, rest); //process rest of list
}
});
}

List<Integer> evens(final List<Integer> numbers) {
return findFirst(numbers, new Continuation<Integer, List<Integer>>(){
public List<Integer> call(Integer first, List<Integer> rest) {
if (first == null) return Collections.emptyList(); //no more elements
List<Integer> result = new ArrayList<Integer>();
if (first % 2 == 0) result.add(first); //match!
result.addAll(evens(rest));
return result; //process rest of list
}
});
}
I love generics, but I'm at a bit of a loss trying to explain how this version is better. Maybe there is a message in that. And maybe the Java 7 version will fare better.

Next up is the Java 7 Prototype version of this, along with the Groovy version.

Full executable source is available at: http://www.hamletdarcy.com/files/2008/continuations.zip (requires Junit 4 to run). Have fun!

Sunday, November 11, 2007

Groovy Closures - The End of Template Method?

I've always loved the Template Method design pattern. If there were ever a good reason to use implementation inheritance (i.e. an abstract class) this seems like it. I find it natural to want to define the outline of an algorithm in a parent class, and then provide the implementation details within concrete subclasses. A while back, Alex Miller provided a great criticism of the pattern, pointing out several deficiencies in it. Particularly, it is sometimes hard to comprehend and maintain as it grows in complexity.

After working more with Groovy, another limitation of how the pattern is traditionally written has become apparent. In multi-step algorithms, where a skeleton algorithm in an abstract class calls several steps in a subclass, each of the steps are tightly coupled to one another. A concrete subclass locks you into using a certain composition of steps at compile time, and as the number of steps grows, the combinatorial explosion of subclasses becomes unmanageable. An example might make this clearer.

Consider an encryption algorithm composed of two steps (double encryption must be more secure, right?). Say you want step one to reverse the text of a message, and step two to perform a rot13 operation on it. (Okay, okay, this is really a cipher and not an encryption, but that's not the point). In this scenario you would define the algorithm template in an abstract class, say TwoPhaseEncryption. Then you would create a subclass with two methods, one to perform the reverse and one to perform the rot13.

Here is the traditional Java code, with the subclass implemented as an inner class for brevity:

public abstract class TwoPhaseEncryption {

public String perform(String text){
return phase2(phase1(text));
}

protected abstract String phase1(String text);
protected abstract String phase2(String text);

public static final class ReversingRot13 extends TwoPhaseEncryption {

protected String phase1(String text) {
return new StringBuffer(text).reverse().toString();
}

protected String phase2(String text) {
String result = "";
for (int i = 0; i < text.length(); i++) {
char c = text.charAt(i);
if (c >= 'a' && c <= 'm') c += 13;
else if (c >= 'n' && c <= 'z') c -= 13;
else if (c >= 'A' && c <= 'M') c += 13;
else if (c >= 'N' && c <= 'Z') c -= 13;
result += c;
}
return result;
}
}
}
I tried to choose a dead simple example, but the rot13 method is still 10 lines of code. The rest of the class should be pretty familiar, though. The abstract TwoPhaseEncryption provides a perform(String) method that calls phase1() and phase2() on its subclasses, and then the Rot13Reverser subclass reverses the string as phase one and shifts all the letters 13 positions as phase 2. As is evidenced by the usage, this code does indeed provide symmetrical encryption and decryption:
TwoPhaseEncryption encryptor = new ReversingRot13();

String encrypted = encryptor.perform("abcdefghijklmnopqrstuvwxyz");

String decrypted = encryptor.perform(encrypted);

//prints Encrypted: mlkjihgfedcbazyxwvutsrqpon
System.out.println("Encrypted: " + encrypted);

//prints Encrypted: abcdefghijklmnopqrstuvwxyz
System.out.println("Decrypted: " + decrypted);
This is elegant to me. It's easy to understand, especially if you already know what a template method is. It's easy to test, which I place a high value on. But what happens when you want to reverse the two steps... say you needed a Rot13Reverser instead of a ReversingRot13? Or maybe you have a new DES encryption step, and now you need all the possible combinations: a Rot13DESer, a DESRot13er, a ReversingDES, and a DESReverser. Do you really want 6 subclasses? (Your answer is no). How about adding one more encryption type? That's too many types.

Please don't be deceived by the simplicity of this example! This code shows all of the steps being homogenous, they can be mixed and matched any which way, so a composition might be a better answer. But oftentimes the steps aren't homogenous. Perhaps you have an algorithm with a common setup-run-teardown phase, where each step can't be substituted with one another, yet different run phases might be used with different setup/teardown combinations. There are times where a template method is the desired pattern. But I'm here to tell you that subclassing probably isn't the desired implementation.

Let's see how you would program a template method in Groovy, which provides simple and easy to use closures as part of the language.
class TwoPhaseEncryptor {
def phase1;
def phase2;

TwoPhaseEncryptor(phase1, phase2) {
this.phase1 = phase1;
this.phase2 = phase2;
}

//perform two phase encryption
def perform = { text ->
return phase2(phase1(text));
}
}
Seriously? That's it? Sort of. This is just a class that takes two closures (the phases) and runs the supplied text through each in the perform() method. But the implementation of the algorithm can be done whereever you want. In my example I do it in the client script that uses the TwoPhaseEncryptor, but they could be defined on a common class and then assembled by the client as need be:
//closure that reverses a string
reverser = { return it.reverse() }

//closure that performs rot13 substitution on a string
rot13 = {
def rot13Array = it.collect {
if (it >= 'A' && it <= 'M') return (char)(it.charAt(0) + 13);
if (it >= 'N' && it <= 'Z') return (char)(it.charAt(0) - 13);
if (it >= 'a' && it <= 'm') return (char)(it.charAt(0) + 13);
if (it >= 'n' && it <= 'z') return (char)(it.charAt(0) - 13);
return it; //out of range
}
def result = ""
rot13Array.each {
result += it;
}
return result
}

encryptor = new TwoPhaseEncryptor(reverser, rot13)

//encrypt the alphabet
def encrypted = encryptor.perform("abcdefghijklmnopqrstuvwxyz")

//decrypt the encrypted alphabet
def decrypted = encryptor.perform(encrypted)

println "Encrypted: " + encrypted
println "Decrypted: " + decrypted
So in this example, the first closure takes the place of the phase1() Java method, and the second closure takes the place of the phase2() java method. The phase1() and phase2() methods are then simply passed to the TwoPhaseEncryptor as parameters at runtime, rather than provided by the concrete subclass.

At first glance, the code doesn't look as clean to me. I like the way Java declares the intent of the algorithm by making you declare abstract methods, and all the code for the phases are in one place, making it easy to take it all in quickly. But the Groovy way has a much larger advantage: low coupling.

What needs to change in the Groovy version if you want to do the rot13 first instead of second? You switch the order of the parameters. Hmmm, that was easy. What needs to change to introduce a DES reverser? Well, you write the method and then pass it as a parameter. Where is the class explosion? Where are the multitudes of types? They aren't there! Don't like how the calling code needs to know about the phases? Move the phases to a helper class, or better yet, use Spring to inject your calling code with a constructed TwoPhaseEncryptor.

The Gang of Four Design Pattern book was written with C++ users in mind, so the high coupling between the template method and its implementations wasn't immediately apparent. Having written quite a few template methods in C++ myself, it wasn't until I discovered Groovy that I saw the Path to the Better Way. As Neal Ford argues, many of the design patterns are recipes, they show you how to code around common issues using the languages of the day. But they are seasonal recipes, and as languages come and go, the recipes are going to change while design problems won't. And at this point, my recipe for template methods no longer requires subclassing because I can better achieve the object oriented principles I strive for (low coupling, high cohesion) by using object composition and dependency injection.

And if anyone can write a better rot13 method in Groovy then please post it!

Tuesday, September 18, 2007

Java Performance Myths

I recently looked at the Jave Performance Myths presentation that Cliff Click gave at JavaOne. Some of the results were not surprising: final keyword does not help performance, autoboxing is nearly free. But other results were more interesting...

1. Try Catch Blocks are free - Ah, the smug feeling associated with quoting Joshua Block's Effective Java #39, Use Exceptions Only for Exceptional Conditions. This Effective Java entry says to use exceptions only when an exception has truly occurred. The rationale behind this is twofold: exception handling on JVMs is slow, and the code produced is obscure and confusing. It looks like the first reason is now longer true, but I'd still say that using exceptions to exit looks is poor code because it does not reveal intentions the way normal exiting of loops does.

2. Synchronization is cheaper than bugs - In Java Concurrency in Practice, Brian Goetz describes a proper synchronization policy as one that manages concurrent access to an object's state without violating any invariants or postconditions of that object. The speed at which critical sections can be entered and exited really shouldn't effect this policy. While I'm glad that the speed of synchronization has been increased, I don't see how this increase would effect software design. Originally, this myth seemed a lot more important than I think it is.

3. Garbage Collection Works, Use Object Pools sparingly - Garbage collection on lightweight and medium weight objects (think Hashtable) is now faster than if object pools and caching were used. Only under rare circumstances should we be using something like the flyweight pattern. This is what Neal Ford and Stuart Halloway have been saying for some time... many of the GoF design patterns are recipes for Java and C++ that don't have relevance in other languages. And now some of the patterns don't have relevance in Java.

4. Enum values() calls are expensive - The enum data type is Java 5 was supposed to be syntactic sugar and hold no performance implications. It looks like if-blocks and for-loops invoke enum.values() under the covers and this method holds a significant performance hit when called repeatedly. This can be avioded by moving calls to enum.values() out of the loop. Of course, profiling data should support this move before any decision to optimize is made (Effective Java 37 - Optimize Judiciously).

Check out the original Jave Performance Myths presentation for more info!

Friday, July 13, 2007

Observer in the Wild

The Design Patterns book study group I participate in met last week and discussed the Observer Pattern for an hour.

The Observer Pattern appears frequently in GUI development, and most people are at least familiar with the Swing style MouseListener interface (or others like it). Basically, a client object declares itself to be an observer (listener) and passes itself off to an observable subject. The subject keeps a list of all observers and notifies them when some event has occurred.

Be cautious in thinking that this is a pattern that needs to be emulated. There are a lot of different ways to implement the Observer Pattern beyond this common method. In fact, as Stuart Halloway recently pointed out, some languages are sufficiently advanced (or just different) to have no need for this pattern at all.

One Java compatible alternative implementation is to not require the observers to declare a specific interface and instead use reflection to notify listeners that an event has occurred. This may sound suspect on the surface… but consider it for a moment. If a method name were invoked reflectively then there is little advantage to not declaring an interface… it just makes the Observer pattern harder to understand and harder to support in an IDE. But if method parameters are analyzed instead, then you get several advantages. One, the Observable object is not coupled to any specific client interface, and any object at all can become an observer of events. Kinda nice. Two, parameters can be analyzed and the most specific method can be called for the event. For instance, if you have a listener of cascading style sheet property events, and want to be notified when a property changes, then you may have a listener of a CSSChangeEvent… or maybe you only want a CSSFontChangeEvent, which is a more specific sub type of the change event. Traditionally, the listener would have to analyze the observable object to see if the event was /really/ want it was interested in or cast the data sent to it to perform the same check. Ugly, but totally acceptable in the Java world because that’s the way it is done. Using the parameter reflection scheme cleans up the code considerably because methods can declare specifically what types of events they are interested in and the Observable object can suppress events that the listener doesn’t care about; all while still being extremely loosely coupled with that listener. Before writing this off as a hare-brained scheme, consider that the Spring framework for .NET uses a very similar mechanism for their observer implementation. The code base I work on has a similar version also. Of course, there are a few problems with this. Method parameters may be ambiguous, and how the “most specific” method is determined must be documented somewhere… AND the programmer must somehow know to read the documentation (yeah, that’ll happen). Also, IDE support for this does not exist and things like Find Usages and automated refactorings won’t work. A final problem has to do with typing. At what point does the observable check to make sure that the listener actually has a method that can be called? The Java world adheres to the fail early/fail fast error handling principle, so the initial inclination would be to check the listener as it is registered to make sure it has an event handler method that can be called. But as Groovy gains usage, it would be possible to register a Java/Groovy object with an Observable that does NOT have valid callback methods and then later dynamically add those methods onto the object as a mixin. Far fetched, I know, but technically possible. An open discussion point from the meeting is whether or not this parameter reflection theme constitutes “Duck Typing“, but I’ll leave that for the comments section below.

A second alternative that is not Java compatible is to use closures. But when Java 7 is released it just may be part of the language! In many languages, a method is a first class citizen of the object world: it can be passed as a parameter, returned from a function, declared as a variable, and stored as an item’s state. C++, SmallTalk, JavaScript, Ruby, and Groovy all have some construct like this (plus many other’s, I’m sure). ActionScript 1.2 in Flash performed all of their event listeners like this. The variation here is that instead of creating a small, anonymous class to be a listener, you simply create a small, anonymous method.

Your code goes from:

component.addCSSChangeListener(new CSSChangeListener(){
  onFontChange(){System.out.println("font changed");}
});

to something like:

component.addCSSChangeListener({
  System.out.println("font changed");
});

Closures open up a whole lot of new programming techniques. It is a big change that may or may not be added to the language. If it does, a lot of the interface based programming that we do can be streamlined to be closure/method based. Of course, you can write FORTRAN code in any language you want, and you will always be free to use obsolete constructs with languages that don’t require them. For instance, I googled “Ruby + Observer” and found quite a few examples showing how to implement an interface based observer pattern in Ruby code. I’m sure there is a time and place where that is needed, but a language that provides closures should obsolete this pattern.

A final option, that even less languages support, is to handle observers using pattern matching functions. Some languages support what is called Pattern Matching in method parameters... so dynamic dispatch doesn’t occur simply on the object type, but based on the value of the object too. For instance, consider the following if statement:

void foo(int x) {
  if (x == 0) {
    print("zero");
  } else if (x == 1) {
    print "one");
  } else
    print(x);
  }
}

In a pattern matching language you could instead write it like this:

void foo(0) {
  print("zero");
}
void foo(1) {
  print("one");
}
void foo(int x) {
  print(x);
}

This means that you could declare your listener object to have the method expected, and have the events it wants to handle in the parameters of that method. No if statements, no casts, and no inspecting the Observable to see what event was generated. This has all the advantages to the first option discussed (reflection based on parameters) except that the language supports it instead of having to use reflection and then defining your own parameter precedence level. Also, dynamic dispatch isn’t based on solely object type, but the values of that type as well. So, using the previous CSS handler example, you could have one method that handles changes to the “Tahoma” font, one method that handles changes to any font, and one default method to handle everything else. Ultimate flexibility, and the Observerable is still loosely coupled to the observer because there isn’t even an interface declared that it is dependent on (assuming a weak type system). This is the language feature I want!

So, Java programmers (that’s you), which of these observer patterns is the most awful? Which one would have you cursing the original developer when you’re trying to maintain the code years later? Feel free to let loose...