Sunday, December 27, 2009

Replacing conditional logic

A conditional dispatcher is a conditional statement (such as a switch) that performs request routing and handling. For example:

public void doAction(String action) {
        if (action.equals(UPLOAD_FILE)) {
            System.out.println("uploading file");
        } else if (action.equals(SAVE_FILE)) {
            System.out.println("saving file");
        } else if (action.equals(DEL_FILE)) {
            System.out.println("deleting file");
        }
        //etc
 }



Another variation is to use "instanceof" and then do something depending on the type of the object.

The two most common reasons to refactor from a conditional dispatcher are the following:

  1. Not enough runtime flexibility: Clients that rely on the conditional dispatcher develop a need to dynamically configure it with new requests or handler logic. Yet the conditional dispatcher doesn't allow for such dynamic configurations because all of its routing and handling logic is hard-coded into a single conditional statement.
  2. A bloated body of code: Some conditional dispatchers become enormous as they evolve to handle new requests or as their handler logic becomes ever more complex with new responsibilities. Extracting the handler logic into different methods doesn't help enough because the class that contains the dispatcher and extracted handler methods is still too large to work with.

The Command pattern provides an excellent solution to such problems. To implement it, you simply place each piece of request-handling logic in a separate "command" class that has a common method, like execute() or run(), for executing its encapsulated handler logic. Once you have a family of such commands, you can use a collection to store and retrieve instances of them; add, remove, or change instances; and execute instances by invoking their execution methods.

...
public enum FileOps {UPLOAD_FILE, SAVE_FILE, DEL_FILE}

private Map handlers = new HashMap();

public CommandDriver() {
        createHandlers();
}

private void createHandlers() {
        handlers.put(FileOps.UPLOAD_FILE, new UploadHandler());
        handlers.put(FileOps.SAVE_FILE, new SaveHandler());
        handlers.put(FileOps.DEL_FILE, new DeleteHandler());
}

public void doAction(FileOps action) {
        Handler handler = handlers.get(action);
        handler.execute(null);
}
....

And the command defines the operation to execute and accepts a context object with the necessary data to do so:

public class UploadHandler implements Handler{
    public void execute(Map params) throws IOException {
        System.out.println("uploading file");
    }
}


Another alternative to replacing conditional logic is to use the Strategy pattern. This pattern has a particularly good synergy with IoC containers: it's the responsability of the container to inject the strategy to use. It is also easier to test using mocks and we can change the strategy at runtime.

public class StrategyDriver {

    private FileStrategy fileStrategy;

    public void setFileStrategy(FileStrategy fileStrategy) {
        this.fileStrategy = fileStrategy;
    }


    public void doAction(Map params) throws IOException {
       fileStrategy.handleFile(params);
    }

    public static void main(String[] args) throws IOException {
        new StrategyDriver().doAction(null);
    }
}

Friday, December 4, 2009

Redeployment pain



Did you know that the average turnaround time (for deploy/redeploy) of a typical web application is about 1 minute? This sums up to almost 1 month a year (of pure waste)!
That's a completely insane amount of waste.
And don't forget the effort/time that's necessary to pick up "where we left" when the application is running again after staring at the console for a small while.
So, how can put the redeployment torture to an end?



Deploy the application in exploded format

You can package view templates and other types of GUI resources and just hit "refresh" but it doesn't work for classes so it works partially.

OSGI modules

By dividing the application in smaller modules we can speed up the turnaround time. Less code has to be deployed but still we have to wait and there's a degradation cost.

Web framework class loaders

For example, Seam supports incremental redeployment of JavaBean components. That's pretty nice but it doesn't work for entity classes (plus several other limitations) so it works partially. I think Tapestry does something similar.

JVM Hotswap

Running an application in debug mode you can "reload" changed classes. Works OK as long as you only make changes in method bodies (no new methods, no new fields). On top of that it's too slow so it's not really a solution

JavaRebel

I attended a Devoxx presentation about this product and was impressed. :)

It installs as a JVM plugin (-javaagent) and works by monitoring the timestamp of class files. When it is updated (e.g. when a developer saves a class from the IDE) JavaRebel will reload the changes to class code and structure while preserving all existing class instances.

Check http://www.zeroturnaround.com/javarebel/

I've actually tried this and it works both for JavaBean components and entities. Cool! Of course ot also works for any Java SE application. Goodbye redeployment pain.

For example for JBoss all you have to do is add the following line to run.bat

set JAVA_OPTS=-noverify -javaagent:javarebel.jar %JAVA_OPTS%


And after copying the javarebel jar file to /bin you are ready to start your application and enjoy zero turnaround glory :)

Sunday, March 8, 2009

Performance and scalability concepts



Performance is about going fast, it's got nothing to do with application or meeting business needs. It's just about low latency and making things go fast. Describes the elapsed time that it takes to execute an operation

Scalability is about adding units of operation and making a system pull more volume in order to meet an increase demand.

Ideally we want to put this two attributes into a running system. This is usually not so easy since more often than not, scalability features will conflict will raw performance figures. It is hard to decide (up front) between scalability and raw performance (most applications do not need to scale though).

In addition, scalability describes how a system behaves given an increasing number of simultaneous operations. Scalable performance describes a system that can scale predictably under load and can also execute operations quickly. This behavior has to be architected in.

How do we make applications that rely on databases, scalably performant? There are a few rules that define this situation:
  1. Databases are far away, usually on the other side of a serialization barrier (lots of I/O, up and down network stack)
  2. Databases are optimized for bulk calculations
  3. Databases don't scale past a single server machine

And a few things to keep in mind:
  • Minimize roundtrips
  • Because of 1, trips to DB are slow. Because of 3, each hit brings us closer to the limit of the DB server.
  • Offload calculations and joins
  • Because of 2, the DB does this very fast. Because of 3, this could end up limiting scalability.
  • Course grained transactions
  • Because of 1, course-grained boundaries and caching will pump performance (make the unit of work as wide as it can be). The number of transactions goes down, increasing scalability. Opens the door for write behind caching and queued updates (this can potentially remove synchronicity from DB operations).
  • Effective statement batching
  • With coarse-grained transactions: batching boosts performance and reduces statement execution count on server which results in better scalability. Using a good ORM can do the job.
  • Reduce lock duration
  • Locking implies limiting concurrency which is necessary to maintaining ACID properties. The problem is that locks on data force parallel data access to happen in sequence. Use optimistic locks to improve scalability!
  • Polish the DB schema
  • Use the DBA :)

Thursday, January 1, 2009

Code with less noise

Accidental complexity

Like most developers (I guess), I often have to deal with code that other people have written and more often than not the code that is providing the real value pales in comparison to the amount of code that is generating noise due to language verbosity & lack of abstraction power. Brain cycles are simply wasted focusing on stuff that provides no value, because it does not add meaning (on the contrary). This is one of the side effects of accidental complexity or choosing the wrong tool (i.e. language/framework) for the task at hand. Which leads us to the concept of:

Noise and signal

Noise in programs: the hard to read and the tedious a.k.a bloat a.k.a accidental complexity

Obviously, noise makes code much less meaningful. Ideally:
We want to strive to make sure every single line of code has some value or meaning to the programmer. Programming languages are for humans, not machines. If you have code that looks like it doesn’t do anything useful, is hard to read, or seems tedious, then introduce an abstraction that will let you remove it.
So then:
If a shorter program is shorter by virtue of having less accidental complexity, it’s better. It has a higher ratio of signal to noise. - Reg Braithwaite

Some code

Let's write some simple Groovy code to illustrate this.







So what?

Well, as a simple example take the line:

assert ['ULC'] == invoices.items.grep {it.total() > 7000}.product.name

In Java we would write something like:
It's obvious that I'd rather read one line. This is just a simple example (from Groovy in Action btw) but it's obvious that:
Less noise -> less accidental complexity -> more meaning -> less maintenance cost.

Programming is a pedagogical trait
If your language’s mechanisms for abstracting away accidental complexity are so laborious that you cannot remove the useless, the hard to read, and the tedious from your programs without introducing code that is even more useless, harder to read, and more tedious to your framework, then change languages. -Reg Braithwaite
To sum it up:
  • Code is read much more than it is written. If people can't read your story, they can't improve it or fix it. Unreadable code has a real cost. This is the famous technical debt
  • There are many ways to write meaningful code, even in Java
  • Groovy is definitely a good option for people who would like to stop torturing themselves with native Java syntax and want to make their code easier to read while at the same time protecting the Java investment that has been made in different frameworks and tools (flat learning curve). It has the best IDE support and the syntax is very Java-like. Yes, syntax is very important.
This last point leads us to the concept of learning: new languages, new techniques, in order to work at higher levels of abstraction and thus reach higher productivity. There's no other way if you want to be among the mythical 5%. :)