Sunday, August 07, 2011

Scala Again

Recently I've been re-reading Odersky's Scala book (the new edition). I looked back at my original blog posts about Scala and was surprised to see they were over two years old. Time flies!

I set up a copy of Eclipse (3.7 Indigo) with the latest Scala plugin (2.0.0beta09) so I could play around.

One of the examples in the book (and other places) is using case classes and pattern matching to simplify algebraic equations. I do some of that in Suneido (both in the language and the database query optimization) so it's somewhat familiar. Here are the classes:


I chose two optimizations - subtraction of two constant numbers, and addition of zero


This version works for a single level:

BinOp("-", Num(2), Num(2)) => Num(0)

BinOp("+", Var("x"), Num(0)) => Var("x")

The obvious way to make it handle multiple levels is:


But this is top down, which doesn't handle cases like:

BinOp("+", Var("x"), BinOp("-", Num(2), Num(2)))

=> BinOp("+",Var("x"),Num(0))

We need to operate bottom up, which means processing a node's children before itself:


BinOp("+", Var("x"), BinOp("-", Num(2), Num(2)))

=> Var("x")

One thing that bugs me about the recursive versions is that they copy every node, even if they're not changing anything. In Suneido I do something like:

arg = simplify(node.arg)
if (arg != node.arg)
    node = new UnOp(node.op, arg)

But that seemed verbose and didn't fit with this Scala code. Instead, I added "alter" methods to UnOp and BinOp like:


so simplify only had to change slightly:


Of course, I ask myself if this is premature optimization. Obviously, for this simple test, it's unnecessary. On the other hand, it was part of my exploration.

I always struggle with this question of premature optimization. The problem is (as I've written about before) if you totally ignore performance issues, you can easily end up with a program that, for example, does several times as much allocation as it needs to. And that sub-optimal code won't be conveniently in one hot spot - it'll be spread throughout your code. Allocation in modern JVM's is very fast, but garbage collection and memory bandwidth and cache effects still cost.

The more I play with Scala, the more I like it.

No comments: