Wednesday, April 08, 2009

jSuneido Progress

I've been making steady progress on the Java version of Suneido. I've been working on compiling to Java byte code and have functions with expressions and control statements pretty much finished. That leaves exceptions, blocks, and classes.

But yesterday I went back and made a fairly radical change to how jSuneido is implemented. I started at 7:30 am, started making changes, broke almost every test, and finally got the tests running again at 9:30 pm. (with breaks for lunch and supper)

cSuneido's data types all inherit from SuValue. Everything is done in terms of virtual calls on SuValue. I followed the same pattern in jSuneido and it was working ok.

The problem is that in Java this means everything being "double wrapped". e.g. a Java string is wrapped in a Suneido SuString. This isn't twice as much memory, since the wrappers are small, but it is twice as many memory objects. And it means every operation, (e.g. string concatenation) has to create both a result object and a wrapper.

Integers aren't as bad since the wrapper can contain the primitive type. But even integers are "worse" than cSuneido because it encoded them directly in the pointer. Needless to say, the JVM doesn't allow this sort of trick.

The other problem is that using wrappers like this makes the code a lot more "awkward". Instead of 123 or "hello" you have to write SuInteger.valueOf(123) or SuString.valueOf("hello").

Another area that I was thinking about was integration with Java code. With Suneido having its own types, you'd have to wrap and unwrap values in the interface, which seems ugly.

One thing that got me thinking about this was Scala's use of implicit conversions to extend existing classes (like String or Integer) without actually storing them in a wrapper. (Although I wonder about the performance impact of wrapping on the fly.)

I started wondering whether I could drop the idea of an SuValue base class and the derived wrapper classes, and just use Java Object instead. That way I could directly use native Java types like Integer, String, and BigDecimal.

The downside of this approach is that you can't do everything with virtual calls on SuValue. Since there's no common base class (other than Object, which obviously doesn't have the methods I need) you have to resort to instanceof and getClass. The object-oriented purists would frown on this. But again, Scala (and other functional languages) opened my eyes on this a bit, since matching on "type" is quite common and accepted.

So operations like "add" have to check the types of values and handle any conversions. This is a little ugly, but it's isolated in a small amount of core code that doesn't change much. And I can't say I was sorry to drop the double dispatch I was using. It's a good technique, but I still find it confusing.

Ideally, you'd test both approaches and measure speed and memory usage. Maybe if I had a team of programmers (or grad students) to assign to it. But with just me, part time, I can't see spending the time to do this. So I have to take my best guess at what the best approach is.

I decided it was the way to go and so I made the switch. Better now, when I could do it in a day, and not worry too much if I introduce bugs, versus later when I'd have more code, and less tolerance for bugs.

So far, I'm happy with the results. There are a few isolated ugly parts, but other than that it seems cleaner. Time will tell, although there's no way to know how the other approach would have worked out, so I'll never know for sure.

No comments: