Sunday, March 08, 2009

jSuneido Language Implementation

As the saying goes, "be careful what you wish for".

I was getting tired of the somewhat mechanical porting of code from C to Java, and I was looking forward to the language implementation because I knew it would involve some thinking.

Of course, now that I'm in the middle of thrashing around trying to figure out the best way to implement things, I'm thinking maybe there's something to be said for having a clear path!

"Mapping" one complex object (Suneido's language) onto another (the JVM) is complicated. There are a lot of choices, each of them with their pro's and con's. And you never have complete information either about how things work, or what their performance characteristics are.

Some of the choices won't be easy to reverse if they turn out to be wrong. Often, you're not even sure they were "wrong". They might turn out to have awkward parts, but then again, any design choice is likely to have awkward aspects.

One of the things to remember is that the Java language and the Java Virtual Machine (JVM) are different. Some features are implemented at the Java compiler level, and so they're not directly available at the JVM level. You have to implement them in your own compiler if you want them.

I'm also having trouble fighting premature optimization. On one hand, I want a fast implementation. On the other hand, it's better to get something simple working and then optimize it.

For example, I'd like to map Suneido local variables to Java local variables. But to implement Suneido's blocks (think closures/lexical scoping) you need a way to "share" local variables between functions. The easiest way to do that is to use an array to store local variables. But that's slower and bigger code. One option is to only use an array when you've got blocks, and use Java local variables the rest of the time. Or even more fine grained, put variables referenced by blocks in an array, and the rest in Java local variables. While faster and smaller, this is also more complicated to implement. To make it worse, when the compiler first sees a variable, it doesn't know if later it will be used in a block. So you either have to parse twice, or parse to an abstract syntax tree first. (Rather than the single pass code generation during parse that cSuneido does.)

One of the things that complicates the language implementation is that Suneido allows you to Eval (call) a method or function in the "context" of another object, "as if" it were a method of that object. This is relatively easy if you're explicitly managing "this", but it's trickier if you're trying to use the Java "this".

I'd prefer to map "directly" to JVM features like local variables and inheritance because the resulting code will be smaller and presumably faster (since JVM features will have had a lot of optimization). And presumably the closer the match, the easier it will be to interface Suneido code with Java code.

Despite the initial thrashing, I think I'm settling into a set of decisions that should work out. I already have code that will compile a minimal Suneido function and output java bytecode (in a .class file).

So far, ASM has been working well. Another useful tool has been the Java Decompiler so I can turn my generated bytecode back to Java to check it.

No comments: