Wednesday, May 14, 2008

jSuneido - implementing classes & more

So far, so good. I've written about 850 lines of Java that equates to about 2000 lines of C++ in two and a bit days. I've been porting the basic Suneido data types.

I'm lucky that Java is pretty close to C++, and Suneido's byte code is pretty close to Java byte code.

I'm gradually getting an idea of how to implement Suneido on top of Java and the JVM. Here's how I envision Suneido classes being compiled:
class SuClass { // ultimate base class - static, not generated
public SuValue invoke(int method, SuValue ... args) {
return invoke2(method, args);
}
public SuValue invoke2(int method, SuValue[] args) {
if (method == DEFAULT)
throw SuException("method not found");
return invoke2(DEFAULT, args);
}
}

class xxx extends SuClass { // generated
public SuValue invoke2(int method, SuValue[] args) {
switch (method) {
case 8346:
return mymethod(massage(6, args, 4746, 9846, 3836));
...
default:
return super.invoke2(method, args);
}
}
public SuValue mymethod(SuValue[] locals) {
if (locals[2] == null) locals[2] = FALSE;
invoke(8347, locals[1]); // call a method in "this"
locals[4] = globals[123].invoke(7472, locals[0], locals[2]);
locals[3] = new SuValue(
JavaClass.a_static_method(locals[1].string()));
}
...
}
Note: I'm showing this as Java source code, but I plan to compile directly to Java byte code.

The explanation:
  • int method is an index into the Suneido symbol table - since it's faster to dispatch on int instead of string (and it also makes the compiled code smaller)
  • a variable number of arguments are received into the args array
  • a switch is used to dispatch to the correct method, in this example 8346 is the symbol index of "mymethod"
  • Suneido's extra argument passing features are implemented by "massage" which also allocates an array for the local variables of the method, the "6" is the size of this array, the remaining (variable number) arguments are symbol indexes for the methods parameter names (required to handle named/keyword arguments)
  • if the method is not found in the current class, invoke2 is called on the parent class, if this ends up at the ultimate base class (SuClass) then Suneido's Default method-not-found will be called, if this also ends up back at SuClass then an exception is thrown
  • methods receive a "locals" array containing the arguments in the first part of the array
  • default argument values are compiled to code, in this example the default value of the third argument is false
  • classes and functions are stored in the globals array
  • blocks are compiled into separate methods, context is passed via the locals array (this is one of the reasons for using the locals array instead of native Java local variables)
  • Java classes can be called directly, with the appropriate conversions
This should be relatively fast - method dispatch is just a couple of extra function calls and a switch - no reflection or table lookup. Of course, methods deep inside class hierarchies will take longer due to the "chaining".

So far I'm using unchecked exceptions. Partly because it's simpler and partly because that's what I'm used to from C++. The traditional advice (e.g. from Sun) is to use checked exceptions, but this is starting to be questioned.

I'm using the standard Java indent/curly style. It hasn't been a problem but it's a bit of a transition after 25 years of C and C++ (and Suneido) using the Whitesmiths style.

I did figure out how to configure the Home and End keys to be beginning and end of line in Eclipse. That's one thing I miss on the Mac. The "standard" appears to be Apple + left/right arrow, but I find that more awkward, it doesn't appear to be supported everywhere, and those keys are sometimes used for other purposes like switching OS X virtual desktops.

I notice that C++0x is proposing another approach to the problem I talked about in my last post. I suggested replacing:
Type var = new Type(...);
with:
Type var = new(...);
For C++0x they are proposing:
auto var = new Type(...);
This is better than my suggestion because it also allows things like:
auto var = func(...);
where the type of var is taken from the type of the return value of func.

No comments: