The Software Life

Thursday, June 26, 2008

jSuneido - JavaDoc

I've posted the JavaDoc for jSuneido so far:

http://www.suneido.com/jSuneido/doc/

I haven't spent a lot of time on JavaDoc comments, but there's already more than I had for the C++ version.

Reminder - you can browse the source at:

http://suneido.svn.sourceforge.net/viewvc/suneido/jsuneido/

Or check it out using Subversion from:

http://suneido.svn.sourceforge.net/svnroot/suneido

Friday, June 20, 2008

jSuneido - server (yet again)

Ok, so the plan is to use Java Channel's, Selector, and Executor to build a Reactor style server. How hard can it be :-)

The problem is multi-threading concurrency. Why isn't that surprising?

It turns out that using Selector with multiple threads requires some tricky locking. At first, looking at Architecture of a Highly Scalable NIO-Based Server it didn't seem too bad.

But then, I found How to Build a Scalable Multiplexed Server With NIO Mark II (large pdf) and it started to seem harder. I downloaded his source code and at least it was small enough to be comprehensible. Full frameworks tend to be too big to easily grasp.

At first I was just going to use his code as an example. But after a half hour of copy and paste I came to my senses. Although I was simplifying it a little, I was basically ending up duplicating his code. Not a good plan. Especially since his code is still "preliminary" and could change.

I threw out what I'd done and dropped his code into my project. And so far it's working. We'll see.

Wednesday, June 18, 2008

Antlr and Eclipse

I just finished reading The Definitive Antlr Reference published by Pragmatic Programmers. Antlr is a parser (& lexer) generator. It's written in Java but it supports other target languages like C++, Ruby, and Python. It looks like a pretty cool system.

I have an interest in this area that goes a long way back. In the early 80's I reverse engineered Yacc to create my own look-alike. (Since there was nothing available on MS-DOS.) I didn't have the source code so it was an interesting exercise to feed different inputs to a "black box" and see what the outputs were. I only had a crude understanding of how Yacc worked but I was successful enough that I was able to use my program to generate parsers for several applications. (Later, I switched to the version of Yacc that came with the MKS Toolkit.)

Although I'd seen some mentions of Antlr, I probably wouldn't have got very interested if the book hadn't come out. Although there's information on the web, I still find a good book the best way to learn something new. And a book adds a kind of "legitimacy" to things.

When I wrote Suneido I chose to hand write a recursive descent parser. Coincidentally, unlike Yacc (and the similar GNU Bison), Antlr generates ... recursive descent parsers.

I wondered if I could use Antlr to replace Suneido's hand written parser. I installed the AntlrWorks gui tool and quite quickly managed to create a grammar for Suneido's database language.

Next, I wanted to try to incorporate it into my Eclipse project. I found a plugin and installed it and added my grammar to the project ... and got a bunch of weird errors. Argh!

After a bunch of messing around I realized that the Eclipse plugin I was using was for the old version of Antlr. The next problem was how to uninstall the plugin. This isn't as obvious as you'd think. It's under Help > Software Updates > Manage Configuration. That gives you options for updating, disabling, and showing the properties, but not uninstall. After some further poking around I discovered Uninstall is on the right click context menu on the tree. I'm not sure why it's not listed under Available Tasks. Maybe because it doesn't work very well. I chose to Uninstall the Antlr plugin. But I couldn't get rid of the grammar errors. And when I looked, there were still Antlr related files hanging around. I cleaned up manually as best I could.

I found a new plugin and tried to install it. It had dependencies on another package - the Eclipse Dynamic Language Toolkit. I couldn't find an update site for the specific version it needed so I had to install manually. Then there was a dependency on the Graphical Editing Framework (GEF) so I had to install that. Then I could finally install the new Antlr plugin. And finally get my grammar to compile.

Even better, if I want this on my other computers, I have to go through the same process to install all the dependencies. Yuck.

All in all, it was a pretty frustrating process and it soured me a bit on Antlr, although it's really nothing to do with Antlr itself. I can blame some of it on Eclipse, but it's more just a result of a rich ecosystem of third party plugins. I guess you have to expect some problems.

The real test would be to implement the main Suneido language grammar in Antlr. I think this could be a little tricky because Suneido has optional semicolons, which make some newlines significant (but the rest of the time they're ignored). It was tricky enough kludging this in the hand written parser. The Antlr book does give an example of how to parse Python-like languages where indenting whitespace is significant, although that's not quite the same problem. There's also a simplified Ruby grammar on the Antlr website that might help. (Since Ruby has similar optional semicolons.)

I'm still not convinced that using Antlr is the way to go. Since it generates recursive descent grammars it's much easier to convert the grammar from the existing parser (than it would be to use Yacc or Bison). But it also wouldn't be hard to port the existing parser directly to Java. An explicit grammar is certainly easier to modify, but the language grammar doesn't change a lot.

Tuesday, June 17, 2008

jSuneido - Server (again)

First I thought the socket server portion of Suneido would be easy in Java. Then I realized it wasn't going to be as easy as I thought. Now, after reading Java Concurrency in Practice (recommended), I'm back to thinking it'll be "easy" (relatively).

I won't need to use Grizzly or MINA or any other library/framework. The key is the Executor system which was introduced in Java 5. I can use the Executors.newCachedThreadPool to process requests.

This will result in (pooled) thread-per-request which is what I want. So if there are 50 users connected, but only 4 requests active at a given time, there will only be 4 worker threads. Actually, it's a bit more complicated because inactive threads are kept around for a while (cached) in case they're needed.

CachedThreadPool uses an unbounded number of threads. This isn't so good for uncontrolled situations because it can lead to problems like denial of service attacks. But for the situations where Suneido applications are used - on local area networks - it should be a reasonable choice.

It doesn't use the Leader/Followers pattern but it does use a SynchronousQueue to pass requests from the "dispatcher" to the worker threads. This passes the requests "directly" from one thread to another, avoiding the overhead of adding and removing requests to a normal queue.

I find this flip-flopping between optimism and pessimism quite common, both on small scales and on larger scales. Learning often fits into one of two categories - learning how to do things, or learning that things are more complex than you thought.

Friday, June 13, 2008

jSuneido - concurrency strategies

I've been thinking about concurrency as I make progress on porting the btree indexing code (inserting and iterating are working).

The simplest, obvious way to handle concurrency would be to lock entire indexes. You could argue that there are lots of tables and each table has multiple indexes, so usage should be spread out enough to minimize contention. Unfortunately, the 80-20 rule (or worse) applies and activity tends to focus on particular tables and indexes.

Since there is a lot more "read" activity than "write" activity, an obvious improvement would be to use "many readers / single writer" locking.

There are strategies for locking individual index nodes in btrees, but you have to be very careful on the order of locking or you end up with deadlocks.

Most btree updates only update a single node - it would be nice if those cases just locked that one node. This would definitely reduce contention. But some updates require changes to more of the tree and then it makes more sense to lock the entire index. I think I can reconcile these by having single node updaters first acquiring a read lock on the entire index and then write locking the single node. The read lock will prevent anything else from acquiring a write lock on the entire index.

A complication is that you don't know if it will be a single node update until you get there. But if you discover that it will require larger scale changes, then you can upgrade the whole index read lock to a write lock.

This approach should be deadlock free since you're always locking in a fixed order - index as a whole, and then a single node.

There is, of course, another problem - there are no persistent in-memory objects for index nodes. They are mapped in and out of memory as needed. So it doesn't do any good to lock your node object because it's just a temporary object and other threads will have their own. (Locking the entire index is ok because there is a persistent shared object for the index as a whole.)

I could cache the node objects (with weak references so they could still be garbage collected) but then the cache would need synchronization.

Or I could use OS file locking but I've had bad experiences with that in the past.

Another option would be to associate a lock object with each memory mapped "section" and lock at that level. But chunks are 4 mb so this would be pretty coarse grained locking. (Index nodes are only 4k so you could potentially be locking a thousand of them.)

I'm busy reading concurrency books, maybe one of them will give me a better idea. To start I can always just use many-readers / single writer locking on the index as a whole.

Tuesday, June 10, 2008

jSuneido - Bye Bye Generics

Sometimes constraints are a good thing. They force you to look for alternatives. To think harder.

I started out by doing a fairly simple translation of C++ templates to Java generics in the Suneido btree code.

I immediately ran into problems because Java generics are very different from C++ templates.

I then proceeded to thrash and flail, trying one approach, abandoning it, throwing out the code, starting over with another approach. (This doesn't lead to much progress in terms of lines of code!)

The C++ btree design was fairly complicated. There were four Slot classes and four corresponding Slots classes. The Btree template was instantiated with four different combinations of these, for various purposes. Within Btree the slots were wrapped by LeafNode and TreeNode classes.

As I worked on it, I realized that three of the Slots classes were virtually identical. And on top of that, they were virtually identical to the Record code.

The fourth Slots/Slot classes were used in a single place in a specialized way that is much more suited for a hash table than a btree. At the time, I was probably so wrapped up in btree's that I didn't even think twice about it.

So in Java I've ended up with a non-generic Btree class, a single Slots class (that mostly delegates to the record class), and a single Slot class. Three classes instead of 10 or so in the C++ code. I have lost a little bit of type safety, but these classes are used in a pretty restricted way so this shouldn't be a problem. And I'll have tests anyway :-)

To be fair, some of the C++ use of templates was motivated by performance. Using templates avoids the overhead of virtual method calls. Since I never tried it without the templates it's hard to say whether that was justified.

Working with Java is forcing me to change my mindset a little. Coming from a time when memory, cpu, and disk resources were severely limited, and working on systems like Suneido where performance was important, I tend to always have "efficiency" in mind as I design.

So I'd try to avoid allocation, for example by putting objects on the stack instead of the heap. But you can't do that in Java.

I also worked hard in Suneido to make the mapping from database to memory as direct as possible. By carefully designing classes I could store them directly on disk, and with memory mapping access them directly without any reads/writes or translation. Again, this is a lot tougher in Java with no pointers. I can do a certain amount with memory mapped ByteBuffer's but it's definitely not as direct. On the positive side, the Java code is ending up a lot simpler than the C++ code. Java just doesn't provide the same opportunities for "clever" tricks.

It'll be interested to see what affect this has on the resulting performance. You might think that since hardware has improved so much that it doesn't matter. But the two versions are going to be compared side by side and if the Java version is slower, it'll be a hard sell. Although, if I can get the multi-threading working reasonably well, that will give the Java version a big advantage, even if it ends up slightly slower on a single core.

I think it's also going to be beneficial to be forced to not only read the C++ code, but to understand it. I've already discovered several sections in the code that turned out to be not used at all. (Nothing major, but still "junk".) And although, I'm not "finished", it's looking pretty clear that revisiting the btree code is going to lead to simplifying it a lot.

Monday, June 09, 2008

jSuneido - Server

I may have been a little premature when I said "writing a multi-threaded socket server seems a lot simpler in Java than in C++".

The examples I looked at that were so simple were for thread-per-connection. This is a reasonable model for low volume or for short connections (i.e. one request per connection).

But when you have persistent, intermittently active connections as Suneido does, then thread-per-connection doesn't scale very well. There are several problems - one is that each thread consumes resources so if you have a lot of inactive threads waiting for intermittent requests you tie up unnecessary resources. Another problem is that a thread that's been inactive will no longer be in the cpu caches and may even have been swapped out to disk, especially when resources are tight. This makes it slow to respond and can lead to thrashing.

One of the alternatives is the Reactor pattern. I was planning to use the ACE reactor implementation for the C++ version. The reactor model uses a smaller pool of threads that carry out requests, not tied to specific connections.

I could write my own reactor in Java, but it's tricky. The basic idea is simple, but getting the multi-threading right and getting good performance aren't so simple.

I may use Grizzly - a Java implementation of the reactor pattern, but it doesn't appear to support the Leader/Follower pattern.

The "obvious" approach to implementing a reactor is to have one "dispatcher" thread that listens for requests and a pool of worker threads to handle them. But this "forces" at least one context switch per request, not so good for "short" requests. With the Leader/Follower pattern there is no dispatcher - threads take turns accepting a request and then handling it. This avoids the dispatching context switch. If the operating system supports it, all the non-busy threads can simultaneously "wait" on all the connections and the OS will give an incoming request to one of them. If the OS doesn't support this, then one of the non-busy threads is chosen as the "Leader" and accepts the next request, and then another thread is promoted to "Leader".

Leader/Follower may not be that important for Suneido since most requests are large enough that the context switch overhead is probably insignificant.

There is also the Apache MINA network application framework. It seems a little "heavier" than what I need.

Sunday, June 08, 2008

jSuneido - C++ Templates versus Java Generics

I'm currently working on porting Suneido's btree code from C++ to Java.

It's challenging because the C++ version makes heavy use of templates (plus a few other pointer type tricks).

If I look at the C++ code one way it seems really elegant, but if I look at it another way it seems really twisted.

The problem is that Java generics are very different from C++ templates. I know that people complain about Java generics, but just using the standard generic collection classes it seems fine. It's only when you try to write a more complex generic class that you start to see the reason for the complaints.

Java generics work by "type erasure". In C++ terms this would be like compiling all your templates down to a single void* version. This had the big advantage that it was backward compatible and didn't require any VM changes. (It also results in smaller compiled code.) The disadvantage is that you "lose" all the type information at run-time.

For example, class can't do new T() because T has been "erased" (to Object).

The way around it is to create new objects via an instance. This instance can be the class - Class - in which case you use newInstance from the reflection API. Or it can be a "factory" object. Unfortunately, this instance has to be passed in (since the whole problem is that generic classes can't instantiate parameter types)

This will all be old hat to experienced Java programmers. And it wouldn't be a big issue for me, except that I'm trying to port C++ templates that have no problem instantiating parameter types ...

Java Books

Here are the Java books I've bought in an attempt to go from 0 to 100 in a matter of weeks. Some of them are fairly specific to my needs, like the virtual machine internals.

Java(TM) Programming Language, The (4th Edition) (The Java Series)
by Ken Arnold

The Definitive ANTLR Reference: Building Domain-Specific Languages (Pragmatic Programmers)
by Terence Parr

Java Pocket Guide (Pocket Guides)
by Robert Liguori

Effective Java (2nd Edition) (The Java Series)
by Joshua Bloch
This is helpful for learning the "right" way to do things in Java.

Java(TM) Platform Performance: Strategies and Tactics (The Java Series)
by Steve Wilson
(old, but I picked it up used for $4 from Powells in Portland)

Java Concurrency in Practice
by Brian Goetz

Java(TM) Virtual Machine Specification, The (2nd Edition) (The Java Series)
by Tim Lindholm

Programming for the Java(TM) Virtual Machine
by Joshua Engel

Java: The Complete Reference, Seventh Edition (Complete Reference Series)
by Herbert Schildt

Core Java, Vol. 2: Advanced Features, 8th Edition
by Cay S. Horstmann

Core Java(TM), Volume I--Fundamentals (8th Edition) (Addison-Wesley Object Technolo)
by Cay S. Horstmann

TCP/IP Sockets in Java: Practical Guide for Programmers (The Practical Guides)
by Kenneth L. Calvert
Annoyingly, I find there is a newer edition.

Scripting in Java: Languages, Frameworks, and Patterns
by Dejan Bosanac

Saturday, June 07, 2008

Java Version for jSuneido on Eclipse on Mac

The last time I opened up my jSuneido project in Eclipse on my Mac at home I had an error. But I hadn't had an error on Windows at work.

It turned out I was using a method (Arrays.copyOfRange) that was introduced in 1.6

The default for the Mac still appears to be Java 1.5

To "fix" this go into Eclipse > Preferences > Java > Installed JREs and pick 1.6

PS. Don't forget to enable asserts in the newly chosen JRE (like I did!)

Friday, June 06, 2008

Fear

Fear is often a good thing. It makes us think twice. Ideally it can reduce mistakes.

But handled the wrong way, fear can backfire, it can be self-fulfilling.

For example, let's say you're afraid of upgrading your software. That's understandable - software upgrades can be painful.

But how you handle the fear is important. It's positive if it encourages you to be cautious, to test the upgrade first, to upgrade in a way you can rollback, to upgrade at a slow time when you'll have time to recover from any problems, to wait for version 2.1 instead of 2.0

Instead, you might let your fear stop you from upgrading. In the short term this saves you work and pain. But in the long term, you're going to have a problem. The longer you go without upgrading, the scarier it gets. Before you know it you're three versions behind and your version is no longer supported. Now you are forced to upgrade, and now you're pretty much guaranteed to have problems.

Or lets say you're afraid of your server crashing because you're not sure you know enough to re-install and re-configure it. Yikes! That's cause for fear. But the answer is not to cross your fingers. The answer is to solve the problem - get a spare machine, learn how to re-install and re-configure. Do a test run - pretend your server has crashed and go about re-building it.

Face your fears, use them, don't avoid them (or ignore them).

(Note: I'm not saying you should rush to do upgrades - that's just an example.)

Monday, June 02, 2008

RailsConf - Day 4

I started off the day with "The Worst Rails Code You've Ever Seen". Some of this was pretty basic mistakes. But not having written any Rails code myself, I couldn't even see what was wrong with some of the examples, let alone how to fix them. I'm sure it was obvious to most people.

The next session was supposed to be about Scaling Rails from the Inside Out by Engine Yard, but instead he talked about a new system for distributed application "cloud" computing. Interestingly, the system was based on XMPP (the Jabber instant messaging protocol) and much of it is written in Erlang.

Scaling was a common theme in the conference. Next I went to a talk by RightScale about using Amazon EC2. Pretty neat stuff. He talked about one case where a company got suddenly popular on Facebook and scaled from 80 servers to 3500 servers in a matter of days (although not without problems along the way!)

My last session was an open Q&A with four of the JRuby developers. Quite interesting, although not as much internals as I'd like. But then again, most people don't care so much about that end of things.

The conference wrapped up with a Q&A session with four of the Rails core developers.

Overall, a good conference, even though I'm not a Rails developer!

Sunday, June 01, 2008

RailsConf - Day 3

The initial keynote was by Jeremy Kemper on Rails 2.1 - some interesting new stuff.

Next I went to a session on Git. Git is a new version control system. In this crowd it seems to be rapidly replacing Subversion. It seems like not that long ago that SourceForge got SubVersion, I wonder how long it'll be before they support Git? (However, it is possible to use Git together with Subversion.) It was a high speed overview of Git, but it did cover a little about how it works which I always like to know. I've been thinking about improving Suneido's version control system. Something like Git might solve a lot of the problems we have.

Then a session on Rails deployment with jRuby. jRuby on Rails lets you take advantage of the multi-threading in the JVM so you can run multiple Rails instances within one VM (rather than the conventional multiple processes.) There's a new tool called Warbler that makes it really easy to do the deployment on jRuby.

And if you want to make deployment really easy you can use Heroku - a Rails "hosting" solution that uses Amazon EC2 and S3 to scale up and down as your traffic requires. Very cool. We use S3 and have been pretty happy with it. Currently we're running our own Rails server for eTrux, but it's a hassle and it would be nice to have it hosted.

Then another session on jRuby by Ola Bini - more general but still interesting

My final session was a comparison of Ruby testing frameworks, primarily Test/Unit, Rspec, and Shoulda. I thought I might pick up some new ideas but they're all pretty similar, nothing too exciting.

Kent Beck gave the keynote at the end of the day. He talked about three "big ideas" he's been involved in - patterns, test driven development, and XP/Agile. I've heard him speak before and he's worth listening to.

When I first starting going to conferences a few years ago I felt under-privileged because I didn't have a Mac laptop like the cool people. Now it's not just the cool people - it's virtually everyone that has a Mac laptop - even me! But I still feel under-privileged because now everyone has MacBook Pro's and I only have a regular MacBook. One guy that sat down next to me even commented "oh, you have one of those old Macs". (Actually, they're not that "old" - they're still being sold, but I got the point.) There weren't many of the new Mac Air laptops - they're probably a little underpowered for this kind of programmer crowd.

I'm enjoying this conference more than I expected. The sessions have been varied enough that I can find stuff that I'm interested in. It's a pretty young group - both the presenters and the attendees. And there seems like a lot of energy and enthusiasm.

One thing that has struck me is just how rich and varied the software ecosystem is becoming. So many products - several in every niche it seems. maybe that's partly a result of how much lower the barrier to entry is. One or two people can produce a lot these days. And with the internet, they can spread the word about what they've done. It makes it a little confusing, but the end result is that there is an amazing amount of amazing stuff out there.

Friday, May 30, 2008

RailsConf - Day 2

We started off with Joel Spolsky who is an entertaining speaker, but unfortunately doesn't have as much real content in his talks as he does in his blog.

Next I went to a talk on IronRuby (Ruby on the .Net CLR) They seem to be making good progress. I would have liked to hear more about the details of their implementation. It sounds like performance is a big issue - they are currently about 10 times slower than MRI (Matz's Ruby Interpreter - the original C implementation of Ruby).

Then a talk by the developer of Limelight. I wish he'd spent more time on Limelight and less on his "10 Things I Hate About Developing Web Apps". Limelight is a "runtime" like Adobe AIR that uses a Ruby DSL to make "web like" applications but in a "cleaner", simpler way than HTML + CSS + JavaScript. (Sorry, I can't seem to find a link.)

I went out at lunchtime and got back just before the sessions were due to start, only to find the ones I wanted to see were all full. This was ironic since in the morning introduction they'd made a point of saying they hadn't sold out and didn't want to sell out because they wanted everyone to get in. I guess I should have got in line earlier. Oh well, I grabbed a latte and worked on jSuneido.

The last two sessions were on Rubinius and MagLev - two more implementations of Ruby.

Rubinius' aim is RiR - Ruby In Ruby - i.e. trying to implement as much of Ruby in the Ruby language rather thin in C like MRI. I was interested to hear they are planning/hoping to use LLVM to optimize their compiler. (I'd looked at this for Suneido.) I found it a little puzzling that one of their reasons for not using the StrongTalk VM was they couldn't understand the source code. But I seriously wonder if the LLVM source code is any easier to understand. If you're going to use another project because you want to skip the effort of writing it yourself, then I don't think it's fair to expect to "understand" it. You have to expect a certain amount of "black box".

MagLev is Ruby on top of GemStone's proprietary Smalltalk VM and including their impressive object persistence facilities. MagLev is claiming pretty amazing performance from their VM - up to 100 times faster than MRI. This seemed to raise some issues with the jRuby guys. Charles Nutter asked a question which came pretty close to saying, "if it doesn't work, you can make it as fast as you want" (MagLev doesn't implement all of Ruby yet.) I wonder if the GemStone VM is really much faster than the JVM? Not that it matters in the big picture because the JVM is everywhere whereas GemStone is nowhere (relatively).

You may notice that although this is RailsConf, none of the sessions I went to were on Rails! It suited me fine since it matched my interests, but I felt a little guilty. Of course, the one session that I couldn't get into because it was full was an actual Rails one. There is of course a Rails tie in to these sessions. It's Rails that has made Ruby so popular and the reason for all these implementations of Ruby is primarily to run Rails.

All in all, a pretty good day. The weather was beautiful, of course, since I was inside all day!

Thursday, May 29, 2008

RailsConf - Meta-programming Ruby

My tutorial this morning was on meta-programming Ruby.

The second half by Patrick Farley was about the implementation of Ruby and how it enables and affects meta-programming. With my interest in implementing programming languages this was right up my alley.

The first half, by Neal Ford, I didn't like as much. He spent some of his time bad mouthing other languages - easy prey but not very productive. I wasn't too impressed with some of his examples showing why Ruby is better - they didn't seem to be comparing apples to apples.

For example, he compared a Java static (class) for determining if a string is empty to the Ruby empty instance method. Because it's a static method the Java class has to check for null. The Ruby method doesn't need to, but not because Ruby is better, but because the Ruby method is an instance method. A Ruby class method would not only have to check for nil, but also for completely different types. The Java method loops through the characters and checks if they are whitespace. The Ruby one just calls strip to remove the whitespace. I'm not sure if the Ruby strip creates a new string or modifies the instance (Ruby strings are mutable) but either way that's quite different from the read-only Java version. A Ruby version that was equivalent to the Java version would be much the same code, as would a Java version that was equivalent to the Ruby version.

Another example was using Ruby meta-programming to dynamically create test methods for a list of cases. But rather than impress me, it just made me ask why? Why not just loop through the cases in a single test method - no fancy meta-programming required.

The funny part is that Ruby does have definite advantages for some things, but the examples didn't really show that. Maybe it's just hard to come up with good examples.

Of course, at RailsConf they're mostly preaching to the converted, so everyone just laps it up. Not being a particular Ruby/Rails fanatic and being involved with other languages, I'm a little more critical.

I'm also not totally convinced about the whole meta-programming thing. Yes, it allows some pretty amazing things. But it also allows some things that are very twisted, hard to comprehend and debug.

Still, it was a good session containing some pretty interesting material.

Benchmarks

One of the optimizations I saw mentioned for JVM languages is pre-creating small integer objects, e.g. from -128 to +128.

This seemed like an easy thing to try. So I made a simple benchmark and timed it both ways.

The problem was, there was no speed difference. There was as much variation between runs as there was between methods.

The problem with micro-benchmarks is that they generally don't do any "real" work, so if you're not careful the compiler/vm will optimize away the code you're trying to test. I figured that must be what was happening here.

Sure enough, I re-wrote the benchmark so it wouldn't be so "obvious" my code was useless, and then I got more reasonable results. (There was less variation, more consistent times as well for some reason.)

Without the optimization it took an average of 9.7 seconds. With the optimization it was 6.9 seconds. This is roughly a 30% speed improvement for a few lines of code. Not bad. Of course that's if you're doing 100% small integer operations which is never the case.

PS. I know - premature optimization. But at least I measured it! (the results anyway, if not the problem) But at this stage I'm still trying to get a feel for the JVM and this kind of experimentation helps. After all, the practicality of this whole project depends on achieving decent performance.

PPS. You won't see these (or any other) changes in Subversion until I figure out why my laptop is refusing to commit changes :-(

Tuesday, May 27, 2008

jSuneido - implementing classes and methods

Reading Headius: The Power of the JVM got me thinking about how to implement Suneido classes and methods on the JVM.

Suneido is a little easier to implement in this respect than more dynamic languages. Suneido's classes are more like Java classes - defined in a single source unit and not modifiable at runtime. Unlike Ruby you can't add methods dynamically.

It sounds like one of the keys to performance might be call site caching of method lookup. What this means is that after you've done the method lookup you cache the result at the call site. Next time you just call the cached method. Of course, the cached method is only valid if the receiver is the same class as when you cached it. If the value class is different you have to fall back to the method lookup. But in practice it's a worthwhile optimization, and not too complex to use. It's an old idea that's been around since early Smalltalk days.

One way to implement this is would be to compile each call to something like:

if (cache == null || cache.class != receiver.class)
cache = receiver.method_lookup(...)
cache.call(...)

One of the problems is that there's no "good" way to reference a method. I think that's one of the reasons why jRuby compiles each Ruby method to a Java class - so it can use a class reference as a method reference.

One of the features being considered in JSR292 (JVM improvements for dynamic languages) is method "handles" which would avoid having to compile each method into a separate class.

Another JSR292 feature is "dynamic invoke". This would move some of this mechanism into the JVM. And "anonymous" classes will help as well.

I'm still trying to figure out what's possible/practical/reasonable with the JVM. Unfortunately, there's not a lot of documentation at this level. It would be nice to know more about how other JVM languages like jRuby and Groovy are implemented, but there doesn't seem to be much out there. I guess I could try to figure it out from their source code but I bet that wouldn't be easy.

One feature I think I should investigate is the Java Proxy system. It sounds like this is used by Scala (another JVM language). Apparently it doesn't work for native type arguments but I don't think that's a problem for Suneido.

Meanwhile I'm plugging away at porting the code. The record and memory mapped file code now seems to be working (at least at a basic level).

PS. I'm writing this (and checking code into Subversion) from the Vancouver airport on my way to RailsConf in Portland. Gotta love free wireless internet!

Monday, May 26, 2008

In Portland

I'll be in Portland, Oregon for RailsConf this week.

If anyone is interested in getting together, send me an email to mckinlay axonsoft.com

Java Roadblock Part 2

I tried a test program:

RandomAccessFile f = new RandomAccessFile("tmp", "rw");
f.seek(3L * 1024 * 1024 * 1024);
f.write(123);
FileChannel channel = f.getChannel();
long size = channel.size();
System.out.println("File is " + size + " bytes large");
for (int i = 0; i < 10; ++i) {
    long ofs = 0;
    int chunksize = 512 * 1024 * 1024;
    while (ofs < size) {
        int n = (int)Math.min(chunksize, size - ofs);
        int tries = 0;
        ByteBuffer buffer = null;
        do {
            try {
                buffer = channel.map(FileChannel.MapMode.READ_ONLY, ofs, n);
            } catch (IOException e) {
                if (++tries > 10)
                    throw e;
                System.gc();
                System.runFinalization();
            }
        } while (buffer == null);
        System.out.println("Mapped " + n + " bytes at offset " + ofs + 
            " with " + tries + " tries");
        ofs += n;
        buffer = null; // "unmap"
    }
}
channel.close();

It took 0 to 3 "tries" of forcing garbage collection and finalization. Of course, this is in the context of a simple test program. A large active heap might be better - because of frequent garbage collection - or it might be worse due to delayed finalization.

This might or might not be workable for Suneido. Currently I only map up to 1 gb at a time. That means by the time I might run out of address space there would be a lot of unreferenced mappings and hopefully at least some of them would get finalized.

This was on my Mac which I see is defaulting to Java 5 (although 6 is installed). On my work Windows machine with Java 6, I get 0 tries every time! Either it works better on Windows or it works better on Java 6. Hopefully the latter!

So I guess I'll forge ahead. I'll leave in the code to force garbage collection and finalization if the mapping fails since it doesn't hurt anything. I should probably log it somehow so I know if it's happening.

One thing I haven't quite figured out is how growing the file fits in with mapping. In my test program I had to actually "write" at the end of the file to allow mapping. This may differ between operating systems.

PS. The other "solution" is to use the 64 bit version of Java (on a 64 bit OS). Then there is no shortage of address space. I could even go back to the simpler approach of just mapping the whole file. (Although there are issues with that with growing the file.)

Sunday, May 25, 2008

Java Roadblock

I was just starting to port Suneido's memory mapped file code from C++ to Java. All was going well until I couldn't find any "unmap" function.

Searching the internet gave me:

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4724038

Basically, it looks like I'm out of luck. Either I try to rely on one of the workarounds or else I have to come up with a completely different approach.

The problem is that Suneido uses memory mapped files to access the database. The database file can be bigger than the available address space so the file is mapped in chunks and the least recently used chunks are unmapped. But in Java I can't unmap!

This really sucks. Things were going well and I was quite happy with Java. I'm not sure what to do now. I guess I can experiment with the workarounds, but even if they work there's no guarantee they'll work elsewhere on a different JVM or OS.