Wednesday, November 11, 2009

jSuneido Multi-Threading Issues

It didn't take much testing to find something that worked single-threaded but failed multi-threaded.

I was expecting this - I figured there'd be issues to work out.

But I was expecting them to be hard to track down and easy to fix and it turned out to be the opposite - easy to track down but hard to fix.

The problem turned out to be more a design flaw than a bug. I've thought of a few solutions but I'm not really happy with any of them.

Oh well, I knew all along this wasn't going to be easy. It'll come.

Monday, November 02, 2009

IntlliJ IDEA Goes Open Source

I recently learned that IntelliJ has released a free, open source community edition of their IDE.

IntelliJ is one of the main IDE's along with Eclipse and Netbeans. I hadn't looked at it much because the other two are free, but it does get some good reviews. (Apparently they did offer free licenses to open source projects but I wasn't aware of that.)

I tried downloading it and installing it and had no problems. It comes with Subversion support "out of the box" and I was easily able to check out my jSuneido project. That's more than I can say for Eclipse where it's still a painful experience to get Subversion working (at least on a Mac). IntelliJ proves that it is possible to do it smoothly.

I haven't had time to play with it much yet. My first impression was that the UI was a little "rougher" than Eclipse. I can probably tweak the fonts to get it a bit closer. Maybe it's due to Eclipse using SWT. (I'm not sure what IntelliJ is using.)

IntelliJ is known for their strong refactoring. To be honest, I only use a few basic refactorings in Eclipse (like rename and extract method) so I don't know if this would be a big benefit. I should probably use more...

IntelliJ is also supposed to have the best Scala plugin. I'll have to try it. I tried the Eclipse one but wasn't too impressed with where it's at so far.

Friday, October 30, 2009

jSuneido Progress

Over the last couple of weeks I've been working on database concurrency in jSuneido. I basically ripped out all the transaction handling code that I ported from cSuneido and replaced it with a more concurrency friendly design.

The tricky part was keeping the code running (tests passing) during the major surgery. It's always tempting to just start rewriting and hope when you finish that it'll all work. Of course, it never does and you're then faced with major debugging because you have no idea what's broken or which part of your changes broke it.

Evolving gradually isn't so easy either, but at least there's no big-bang integration nightmare to face at the end. As you evolve the code some of the intermediate stages aren't pretty since you have two designs cohabiting. And that can lead to some ugly bugs due to the ugly code. But if you're making the changes in small steps, you can always just back up to a working state.

A few days it was touch and go whether I'd get that day's changes working by the end of the day but thankfully I always did. I always hate leaving things broken at the end of the day!

There were a few uneasy moments when things weren't working and I started to wonder if there was some flaw in the design that would negate the whole approach. But the bugs always turned out to be relatively minor stuff. (Like transactions conflicting with themselves - oops!)

The new design doesn't write any changes to the database file till a transaction commits. This eliminated code that had to determine if data should be visible to a given transaction, and code that had to undo changes when the transaction aborted and rolled back. It does mean accumulating more data in memory, but memory is cheap these days. And simpler code is worth a lot.

Switching to serializable snapshot isolation, as described in Serializable Isolation for Snapshot Databases turned out pretty well. You still need to track reads and writes, but the advantage is that conflicts are detected during operations, rather than having to do a slow read validation process during commit (Especially since, at least in my design, commit is single threaded.) It was also nice to see that this approach is a bit more permissive than my previous design i.e. allows more concurrency.

It was exciting (in a geek way) to finally get to the whole point of this exercise - making Suneido multi-threaded. It's taken so much work to get to this point that you almost forget why you're doing it.


I've also gradually been making things thread-safe where needed. My approach to concurrency has been:

  • keep as much data thread contained as possible, i.e. minimize shared data
  • immutable data where possible
  • persistent immutable data structures for multi-version (the equivalent of database snapshot isolation)
  • concurrent data structures where applicable e.g. ConcurrentHashMap
  • Java synchronized only for bottom level code that doesn't call any other application code (to avoid the possibility of deadlock and livelock)

The real test will be how scalable jSuneido is over multiple cores. My gut feeling is that the design should scale fairly well, but I'm not sure gut feelings are reliable in this area.

It's a funny feeling to be finally approaching the end of this project. There's still lots to do, but the big stuff is handled.

Tuesday, October 27, 2009

The First Mistake in Documentation

From the Java 6 docs for Buffer:

capacity

public final int capacity()
Returns this buffer's capacity.
Returns:
The capacity of this buffer
And no, it's not an isolated case. It's followed by "position - return this buffer's position"

I don't know how many times I've seen stuff like this. Why do people write this kind of thing? They must know it's useless. Is it just because they are "required" to write documentation so they fulfill the letter of their requirements? Maybe IDE's are too good at generating boilerplate which then gets left as generated.

Comments like "x = x + 1 // add one to x" are a similar phenomenon

Saturday, October 24, 2009

When Will I Learn

I had an annoying intermittent bug in jSuneido that I narrowed down to a problem with my PersistentMap. I eventually tracked it down by writing a random "stress" test.

The "funny" part is that I knew I should have done this right from the start. In my previous blog post I said:

I should probably do some random stress testing to exercise it more.
As is often the case, the fix was tiny - 3 characters!  And, of course, it was a situation that was relatively rare. It was a good example that 100% test coverage does not mean no bugs. This kind of thing is one of the downsides of rolling your own.

Here's the fixed source code. The bug was missing "<< h" on line 202

Friday, October 23, 2009

Give Them Something Better

 This quote was talking about movie theaters, but it could easily be talking about software.

"Giving the people what they want is fundamentally and disastrously wrong. The people don't know what they want ... [Give] them something better" - Samuel "Roxy" Rothapfel, 1914 (quoted in Blue Ocean Strategy)
Of course, the danger is going too far to the other extreme, thinking you know better, and ignoring peoples real needs.

Thursday, October 22, 2009

Eclipse Annoyance

Have I mentioned that the auto-indent/wrapping in Eclipse really sucks!

            insertExistingRecords(tran, columns, table, colnums,
 fktablename,
                    fktable, fkcolumns, btreeIndex);

I know formatting is somewhat subjective, but I can't see how this kind of result can be called "correct" under any interpretation. The frustrating part is that you fix it, and next time you turn around, it's mangled again.

I cannot understand how the Eclipse developers tolerate this. Surely somewhat would get annoyed enough to fix it. The only thing I can guess is that they don't use it. I wonder why. Maybe I'm the idiot for leaving it turned on!

To be fair, maybe it works for everyone else and it's something specific to my settings that messes it up. If so, I haven't been able to find the magic combination of settings that "works".

Scalability Isn't Easy

How We Made GitHub Fast - GitHub