The tricky part was keeping the code running (tests passing) during the major surgery. It's always tempting to just start rewriting and hope when you finish that it'll all work. Of course, it never does and you're then faced with major debugging because you have no idea what's broken or which part of your changes broke it.
Evolving gradually isn't so easy either, but at least there's no big-bang integration nightmare to face at the end. As you evolve the code some of the intermediate stages aren't pretty since you have two designs cohabiting. And that can lead to some ugly bugs due to the ugly code. But if you're making the changes in small steps, you can always just back up to a working state.
A few days it was touch and go whether I'd get that day's changes working by the end of the day but thankfully I always did. I always hate leaving things broken at the end of the day!
There were a few uneasy moments when things weren't working and I started to wonder if there was some flaw in the design that would negate the whole approach. But the bugs always turned out to be relatively minor stuff. (Like transactions conflicting with themselves - oops!)
The new design doesn't write any changes to the database file till a transaction commits. This eliminated code that had to determine if data should be visible to a given transaction, and code that had to undo changes when the transaction aborted and rolled back. It does mean accumulating more data in memory, but memory is cheap these days. And simpler code is worth a lot.
Switching to serializable snapshot isolation, as described in Serializable Isolation for Snapshot Databases turned out pretty well. You still need to track reads and writes, but the advantage is that conflicts are detected during operations, rather than having to do a slow read validation process during commit (Especially since, at least in my design, commit is single threaded.) It was also nice to see that this approach is a bit more permissive than my previous design i.e. allows more concurrency.
It was exciting (in a geek way) to finally get to the whole point of this exercise - making Suneido multi-threaded. It's taken so much work to get to this point that you almost forget why you're doing it.
I've also gradually been making things thread-safe where needed. My approach to concurrency has been:
- keep as much data thread contained as possible, i.e. minimize shared data
- immutable data where possible
- persistent immutable data structures for multi-version (the equivalent of database snapshot isolation)
- concurrent data structures where applicable e.g. ConcurrentHashMap
- Java synchronized only for bottom level code that doesn't call any other application code (to avoid the possibility of deadlock and livelock)
The real test will be how scalable jSuneido is over multiple cores. My gut feeling is that the design should scale fairly well, but I'm not sure gut feelings are reliable in this area.
It's a funny feeling to be finally approaching the end of this project. There's still lots to do, but the big stuff is handled.