Thursday, February 18, 2010


Now that jSuneido is more or less functional, I've been working on some of the peripheral functions like checking and rebuilding databases. (i.e. crash recovery)

The checking part went quite smoothly and quickly. The rebuilding/recovery part has been slow. I've been working on it for several days and it seems like I've made no progress. The only code I've written is tests and debugging! But whether it feels like it or not, I guess I was making progress because it "suddenly" started working today.

Recovery is hard for a number of reasons. For starters you're assuming the database has been corrupted, so you have to code a lot more defensively. Second, because you're working outside the normal operation of the database, you can't use the normal functionality. You're not doing transactions etc.

One lesson I (re)learned was how important "visibility" is. By that I mean being able to "see" the data you're working on. That's a large part of why debuggers can be valuable - they let you inspect the data. Often, inserting the right "print" statement in the "right" place is all it takes to figure out a bug. Of course, finding that right place is not so easy. In this case visibility meant writing a dump utility so I could see exactly what was inside the database. Obviously, I have a pretty good idea in general how it's structured, but not the details of exactly which types of blocks of data are in which sequence. And dumping to text files meant I could use tools like diff to compare before and after recovery.

PS. In hindsight, yes, it would have made sense to write the database checking much earlier so I could verify that databases weren't getting corrupted by operations. On the other hand, that was usually pretty obvious -  it crashed!

No comments: