Friday, October 30, 2009

jSuneido Progress

Over the last couple of weeks I've been working on database concurrency in jSuneido. I basically ripped out all the transaction handling code that I ported from cSuneido and replaced it with a more concurrency friendly design.

The tricky part was keeping the code running (tests passing) during the major surgery. It's always tempting to just start rewriting and hope when you finish that it'll all work. Of course, it never does and you're then faced with major debugging because you have no idea what's broken or which part of your changes broke it.

Evolving gradually isn't so easy either, but at least there's no big-bang integration nightmare to face at the end. As you evolve the code some of the intermediate stages aren't pretty since you have two designs cohabiting. And that can lead to some ugly bugs due to the ugly code. But if you're making the changes in small steps, you can always just back up to a working state.

A few days it was touch and go whether I'd get that day's changes working by the end of the day but thankfully I always did. I always hate leaving things broken at the end of the day!

There were a few uneasy moments when things weren't working and I started to wonder if there was some flaw in the design that would negate the whole approach. But the bugs always turned out to be relatively minor stuff. (Like transactions conflicting with themselves - oops!)

The new design doesn't write any changes to the database file till a transaction commits. This eliminated code that had to determine if data should be visible to a given transaction, and code that had to undo changes when the transaction aborted and rolled back. It does mean accumulating more data in memory, but memory is cheap these days. And simpler code is worth a lot.

Switching to serializable snapshot isolation, as described in Serializable Isolation for Snapshot Databases turned out pretty well. You still need to track reads and writes, but the advantage is that conflicts are detected during operations, rather than having to do a slow read validation process during commit (Especially since, at least in my design, commit is single threaded.) It was also nice to see that this approach is a bit more permissive than my previous design i.e. allows more concurrency.

It was exciting (in a geek way) to finally get to the whole point of this exercise - making Suneido multi-threaded. It's taken so much work to get to this point that you almost forget why you're doing it.


I've also gradually been making things thread-safe where needed. My approach to concurrency has been:

  • keep as much data thread contained as possible, i.e. minimize shared data
  • immutable data where possible
  • persistent immutable data structures for multi-version (the equivalent of database snapshot isolation)
  • concurrent data structures where applicable e.g. ConcurrentHashMap
  • Java synchronized only for bottom level code that doesn't call any other application code (to avoid the possibility of deadlock and livelock)

The real test will be how scalable jSuneido is over multiple cores. My gut feeling is that the design should scale fairly well, but I'm not sure gut feelings are reliable in this area.

It's a funny feeling to be finally approaching the end of this project. There's still lots to do, but the big stuff is handled.

Tuesday, October 27, 2009

The First Mistake in Documentation

From the Java 6 docs for Buffer:

capacity

public final int capacity()
Returns this buffer's capacity.
Returns:
The capacity of this buffer
And no, it's not an isolated case. It's followed by "position - return this buffer's position"

I don't know how many times I've seen stuff like this. Why do people write this kind of thing? They must know it's useless. Is it just because they are "required" to write documentation so they fulfill the letter of their requirements? Maybe IDE's are too good at generating boilerplate which then gets left as generated.

Comments like "x = x + 1 // add one to x" are a similar phenomenon

Saturday, October 24, 2009

When Will I Learn

I had an annoying intermittent bug in jSuneido that I narrowed down to a problem with my PersistentMap. I eventually tracked it down by writing a random "stress" test.

The "funny" part is that I knew I should have done this right from the start. In my previous blog post I said:
I should probably do some random stress testing to exercise it more.
As is often the case, the fix was tiny - 3 characters!  And, of course, it was a situation that was relatively rare. It was a good example that 100% test coverage does not mean no bugs. This kind of thing is one of the downsides of rolling your own.

Here's the fixed source code. The bug was missing "<< h" on line 202

Friday, October 23, 2009

Give Them Something Better

 This quote was talking about movie theaters, but it could easily be talking about software.
"Giving the people what they want is fundamentally and disastrously wrong. The people don't know what they want ... [Give] them something better" - Samuel "Roxy" Rothapfel, 1914 (quoted in Blue Ocean Strategy)
Of course, the danger is going too far to the other extreme, thinking you know better, and ignoring peoples real needs.

Thursday, October 22, 2009

Eclipse Annoyance

Have I mentioned that the auto-indent/wrapping in Eclipse really sucks!

            insertExistingRecords(tran, columns, table, colnums,
 fktablename,
                    fktable, fkcolumns, btreeIndex);

I know formatting is somewhat subjective, but I can't see how this kind of result can be called "correct" under any interpretation. The frustrating part is that you fix it, and next time you turn around, it's mangled again.

I cannot understand how the Eclipse developers tolerate this. Surely somewhat would get annoyed enough to fix it. The only thing I can guess is that they don't use it. I wonder why. Maybe I'm the idiot for leaving it turned on!

To be fair, maybe it works for everyone else and it's something specific to my settings that messes it up. If so, I haven't been able to find the magic combination of settings that "works".

Scalability Isn't Easy

How We Made GitHub Fast - GitHub

Tuesday, October 20, 2009

Another Unavailable eBook Reader

Barnes & Noble has announced their eBook reader. It looks pretty nifty, but of course, it's not available in Canada. (Only US currently.)

Nook, eBook Reader, eBook Device - Barnes & Noble

There are some interesting features like being able to loan books to other people, and being able to read your ebooks on your computer.

So far, the Sony eBook Readers are the only mainstream reader available in Canada.

The Sony eBook store claims about 55,000 books versus Amazon's 350,000. Barnes & Noble claims "over one million" but they must be counting all the public domain stuff. I'd be surprised if they have as good a selection as Amazon.

I wonder if you'd get away with shipping a Nook to a friend's US address. The question is whether they'll let you use a Canadian credit card to pay for books (Amazon doesn't).

Monday, October 19, 2009

Back to No Kindle in Canada

I just tried to order the new international version of the Kindle and got:



When the announcement first came out I specifically went and checked Canada and I'd swear it said it was available. (my previous post)

It really annoys me to continually run into products and services that exclude Canada. I don't think Amazon has anything personal against Canada, so I have to assume it's our industry and government that are blocking it. Stupid and short sighted.

Canada snubbed as Kindle goes global - The Globe and Mail

Wednesday, October 14, 2009

A Java Immutable Persistent List

Something else I needed for jSuneido was an immutable PersistentList (source). I also improved my PersistentMap (source).

One of the things that was bothering me was that I thought there should be a more efficient way to initially build immutable persistent lists and maps. Creating a new version for every element added seemed like overkill.

When I started looking at the Google Collections Library I found the Builder classes they have for their immutable collections. This seemed like a good approach so I added builders to PersistentList and PersistentMap. (The Guava: Google Core Libraries for Java 1.6 also look interesting.)

To be clear, if you just need a thread safe map or list, the java.util.concurrent ones are just fine. I'm using them in a few places.

And if you just need an immutable collection java.util collections offer the unModifiable wrappers and the Google Collections Library has immutable collections. However, they are not "persistent" (see Language Connoisseur). They don't offer any easy way to create new versions with added or removed elements.

Persistent immutable data structures really shine if you need to make modified versions but still retain full access to previous versions. That's what I need in the jSuneido database code because it implements multi-version concurrency control. Each transactions sees the database "as of" a certain point in time. Persistent immutable data structures are a great fit for this.

Java ByteBuffer Annoyances

jSuneido uses ByteBuffer's to memory map the database file, but they're not ideal for this. Unfortunately, they're the only option.

Java NIO Buffer's are meant to be used in a very particular way with methods like mark, reset, flip, clear, and rewind. For the purpose they are designed for, I'm sure they work well.

But I'm not sure that shoehorning the memory mapping interface into ByteBuffer was the best design choice.

jSuneido memory maps the database file in big chunks (e.g. 4mb). When you want access to a particular record it returns a new ByteBuffer that is a "slice" of the memory map one. This is the closest I can get to cSuneido returning a pointer. The problem is that slice doesn't take any arguments - you have to set the position and limit of the buffer first. Which means the operation isn't thread safe because a concurrent operation could also be modifying the position and limit.

Most operations on buffers have both "relative" (using the current position) and "absolute" (passing the position as an argument) versions. For some reason slice doesn't have an absolute version.

Another operation that is missing an absolute version is get(byte[]) so again you have to mess with the buffer's current position, meaning concurrency issues.

It would have been much better (IMO) if ByteBuffer was immutable (i.e. all absolute operations). All the position/mark/limit reset/clear/flip stuff should have been in a separate class that wrapped the base immutable version. But not only does the current design force you to drag along a bunch of extra baggage, but it also (for some unknown reason) omits key operations that would let you use it as immutable.

Considering that one of the goals of java.nio was concurrency, it seems strange that they picked a design that is so unfriendly to it.

Monday, October 12, 2009

jSuneido Performance

I was troubled by jSuneido appearing to be 10 times slower than cSuneido so I did a little digging.

The good news is that with a few minor adjustments it's now only about 50% slower. That takes a load off my mind.

The adjustments included:

- using the server VM (adding -server to the JRE options)

- disabling asserts (I had some slow ones for debugging)

- removing dynamic loading of classes (if a global name wasn't found it would try to class load it)
- recording when a library lookup fails to avoid doing it again

The C++ Suneido code makes a point of avoiding copying data as it moves from the database to the language. For example, a string that comes from a database field actually points to the memory mapped database - zero copying. Of course, I never knew how much benefit this was because I had nothing to compare it to. It just seemed like it had to be beneficial.

In Java it's a lot harder to avoid copying. To get a string from the database, not only do you have to copy it twice (from a ByteBuffer to a byte array and then to a String) but you also have to decode it using a character set. I could avoid the decoding by storing strings as Java chars (16 bit) but that would double the size in the database.

There may be better ways to implement this, I'll have to do some research. Another option would be to make the conversion lazy i.e. don't convert from ByteBuffer to String until the string is needed. Since a certain percentage of fields are never accessed, this would eliminate unnecessary work.

One small benchmark I ran (admittedly not representative of real code) spent 45% of its time in java.nio.Bits.copyToByteArray!  Yikes, almost half the run time! Judging by this, my efforts to avoid copying in cSuneido were probably worthwhile.

A strange result from this same benchmark was that 25% of the time was in String.intern. I found this hard to believe but it seemed consistent. I even tried the YourKit profiler and got the same result. (Although if it is using the same low level data collection that would explain it.) But if I did a microbenchmark on String.intern alone it was very fast, as I would expect. (I checked the profiler output to make sure the intern wasn't getting optimized away.)

I don't know if this is an artifact of the profiling, or if intern slows down when there are a lot of interned strings, or if it slows down due to some interaction with the other code. I did find some reports of intern related performance problems but nothing definitive.

jSuneido currently interns method name strings so that it can dispatch using == (compare reference/pointer) instead of equals (compare contents). Most cases e.g. object.method() are interned at compile time. (Java always interns string literals.) But some cases e.g. object[method]() are still interned at run time.

On larger, more realistic tests intern doesn't show up so I'm not going to worry about it for now.

Amazon Kindle International

It looks like we'll be able to get the Kindle in Canada soon.

Amazon.com: Kindle Wireless Reading Device (6" Display, U.S. & International Wireless, Latest Generation): Kindle Store

I've been getting close to buying a Sony reader, now I'll have to choose:

Sony
- touch screen
- more open formats e.g. ePub
- access to public domain Google Books
- access to some public libraries

Kindle
- bigger selection of books
- free 3G wireless
- access to Wikipedia
- purchased books can also be accessed on iPhone or iPod Touch

So far it's only the smaller Kindle that's going to be available internationally.

One thing it would be good for is computer books. I hate buying them because I know they'll be out of date within a year or two. But you can't get most of them from the library, and even if you could, often I need to refer back to them for longer than I could borrow them from the library. And it would be great to not have to haul physical books back and forth between home and office. Environmentally, you save trees, but you consume another gadget, which wouldn't be so bad if it lasted longer, but you know it'll be out of date in a year or two.

Quick and Dirty Java Profiling

Search for "Java Eclipse profiler" and one of the first things you'll find is The Eclipse Test & Performance Tools Platform. I installed it, but when I went to use it, it told me my platform wasn't supported. I'm not sure what that meant - 64 bit? OS X? Java 1.6? Eclipse 3.4? I didn't pursue it, life's too short.

I looked at a few more tools but nothing stood out. There's the YourKit profiler but it's a commercial product. (Although maybe free for open source projects?)  Eventually I found an article on using the built-in hprof. That sounded simple enough.

To use hprof you add something like -agentlib:hprof=cpu=samples to your Java command line. In Eclipse you can do this from Eclipse > Preferences > Java > Installed JREs > Edit. It's crude to edit the preferences to turn this on and off, but ok for limited use. (There may be a better way to do this, I'm still far from an Eclipse expert.)

This will produce a java.hprof.txt file with the results. It's fairly self explanatory.

There are more options for hprof (you can get a list with -agentlib:hprof=help)

Sunday, October 11, 2009

jSuneido Passes Accounting Tests

Another good milestone - jSuneido now successfully runs all the tests from our accounting application.

The further I go, the tougher the bugs tend to get. One of them took me two days to track down and fix. The problems were in predictable areas like obscure details of database rules and math. Some of them were not so much bugs as just small incompatibilities with cSuneido. A few were errors in the tests themselves.

I've got more application tests I can run. Hopefully they won't uncover too many more problems.

On a less positive note, jSuneido is taking almost 10 times longer than cSuneido to run the accounting tests. That's a significant difference. I haven't done any optimizing yet, but I wouldn't expect optimizing to make a 10 times difference.  It may be time to find a profiler and see what's taking all the time.

The stdlib tests run in similar amounts of time on cSuneido and jSuneido so I suspect the difference is in the database - the accounting tests use the database a lot more than the stdlib tests. cSuneido has a thinner, more direct interface to the memory mapping, which may make a fair difference in speed. I should be able to write some tests to see if this is the problem. Of course, if it is, I'm not sure how I'm going to fix it ...

Thursday, October 08, 2009

A Java Immutable Persistent Map

I know this is going to seem like I'm reinventing the wheel again, but I honestly tried to find existing Java code for a persistent map. I also took a stab at extracting something usable from Clojure but it was not easy to untangle from the rest of the Clojure code. I does seem surprising that there isn't anything out there (at least easily findable). I guess Java programmers don't write functional code.

It took me most of a day and about 300 lines of code to implement, using the Ideal Hash Trees paper and the occasional glance at the Clojure code to see how it did things. I took a similar approach as Clojure but simplified somewhat. I also made mine compatible with the Java collections classes. And mine doesn't depend on anything other than the standard Java libraries, so it should be a useful starting point for anyone else.

I started implementing iteration but gave up for lack of time. (And YAGNI - I may not need it.) Iterating through trees is always a pain.

The code is not as clean as I'd like, e.g. the methods could be smaller, but I did write tests with pretty much 100% coverage. I should probably do some random stress testing to exercise it more. And there are some magic numbers in there like 0x1f and 5 and 32.

I didn't find it right away so I'll point out that if you're looking for CTPOP (count population) in Java it's Integer.bitCount

I tried to restrain myself from premature optimization and took the simplest approach I could. I'm sure it could be made to use less memory and run faster. But the algorithm is good, so speed should be reasonable. And it will certainly use less memory than naive copy on write.

It was actually a nice break from slogging away getting our accounting application tests to run on jSuneido.

PS. I realize that Git uses a persistent data structure, which is why it is so "cheap" to make new versions. I started implementing a Git-like system in Suneido a while ago, but at that point I hadn't run into persistent data structures. But tree data structures are not strangers anyway due to Suneido's btree database indexes.

Monday, October 05, 2009

Building the Perfect Beast

I just watched a good presentation by Rich Hickey on Persistent Data Structures and Managed References.

In it he mentions that Clojure's software transactional memory (STM) doesn't do read tracking.

That caught my attention. Read tracking (and validation) is one of the big hassles in multi-version concurrency like in Suneido's database.

I thought maybe there was a way to avoid it that I'd missed so I did some digging. Sadly, what I discovered is that Clojure's STM only implements snapshot isolation (SI).

This means you can still get "write skew" anomalies where multiple concurrent update transactions each write data that the other reads, leading to results that would not (could not) happen if the transactions were serialized.

Suneido implements serializable transactions, not just snapshot isolation, to prevent these kinds of anomalies. (I like how they call it "anomalies", it doesn't sound as bad as "errors".)

Clojure implements snapshot isolation because it's simpler, easier, faster, etc.

Databases like Oracle and PostgreSQL supply snapshot isolation when you request serializable, again, for performance reasons. Amusingly, PostgreSQL says "Serializable mode does not guarantee serializable execution..."

It reminds me of the old saying "if it doesn't have to work correctly, you can make it as small/fast/easy as you want".

But while I was digging into this, I found that there is a way to make snapshot isolation serializable using commit ordering. Wow, if there is a way to avoid read tracking/validation and still be serializable that seems like the best of both worlds.

I found a paper on Serializable Isolation for Snapshot Databases but if I understand it correctly from a quick read, it simply replaces read tracking with a new kind of read lock. I have to study it a bit more to figure out the advantage. I think it may keep the special read locks for a shorter period of time than read tracking. And I think it will detect potential anomalies earlier, avoiding the read validation phase at commit time.

But this paper doesn't seem to have anything to do with commit ordering so I'm unsure if that's yet another approach or whether I'm just missing the connection.

Sometimes I think my father was right, that I should have become an academic so I could spend all my time on this kind of thing. But I think that's a fantasy. First there's all the politics in academia (that I would hate). Then there's the need to specialize so much (whereas I like to jump around). And finally, I like to have a practical end to what I'm doing - actual real users benefiting, not just a paper published in some journal to be read by other academics.

* Building the Perfect Beast is a Don Henley song

Snow Leopard Technology

Mac OS X 10.6 Snow Leopard: the Ars Technica review - Ars Technica

There's some good stuff in this article - 64 bit, LLVM, Clang, concurrency, GCD, etc.

Here's a quote I can relate to.
"The prospect of an automated way to discover bugs that may have existed for years in the depths of a huge codebase is almost pornographic to developers—platform owners in particular."

Saturday, October 03, 2009

Write Your Own Compiler

I recently listened to a podcast of Scott Hanselman talking to Joel Spolsky.

One of the things they talked about was how Fog Creek (Spolsky's company) wrote their own compiler because it was easier than rewriting their application, which was written in an ancient version of VBScript.

I've always been a little defensive about how my company writes their applications in our own language (Suneido). At best people look at me funny, at worst they think I'm crazy. Our salesmen have to skirt around the issue because it scares people to hear you're not using big name tools. As the old saying goes, "no one gets fired for buying IBM". Nowadays you can substitute Microsoft or Oracle or Java.

So it was heartening to hear Spolsky talk about (aka defend) why they wrote their own compiler. One of his points was that writing a compiler is not that hard. People tend to think it's a big deal, but for a smallish language it's not that big a job. For example, Suneido's compiler makes up a tiny fraction (less than 1%) of the total source code in my company's main application.

I'm the first to admit that part of the reason is simply that I like developing languages (and database servers, and IDE's, and frameworks) better than I like writing applications.

But there are other reasons. Having your own platform has its costs, but it also has its benefits. It's a good way to insulate yourself from the fast pace of change in the computer industry. My company still has customers running software that we originally wrote on MS-DOS. The exact same application code is now running on Vista. There are not many commercial platforms that can claim that. Even things like Java that are supposed to be stable platforms change a lot over the years. And Microsoft changes languages and frameworks faster than you want to keep up with. (Spolsky's problem.)

Of course, that can mean you're not using the latest bleeding edge technology, but that's not such a bad thing for a business application.

Friday, October 02, 2009

More Concurrency

Coincidentally, right after I wrote my last blog post about concurrency Tim Bray started a series of blog posts about it. Here's the latest:

ongoing - Concur.next — The Laundry List

Thursday, October 01, 2009

Language Connoisseur

It took me a few tries to come up with the right word. Addict? Fanatic? Polyglot? Aficionado?

I think "connoisseur" fits best - "A person of informed and discriminating taste e.g. a connoisseur of fine wines."

I love to read about different programming languages, but I wouldn't say I actually "learn" them. I've only written significant amounts of code in three languages, C, C++, and now Java. It takes years for me to feel like I'm getting "good" at a language.

But I can "taste" other languages, and like a wine connoisseur, critique them (at least in my own mind). Ah yes, made from JVM grapes, definite functional nose, aged in strong typing, goes well with concurrent meals. (Sorry, I'm stretching, but you get the idea.)

It's interesting to note that 4 of the 6 languages that I've read about recently are JVM based. That's partly because I'm interested in JVM based languages because of jSuneido. But I think it also indicates a trend. There are languages based on .Net as well but for some reason they don't seem to be as popular, or at least have as many books.

One of my big interests these days is concurrency, again partly because of having to face it with jSuneido. But it's also something we all have to face as cpu's no longer get faster, and instead we get multiple cores.

It's becoming pretty obvious that threads, mutable data structures, and locking are not the answer. They work, and with a lot of effort it's possible to write programs using them, but it's hard and error prone. I think they will become the "machine language" of concurrency. Present under the covers but not dealt with directly by most people.

So the alternatives in other languages are pretty interesting. Erlang and Scala have "actors". Clojure has software transactional memory.

Pure functions and immutable objects seem to be part of the answer. Transactional memory is promising. All three of these go together.

When I first started reading about Clojure I was confused by the references to "persistent data structures". I was thinking of persistence as saving things in a database. But that's not what they're talking about. Persistent data structures  in this sense are data structures that are optimized for making modified immutable "copies" that share representation. The classic and simplest example is a single linked list (like in Lisp). For example, when you add a new item to the front of a list the new list "shares" the old list.

Suneido strings are immutable and use persistent data structures. Unlike most languages, when you concatenate onto a string in Suneido it doesn't create a whole new string. Instead, behind the scenes, it creates a list of two strings. This is transparent - to the programmer it's just as if it created a new string.

This answered a question that had been lurking in the back of my mind. Isn't it really slow to use large immutable objects in functional languages? Now I can see how it can be done efficiently using some clever algorithms like Clojure's persistent hash map.

I'd been trying to figure out how to implement an efficient concurrent version of the schema cache in jSuneido. Suneido stores the database schema in tables in the database. But for speed it caches it in memory. The schema doesn't tend to change very often so it is well suited to making it immutable (and therefore not requiring locking). But it does change occasionally, and when it does, transactions have to see the appropriate version. (Suneido uses multi-version database transactions.) A persistent data structure is perfect for this because each transaction can have it's own immutable "copy" without physically having to copy the schema (which can be quite large).

However, there's a catch. cSuneido loads the schema lazily (on demand). This adds a twist - the schema cache is logically immutable, but physically mutable. That would require locking, which I'm trying to avoid.

Then I realized that lazy loading was to speed up start up. But jSuneido is primarily (at least initially) aimed at the server, where start up speed is not as important. So I can just load the whole schema at start up, eliminating the problem. Nice.

There is another catch. Java doesn't come with any persistent data structures :-(  and I don't particularly want to write my own.  Java doesn't even have a single linked list.

I did a quick search but I didn't find any third party libraries either.

One possibility is to use the Clojure data structures. (they're written in Java) I took a quick look at the source code and it seems like it might be feasible but I'm not sure how dependent they are on other parts of Clojure. If I'm lucky they'll be usable on their own.

Of course, adopting other code always has it's price. If there are bugs I won't know how to fix them. If Clojure fixes, updates, or improves the code I'll have to decide whether to switch to the new version.

Fun stuff! :-)