Monday, October 05, 2009

Building the Perfect Beast

I just watched a good presentation by Rich Hickey on Persistent Data Structures and Managed References.

In it he mentions that Clojure's software transactional memory (STM) doesn't do read tracking.

That caught my attention. Read tracking (and validation) is one of the big hassles in multi-version concurrency like in Suneido's database.

I thought maybe there was a way to avoid it that I'd missed so I did some digging. Sadly, what I discovered is that Clojure's STM only implements snapshot isolation (SI).

This means you can still get "write skew" anomalies where multiple concurrent update transactions each write data that the other reads, leading to results that would not (could not) happen if the transactions were serialized.

Suneido implements serializable transactions, not just snapshot isolation, to prevent these kinds of anomalies. (I like how they call it "anomalies", it doesn't sound as bad as "errors".)

Clojure implements snapshot isolation because it's simpler, easier, faster, etc.

Databases like Oracle and PostgreSQL supply snapshot isolation when you request serializable, again, for performance reasons. Amusingly, PostgreSQL says "Serializable mode does not guarantee serializable execution..."

It reminds me of the old saying "if it doesn't have to work correctly, you can make it as small/fast/easy as you want".

But while I was digging into this, I found that there is a way to make snapshot isolation serializable using commit ordering. Wow, if there is a way to avoid read tracking/validation and still be serializable that seems like the best of both worlds.

I found a paper on Serializable Isolation for Snapshot Databases but if I understand it correctly from a quick read, it simply replaces read tracking with a new kind of read lock. I have to study it a bit more to figure out the advantage. I think it may keep the special read locks for a shorter period of time than read tracking. And I think it will detect potential anomalies earlier, avoiding the read validation phase at commit time.

But this paper doesn't seem to have anything to do with commit ordering so I'm unsure if that's yet another approach or whether I'm just missing the connection.

Sometimes I think my father was right, that I should have become an academic so I could spend all my time on this kind of thing. But I think that's a fantasy. First there's all the politics in academia (that I would hate). Then there's the need to specialize so much (whereas I like to jump around). And finally, I like to have a practical end to what I'm doing - actual real users benefiting, not just a paper published in some journal to be read by other academics.

* Building the Perfect Beast is a Don Henley song

No comments: