Friday, May 29, 2015

suneido.js

suneido.js = Suneido + JavaScript

After getting back from my last traveling I mentioned to a friend that I was looking forward to getting back to looking into a Go version of Suneido. Not that I’d committed to it, but it seemed like a reasonable path to take since it had the potential to replace both the C++ cSuneido client and the Java jSuneido server.

But my friend pointed out that what we really needed was a web front end. Which he’s told me before, but for some reason this time I paid attention. It’s easy to get into a rut, to make decisions unconsciously based more on what is familiar than on what’s really best. Not that you ever really know what the best path is, but a Go implementation of Suneido wouldn’t really change the game. It’d just be an incremental improvement.

Currently our front end GUI is Windows specific. So to access our application software over the internet you have to use RDP (Remote Desktop Protocol), which requires a Windows server. So even though jSuneido will happily run on Linux, we can’t take advantage of that.

And although RDP works fairly well, our Windows GUI is not the look and feel of the web that people expect these days. We’re starting to get comments that our program looks dated. (Reminds me of when we moved from terminal mode MS-DOS to Windows. And yes, I have been in this business that long!)

If all we wanted was to write web software, there are plenty of languages and tools and frameworks to do that. But we have a million lines of complex Suneido code. Rewriting that in another language for another database is not something I even want to think about!

And therefore the idea of somehow running our existing code, with the least amount of changes, on a web front end.

One way to do that would be to rewrite just the front end in HTML + CSS + JavaScript. But that would mean a certain amount of duplication since some code has to run on the front end and the server. And it would mean getting all our programmers up to speed in HTML, CSS, and JavaScript as well as Suneido.

Suneido’s philosophy (right or wrong) has always been to provide an integrated “everything included” platform that shields programmers from the hassles and complexities of dealing with multiple languages and databases and frameworks. That shielding has also made Suneido a ery stable platform. I can’t think of too many languages or databases or frameworks where code you wrote 15 years ago would work unchanged today. (Java itself has been around that long, but that’s just the language piece of the puzzle.)

So what I really wanted was an approach that would encapsulate and hide the HTML, CSS, and JavaScript from application developers, like Google GWT does. (Of course, you’d need to know that stuff to work on the lower level implementation.) In my mind, that breaks down into two pieces.

The easy part is to be able to compile/translate Suneido code to JavaScript. I’ve been looking into that, and have a lot of it implemented. More on that in another blog post.

The harder part is to figure out how to map our Windows oriented GUI to the browser. At a general level that’s not that hard. Our user interface is “component” based with screens composed from a small number of basic building blocks. It shouldn’t be too hard to obtain or write equivalent “widgets” for the web. On the other hand I’m certain that some of the details will be painful.

One of the issues is that our current user interface is very chatty. It was developed on and for local area networks with low latency. So it uses a lot of lower level requests to the server. If we stayed on local area networks we could stick with the same model, but really we want to be able to run across the internet, where you have much higher latency. Bandwidth is also a factor, but even if you have really high bandwidth the latency will kill you if you’re doing too many little requests.

Talking about this with some of my programmers, one of the questions was whether we would use node.js for the server. I think they were thinking that the whole of Suneido would be re-written in JavaScript, similar to how cSuneido is written in C++ and jSuneido is written in Java. But it wouldn’t make sense to write Suneido’s database in JavaScript. The server can still run the existing jSuneido. It’s perfectly capable of acting as a web server and at the same time providing the database back end. The server back end would still be running Suneido code (compiled to JVM byte code). Only the front end user interface code would be translated to JavaScript to run on the browser.

I wouldn't say I've committed to this direction, but it seems worth looking into it. The big question is how much of our existing code we'd be able to use, and how much revising / rewriting we'd have to do.

Tuesday, May 19, 2015

Stonebraker on Databases

I recently listened to a podcast on Software Engineering Radio with database expert Michael Stonebraker. (at the recommendation of a friend - thanks Larry) It's a few years old, but still quite relevant.

As usual, I relate much of what I read and hear about to Suneido. In this case it was interesting to see how Suneido's database held up to Stonebraker's criticism of current conventional relational databases.

He talks about profiling a database and finding that 90% of the time was spent on "overhead". The time consisted of four main areas:

buffer pool management
Keeping a cache of database pages, page replacement tracking, and converting from external (disk) record format to internal (in-memory) format. Since most transactional (OLTP) databases now fit in main memory, this work is even more wasteful.

record locking
Most conventional relational databases use pessimistic row level read and write locks to ensure that transactions are atomic, consistent, and isolated (i.e. ACID). Managing locks and waiting to acquire locks can be slow.

thread locking
Most databases are multi-threaded which, in most cases, means they use locking to protect shared data structures. Locking limits parallelism.

crash recovery write-ahead logs
For durability (the last part of ACID) databases commonly use a write-ahead log where changes are written (and in theory flushed to disk) prior to updating the actual database. In case of a crash, the log can be used for recovery. But writing the log is slow.

So how does Suneido do in these areas?

Instead of a buffer pool, the database is memory mapped. This handles databases that fit into memory as well as ones that don't. When they don't the page replacement is handled by the operating system which is in a good position to do this efficiently. Modern operating systems and hardware already have so many layers of caching and buffering that it seems crazy to add yet another layer of your own!

Suneido also does as much work as possible using the external format of records, only converting to internal format when necessary. For example, most database "wheres" are done in external format. The encoding of values into external format maintains ordering, so sorting can also be done while still in external format.

Rather than pessimistic record locking, Suneido uses optimistic multi-version concurrency control in conjunction with an append-only "immutable" database. This means that read transactions do not require any locking and do not interact with update transactions. Write transactions only require locking for the actual commit.

Suneido's database server  is multi-threaded with requests handled by a pool of threads. But most of the internal data is in immutable persistent data structures, which require minimal locking. And the database itself is an immutable persistent data structure requiring minimal locking. (I minimized the locking as much to avoid bugs as for performance.)

Finally, Suneido doesn't use a write-ahead log. Instead, its immutable append-only design makes it a type of log structured database, where the database itself can act as the log.

NOTE: This is based on the newer Java implementation of Suneido which has a different database engine than the older C++ version.

I haven't benchmarked Suneido's database against other systems so I can't make any claims about speed. But in terms of avoiding most of the overheads that Stonebraker identifies, Suneido seems to hold up pretty well.

Saturday, May 16, 2015

Beautiful Code for Parsing

I've recently been reading up on JavaScript. One of the books I was re-reading was JavaScript: The Good Parts by Douglas Crockford. In there he mentioned the chapter he wrote in Beautiful Code. (Of which, there are mixed reviews.) It's been a while since I read that book and I didn't remember his chapter. So I pulled out my copy (old enough that it's an actual paper copy). I didn't want to carry the (large) book around so I took advantage of how O'Reilly lets you register your paper books and then buy the ebook for $5. Of course, after I did that I found Crockford's chapter is available on his web site. That's ok, I wouldn't mind re-reading the whole book.

The chapter was on parsing based on Top Down Operator Precedence (TDOP) by Vaughn Pratt. I thought I'd start by reading Pratt's original paper. Crockford's link takes you to the ACM citation which wants to charge you to read the paper (even though it's from 1973), but a quick web search found a public version. I found the paper a little hard to follow.

Crockford's article is an example of a parser for a subset of JavaScript written in that subset. It still took some effort to understand but I found it easier to follow than the original paper.

Suneido's hand written top down recursive descent parser is ok, but it has a lot of methods to parse expressions. It always seemed like there should be a better way. At some point I looked at the Go parser and it manages with a lot less methods by using precedence. I'd be quite interested to see what a Suneido parser would look like using Pratt's approach. Although, because Suneido's syntax grew ad hoc, it has some parts that are a little ugly to parse.

I love coming across elegant new algorithms and ideas. You can tell how much of a geek I am by the fact that I get as much pleasure out of a new algorithm as most people would from a bowl of ice cream :-) Which might help explain why I'm so skinny - not many calories in a delicious algorithm.