Wednesday, July 16, 2008

jSuneido - progress and Antlr

I now have the main database query language parsing to syntax trees (using Antlr). This is a good step because it includes all the expression syntax. Next, I'll work on the query optimization, and then query execution.

Antlr is working pretty well. One thing I don't like (but it's not unique to Antlr) is that the grammar gets obscured by the "actions". It's too bad, because a big reason for using something like Antlr is so the grammar is more explicit and easier to read and modify.

For example:

cols : ID (','? ID )* ;

turns into:

cols returns [List list]
    @init { list = new ArrayList(); }
    : i=ID { list.add($i.text); }
      (','? j=ID { list.add($j.text); }
      )* ;


(There may be easier/better ways of doing some of what I'm doing, but I think the issue would still remain.)

It would be nice if you could somehow separate the actions from the grammar, similar to how CSS separates style from HTML.

This might also solve another issue with Antlr. To write a tree grammar to process your syntax tree you copy your grammar and then modify it. Sounds like a maintenance problem to me.

Similarly, Suneido's language has pretty much the same expression grammar as the database query language. But I'm going to have to copy and modify because the actions will be different. (For queries I build a syntax tree, but for the language I generate byte code on the fly.) Which means if I want to change or add something, I have to do it in multiple places. Granted, that's no different than the handwritten C++ parsing code, but you'd like a better solution from a sophisticated tool.

No comments: