Tuesday, July 08, 2008

jSuneido and Antlr

I still haven't decided whether to use Antlr to replace my handwritten lexers and parsers in Suneido, but I decided to take it a little further.

I've been working on the methods to create/modify/drop database tables, so the obvious grammar to try was the database "admin requests" (the simplest of the grammars in Suneido).

I'd already written the basic grammar when I first played with Antlr so the main task was to hook up the grammar to jSuneido. For small, simple grammars it is often more of a challenge to "hook them up" than it is to write the grammar.

My first approach was to use Antlr's built-in AST (abstract syntax tree) support. That wasn't too bad. It definitely helped to have AntlrWorks to test and debug.

But then I had to do something with the AST. For big grammars, you can write another grammar to "parse" the AST, but that seemed like overkill for this. I could have just manually extracted the information from the AST but this isn't covered in the book and there's not much documentation.

Instead, I decided to not use the AST support and just accumulate the information myself. It took some experimentation to figure out how to do this. Again, it's not an approach the book really covers. In hindsight, I'm not sure if it was any easier than figuring out how to use the AST.

One of the weaknesses with the handwritten C++ parsers is that I didn't uncouple the parsing from the desired actions. It would be really nice to be able to use the same parser for other things e.g. checking syntax without generating code. What I'm hoping to do with jSuneido is to have the parser call an interface that I can have different implementations of. So even though I don't really need this for the admin request parser I decided to try out the approach.

Once I got it working in AntlrWorks the next step was to get it integrated into my Eclipse project. I had a few hiccups along the way. One was ending up with different versions of the Antlr compiler and runtime (which leads to lots of errors).

But eventually I got it working. I have a few things left to implement but there's enough working to validate the approach.

One downside is that building jSuneido has gotten more complex. Now you need the Antlr compiler and runtime. I guess I could eliminate the need for the compiler if I distributed the java files generated by the Antlr compiler.

Deploying jSuneido will now require the runtime as well. I'm not totally happy about that, considering one of Suneido's goals is ease of deployment.

For just the admin request lexing and parsing it's probably not worth it. The next step will be the actual database query language. But the real test is how well it works for the main Suneido language. If that goes well, then it's probably worth the "costs".

If you're interested, the code is in Subversion on SourceForge. e.g. Request.g

No comments: