This meant extending the Antlr grammars for requests (to define views) and queries (to expand views). I thought defining views would be easy and expanding them would be tricky. It turned out to be the opposite.
The trick with defining views is that I didn't want to parse the view definition, especially since the view definition follows the query grammar, not the request grammar. I just wanted to slurp up the remainder of the input as the definition. This sounds simple, but I thrashed for 3 or 4 hours on it.
I had something like:
request : VIEW id '=' VIEWDEF ;But you can't do that. What makes it trickier is that although this looks like a single grammar it actually produces separate lexer and parser classes. (lower case rules are for parsing, upper case for lexing.) The lexer operates independently so you can't define VIEWDEF as .* since that'll match anything/everything.
VIEWDEF : .* ;
I tried predicates based on some example I found for implementing C macros:
request : VIEW id '=' {viewdef = true; } VIEWDEF ;But this doesn't work either, probably for a variety of reasons, but one main one is that the parser and lexer are separate so the parser can't control the lexer (at least not easily).
VIEWDEF : {viewdef}? .* ;
So I decided to do the work all in the lexer:
request : VIEWDEF ;But then I got "alternatives are unreachable". Eventually I found an explanation that .* is not greedy. That works well for something like '/*' .* '*/' but not so well for what I wanted. The suggested solution was to use ('\u0000'..'\uFFFE')* instead. Here's the end result:
VIEWDEF : VIEW ID '=' .* ;
request : VIEWDEFThis seems a little ugly, but appears to work. It's a little roundabout to lexically split the view definition, then concatenate it back together, only to split it apart in the parser.
{
String s = $VIEWDEF.text;
int i = s.indexOf('=');
iRequest.view(s.substring(0, i), s.substring(i + 1).trim());
}
VIEWDEF : VIEW WHITE+ NAME WHITE* '=' ANY EOF
{ setText($NAME.text + "=" + $ANY.text); };
fragment
ANY : ('\u0000'..'\uFFFE')+ ;
There could be a better way to do this but I couldn't find it. This will do for now.
I did find a bug in the standard library code while I was working on the views. The QueryCount function was trying to optimize the case where the query was just a table name. But something that looks like just a table name could actually be a view, in which case the optimization wouldn't work. I removed the optimization. It should really have been in the database anyway.
I thought this would fix the last of the test failures, but when I ran them all I found there were still a couple more failing. Hopefully one more day will clean them up.
As powerful as Antlr is, it's not the easiest system to work with. One of the next things I want to work on is compiling the Suneido language. I have a feeling I'll be putting in many more hours getting Antlr to do what I want. I could always go back to porting the hand written C++ recursive descent parser, but I think Antlr is better if it's not too painful to get working.
No comments:
Post a Comment