Tuesday, January 13, 2009

Another Query Optimization

Database query optimization in Suneido is a collection of heuristics. Most of them have come from finding queries that aren't executed efficiently and then figuring out ways to improve it.

The latest example was a query like:
(table1 leftjoin table2) where table2_field = "stuff"
Joins are implemented by reading sequentially through the left hand table (or sub-query) and doing lookups on the right hand side. This means you can read through the left hand side in any order. The right hand side requires an index on the join fields (to do the lookups efficiently).

Normal join's are symmetrical, but the implementation isn't, so part of the optimization process is deciding whether to reverse the order of a join.

But leftjoin's preserve all the records from the left hand side even where there are no matching records on the right. This means leftjoin's are not reversible, which limits optimization and sometimes leads to inefficient execution.

The trick is that this case:
table1 leftjoin table2 where table2_field = "stuff"
is actually equivalent to:
table1 join table2 where table2_field = "stuff"
because table2_field can only be "stuff" if there are matching records.

Once we see that it's equivalent to a normal join, then it's reversible, and there are more opportunities for optimization.

You might wonder why the programmer wouldn't just see this and write the query with a join instead of a leftjoin. In this example, the leftjoin was in a view definition, with the where added later. So the programmer only saw:
myview where table2_field = "stuff"
Besides, programmers don't always catch these things. It's a lot better if the system handles it.

Suneido's query optimization has two main phases - transform which moves and merges operations, and optimize which chooses strategies and index use.

This new optimization will be in the transform phase. Suneido already tracks "fixed" fields that are restricted by where's to a constant value. So if we see a fixed (and non-empty) field on the right hand side of a leftjoin, we can convert to a regular join.

Note: Suneido already moves where's as close to the tables as possible, so even if the query starts out as:
(table1 leftjoin table2) where table2_field = "stuff"
it will be transformed to:
table1 leftjoin (table2 where table2_field = "stuff")
This makes it easier to "see" the fixed field on the right hand side.

This shouldn't be too hard too implement. Especially since I've been working on the join code fixing a bug I found recently (while working on jSuneido).

[I'm not sure how much sense this will make if you're not familiar with this stuff. I wrote it for my own notes as much as anything.]

No comments: