Wednesday, January 04, 2023

Three Weeks for Three Tweaks

It started with reports of slow database queries. At first I didn't pay too much attention, after all, some queries are slow. We're dealing with quite large data (in a small business sense, not in Google terms) and customers are choosing what to select on and how to sort.

It progressed to the same query being slow sometimes and fast other times. That seemed a little more suspicious, but there were still a lot of factors that might have explained it.

Finally someone came up with a simple example where the same query, ran on the same data, was 1000 times slower when you ran it slightly differently. That was bad news for me, since it definitely meant there was a problem with the query optimization. Why was it picking such a slow strategy sometimes, when there was an obviously better strategy. Especially when it was such an extreme difference. The optimization shouldn't have to be very accurate when there is a 1000 times difference!

It didn't turn out to be a bug, the code was working as designed. It was just a particular scenario that wasn't handled well by the current design.

After 20 years, the easy improvements have been made and I'm into the realm of diminishing returns. I ended up needing to "tweak" three aspects of the code in order to fix the issue. And all three were required before I could see if it was going to work. TL;DR - it did solve the problem.

One way of looking at the issue was that the optimization was getting caught by a local minimum. It wasn't smart enough to use a sub-optimal strategy in one spot in order to allow a better strategy overall. (It doesn't explore every possible strategy, for a complex query that would be far to slow.)

None of the changes were difficult, but they were all somewhat invasive, requiring changes to all the query operations.

Background

In case you actually want to try to follow what I'm talking about, here's a little background on how Suneido implements queries.

A query like:

(table1 join table2) union table3

gets parsed into a tree like:

where the leaves are database tables and the other nodes are operations. Each node has one or more strategies for execution, plus some parameters like choice of index.

Optimization starts with calling the optimize method on the root operation (union in this case) which then calls optimize on each of its sub-queries.

Execution works similarly by calling the get method on the root operation, which then "pulls" data from its sub-queries.

Tweak #1

Lets say we have a query like:  

table1 join table2 where id=1

During optimization, query operations can ask their children for their estimated number of rows. In this case join would get the table size from table1 e.g. 1000 and one (1) from the where since it's selecting on a unique id. But that information isn't sufficient to estimate the fan-out of the join.

So the first change I made was to return the "population" count in addition to the row count. The where could return 1 for the result count and e.g. 10,000 for the population i.e. table2 size. This allows the join to estimate a more realistic fan-out of 1:10.

Tweak #2

The next problem is when a temporary index is required, there are two costs - the "fixed" cost of building the index, and the "variable" cost of reading from it. But until now these had been combined into a single "cost". Other operations usually have only a variable cost.

Continuing the example from Problem #1, now the join can estimate the fan-out is 1:10, it can estimate it's only going to read 10 records from table2. So the variable cost will be low, but we'll still incur the entire fixed cost. So in this case we want to minimize the fixed cost. But to do that, I needed to separate the cost into fixed and variable parts everywhere.

Tweak #3

The remaining problem was that join didn't have any way to tell the sub-query that only a small part of the result would be read. To address this, I added a "fraction" argument to allow queries to tell their sub-queries how much of the result they estimated they would read.

In the running example, 10 rows from 10,000 would be a fraction of .001 Where before it would choose a strategy with e.g. a fixed cost of 1000 and a variable cost of 1000 (total 2000) over one with a fixed cost of 0 and a variable cost of 5000 (total 5000). Now the 5000 be multiplied by .001 giving 5, which is obviously preferable to 2000.

This allowed the optimization to avoid the local minimum.

Conclusion

I'm not sure how common the problem scenario is. It was only because we ran into such an extreme case (1000 times slower) that I got motivated to investigate. Perhaps less extreme cases also occur and will be improved by these changes. As far as I can reason, the changes should not make any cases worse.

No comments: