Friday, May 29, 2009

YAGNI Strikes Again

This might also have been called "Premature Optimization Strikes Again".

For a given query, Suneido's query optimization chooses the "best" index to read each table by. But it also tries to determine if it is worthwhile to use additional indexes as well.

The problem is that using an additional index may not help, and may actually slow down the query.

Over the years, each time we ran into a case where it was doing more harm than good I've struggled to try to improve the heuristics.

One of the things that makes this hard is that the query optimizer doesn't really have enough information to know when additional indexes would be useful.

Another case came up recently. We added an index to speed up one thing and sometime later (through customer complaints) found that it had slowed down something else. (We probably need some performance tests to catch this kind of thing.)

Adding an index shouldn't slow anything down, it should only potentially speed things up. But extra indexes tend to lead to the query optimizer choosing additional indexes, and sometimes slowing things down.

As I was working on the current problem, I wanted to measure the speed without additional indexes so I disabled this feature.

Then I started to wonder if this feature was actually providing enough benefit to justify all these hassles (and slow downs). So I ran our test suite of over 1000 tests with additional indexes disabled and it made no difference to the speed!

That doesn't mean there aren't certain situations where this feature would be worthwhile. But it is a pretty good indication that overall it's not providing a lot of benefit. And there's no question that it's caused problems.

So I think I'll do some more testing and try it out in-house and if no problems come up I'll just remove this feature.

Obviously, I shouldn't have added this feature in the first place without "data" to tell me it was worthwhile. i.e. I shouldn't have prematurely optimized. The problem is, there's no way to know if it will be beneficial or not without implementing it. It's not like measuring a bottleneck in your code and then optimizing it. In this case, you're talking about whether adding something will help, and how can you "measure" that without implementing it?

Oh well, I guess there's nothing wrong with determining that a feature is more trouble than it's worth and ripping it out. You just have to get over the psychological hurdle of throwing away a bunch of work.

* YAGNI = you ain't gonna need it

No comments: