Wednesday, December 22, 2010

Optimizing jSuneido Argument Passing - part 2

I think I've finished optimizing jSuneido's argument passing and function calling. The "few days" I optimistically estimated in my last post turned into almost a month. I can't believe it's taken me that long! But I can't argue with the calendar.

I gained about 10% in speed. (running the stdlib tests) That might not seem like much, but that means function call overhead was previously more than 10%, presumably quite a bit more since I didn't improve it that much. Considering that's a percentage of total time, including database access, that's quite an improvement.

When I was "finished", I wanted to check that it was actually doing what I thought it was, mostly in terms of which call paths were being used the most.

I could insert some counters, but I figured a profiler would be the best approach. The obvious choice with Eclipse is their Test and Performance Tools Platform. It was easy to install, but when I tried to use it I discovered it's not supported on OS X. It's funny - the better open source tools get, the higher our expectations are. I'm quite annoyed when what I want isn't available!

NetBeans has a profiler built in so I thought I'd try that. I installed the latest copy and imported my Eclipse project. I had some annoying errors because NetBeans assumes that your application code won't reference your test code. But include the tests as part of the application so they can be run outside the development environment. I assume there's some way to re-configure this, but I took the shortcut of simply commenting out the offending code.

I finally got the profiler working, but it was incredibly slow, and crashed with "out of memory". I messed around a bit but didn't want to waste too much time on it.

I went back to manually inserting counters, the profiling equivalent of debugging with print statements. I got mostly the results I expected, except that one of the slow call paths was being executed way more often than I thought it should be.

So I was back to needing a profiler to track down the problem. I'd previously had a trial version of YourKit, so this time I downloaded a trial version of JProfiler. It ran much faster than the NetBeans profiler and gave good results. But unfortunately, it didn't help me find my problem. (Probably just due to my lack of familiarity.)

I resorted to using a breakpoint in the debugger and hitting continue over and over again checking the the call stack each time. I soon spotted the problem. I hadn't bothered optimizing one case in the compiler because I assumed it was rare. And indeed, when I added counters, it was very rare, only occurring a handful of times. The problem was, those handful of occurrences were executed a huge number of times. I needed to be optimizing the cases that occurred a lot dynamically, not statically.

Although I only optimize common, simple cases, they account for about 98% of the calls, which explains why my optimizations made a significant difference.

One of my other questions was how many arguments I needed to optimize for. I started out with handling up to 9 arguments, but based on what I saw, the number of calls dropped off rapidly over 3 or 4 arguments, so I went with optimizing up to 4.

I can't decide whether it's worth buying YourKit or JProfiler. It doesn't seem like I want a profiler often enough to justify buying one. Of course, if I had one, and learnt how it worked, maybe I'd use it more often.

No comments: