Thursday, July 07, 2011

Dreaded Deadlock

I was doing some refactoring and came across a stress test that I hadn't run for a while. (it's too slow to include in my unit test suite)

I ran it and it hung up. What the ...?

I thought maybe it was my refactoring but I reverted my changes and it still hung.

The test is multithreaded and using the Eclipse debugger I could see the threads were all hung at the some spot. It wasn't an infinite loop, but it was a synchronized method.

I used jConsole's deadlock detector to confirm what was going on.

Suspiciously, I had recently made a "minor" change to this exact part of the code.

The funny part was that I made the change because FindBugs reported a potential concurrency problem.

I removed the synchronized and used volatile instead and that fixed the problem. I think volatile is ok in this instance, but it's always hard to know for sure.

I think FindBugs was correct that there was a problem. My mistake was slapping in a fix without really thinking it through. That's definitely a no-no when it comes to concurrency. It's a good example of how the "intuition" that adding more locks can't hurt is very wrong.

My second mistake was not running the related stress tests when I made the change. I need to add the stress tests to our continuous build system at work. And maybe set up something at home as well, if I can make it lightweight enough.

Yet another reminder that concurrency is hard! (not that I should have needed the reminder)


No comments: