Wednesday, May 23, 2012

IllegalMonitorStateException and Stack Overflow

Just when I thought immudb was ready to deploy, I got sporadic IllegalMonitorStateException's.

I wasn't sure what that exception meant, but it sounded like concurrency, and that was not good news.

According to the documentation it comes from wait and notify. The catch is that my code doesn't directly use wait and notify. So something that my code uses is in turn using wait and notify.

I tried to catch the exception in the debugger to see where it was coming from, but of course, it never happened when I ran inside the debugger :-(

I started looking around at the docs for concurrency classes that I use and found that ReentrantReadWriteLock WriteLock unlock can throw IllegalMonitorStateException "if the current thread does not hold this lock".

Ouch. My server is NOT thread-per-connection. It uses a thread pool to service requests. So the thread that handles the request that starts a transaction (and acquires the lock) may not be the same thread that handles the request that ends the transaction (and releases the lock).

Because I'm not testing with lots of connections, most of the time all the requests will be handled by the same thread and it will work. But once in a while an additional thread would be used AND would happen to be ending a transaction, and then I'd get this error.

And my concurrency test is equivalent to thread-per-connection so it doesn't run into the problem. That's a weakness in my test, and in this respect the bounded executor I was using before is more equivalent to the actual server.

Of course, I can't guarantee this is the only source of the error, but regardless, it was a bug I needed to fix.

I needed a read-write lock that allowed lock and unlock to be in different threads. I didn't need (or want) it to be reentrant (where the thread holding the lock can acquire it multiple times).

I searched on the web but couldn't find anything.

It looks like it could be written using AbstractQueuedSynchronizer but I'm afraid if I write it myself I'll make some subtle concurrency mistake. I could find various examples of using AbstractQueuedSynchronizer but not a ReadWriteLock.

I'm stumped so I decided to post a question on Stack Overflow. (After searching to make sure there wasn't an existing question.)

I've always been a fan of Stack Overflow. I'm not a heavy user, but I've asked and answered a few questions, and if it comes up in web searches I give it preference. But this time I got a little frustrated with the responses. No one wanted to answer the question - they just wanted to tell me what I was doing was wrong. That's a valid response to some questions, and I have to admit I hadn't really explained the context. I thought the question was specific enough to make the context unnecessary.

I kept clarifying the question and trying to convince the responders that I really did need what I was asking for. It was even more frustrating that people were voting for the responses that just told me I was wrong.

Of course, part of the frustration was that I started to doubt my own design decisions. Maybe I should just be using thread-per-connection - it would have avoided the current issue.

But in the end, Stack Overflow came through again - someone posted an answer that was exactly what I needed. Bizarrely, that answer didn't get as many votes, even though it was the "right" answer.

The answer was to use Semaphore. I hadn't even noticed this class because it's in java.util.concurrent, not in java.util.concurrent.locks where I was looking. I guess it's not a "lock" although it can be used as one.

And when I went to look at the source code for Semaphore, I found that it is implemented (at least in OpenJDK) with AbstractQueuedSynchronizer (which is in java.util.concurrent.locks)

It was simple to write my own read-write lock using Semaphore and everything seems to work fine. I wondered about performance but it seems to be roughly the same as before. I ran some tests and I didn't get any IllegalMonitorStateException's, but it was sporadic before, so that doesn't guarantee it's fixed.

AbstractQueuedSynchronizer has both "shared" and "exclusive" features which seem to map well to read-write locking. But Semaphore doesn't use the exclusive feature. It seems like you could write a read-write lock based on AbstractQueuedSynchronizer that would be a little "cleaner" than Semaphore. But for now at least, I'm happier using something like Semaphore that is tried and tested.

No comments: