Wednesday, May 23, 2012

IllegalMonitorStateException and Stack Overflow

Just when I thought immudb was ready to deploy, I got sporadic IllegalMonitorStateException's.

I wasn't sure what that exception meant, but it sounded like concurrency, and that was not good news.

According to the documentation it comes from wait and notify. The catch is that my code doesn't directly use wait and notify. So something that my code uses is in turn using wait and notify.

I tried to catch the exception in the debugger to see where it was coming from, but of course, it never happened when I ran inside the debugger :-(

I started looking around at the docs for concurrency classes that I use and found that ReentrantReadWriteLock WriteLock unlock can throw IllegalMonitorStateException "if the current thread does not hold this lock".

Ouch. My server is NOT thread-per-connection. It uses a thread pool to service requests. So the thread that handles the request that starts a transaction (and acquires the lock) may not be the same thread that handles the request that ends the transaction (and releases the lock).

Because I'm not testing with lots of connections, most of the time all the requests will be handled by the same thread and it will work. But once in a while an additional thread would be used AND would happen to be ending a transaction, and then I'd get this error.

And my concurrency test is equivalent to thread-per-connection so it doesn't run into the problem. That's a weakness in my test, and in this respect the bounded executor I was using before is more equivalent to the actual server.

Of course, I can't guarantee this is the only source of the error, but regardless, it was a bug I needed to fix.

I needed a read-write lock that allowed lock and unlock to be in different threads. I didn't need (or want) it to be reentrant (where the thread holding the lock can acquire it multiple times).

I searched on the web but couldn't find anything.

It looks like it could be written using AbstractQueuedSynchronizer but I'm afraid if I write it myself I'll make some subtle concurrency mistake. I could find various examples of using AbstractQueuedSynchronizer but not a ReadWriteLock.

I'm stumped so I decided to post a question on Stack Overflow. (After searching to make sure there wasn't an existing question.)

I've always been a fan of Stack Overflow. I'm not a heavy user, but I've asked and answered a few questions, and if it comes up in web searches I give it preference. But this time I got a little frustrated with the responses. No one wanted to answer the question - they just wanted to tell me what I was doing was wrong. That's a valid response to some questions, and I have to admit I hadn't really explained the context. I thought the question was specific enough to make the context unnecessary.

I kept clarifying the question and trying to convince the responders that I really did need what I was asking for. It was even more frustrating that people were voting for the responses that just told me I was wrong.

Of course, part of the frustration was that I started to doubt my own design decisions. Maybe I should just be using thread-per-connection - it would have avoided the current issue.

But in the end, Stack Overflow came through again - someone posted an answer that was exactly what I needed. Bizarrely, that answer didn't get as many votes, even though it was the "right" answer.

The answer was to use Semaphore. I hadn't even noticed this class because it's in java.util.concurrent, not in java.util.concurrent.locks where I was looking. I guess it's not a "lock" although it can be used as one.

And when I went to look at the source code for Semaphore, I found that it is implemented (at least in OpenJDK) with AbstractQueuedSynchronizer (which is in java.util.concurrent.locks)

It was simple to write my own read-write lock using Semaphore and everything seems to work fine. I wondered about performance but it seems to be roughly the same as before. I ran some tests and I didn't get any IllegalMonitorStateException's, but it was sporadic before, so that doesn't guarantee it's fixed.

AbstractQueuedSynchronizer has both "shared" and "exclusive" features which seem to map well to read-write locking. But Semaphore doesn't use the exclusive feature. It seems like you could write a read-write lock based on AbstractQueuedSynchronizer that would be a little "cleaner" than Semaphore. But for now at least, I'm happier using something like Semaphore that is tried and tested.

Sunday, May 20, 2012

More Immudb Results

The test that I was using to measure concurrency performance was using a bounded executor - code I'd found on the web, back when I knew Java even less than I do now. I decided that it was more complex than it needed to be and I rewrote it just using a number of worker threads. Surprisingly, that seemed to eliminate the drop in performance with more threads.

I also tested on my Windows machine which has a similar CPU but with an SSD (solid state drive) instead of a hard disk. Here are the results:

Now the performance seems to consistently level off as the number of client threads increase. That's less worrying than the performance dropping, but it's still a little puzzling. It still doesn't appear to be due to actual concurrency scaling issues like lock contention. Perhaps it's running into some other limit like storage or memory bandwidth?

Windows + SSD was roughly 2 times as fast as Mac + HD. I suspect that's mostly due to the SSD, although the OS and hardware could also play a part.

immudb also shows more improvement on SSD than the previous version. (blue to red versus orange to green) My guess would be that this is because immudb's large contiguous writes are optimal for SSD.

On both platforms, immudb is about 4 times as fast as the previous version.

Tuesday, May 15, 2012

Immudb Concurrent Performance Puzzle

One of the tests I hadn't run on immudb (my new append-only database engine for Suneido) was a concurrency test I'd used to debug the previous database engine.

Here are some rough results. As usual, this is not a scientific benchmark. The operations it's doing aren't necessarily representative, and I'm not running it long enough or enough times to get accurate figures. But it still gives me some information.

I'm pretty happy with the results relative to the previous version. Up to four client threads it's about four times as fast - can't complain about that. And another good result that's not shown by this chart is that immudb has a lot less transaction conflicts.

Also on the positive side, I only found a couple of bugs, and none of them took more than a few minutes to fix. This is a sharp contrast to when I was debugging the previous version. I give credit to the immutability and the resulting reduction in locking.

But I'm puzzled by the drop in performance over 4 threads. And I didn't even include 8 threads because the results were all over the place - anywhere from 3000 to 6000. There's some variation with less threads, but nowhere near this much.

If the performance just levelled off, that would be one thing, but I'm not happy with the performance dropping under heavy load - that's not what you want to see.

The previous version may not perform as well, but it doesn't show the puzzling drop off with more threads. Although, even with the drop, the immudb performance is still much better than the previous version.

At first I was only running 1, 2, 4, and 8 threads. My computer has 4 cores with hyper-threading for more or less 8 hardware threads so I thought maybe it was because I was using up all the hardware threads and not leaving any for other things like garbage collection and the OS. But that wouldn't explain the drop off at 5 threads.

I looked at hprof output, but it didn't give me any clues. It did highlight some areas that I could probably improve, but not related to concurrency. (they were related to one of my pet peeves - ByteBuffer)

More threads could mean more memory usage, but JConsole and VisualVM don't show a lot of time in garbage collection.

The obvious issue is some kind of contention. Immudb doesn't do much locking (because most things are immutable) but I could see contention over the commit lock. But JConsole and VisualVM  don't show much waiting for locks. Thread dumps show all the threads as RUNNABLE And I don't see underutilization of the cpus, as I would expect with heavy contention (since they'd be waiting for locks, not executing).

Another possibility is that more concurrent transactions will mean more work to do read validation. And since each commit has to check against all other concurrent transactions, this is ON2 which could be bad. But again, I don't see any signs of it in my monitoring.

And I can't think of any explanation for the excessive variation. Strangely, 16 threads shows much less variation than 8.

I'm more puzzled than worried at this point. In actual usage I don't think it will be an issue because our systems, even the largest ones, won't be applying a continuous load this heavy.

Anyone have any ideas or suggestions of things to look at or try?

Saturday, May 05, 2012

No Service

Every trip I make to the US, I end up wishing I had cell phone internet access.

I have a Canadian data plan on my iPhone. I don't use it a lot, since I can usually find wifi, but occasionally it is very handy, e.g. to use Google Maps to find an address while wandering around. I could use this plan in the US but the roaming charges are ridiculous and I refuse to pay them.

When I got my iPad, I bought the 3G model thinking that at some point I could get a US SIM card and data plan. In the winter when I was in Savannah I stopped at an AT&T store to try to set it up. The two young guys working there were friendly enough, but did not seem at all inclined to help. The first told me I couldn't do it because I had a Canadian SIM card. I told them I didn't have a SIM card. The second guy then said it wouldn't work because I didn't have a SIM card. Couldn't they give me a SIM card? Yeah, they could, but it wouldn't work. Could we try it? No, because I wouldn't have an account. Could they set up an account for me? No, because I'm Canadian. At this point I gave up.

So this trip, I thought I'd try an Apple store and see if they would be more helpful. I explained what I wanted to the greeter. He said "no problem" and gave me an AT&T SIM card and even installed it for me (between greeting each person that came in the store). All I had to do was phone AT&T and set up a plan.

Of course, that wasn't as easy as it sounded. I knew phoning would be a hassle, so first I tried to sign up on the iPad itself. That wouldn't work because it insisted on a US billing address, so my credit card wouldn't work. Then I tried on the AT&T web site, but same problem. So I gave in and phoned. It took me multiple tries to get through the computerized phone system, wait on hold, get the wrong department and get transferred, wait on hold again. Finally I got someone in the correct department who informed me it was impossible. I had to have a credit card with a US billing address. Considering this was for a pre-paid plan, I'm not sure why they care where the money comes from. But it's not like you can argue logic with a call center.

I did a little searching on the web and the suggestion I found was to use a prepaid gift credit card. I've never used one of these but it was worth a try. I picked up a Visa card from a local convenience store. But the instructions with the card said if you wanted to use it on the web you should register it with your address. I needed a US address so I used our hotel address. Finally, I could set up my AT&T plan.

It all seemed to work properly, but I could barely get service. AT&T must not have very good coverage in our area of Las Vegas. When we got to Fresno I tried again. Good service - five bars - but still no data connection. I checked all the settings and checked my account but everything looked correct.

So we stopped at the Apple store on our way out of Fresno. I talked to the greeter, but he had no idea what the problem would be and said I'd need to make an appointment for the Genius Bar. Unfortunately, the next free appointment wasn't for two hours :-(  I wandered through the store thinking what to do next. I could just buy a US iPad (the new model would be nice), but I hated to do that when I didn't know what the problem was. There was a guy behind the Genius Bar who didn't appear to be busy so I asked if he had a couple of minutes to help me. He was a little reluctant (probably not supposed to take walk ups) but agreed to take a look. He checked all the same things I had already checked. Eventually he decided it was because it was a Canadian iPad. (It always strikes me as funny how these support people come up with theories, but then present them as "fact".) But it's a US SIM, I said. He told me to wait a minute and disappeared into the back with my iPad. He returned after 10 or 15 minutes and handed me my iPad saying "here you go". I looked at the screen and there was the little "3G" icon at the top! And I was connected to the internet!

I asked what the fix was. He hesitated and then said they turned it off and back on. Shelley was standing next to me and laughed, "isn't that what you would have told me to do?" I kicked myself for not trying this. The reason I didn't is that iOS tries hard to hide the whole idea of "turning off" your device. They try to abstract away the issue of on or off, but as usual, it's a "leaky" abstraction. Of course, the other question is why it needed to be turned off and on - what did that reset?

So finally, I have my US data. Has it been useful? Not very, since we're mostly out in the boonies and there is poor coverage. It should be more useful when I'm traveling in cities. In any case, it had become a quest and the usefulness wasn't really the motivation any more!

I only signed up for the $15 for 250mb prepaid plan since I wasn't sure how it would work. I'm not sure if that'll be sufficient. If I needed more I could get 3gb for $30. Also, in theory, I think I could swap the SIM card into my iPhone since that's a little easier to carry around and use on the street. Now I just have to remember to cancel my plan when I get home, although the most they could bill was the $50 of the prepaid card.