Friday, April 30, 2010

jSuneido Concurrency Bug

I spent all day yesterday working on my concurrency bug in jSuneido. (see previous post)

It didn't seem like I made any progress, but I guess that's not true - I eliminated a few possibilities.

Originally I assumed it was a deadlock since it was "freezing". So I managed to make the problem happen (that takes a while), connected JConsole, and ran the deadlock detector. It didn't find any deadlocks. Thinking about it, that makes sense since it's really only the client that gets "stuck" - the server thinks everything is fine. There are no worker threads running for that client.

So much for that assumption. I tried to narrow it down, without much success. I added a bunch of assertions, which failed, but I found it was my assumptions that were wrong, not the code.

The client appears to be waiting for a response to a request. But the server is not in the process of handling a request - it either didn't get the request, or it thinks it finished it. The read and write buffers for that client are empty.

Sometimes you have to run test for a long time before the problem shows up. So you try something, and the longer the tests run successfully, the more you think you've got it fixed. But eventually it has always shown up. I guess I should be thankfully this is on the order of minutes rather than hours! Even so, it's reminiscent of the "old" days when you had to have something to read while you waited for compiles. (And no internet to browse back then.)

On the positive side, the handling for inactive transactions and clients seems to be working well - when the client gets stuck, the server eventually aborts its transactions and if it stays stuck, closes the connection.

My current hypothesis is that it's related to a potentially overlap between handling requests. As an optimization, worker threads first attempt to directly write their response. This is in non-blocking mode so it may or may not write everything. Any remaining data is sent by the select thread when the socket becomes write ready. But if the worker does successfully write the entire response, and then there is a context switch back to the "select" thread, the next request can be read and another worker started, before the previous worker has a chance to finish. I'm not sure why this would be a problem, but my assumption when I was writing the code was that workers for the same client would not overlap, so I didn't worry about making the data structures thread safe. That's the next area I want to look at.

Thursday, April 29, 2010

Dig Deep

In the last few days I've questioned a number of changes that I saw going through version control. None of them was "incorrect". The recurring theme was that they were fixing problems at too "shallow" a level.

For instance a dialog was using a feature to save its size after being changed by the user. This wasn't working properly, so the feature was disabled. There's a couple of problems with this. One is that we lose a feature from that dialog that we presumably wanted. But more importantly, that feature is used in many places and if there's a bug in it, it's going to show up elsewhere. Are we going to disable it everywhere? How much time are we going to spend dealing with each new instance that turns up?

Of course, sometimes you have to fix a bug the most immediate way. That's fine as long as you go back and look for the real problem.

Here's another example: we got an error about a missing database table. This was after wiping out all the data so the initial thought might be to ignore it. But this table is supposed to be created automatically. So we dug a bit deeper and we found one place where it didn't create it when necessary and fixed that. But access to this table is supposed to be encapsulated in one class. This class handled creating the table if necessary. The spot we fixed was accessing the table directly. So we moved this access into the class to restore encapsulation. But then we realized that it shouldn't even need to use this class/table. It turns out we had used it to work around yet another bug. So we removed the use of the class/table and fixed the original, previous bug. The end result was less code and simpler.

This process of digging deeper into a problem can take time. It can take orders of magnitude more time than a quick fix to the immediate problem. Which is why we often apply the quick fix and move on. But the long term result of that approach is technical debt - your code gets more complex, less understandable, less consistent, hard to change. And in the long run the quick fix ends up costing your more time than if you had fixed it properly in the first case.

This is a bit like the "lean" idea of finding and fixing "root causes". One of the techniques is to ask five (more or less) "whys". Why did we get a missing table error? Because the code wasn't creating it on demand. Why? Because it was bypassing the access class. Why? Because it was doing something non-standard. Why? To work around another bug.

Tuesday, April 27, 2010

Is It All About Sales?

It always bugs me when people (usually sales people) say "It's all about sales. Everyone in the company needs to be selling all the time, not just the sales and marketing department."

Before anyone gets the wrong idea and starts jumping on me - I am not disagreeing with this. But I don't think it's the whole picture, and I don't think it's the most useful way to think about it.

Take my company - we have sales and marketing, product development, and customer support. To me, they're like three legs of a stool. Does it make sense for one leg to say it's all about that leg? All three legs are essential. Take away any one and the stool will fall over.

Part of the reason sales is seen as primary is that they bring in the money. It's pretty obvious that's an important role. They have the most direct link to the success of the company. But just because the other roles have a less direct link to success doesn't make them less important.

Think of fighting a fire. The fireman holding the nozzle of the hose has the most direct link to spraying water on the fire. But without the rest of the hose, and the source of water, he would be out of business.

Take Apple for example, obviously great at sales and marketing. But are they "all" about sales? Where would they be without great products?

A good marketing and sales group sometimes think it can sell anything. (And there is some truth to that.) But when my company has tried to sell third-party products it has become pretty clear that if the product doesn't have good development and good support then in the long run, it's pretty hard to sell.

You don't usually hear "it's all about customer support" or "it's all about product development". To me, those views would be just as incomplete. For example, 37 Signals claims they have no marketing people. That may be true in the sense that they don't have anyone whose only job is marketing. But get real, those guys do more marketing than most marketing departments. They're certainly not "just" programmers.

Perhaps another reason you hear "everyone needs to sell" more often is that non-sales people tend to avoid selling. Whereas sales and support are usually more than happy to talk about where the product should be developed. Another weak spot is that programmers tend to avoid customer support (I know I did!).

While equally incomplete, I think there's value in the idea that everyone needs to do customer support and everyone needs to do product development, just like there is value in the idea that everyone needs to sell.

I think what it boils down to is that everyone needs to keep the big picture in mind, not just any one particular role. That's easier and more obvious in a small company. Most of the time I think my company does fairly well at this. Sales and support contribute to product development, programmers and sales help with customer support, and yes, hopefully everyone helps sell.

Saturday, April 24, 2010

Cultures and Attitudes

You would think that peoples' attitudes and behaviour would be an individual thing with a range of variation in any organization.

But attitudes and behaviours are greatly affected by the culture of an organization, biasing people within the organization to act a certain way. There is still variation, and you can find good people in a bad organization (or vice versa). But even this is reduced by a tendancy for good people to leave a bad organization, and bad people to be removed from a good organization.

Amtrak (US passenger rail) seems to have a culture of small minded bureaucracy.

On the way to Portland there were only a handful of people in our car. So one person moved to a seat with more room. No big deal, until the steward came along: Is this your seat? You must remain in your designated seat. It is not possible to switch seats. The passenger wisely kept quiet. Eventually the steward wound down his littany of pronouncements. The passenger looked at him. Then the steward said "Ok, here's what we can do. I can let you stay in this seat as long as you agree to move if someone else is assigned this seat.". I had to shake my head. This was exactly what the passenger had said he would do right from the start. "But you must leave your seat assignment card on your assigned seat."

But that wasn't the end of it. A bit later a second steward came around: Is this your assigned seat? No, but the other steward said I could sit there. You could tell he was frustrated at having his exercise of power cut short and looked for another outlet. "Ok, but you must have your seat assignment card on the seat you are sitting in, not the seat you were assigned." (The exact opposite of the first steward.)

This kind of bureaucratic attitude is often exemplified by people saying "you must" or "you cannot" or "I cannot", even (or especially) when this is obviously untrue.

For example, when I went to get tea I was told "I cannot put hot water in your cup.". Obviously he "could", presumably what he meant was that he wasn't supposed to. So why not say that? It would be a lot less frustrating if he said "sorry, I'm not allowed to do that". Or just bend the rules. The previous time, the person had put the hot water in a paper cup, as he was required to do, and then poured it into my cup. This satisfied his requirement, while at the same time fulfilling my desire to not waste a cup.

I think creatively interpreting (bending) the rules (and not being punished for it) is a sign of a good organizational culture. Whereas, presenting the rules as "cannot's" is a sign of a bad one.

The problem is that saying "you cannot" makes you feel more powerful. While saying "I'm not allowed" makes you feel less powerful. And of course, most people would prefer to feel more powerful.

I worry about where my own company's culture fits. I hope it's clear that people should understand the reason for "rules" and use their judgement in interpreting them. And at the very least be truthful and say "I'm not allowed to do that" rather than "you cannot".

Thursday, April 22, 2010

Building the Wrong Thing

"The biggest cause of failure in software-intensive systems is not technical failure; it's building the wrong thing." - Leading Lean Software Development

(Presumably that assumes a certain minimal level of technical expertise.)

I spend a lot of time thinking about how to reduce the errors (bugs) in our software. We spend a big percentage of development time on fixing errors. Just to be clear, most of these errors are small internal issues that aren't even noticed by the majority of users. I think as far as our users are concerned, our quality control is good.

But if we could reduce those errors, we would have more time to get stuff done.

But ... if the biggest cause of failure is building the wrong thing, then having more time won't necessarily help. Or at least it's not the best way to improve. By this line of thinking, we'd be better off working on building the right thing, rather than less errors. (Of course, you still want to reduce errors as well.)

It reminds me of the old saying "if it doesn't have to work, you can make it as small/fast/etc. as you like", if you interpret "work" as "building the right thing".

Of course the trick is how to do better at building the right thing. In many ways that's tougher than reducing errors, and that's tough enough.

If you're not involved in the software business at this level, this talk of "building the right thing" may strike you as strange. Why wouldn't you build the right thing? The problem is that no one knows what the "right thing" is. And no, the customers don't know either. As Ford said, "If I'd asked my customers what they wanted, they'd have said a faster horse."

None of this is really new, but it's good to be reminded. Definitely food for thought.

Tuesday, April 20, 2010

Books and More Books

If you like books and you're in Portland, don't miss Powells. And if you like technical books (eg. computer stuff) don't miss their technical book store a few blocks away.

The technical store even has a display of old computer hardware - an Apple II, an Osborne, an original Mac, etc. I can remember quite clearly when those machines were the hot new thing. That's dating myself, but there's something to be said for perspective.

One of the nice things about Powells is that they mix new and used books. That means you see older books and books that are out of print. They had several copies of Software Tools by Kernighan and Plauger. I remember how excited I was about that book. It came out in 1976, when I was 16. A few years later, a friend and I wrote versions of some of those tools (in C, compiled with Whitesmiths compiler) and sold them. Not that we made any money at it.

That's the exception though. Most computer books don't age very well. They become useless a few years after they're published. I should know, I have piles of them!

Between the two Powells stores, and a Borders I visited on the way there, and between computer, science, environment, and science fiction, I made note (on Evernote) of 30 books I wanted to read. I didn't buy any, which in a way was easier - it would have been tough to choose! (I've already bought one on my Kindle).

Oh well, I guess there's worse things to be addicted to!

Hands on the iPad

I got to touch an iPad today at the Apple store in Portland.

As others have mentioned, it is heavier than you expect. I'm not sure why that is when you've never felt one before. Maybe because the ads make it look so thin. You expect it to be as thin and light as an iPod Touch but it's thicker and heavier like an iPhone.

Beyond that it was like a bigger, faster, fancier iPhone. Cool, but nothing earth shaking.

Do I want one? Sure. Am I desperate to buy one this minute? Not really. My current iPhone plus Kindle combo works pretty well for me.

It was interesting to listen to people's reactions to the iPad. People still aren't quite sure where it fits.

As usual, the Apple store was packed. There always seems to be an undercurrent of excitement.

I fondled the new MacBooks, but externally they haven't changed much. It's that i7 heart I lust after. And I gazed longingly at the 26" iMac with quad core i7. Someday. It's not like there's anything wrong with my current MacBook or iMac. Just seductive marketing.

Location:SW Naito Pkwy,Portland,United States

Wednesday, April 14, 2010

A Plea to Evernote

Evernote is a popular app for the iPhone and iPod Touch. (There are also PC and Mac versions.)

You can write notes and record photos from the iPhone camera. Notes are synced to Evernotes servers and to all your devices. You can organize and search your notes. Evernote even does OCR (optical character recognition) on images so you can search for text in images.

It's a great product and many people swear by it.

But ...

The software is buggy. It freezes. It crashes. You have to reinstall it periodically to get it working again. Check out the problem reports on GetSatisfaction for more examples.

A tool like this is only worthwhile if it's dependable. I put all my travel information in it and it died almost as soon as I left home. That doesn't make a happy customer. (thankfully reinstalling got me going again)

I'm not the only one with a love hate relationship with Evernote. People give up on it, come back thinking it's fixed, only to discover some time later (after entering a bunch more notes) that it's still got bugs, just a different set from before.

I'm not sure where the problem lies. But if they want to be successful in the long run they have to do better.

Quit adding new features until it's more stable. Add more automated tests. Set up alpha and beta testing progroms. Review the code. Refactor the code. Throw out problem code and rewrite it. Whatever it takes. I can't see anything more important.

It's a good product but it doesn't matter how many great features it has if it's not reliable.

Wednesday, April 07, 2010

Not Sold on Multiple Monitors

Jeff Atwood recently posted about using three monitors. He's a big fan of multiple monitors.

Personally, I'm not so sure. A few years ago we upgraded almost everyone at our company to dual monitors. I was one of the few hold outs.

At that point, one of my arguments was that I preferred one large monitor to two smaller ones. Even if the surface area is larger, two monitors isn't so good if you want one large window (e.g. Eclipse or Lightroom). Of course, if money isn't an issue, you could just have two (or three) large monitors.
To be clear, I'm not saying there aren't good, valid uses for multiple monitors. Studies have shown they can increase productivity. But I'm not sure that translates into normal usage. As the saying goes, "In theory, theory and practice are the same. In practice, they are not."

I think the key issue is how people use multiple monitors. I suspect that many people use them in ways that are counter productive, that just add distraction and interruptions to their environment. I think you should be closing your email and chat and Twitter and Facebook - not using multiple monitors to keep them constantly in your face.

Of course, it's virtually impossible to wean people away from keeping email etc. open all the time. It doesn't matter how much you talk about the cost of distractions and interruptions. Regardless of the excuses they give, I think the problem is that a lot of people welcome these distractions. After all, it's hard work to focus on one task for extended periods.

If you're going to focus on one thing at a time, then do you still need multiple monitors? Occasionally, I would like to have documentation or the program I'm working on open as well as my IDE. But I find virtual desktops to work pretty well for this.

I do agree that ending up with one monitor straight ahead is an advantage of three monitors over two.

Tuesday, April 06, 2010

Spoke Too Soon

Good think I'm not superstitious or I'd be certain I jinxed myself writing my previous blog post.

Right after posting it I started to test limiting the maximum number of threads and ran into a problem I'd actually seen earlier in the day but written off as nothing because I was still getting my test program working.

For some reason, occasionally one of the clients will get "stuck". The server appears to have completed the request, but the client appears to still be waiting for a response. Eventually the client times out.

And, Murphy's Law, it seems to not happen (or at least nowhere near as often) when the debugger is attached. Which happens to be how I was running the tests all day, just in case I ran into any problems that I would need to inspect. When I resumed testing after writing the blog post, I didn't attach the debugger, since I was just wrapping up.

Of course, not being able to make it happen easily, and not at all when the debugger is attached is going to make it hard to track down. Exactly what I was afraid of. Wish me luck!

Putting jSuneido Through Its Paces

I've spent the day exercising jSuneido. I planned to do more of this sooner, but I kept finding other parts to work on. Partly I'm scared of finding bugs, which at this stage in the game are liable to be obscure and hard to fix. Wimp!

But so far, no new bugs. I've had three computers, each with multiple instances of clients (from 4 to 40), all hammering constantly on the jSuneido server. So far so good - it appears to be solid. I've been using JConsole to monitor it. It's fun to watch the memory grow and then shrink after garbage collection, and threads come and go in the thread pool.

My iMac is just dual core. I also want to run some tests on a quad core machine at the office. More cores is a better test of concurrency since there is more going on literally at the same time, rather than just with context switching. Also, at the office I've got even more machines to run clients on. And of course, I want to see how it scales with more cores, which is to a large degree the point of this whole project.

One thing I'm not sure about is whether I should be limiting the size of the thread pool. Currently I'm just using a standard Executors.newCachedThreadPool which isn't limited. So if I fire up 40 clients making constant requests, I get 40 threads. But 40 threads on 2 cores presumably means a lot of context switching overhead. I'll have to experiment. (Note: in normal operation there wouldn't be one thread per client because the clients wouldn't be making continuous requests. There would only be as many threads as concurrent requests.) The minimum context switching would be to have the same number of threads as cores. But you want to service clients concurrently, i.e. a long request shouldn't block other requests. So you have to sacrifice some context switching overhead to gain more concurrency.

Sunday, April 04, 2010

Slow Gmail on iPhone

Lately Gmail on my iPhone has been really slow. (This is the web version, not the mail app.) The screen pops up right away, but it takes a long time to finish checking for new mail - maybe 30 seconds.

I don't have a lot of messages in my inbox, often none. And it happens with a good wifi connection.

I didn't really think too much about it, but I let my sister check her gmail and it was much faster. Very strange.

I searched on the internet but didn't find too much. Most of what I found was for the mail app.

So I just did the usual "kick the machine" - I cleared the history, cache, and cookies. That seemed to do the trick - it was fast again (less than 5 seconds).

So I wrote this blog post and then went back and tried it again and it was slow!!!

Next, I tried deleting the databases for gmail. Still slow.

Try again, delete the databases and this time also clear the history, caches, and cookies. Log back in and it's fast again. And it seems to be staying fast this time, although I'm not totally confident that it'll stay that way.

If you're having the same problem and want to try this, go to Settings > Safari, scroll down to the bottom. Click on Databases, then Edit, then delete any gmail entries. I always seem to have two entries which appear identical. Then go back to the Safari settings and click on Clear History, Clear Cookies, and Clear Cache (one at a time).

NOTE: Clearing cookies will log you out of sites, so don't do this if you don't have your passwords handy to log back in.

Saturday, April 03, 2010

A Welcome iTunes Feature for the iPhone

For smaller iPod's like the Shuffle, iTunes has for a long time had an option to convert audio to lower quality to fit more on the device. But for some reason this feature was not available on the iPhone until recently.

As I accumulate more music, and at higher quality, and therefore larger files, my iPhone has been getting full, despite being a 32gb model. I have a few apps that take up more space, like Wikipedia and CoPilot Live but the majority of the space is audio.

Recently I noticed that this option had appeared for my iPhone. I think it probably came in the last update, but it may have been there for a while and I just missed it. It freed up over 10gb of space on my phone! Very nice. Audiophiles will no doubt disapprove, but in the places where I listen to music on my iPhone, I doubt I'll notice the difference.

NOTE: Because this requires updating all the music files on your phone, the next sync after setting this will take a while. Don't do it when you're short of time.

Friday, April 02, 2010

Geek Envy

I was sitting reading my Kindle over lunch the other day at Innovation Place. Someone walking by did a double take and asked "Is that an iPad?!"  I said no, it was "just" a Kindle. He was relieved and explained, "My boss is getting the first iPad in Saskatoon and he'd be really upset if someone else got one first." I don't think he saw me shake my head and roll my eyes as he walked away.