Thursday, November 25, 2010

Social Search?

"search is getting more social every day and tomorrow's recommendations from people you know via Facebook are infinitely more valuable than search results from yesterday's algorithm"
- Publishing needs a social strategy - O'Reilly Radar:

Really? What kinds of searches are we talking about? When I search for some technical question or someone searches for e.g. a solution to an aquarium problem, is Facebook really going to help? Personally, I think I'd rather have "yesterday's algorithm".

Sure, if I'm looking for something like a restaurant recommendation then I'd be interested in what my friends have to say. But unless you have a huge, well travelled circle of friends, how likely is it that they'll have recommendations for some random city you're in? And if the recommendations aren't coming from friends, then we're back to regular search.

This "everything is social" craze drives me crazy. Believe it or not, Facebook is not the ultimate answer to every problem.

Tuesday, November 23, 2010

Goodbye Google App Engine

Goodbye Google App Engine

Definitely a different picture than Google paints.

I would think twice about using Google App Engine after reading this.

Maybe we'll just stick with Amazon EC2

Sunday, November 21, 2010

Optimizing jSuneido Argument Passing

Up till now jSuneido has always passed arguments as an Object[] array, similar to a Java function defined as (Object... args)

Suneido has features (e.g. named arguments) that sometimes requires this flexible argument passing. But most of the time it could use Java's standard argument passing.

I've wanted to optimize this for a while, and it was one of the motivations for the compiler overhaul (see part 1 and part 2).

Having finished adding client mode, I sat down to start working on optimizing argument passing. When I started to think about what was required, it began to seem like a fair bit of work.

I decided that first I should determine if the change was worthwhile. (In the back of my mind I was thinking it probably wouldn't be that much better and I wouldn't have to implement it.)

First I measured what percentage of calls were simple enough to optimize. I was surprised to see it was about 90%. But it makes sense, most calls are simple.

Next I measured what kind of improvement the optimization would give. For a function that didn't do much work, I got about a 30% speedup. (Of course, for functions that did a lot of work the argument passing overhead would be negligible and there would be little or no speedup.)

Those figures were just from quick and dirty tests but I was only looking for rough orders of magnitude results.

The changes avoid allocating and filling an argument array and also avoid the argument "massaging" required for more complex cases. The calls are simpler and use less byte code. This means they have a better chance of being inlined by the JVM JIT byte code compiler. The calls are also more similar to those produced by compiling Java code and therefore have a better chance of being recognized and optimized by the JIT compiler.

To me, these numbers justified spending a few days making the changes. Thankfully, I was able to implement it in a way that allowed me to transition gradually. So far I have optimized the argument passing for calling built-in functions. I still have some work to do to handle other kinds of calls, but the general approach seems sound. I don't think it'll be quite as bad as I thought.

You can see the code on SourceForge

One Bump in an Otherwise Smooth iMac Upgrade

I recently upgraded my 24" Core 2 Duo iMac to a 27" i7 iMac.

My usual rule of thumb is not to upgrade till the new machine will be twice as fast as the old one. But that rule is getting harder to judge. For a single thread, the i7 is not twice as fast. But with 8 threads (4 cores plus hyper-threading) it definitely has the potential to run much more than twice as fast. The iPhone MacTracker app shows the new machine with a benchmark of 9292 versus the old machine at 3995. I'm not sure what benchmark that's using.

I've been thinking about upgrading for a while but as winter arrives and I spend more time on the computer, I figured it was time. Another motivation for the upgrade was to be able to test the multi-threading in jSuneido better.

I really wanted to get the SSD drive to see how that would affect performance. But that option is still very expensive. Especially since I'd still need a big hard drive as well for all my photographs. So I didn't go for it this time. I'm not sure how big a difference it would have made for me. My understanding is that it mostly speeds up boot and application start times. But I generally boot only once a day and tend to start apps and then use them for a long time. It would be interesting to see how Suneido's database performed on an SSD.

I did upgrade from 4gb of memory to 8gb. Most of the time 4gb is fine, but when I run two copies of Eclipse (Java and C++), Windows under Parallels, Chrome with a bunch of tabs, etc. then things started to get noticeably sluggish. With the new machine things don't seem to slow down at all. And I can allocate more memory and cores to my Windows virtual machine. I also upgraded from a 1tb hard disk to 2tb. I hadn't filled up the 1tb, but I figure you can never have too much hard disk space and the cost difference wasn't that big.

Migrating from one machine to the other went amazingly smoothly and quickly - the easiest migration I've done. With Windows machines I never try to migrate any settings or applications since you need to clean everything out periodically anyway. I'm sure even with OS X there is a certain amount of "junk" accumulating (e.g. old settings) but it doesn't seem to cause any problems.

I used OS X's migration tool but I wasn't sure what method to connect the machines - Firewire, direct network connection, network via LAN, or via Time Machine backup. In the end I went with migrating from the Time Machine backup, partly because it didn't tie up my old machine and I could keep working.

Some estimates from the web made me think it might take 10 or 20 hours to migrate roughly 600gb, but it was closer to 2 hours - nice.

The one speed bump in this process was my working copy of jSuneido. I keep this Eclipse workspace in my DropBox so I can access it from work (or wherever). Because I migrated from a Time Machine backup, my new workspace was a few hours out of date. DropBox on the new machine copied these old files over the newer ones. Then DropBox on the old machine copied the old files over it's newer ones. So now both copies were messed up. No big deal - DropBox keeps old versions, I'd just recover those. Except I couldn't figure out any easy way to recover the correct set of files without a lot of manual work. (I could only see how to restore one manually selected file at a time, with no way to easily locate the correct set of files that needed to be restored.) No problem - I'd get the files back from version control. Except for some reason I couldn't connect to version control anymore. Somehow the unfortunate DropBox syncing had messed up something to do with the SSL keys. Except the keys were still there since I could check out a new copy from version control. Eventually, after a certain amount of thrashing and flailing I got a functional workspace. I still ended up losing about 2 hours of work, but thankfully it was debugging work driven by failing tests and it didn't take long to figure out / remember the changes.

Although the new 27" display is only about 10% bigger than the old 24", the resolution has increased from 1920 x 1200 to 2560 x 1440 - almost a third bigger, and quite a noticeable difference. But because of the higher DPI resolution, everything got smaller. As my eyes get older, smaller text is not a good thing!

After all these years, with all the changes in display sizes and resolutions, you'd think we'd have better ways to adjust font sizes. Most people simply resort to overriding the display resolution to make things bigger, but that's a really ugly solution. But I can see why they do it. There's no easy way in OS X to globally adjust font sizes. You can only tweak them in a few specific places. Windows is actually a little better in this regard, but still not great. And even if you manage to change the OS, you still run into applications that disregard global settings.

And history continues to repeat itself. iPhone apps were all designed for a specific fixed pixel size and resolution. Then the iPad comes along and the "solution" is an ugly pixel doubling. Then the higher resolution retina display arrives and causes more confusion. When will we learn!

Even Eclipse, that lets you tweak every setting under the sun, has no way to adjust the font size in secondary views like the Navigator or Outline. This has been a well known problem in Eclipse since at least 2002 (e.g. this bug or this one) but they still haven't done anything about it. I'm sure it's a tricky problem but how hard can it be? Is it really harder than all the other tricky problems they solve? Surely there's something they could do. Instead they seem to be more interested in either denying there's a problem, arguing about which group is responsible for it, or reiterating all the reasons why it's awkward to fix.

Of course, it's open source, so the final defense is always - fix it yourself. Sure, I'll dive into millions of lines of code and find the exact spot and way to tweak it. I think it might be just a little easier for the developers that spend all there time in there to do it. I won't ask them to fix the bugs in my open source code, if they don't ask me to fix the bugs in their open source code.

On the positive side, Eclipse's flexibility with arranging all the panes virtually any way you want lets me take advantage of the extra space. It's a little funny though, because even with a big font, the actual source code pane is only about 1/3 of the screen. It's nice to have so many other panes visible all the time, but there are times when it would be nice to hide (or "dim") all the other stuff so I could focus on the code.

All in all, I'm pretty happy with the upgrade.

Note: No computers were killed in this story :-)  Shelley is taking over my old iMac and her nephew is taking over her old Windows machine.

Friday, November 19, 2010

Giving Up on Amanda

For the last few months we've been trying to implement Amanda for backups in our office.

The two main choices for open source backups seem to be Amanda and Bacula. Amanda was supposed to be easier to set up than Bacula so that's what we chose. (There are also commercial options but they tend to be expensive and even less flexible.)

Unfortunately, it hasn't gone smoothly. Every time we think we have it working it starts failing semi-randomly. Certain workstations will fail some nights with cryptic errors, then work other nights without us changing anything.

We've had some support from the Amanda community. At one point they suggested running a new beta version which appeared to solve some problems, but not all of them.

To add to the problems, when we upgraded Linux, Amanda broke. That's certainly not unique to Amanda, but it's yet another hassle.

I'm sure Amanda works well for a lot of people. Presumably there's something different in our server or network or workstations that leads to the problems. But that doesn't help us. We have a medium size network of about 60 machines - not small, but not especially big either. We're not Linux experts, but we're not totally newbies either.

I'm sure we could solve the current problems eventually, but I've lost confidence. It just seems like we'd have to expect more problems in the future. And it's complex enough that if the person that set it up wasn't here, we'd be lost.

For some things this might be acceptable, but for backups I want something that "just works", that I can count on to be reliable and trouble free. For us, Amanda just doesn't appear to be the answer.

For the last ten years or so we've been using a home-brew backup system. It's simple, but that's a good property. It has reliably done the job. And when we needed to adjust it we could. And it was simple enough that even unfamiliar people could dive in and grasp what it was doing and figure out how to change it or fix it.

The reason we tried to move to Amanda is that we wanted to improve our system. Currently we rely on the server "pulling" backups via open shares. But for security we want to get rid of the open shares which means the workstations have to "push" backups. At the same time, we wanted to start encrypting backups on the workstations. In theory Amanda will do what we want.

I finally decided to pull the plug on trying to use Amanda. And as crazy as it might sound, we're going to try building a new home-brew system. I don't think it'll take us any more time than what we spent on trying to use Amanda.

You might think backups are too critical to trust to a home-brew system. But I'm more willing to trust a simple transparent solution that I understand, rather than someone else's complex black box. (Technically Amanda's not a black box since it's open source, but practically speaking we're not likely to spend the time to figure out how it works.)

And of course, we'll use Suneido to implement it. Suneido actually fits the requirements quite well - we can use it for a central server database, and run a client on the workstations. It's small and easy to deploy, and of course, we're very familiar with it. We'll see how it goes.

RethinkDB

I just listened to a podcast from the MySQL conference from RethinkDB about better database storage engines. Apart from being an interesting talk, a lot of what they were talking about parallels my own ideas in Suneido.

For example, they talk about log structured append-only data storage. Suneido's database has always worked this way.

Next they talk about append-only indexes. Suneido does not have this, but it is something I've been thinking about. (see my post  A Faster, Cleaner Database Design for Suneido). They have a different idea for reducing index writes. It's an interesting solution, but more complex. It won't be as fast as delaying writes, but it would allow crash recovery without rebuilding indexes (as Suneido requires).

I mostly arrived at these design ideas from basic principles e.g. immutable structures are better for concurrency. But it sounds like the file system folks have been working on a lot of the same ideas. It's hard to keep up with everything that might possibly be relevant.

The other interesting part of this talk was the idea that there are a lot of pieces that have to work together and performance depends on the combination. This means there is a huge possibility space that is hard to explore.

Wednesday, November 17, 2010

Java, Oracle, GUI, and jSuneido

A Suneido developer had a few questions that he thought I should blog about, so here goes. (Disclaimer: I'm not an expert on Oracle or Java and don't have any inside knowledge.)

What will happen with Java with Oracle buying Sun? Do I regret choosing Java to rewrite Suneido?

Oracle buying Sun does make me a little nervous but I don't think Java is going to go away, there is too much of it out there.

I don't regret choosing Java to rewrite Suneido. There aren't a lot of mainstream alternatives - .Net would be a possibility, with Mono on Linux. I think .Net is a good platform, but I like being tied to Microsoft even less than I like being tied to Oracle. And I think Java is less tied to Oracle than .Net is tied to Microsoft.

There are, of course, other alternatives like LLVM or Parrot, but they don't have the same kind of support behind them.

I've heard persuasive arguments (albeit mostly from Oracle) that it is to Oracle's benefit to keep Java alive and "open" since much of Oracle's software is written in Java. They might try to charge for stuff but probably at the enterprise end, which doesn't bother me too much.

I do wish Java moved a little faster. Java 7 is taking forever, and now a lot of it has been postponed to Java 8. Meanwhile, Microsoft has moved surprizingly quickly with advancing .Net. On the positive side, the new features in Java 7 for dynamic languages (JSR 292) will be very nice. And I do want stability, so I can't complain too much.

I don't think the Oracle buyout has any effect on jSuneido in the short term. I'm using Java 6 which is readily available.

What platform are you using to develop jSuneido?

I do most of my development on Mac OS X using Eclipse. I run the Windows cSuneido client using Parallels. I'm pretty happy with this. The only minor hassle was that Apple was really slow to release new versions of Java. And Sun/Oracle don't directly release OS X versions. You could get new OS X versions of Java from other places but it was an extra hassle. Now Apple has announced that they won't be distributing Java any more. This isn't a big deal - Microsoft doesn't distribute Java either. It would be nice if Oracle would add OS X as one of their supported platforms. (Not just on Open JDK.)

But this is only the development environment. In terms of deploying the jSuneido server I expect it will be mostly Windows and Linux. There aren't many people using OS X for servers. And Apple just discontinued their rack mount server.

I also do some development on my Windows machine at work. There are slight differences, but mostly I can use the identical Eclipse setup.

I haven't done any testing on Linux. I would hope it would be fine but there could be minor issues. If we were only working on Windows then there might be more, but Linux shouldn't be too different from OS X.

We are getting close to switching our in-house accounting/crm system over to jSuneido. This will be a good test since we have about 40 users. Currently we have a Windows server to run this system and a Linux server for other things. Once we are running on jSuneido we are hoping to get rid of the Windows server and just use the Linux one. So Linux support is definitely coming.

Where can I get a copy of jSuneido?

Currently, the only way to get it is as source code from Mercurial on SourceForge:

http://suneido.hg.sourceforge.net:8000/hgroot/suneido/jsuneido

I haven't started posting pre-built jar file releases yet, but probably soon. If anyone is interested in getting a copy to experiment with, just let me know and I can send you one.

What about a GUI for jSuneido?

Currently, jSuneido does not have any GUI. cSuneido's GUI is Windows based so it's not portable.

In the long run it would be nice to add a GUI to jSuneido. Then, eventually, I could stop supporting cSuneido.  Maintaining two parallel versions is a lot of extra work.

The conventional Java approaches would be Swing or SWT.

Another idea would be to try to use the newer GUI system from Java FX, but there's not much support for using it outside of FX yet.

Another possibility would be to switch to a web based GUI, even for local use. That's an intriguing idea that would be fun to investigate. The downside is that instead of just Suneido code, you'd have HTML and CSS and JavaScript and AJAX. Not exactly the self-contained model that Suneido has had up till now.

The bigger issue for us is that we have a lot of code based on the old GUI system. So a priority for us would be to minimize the porting effort.

Sunday, November 14, 2010

Upgrading Eclipse to Helios (3.6)

I recently upgraded my development environment for jSuneido to Helios (version 3.6) from Galileo (3.5).

Helios has been out since June but I needed to wait for the plugins I use to be updated. Actually, this was one of the things that nudged me to update since one of my plugins started giving errors on Galileo after they updated it for Helios.

It went quite smoothly. The plugins I use are:
  • Mercurial Eclipse
  • Bytecode Outline
  • EclEmma Java Code Coverage
  • FindBugs
  • Metrics (State of Flow)

A new version of Eclipse used to mean a bunch of great new stuff. But like most software products, it's matured and development has slowed down, at least in terms of major new features. In normal usage I didn't notice much difference.

One welcome addition is the Eclipse Marketplace (on the Help menu with the other update functions). EclEmma, Bytecode Outline, Mercurial Eclipse, and FindBugs can all be installed through the marketplace, which is a lot nicer since you don't have to go to their web site, find the url of the update site, copy it, and then paste it into Eclipse. The other plugins show up in the marketplace, but don't have an install button. I'm not sure why, but it's a new feature so you have to expect some hiccups.

A minor complaint is that the marketplace is implemented as a wizard, even though it isn't really a multi-step process. Wizards can be a reasonable approach, but I think they're overused sometimes.

Tuesday, November 09, 2010

Email Overload

And you thought you had too much email :-)


This was on the latest Thunderbird. Not sure what I did to trigger it - looks like some kind of overflow.