The Software Life

Thursday, September 10, 2009

Apple Updates

It's been a busy couple of days for me with Apple updates.

First, Apple released Snow Leopard (the new version of OS X) ahead of schedule, while I was on holidays. The day after I got back was a holiday but I headed for London Drugs who I knew would still be open. Unfortunately, they were all sold out. The next day I tried a few more places including Neural Net (our local Apple oriented computer store). They were all sold out!

Obviously the demand for Snow Leopard was higher than expected, even though there are no really big new features.

In a way the delay turned out to have a positive side. While I was waiting I decided I might as well get the "Boxed Set" which includes the latest iLife and iWork. I'd been thinking about buying iWork anyway and my Mini had an old version of iLife, so it seemed like a good deal. Even better, Neural Net had the Boxed Set Family Pack in stock :-)

Although there were some people recommending waiting to upgrade in case of problems, most people seemed to say it was ok. I updated my MacBook first and when that went smoothly, went ahead and updated my iMac and Mini. So far I haven't had any problems, but I haven't done too much.

OS X finally includes Java 6 :-) so I wondered if there'd be any glitches with Eclipse and jSuneido, but so far so good.

Coincidentally, iTunes 9 was released yesterday so I updated that on all my machines. iTunes finally has a Wish List :-) I always wondered why they didn't have this. Was it because they wanted people to buy right away? But then why would Amazon have a wish list?

The wish list is somewhat hidden. To add items you use the pull down menu attached to the Buy button. The annoying part about this design choice is that items that only have a Rent button (certain movies) can't be added to your wish list. To actually view the wish list, the only link I could find was at the very bottom of the iTunes home page under "Manage". The help describes a different location - under Quick Links on the right hand side - which seems like a better location. It's almost as if they still aren't sure about the feature so they're making it somewhat hidden.

Another major new feature in iTunes 9 is "Home Sharing" which lets you move your media between your different home computers. This should help me keep my living room Mini's music library up to date with purchases (which I mostly make on my main iMac).

You can only use Home Sharing between computers you have "authorized" for your iTunes account. (You're allowed to authorize up to 5 computers.) Originally authorization was for DRM protected music. Since I refused to buy any DRM protected music I never had to worry about authorization. Now I do. I find I have authorized 4 out of my allowance of 5 computers. At least one of those was a machine I no longer own (my old MacBook). I don't think there's any way to un-authorize a machine after the fact (you have to remember to do it before you get rid of the machine or reinstall the OS) As far as I know, the only solution Apple offers is that you can un-authorize all your machines, allowing you to re-authorize the ones you want. (But you can only do this once a year.)

After some searching I found the setting to automatically copy music purchased on other machines. I turned it on and waited. I knew I had purchased music since I last synced my library. Nothing seemed to be happening. I made sure iTunes was running on both machines. I left it for an hour in case it was a slow background task. Nope. I'm guessing that it only works for music you purchase after turning on this option. I guess it all depends how you interpret "automatic". No big deal, it was easy enough to view my iMac library, sort by date added, shift-select all the new stuff and drag it over. I'll have to wait till I purchase some new music to see if it actually syncs automatically then.

On top of all this, Apple released iPhone OS 3.1. I installed it, but there doesn't appear to be anything too exciting in it.

The other big announcement from Apple yesterday was the new version of the iPod Nano with video camera, microphone, speaker, FM radio, and pedometer (!?). I was surprised that the camera was video only, but according to comments by Steve Jobs, this was due to size/space limitations. The FM radio even has a "pause" feature like DVR's. It was nice to see Steve back up on the stage after all his health problems.

The iPod Touch (I keep wanting to call it the iTouch) is now being targeted as a game machine. I would never have predicted that, but then again, I very rarely play games so I tend to forget what a big market it is.

Monday, August 17, 2009

iPod Shuffle Won't Shuffle

I had an older model iPod Shuffle that I used for running. It started to get flaky and eventually died totally.

So I bought a new model, smaller and sleeker, and more memory.

But ... I listen to podcasts when I'm running, not music, and the new iPod Shuffle won't shuffle podcasts.

Even if it would sort by date I could live with it. But it sorts by source, and I don't want to listen to all the podcasts from one source all in a row.

And although I like the controls on the headphone wire, you have to double click to skip tracks and it is a frustratingly slow process to skip all the podcasts from one source just to get to the next source. Good luck if you want to find a particular podcast.

I started doing some research and I read that you could skip between playlists. Ok, I'll put each podcast source in a playlist and then I can skip through them. Except you can only put music in playlists, not podcasts, despite the fact that they're all just mp3 files.

Ok, I'll just move my podcasts over into my music section so I can put them in playlists. Except you can't. For some reason, iTunes goes to great lengths to prevent this. Even if you remove the file from iTunes and then try to import into the music section, it's too "smart" and puts them back in the podcast section. There are various work-arounds but I don't want to have to do this every time I get new podcasts.

Why stop you from shuffling podcasts? Sure, not everyone will want to shuffle, but that's no different than music. After all, the one and only control on the body of this iPod is whether to shuffle or not!

Why stop you from putting podcasts into playlists? Again, I can't think of any reason for blocking this.

It's probably a similar issue as with the K7 flaw - going overboard in trying to keep people on the correct path, refusing to accept that your (the designer's) idea of the "correct" path isn't necessarily the same as your users.

Judging from all the stuff on the web about this, it obviously annoys a lot of people. Come on Apple - listen to your users!

Friday, August 14, 2009

Pentax K7 Flaw (IMO)

I just traded in my Pentax K10D camera for the new Pentax K7. Overall I'm pretty happy with the upgrade, but there's one thing that really annoys me.

Both the K10D and the K7 have a "Green" mode where everything is automatic and many settings are ignored (forced to safe, default settings).

But in the K7, Green mode now forces saving the images as JPEG - RAW is not allowed.

I shoot in RAW 100% of the time - it gives me a lot more control over the final images, and using Lightroom (or Picasa) it's just as easy to handle RAW as JPEG - there are no extra steps or things to deal with.

This means I can't use Green mode on the K7. It's not the end of the world because "Program" mode can also be used fully automatically. You just have to be remember to put settings back to "normal" after changing them. On the K10D I'd use Program to do something different, but I could just flip back to Green mode without worrying about what settings I'd changed. I'd only have to worry about it when I went to Program mode. Now, staying in Program mode, I'll have to be more careful.

I'm sure Pentax had reasons for doing this, but I think they made the wrong decision. Beginners who can't deal with RAW are going to leave their camera set to JPEG. Anyone who is advanced enough to change their settings to RAW presumably did it deliberately (like me) and doesn't want it overridden by Green mode. Besides, given the cost of this camera, the market is not beginner newbies anyway.

It's a fine line between "protecting" users from shooting themselves in the foot, and being over-protective and stopping them from doing valid things. This time I think they went over the line.

Tuesday, August 11, 2009

Tethering iPhone to MacBook

It was so nice out today, after a less than stellar summer so far, that I decided to take my laptop and go sit outside somewhere for coffee. The spot I picked (Pacific Gallery & Cafe) doesn't have wireless so it seemed like a good time to figure out how to tether my MacBook (13" unibody) to my iPhone (3Gs) for internet access.

It didn't turn out to be so easy. First you have to enable bluetooth on both devices (I hadn't brought a cable or that might have been an easier approach). Then you pair the devices. This went ok other than a little searching to find the appropriate settings.

But after pairing successfully, you're still not connected. Pulling down the bluetooth menu from the menu bar showed the iPhone but Connect to Network was grayed out (disabled). My network preferences showed a Bluetooth PAN (Personal Area Network) but it said the "cable" (!?) was disconnected. Not very helpful. In the Bluetooth preferences the tools menu (the "gear" at the bottom) had a Connect to Network that wasn't grayed out, but it also didn't seem to do anything.

If I picked the MacBook on the iPhone the Bluetooth preferences on the MacBook would switch to connected and then immediately switch back to unconnected.

Of course, I googled for the problem. A lot of the results were about how to get around AT&T not allowing tethering. But I was on Rogers (in Canada) and they supposedly do allow tethering.

Apart from the AT&T results, there seemed to be quite a few people with similar problems, but no real consensus on a solution. Some people claimed if you simultaneously hit connect on both the MacBook and the iPhone then it would work. It didn't for me. Some people suggested removing the bluetooth devices from both the MacBook and the iPhone and re-pairing. That didn't seem to help either.

Finally, one person said to restart the MacBook. That worked! I had to laugh because when people ask me about computer problems one of the first things I always suggest is to restart. But I don't expect to have to do that on Mac OS X.

The sad part is that even after I got it working it was too slow to be usable. I couldn't even bring up Gmail because it would time out. Pinging the name server was giving a response time of 4 seconds (4000 ms)! The iPhone was showing 5 bars and a 3G connection, but obviously I wasn't getting a good connection. Browsing on the iPhone was also very slow so it wasn't just the tethering.

I'll have to try it again when I've got a better 3G connection. I'm not sure if it's going to work easily in the future or not. Some of the people reporting problems had it working for a while and then it quit so I'm not totally optimistic. Maybe using a cable will be simpler. I wonder if the dedicated USB cell "modems" work better. (I would hope I'd be able to use my existing data plan?)

Friday, August 07, 2009

New Camera with Projector

Nikon | Imaging Products | COOLPIX S1000pj

I'm not sure it's something I'd use a lot, but having a projector built in to a camera is a cool feature.

Reading the fine print, the projector is only VGA resolution which is not too impressive.

If you were using the camera to record your whiteboard, it might be handy to be able to redisplay it with the projector.

Thursday, August 06, 2009

Anatomy of a feature

inessential.com: Anatomy of a feature

Interesting description of all the little details that are behind even the simplest feature, that most non-programmers have no idea of.

via

iPhone Competition

The Zii Egg looks like a pretty cool gadget.

touch screen
gps
vga camera for video conferencing
hd camera
wifi and bluetooth
sd card slot
hd video output
runs open source Google Android OS

So far it's not a phone, but that's probably coming.

via

Wednesday, August 05, 2009

Apple nabs 91% of "premium" computer revenue

Apple nabs 91% of "premium" computer revenue in June - Ars Technica

via

Sunday, July 12, 2009

The Cathedral and the Pirate

The Cathedral and the Pirate

Microsoft and Windows 7 versus Google and Chrome OS.

Saturday, July 11, 2009

Postino

Postino for iPhone - send real postcards with your photos :: AnguriaLab

One of the great things about the iPhone (and iPod Touch) is the diversity of apps. This is a cool one I just encountered. (via)

Thursday, July 09, 2009

Java Regular Expression Issue

I'm still grinding away on getting all the standard library tests to succeed on jSuneido.

I just ran into a problem because "^\s*$" doesn't match an empty string!?

Nor does "^$"

Nor does "^" (although just "$" does).

I find if I don't enable multi-line mode, then all of those match, as I'd expect.

Pattern.compile("^").matcher("").find() => true

Pattern.compile("^", MULTILINE).matcher("").find() => false

But I need multi-line mode to make it work the same as cSuneido.

I've tried to find anything in the documentation or on the web to explain this, but haven't had any luck. It doesn't make much sense to me. The documentation says:

By default, the regular expressions ^ and $ ignore line terminators and only match at the beginning and the end, respectively, of the entire input sequence. If MULTILINE mode is activated then ^ matches at the beginning of input and after any line terminator except at the end of input. When in MULTILINE mode $ matches just before a line terminator or the end of the input sequence.

The only thing I can think of is that it's applying "except at the end of input" even when there is no line terminator. I guess it depends whether you parse it as

(matches at the beginning of input) and (after any line terminator except at the end of input)

or

(matches at the beginning of input and after any line terminator) except at the end of input

To me, the first makes more sense, but it appears to be working like the second.

So far I've been able to handle the differences between Suneido regular expressions and Java regular expressions by translating and escaping the expressions. But that's tricky for this problem. I guess I could turn off multi-line mode if the string being matched doesn't have any newlines. Except I'm caching the compiled regular expressions so I'd have to cache two versions. And it also means an extra search of the string on every match. Yuck.

Of course, my other option is to port cSuneido's regular expression code, rather than using Java's. Ugh.

Backwards compatibility is really a pain!

Friday, July 03, 2009

Firefox 3.5 Early Feedback

Firefox 3.5 came out the other day and of course I immediately upgraded. I had been tempted to try the release candidates, but I depend on my browser so much these days that I didn't risk it. Ideally, I'd probably wait until the new version had been out for a while before upgrading.

For the most part, I don't really notice the difference. I don't doubt it's faster, but I haven't really noticed the difference. It seems most people don't notice when something acceptable gets faster, it's when something gets slower that you notice.

One minor annoyance is that when you only have one tab open (and you have the option turned on to still show tabs when there's only one), there is no longer a close button. Probably the thinking was that there is no point in closing the last tab. But I actually used that feature quite a lot when I wanted to keep my browser running, but I wanted to close e.g. Gmail so I wouldn't be distracted.

On the Mac it's not so bad because I can just close the browser window (on OS X this leaves the program running) and then click on the dock to open a fresh window when I need it. But on Windows, if you close the window, you exit the program.

There are workarounds available - obviously other people found this annoying too.

It's good for me to run into this problem from the user perspective. I tend to ignore our customers when they complain about minor "improvements" I've made. I have to try to remember how annoying it can be when you're used to working a certain way and it's taken away from you for no apparent reason.

I shouldn't have been surprised, but I was a little shocked when I ran 3.5 for the first time and it told me about all the add-ons that weren't supported. Luckily, none of them were show-stoppers for me or I would have had to figure out how to go back to the previous version.

It would be nice if the installer could tell you which add-ons were incompatible before it started the install process so you could cancel if necessary. Otherwise it would be a painful process to go through each of your add-ons and try to find out if it runs on the new version.

I guess another option would be to use portable version of Firefox to test add-ons. But even then, you'd be faced with installing them one at a time since there's no way to sync add-ons yet (that I'm aware of). Maybe I need to look at something like Add-on Collections.

One of the add-ons I was surprised wasn't supported on 3.5 yet was Google Gears. Which means I've lost the off-line support in Google mail, calendar, docs, reader, etc. I assume they're working on it.

I've also switched back to Weave to sync Firefox between my multiple computers. I used it for a while before, but switched to Foxmarks because it seemed better. But Foxmarks has turned into Xmarks and doesn't seem to be focusing on synchronization. And Weave has improved a lot. (I originally used Google Browser Sync but that was discontinued.)

One annoyance with these kind of sync tools is that the obvious time to check for changes is when you start the browser. But if you have a master password, then every time you start the browser it asks for your password, which is annoying and also not very good for security.

Thursday, July 02, 2009

Gmail Labels

Official Gmail Blog: Labels: drag and drop, hiding, and more

Finally Gmail is improving the label facility. I like the idea of tagging my emails, but previously it was quite awkward when you got too many labels. There was no way to hide labels for old projects or to make commonly used labels more accessible.

I'm guessing they imagined people would have a handful of labels, similar to the handful of built-in ones. But look at any tagging system, like Delicious or Flickr, and you'll see large numbers of different tags, not just a few.

There were workarounds like renaming labels to move them up or down the alphabetical list. Or addons like Gmail folders (which tended to break when Gmail made changes).

The drag and drop is nice, but to me the big improvement will be the ability to hide old labels and to normally only show the frequently used ones.

Google Update

Google Open Source Blog: Google Update, regularly scheduled

It has always seemed ridiculous that so many programs run their updater as a background process, even though they only have to run periodically (e.g. once a day or week). I realize they probably don't use a lot of cpu or memory and they're probably swapped out most of the time, but if nothing else starting them all slows down the boot process.

As a programmer, I can understand the desire to keep control, but these are updaters, not critical operations. The software itself can always check for updates as well.

It's nice to see at least Google switching to running their updater as a scheduled task.

Wednesday, July 01, 2009

3D Video Stabilization

Content-Preserving Warps for 3D Video Stabilization
via John Nack

I thought the image stabilization in iMovie '09 was cool, but it looks crude next to this stuff.

I wonder how long it'll be till we see this technology in video editing software, or even in cameras themselves.

Too bad you can't do this kind of software image stabilization for still images. But recovering from a fuzzy image is a lot tougher problem. Maybe if the camera shot multiple images (almost like a brief video) then you'd have enough information.

Tuesday, June 30, 2009

A Windows Feature I'd Like on the Mac

Both Windows and Mac OS X let you "minimize" windows to the task bar / dock.

Both let you bring a window back by clicking on the task bar / dock.

But on Windows you can click on the task bar icon a second time to minimize the window again. I've got in the habit of using this to take a quick look at a window and then hide it again. I keep trying to do that on the Mac but it doesn't work.

I can see one argument against this feature would be that people often get confused and double-click instead of single-clicking. If implemented naively, a double-click would show and then hide the window immediately, frustrating the user. But Windows solves this problem by treated a double-click the same as a single click.

If anyone knows a way to make this work on the Mac, leave me a comment and I'll owe you one.

One part of this that is nicer on the Mac is that "Hide" minimizes all of an application's windows, and clicking on the dock brings them all back, whereas on Windows it's one window at a time. I have a vague memory that Windows 7 might improve this.

Thursday, June 25, 2009

The iPhone Software Revolution

Coding Horror: The iPhone Software Revolution

Someone else who finally took the plunge and bought an iPhone.

And this rave review is from someone who isn't an Apple or Mac fan.

Monday, June 22, 2009

Apple Sells Over One Million iPhone 3GS Models

Apple Sells Over One Million iPhone 3GS Models in the first three days.

And one of those was me - I finally broke down and bought an iPhone, surprising some people, because although I love gadgets, I don't like cell phones. I could have bought an iPod Touch, but although I didn't really care about the phone, I wanted all the other features like 3G, GPS, compass, camera, etc. that don't come with the Touch. And I'm sure I'll end up using the phone occasionally now I have it.

I'm already loading up on iPhone apps. With over 50,000 available they're one of the best parts of the iPhone/Touch.

Friday, June 19, 2009

Ultra High Speed Photography

kurzzeit.com - Kameras

1 million frames per second - amazing!

(I couldn't get the videos to play in Firefox but Internet Explorer worked.)

Wednesday, June 17, 2009

Mac OS X Hangs from Lightroom

More often than I'd like lately, when I import photos into Lightroom (from an SD card in a USB reader) it hangs my whole Mac.

I can understand how Lightroom could crash, but I'm a little baffled that it manages to freeze the whole operating system. You get the spinning beachball and you can't to anything - can't switch apps, can't pull down menus, can't do Ctrl + Eject to shutdown.

At first I thought it was because I would start to view photos while it was still downloading, so I quit doing that, but it's still happening.

The strange thing is that Lightroom is normally very stable. It doesn't crash or hang when I'm working in it, no matter what I do. I suspect this is more of an OS bug, or at least a bad interaction between the app and the OS.

This seems to have become a problem recently, perhaps related to either Lightroom updates, or OS X updates, or both. (That's one of the downsides of all these automatic updates.)

I wonder whether it has someting to do with importing directly from the SD card through USB. Not that that is an excuse for the OS to die, but I could see where there would be some low level device stuff going on. Maybe I should copy the files to the Mac and then import from there. Although that's quite a bit more hassle since Lightroom auto-detects memory cards and goes straight to Import. However, I think you can set up Lightroom to "watch" a directory, so maybe I could do that and copy to that directory.

Friday, June 12, 2009

Continuous Obsolescence

I see my 13" MacBook has already been replaced by a new model.

It's been moved to the "Pro" label, gained its Firewire connector back, and now has an SD slot (which I'd like for downloading photos).

Apple seems to be coming out with new models faster than ever. The model I have was only out for 7 months before being replaced! Most software doesn't get upgraded that quickly, let alone hardware.

I like the rapid improvement, but I hate the resulting feeling of being left behind! Too bad we can't get automatic updates like software :-)

Edward Bear and Software

I just started reading Java Power Tools and the opening quote on the preface was this:

Here is Edward Bear coming downstairs now, bump, bump, bump, on the back of his head, behind Christopher Robin. It is, as far as he knows, the only way of coming downstairs, but sometimes he feels that there really is another way, if only he could stop bumping for a moment and think of it.

-- "We are introduced to Winnie-the-Pooh and some bees, and the stories begin,"
Winnie the Pooh, A. A. Milne

What a great quote for software development!

Sunday, June 07, 2009

Too Good to be True

I should have known that it was too good to be true that all the tests up to the N's were succeeding. I was a little suspicious, but who likes to question positive results.

What I wasn't remembering was that TestRunner reports errors at the end, not after each test. And I was never getting to the end because I'd get an unhandled exception (i.e. crash).

When I'd hit an unhandled exception I'd run that individual test by itself so once I fixed the exception I'd see the errors caught by TestRunner and I'd fix those, generally by implementing missing methods.

So the tests weren't succeeding up to the N's, they just weren't crashing. When I realized this, and specified that TestRunner should stop on the first failing test, I didn't even make it past the A's :-(

Oh well, there was nothing wrong with fixing the crashes first. I'm just nowhere near as far along in the process as I over-optimistically thought.

Back to slogging :-)

Saturday, June 06, 2009

A Sigh of Relief

The last while I've been working on getting the Suneido standard library tests to run on jSuneido.

Mostly this is a matter of implementing built-in functions and methods. Occasionally I find a bug in the existing jSuneido code, but thankfully that hasn't been too frequent.

It's actually been going quite well. The tests run alphabetically and I've got them succeeding all the way up to 'N'. Actually, I'm surprised that many tests succeed since I still have quite a few built-in methods to implement. 80-20 principle, I guess. (i.e. 80% of the tests only require 20% of the built-in functions.)

But one thing that's been nagging me during this process is that the tests report how long they take to run, and they've been running extremely slowly. I wasn't surprized they would run somewhat slower than cSuneido, since I'm not using the fast server JVM and I haven't done any optimization.

But jSuneido was on the order of 100 times slower. That was a little scary. If it was really going to be that much slower then I might as well give up now. But I figured there had to be a reason for it, probably something stupid.

As I expected, it was something stupid. Suneido loads and compiles code from libraries on demand. Once loaded, the compiled version stays in memory. I had this all working, but I missed one small but critical piece - I wasn't saving the compiled version. So it was re-loading and re-compiling on every reference. Yikes! It's not surprising it was slow.

Now the tests actually seem to be running faster on jSuneido than cSuneido, although I'm not sure the resolution of the timing is accurate enough to tell. At this stage, as long as there's not a huge difference, I'm happy.

Friday, June 05, 2009

Google Squared

Here's my first Google "Square":

8000m peaks

My initial search for "8000m peaks" only came up with 7 of them. Clicking "Add next 10 items" at the bottom added 6 more, plus some climbers which I had to remove. For some reason it didn't find Cho Oyu. But it was the first suggestion when I clicked on "Add items".

It automatically came up with reasonable columns - Image, Description, First Ascent, Height, and Location.

I couldn't figure out any way to sort them, e.g. by height

Wednesday, June 03, 2009

OnDemandBooks

OnDemandBooks

This is pretty cool. But I can't help think they're perfecting yesterday's technology. I guess that's typical - by the time we perfect some technology, we've moved on to something "better", albeit less perfected.

Just like a big fancy machine in the music store to burn cd's on demand has pretty much been made obsolete by mp3 players and buying music over the internet.

Now if only someone can open up digital book distribution the way Apple has opened up digital music distribution.

Amazon has digital music and books, but US only. Even Indigo's new digital book service is US only, despite being a Canadian company! I realize the US is a much bigger market, but I still can't help being annoyed by it!

Currently, most digital books are "proprietary", i.e. if you have a Sony ebook reader you have to buy your books from Sony, and if you have an Amazon Kindle you have to buy your books from Amazon. Even Apple, who are famous for their proprietary approach, let you play their music on devices made by other companies.

I'd really like to go digital with my books the way I have with my music, but we're not quite there yet.

Friday, May 29, 2009

YAGNI Strikes Again

This might also have been called "Premature Optimization Strikes Again".

For a given query, Suneido's query optimization chooses the "best" index to read each table by. But it also tries to determine if it is worthwhile to use additional indexes as well.

The problem is that using an additional index may not help, and may actually slow down the query.

Over the years, each time we ran into a case where it was doing more harm than good I've struggled to try to improve the heuristics.

One of the things that makes this hard is that the query optimizer doesn't really have enough information to know when additional indexes would be useful.

Another case came up recently. We added an index to speed up one thing and sometime later (through customer complaints) found that it had slowed down something else. (We probably need some performance tests to catch this kind of thing.)

Adding an index shouldn't slow anything down, it should only potentially speed things up. But extra indexes tend to lead to the query optimizer choosing additional indexes, and sometimes slowing things down.

As I was working on the current problem, I wanted to measure the speed without additional indexes so I disabled this feature.

Then I started to wonder if this feature was actually providing enough benefit to justify all these hassles (and slow downs). So I ran our test suite of over 1000 tests with additional indexes disabled and it made no difference to the speed!

That doesn't mean there aren't certain situations where this feature would be worthwhile. But it is a pretty good indication that overall it's not providing a lot of benefit. And there's no question that it's caused problems.

So I think I'll do some more testing and try it out in-house and if no problems come up I'll just remove this feature.

Obviously, I shouldn't have added this feature in the first place without "data" to tell me it was worthwhile. i.e. I shouldn't have prematurely optimized. The problem is, there's no way to know if it will be beneficial or not without implementing it. It's not like measuring a bottleneck in your code and then optimizing it. In this case, you're talking about whether adding something will help, and how can you "measure" that without implementing it?

Oh well, I guess there's nothing wrong with determining that a feature is more trouble than it's worth and ripping it out. You just have to get over the psychological hurdle of throwing away a bunch of work.

* YAGNI = you ain't gonna need it

Wednesday, May 27, 2009

Upgrading the Living Room

For some time my computer monitor has been bigger (not to mention newer and better) than our television. We've had an ancient (in technology terms) 21" CRT television for a long time.

It hasn't bothered me too much. We don't watch much TV and when we do, most shows hardly warrant high definition. ER in high def? Big deal. Movies maybe. And a wide screen would be nice for movies.

I did have it connected to my old Mac Mini and an EyeTV to play DVD's and use as a PVR and rent movies from iTunes. But the resolution on an old crt TV is pretty awful. It might barely manage 640 x 480, but software these days tends to want at least 1024. I managed by using the handy screen zoom feature of the Mac - it wouldn't have been usable without it.

Shaw (my cable company) has been bugging me to upgrade to digital for a long time. They finally got the price right (free!) so I let them send me a digital terminal. The downside is that the Mac can't change channels on the cable terminal (unless I get an IR blaster) so I still have the EyeTV hooked up to the analog cable. So if I want to use the Mac to record TV or to pause and skip commercials, then I can't use the digital cable. I could get an HD terminal with PVR but that's another expensive box, and I'd rather use my Mac. Oh well, like I said, I'm not really too worried about picture quality for regular TV shows.

I've been planning on getting a new TV but the holdup was that our entertainment stand pre-dates big screen TV's and wasn't the right size or shape. We shopped for a new stand in town but couldn't find anything we liked. We found one we liked at Ikea in Edmonton, but they didn't have it in the color we wanted. Finally, on the way back from the mountains we stopped at Ikea in Calgary and got one.

The new wall unit wouldn't hold the old TV (not deep enough) so I decided I might as well take the plunge. I did a minimum of research on the internet (trying to avoid the tyranny of choice!) and ended up buying a 37" Sony Bravia XBR6. For movies and TV 720p would have been fine but I wanted full 1080p (1920 x 1080) resolution, partly for "future-proofing" and partly to view my photographs. (almost the same as the 1920 x 1200 resolution of my 24" iMac) And, yes, photographs look great on it :-)

I had my Mac hooked up to the old TV with the DVI to Video adapter. I was using the composite (RCA) video output but it also has S-Video so that's what I used to hook up the new TV. The results were poor. It only went up to 1024 x 768 resolution and it was quite fuzzy. For some reason I had it in my head that S-Video was high resolution and digital, but it's not. It's analog and not much better than composite. (The TV doesn't have DVI input, just HDMI so I couldn't directly hook up DVI.)

I looked at my neighborhood London Drugs but they didn't have anything better. However, my local Apple store (not an official Apple store, just a store that specializes in Apple stuff) had a DVI to HDMI cable. They also sold me an optical audio cable (DVI doesn't have audio) but surprisingly, the Sony doesn't have optical input, only optical output. However one of the HDMI ports is paired with RCA audio inputs so I used that.

The results were much better, I got the full 1920 x 1080, nice and crisp and perfectly readable.

However, the next day I fired it up and the display was "bigger" than the screen so the top menu was no longer visible. Strange. I check the Display preferences and they looked ok. I did notice that "Overscan" was turned on, so I tried turning this off, but then the display was too small - not filling the screen and no longer sharp. Eventually I rebooted and turned Overscan back on and it was back to working properly. I think what happened was that I turned the Mac on before the TV and it didn't recognize the display properly. I guess I'll have to remember to turn the TV on before the Mac. Probably at the same time would work since the Mac takes longer to boot than the TV. No, today I was careful to turn on the TV first, and it still did this. Even a restart didn't fix it. But toggling the overscan off and back on after the restart fixed it. Hopefully this won't be a recurring problem.

Regular DVD's are only 480i (720 x 480) resultion. You need Blu-Ray to get the full 1080p resolution. But I'm using the Mac as a DVD player, and although you can get external add-on Blu-Ray drives, OS X doesn't support Blu-Ray yet. Obviously I could get a separate Blu-Ray player, but I'm trying to reduce the number of boxes, not get more!

You also can't rent HD movies through iTunes on a regular Mac, only on an Apple TV. I'm not sure why they made that choice. To encourage people to buy Apple TV's? But isn't a more expensive actual Mac even better? I like the idea of the Apple TV - simpler and more energy efficient than a Mac. And you can run Boxee on Apple TV so you're not limited to the built-in software. But it can't act as a PVR. And I pretty much refuse to watch regular TV shows without a PVR since commercials drive me crazy.

DVD and regular iTunes movie resolution still looks pretty good scaled up to the higher resolution by the Mac. I assume it's doing similar processing as the "upscaling" that better DVD players and home theater systems do.

One thing I have noticed is that the Mac mini runs a lot hotter driving the new TV. I'm guessing this is due to more processing required for the higher resolution. I wonder if the newer Mac mini's would do better? I have come across people saying Apple TV has trouble driving full 1080 resolution due to the processing requirements.

I also took the plunge and moved my physical cd collection and cd player out of the living room. For a while now, I've had my old 30gb iPod hooked up to the stereo and used that. But Shelley still used the physical cd's. Now I don't even have the iPod hooked up - which means you have to fire up the Mac and TV to listen to music. It may be a little tough convincing Shelley that's an improvement! She still shies away from the Mac as a PVR, although if you ask me, it's no more complicated than the standalone PVR we had.

I was using my iPod instead of playing music via the Mac for a couple of reasons. One is that the iPod is pretty much instant on - no waiting for it to boot. Another issue was reading the screen, although I realize that FrontRow might have solved that. The other problem is how to access my music. My main collection is in iTunes on my iMac. I can share that library, but being energy conscious I turn my iMac off when I'm not using it, and then I can't access it. So I'd have to boot two machines. But I keep a mirror of the music files on my Time Capsule, so I ended up importing all the files into a local iTunes library on the Mac mini. That works well, but I'm not sure how I'm going to keep it up to date. Maybe I should keep a single main library on the Time Capsule. On second thought, the problem with this approach is that currently you have to manually connect the Time Capsule shared drive every time you start up, which is a hassle and won't help with convincing Shelley this is a "better" setup. There are ways to automatically connect it, but then that slows up the boot process. Maybe I should just periodically mirror my iTunes library from my iMac to the Time Capsule and from there to the Mac Mini. I think if I'm careful to keep the paths the same that will work.

I did still hook up the stereo receiver and run audio output from the TV to it, mainly to get better sound for music through the bigger JBL speakers rather than relying on the TV's little built-in ones.

But the receiver is getting dated too - it doesn't even have a remote! Maybe I should upgrade to a "home theater" system? I'm not really that concerned about surround sound for movies. I'd be afraid a set of small low end surround sound home theater speakers won't do as good a job for music as my current setup. I guess I could get a home theater system with an integrated Blu-Ray player. Or maybe there are decent speakers that would connect directly to the Mac that would eliminate the need for a receiver entirely. But what is the sound quality like for listening to music rather than playing games? Maybe I'll hold off on this - enough changes for one time!

The old TV

and the new setup:

Thursday, May 14, 2009

One Year of jSuneido Development

Just over a year ago I was trying to make the C++ implementation of Suneido multi-threaded.

I started to wonder if this was the right approach, and whether I might be better off leveraging an existing VM and dreamed up jSuneido

So for the last year I've been working on porting Suneido to Java. I think it's going pretty well, although I wish it would go faster.

jSuneido is currently about 22,000 lines of code. (About 5000 of those are tests.) I've been working on it an average of 3 days a week for a year - that's about 150 days, or about 150 lines per day.

About six weeks after I started I was doing about 100 lines per day. That was calendar days rather than working days, but for the first burst of enthusiasm I worked on it most days.

I also guessed jSuneido might be about 30,000 lines of code. Given the current size and what's left to do, that probably isn't too far off. Another 8000 lines at 150 lines per day is about 50 days, at 3 days per week, that's about 17 weeks or roughly 4 months.

If you'd told me at the start it was going to take me 16 months I would have been either skeptical or depressed or some combination. But where I stand now, 4 more months doesn't seem too bad. It almost feels like maybe I can see the light at the end of the tunnel.

Of course, that's a really crude estimate and it could be way off. Some of the things I've got left to do could take a lot longer. Tracking down obscure bugs could take indefinite amounts of time. And with concurrency issues looming, obscure bugs could be the order of the day.

I'm not much of a believer in software project estimating. But it's fun to play with the numbers, especially since there's no one to hassle me if they're wrong.

I'm still pretty comfortable that this was the right way to go. Maybe I could have got some multi-threading going in the C++ version in less time. But I still would have been stuck with an aging non-portable implementation with issues like poor garbage collection.

In some ways this qualified as rewriting from scratch, seldom a good idea. But because it has to run all the same Suneido code and interoperate with the existing version, it's more of a port than a rewrite. I obviously had to redesign some of the internals to work with Java and the JVM, but hopefully I've retained most of the accumulated wisdom of the existing implementation.

Tuesday, May 12, 2009

In Defense of Data-Driven Design | Design 2.0

An interesting blog post.

In Defense of Data-Driven Design | Design 2.0

Tuesday, May 05, 2009

Another jSuneido Milestone

The last little while I've been working on getting the standard library to compile to Java bytecode.

I sat down to work on it today, ran the test to see where it would fail next, and it didn't - it compiled everything in stdlib. Sweet!

Of course, I'm not actually running the code, so I can't be sure if it's compiling "correctly", other than it's passing the Java verifier so I know the bytecode is "valid".

The next step is to actually run the stdlib tests, not just compile them.

No ... now that I think about it, the next step is to implement the built-in functions for things like string manipulation and database access that the stdlib tests will require.

Back to the grindstone :-)

Friday, April 24, 2009

Apple Makes a Billion

Apple recently past a billion applications downloaded from their App Store for iPhone and iPod Touch. That's in roughly nine months.

Lots of those apps are free or only a few dollars. Lots are games. Lots probably only get used a few times and then forgotten.

But a billion of anything is pretty impressive.

Sunday, April 19, 2009

No Canadians, Eh

I guess they didn't consider that Canadians might occasionally visit a US store.

(note: this code has expired)

Blogger Line Spacing Problem

When I viewed my last post, I was annoyed once more by the line spacing changing after a block quote.

I did a quick Google search and found a fix.

It's a simple change to the CSS of the template, moving the line spacing from ".post p" to just ".post"

I should have done this a long time ago!

Suneido Exception Handling part 2

It actually didn't take that long to implement the ideas I talked about in my last post.

I ran into some problems with circular dependencies when I made my exceptions derive from SuString, but by splitting the header I got around these.

The hardest part was remembering how C++ exceptions work. A year of programming mostly in Java and already the C++ details are fading!

I remembered something in Effective C++ but it turned out to be in More Effective C++.

The recommended style is:

throw MyException(...)

...

catch (const MyException& e)

At first glance this seems wrong because you're throwing a temporary value, but C++ copies thrown exceptions. And catching by reference avoids a second copy that's done if you catch by value.

Of course, this recommendation assumes no garbage collection. With garbage collection, like in Suneido, it's reasonable to just throw and catch pointers, which is what I ended up doing. (Before changing to pointers I ran into problems because I passed the caught exception to the Suneido catch code, forgetting that the caught value is a temporary on the stack.)

It was a good reminder of the extra complexity in C++. In Java you don't have to worry about whether you have a value or a pointer or a reference and whether it's on the stack or the heap. You do have the difference between primitive types and reference types, but especially with auto-boxing and un-boxing it's pretty transparent. And in any case, it doesn't lead to crashes like forgetting a C++ value is on the stack.

Since exceptions are a sub-class of strings, all the existing code that assumes strings works fine.

I added two methods: exception.As(newString) to re-throw an exception with a different string. And exception.Callstack() to retrieve the call stack from an exception (e.g. to log it).

Next, I'll have to go through our libraries and make sure that any re-throws are done properly to preserve the original source of the exception, and that any logging uses the callstack from the exception.

Wednesday, April 15, 2009

Suneido Exception Handling

One annoying part of exceptions in Suneido is that there's no way to "properly" re-throw an exception.

You can catch (e.g. to perform some cleanup or logging), and then re-throw but then the debugger shows the source of the error as being your re-throw, not the original. This makes it hard to figure out the original source of the problem.

This was an area that caused me some confusion moving from C++ to Java. In C++ you re-throw by saying just "throw;" without any exception value. Java doesn't have this. At first, I thought that meant you couldn't re-throw in Java, but you can.

I had also wondered why Java code always constructed new exception objects to throw. Why not just have some static exception objects you could re-use?

The reason is that it is the exception constructor that captures information about the location of the error, and the call stack, etc. So obviously, re-using exception objects wouldn't work too well!

So to re-throw in Java, you simply throw the exception you caught. It will have the correct information since it was constructed at the site of the original error. (Unless you want to disregard that information, in which case you can always just construct a new exception.)

It seemed like this approach might be a possible way to improve Suneido's exception handling. But Suneido exceptions are thrown and caught as strings, not exception objects, which makes it a little tougher to include exception information.

But it should be feasible. The value given to catch could be an object that inherits from SuString, but adds exception information. throw could allow either a simple string, in which case you'd construct an exception string with information about the current call stack. Or you could re-throw an exception string from catch, in which case you'd preserve the original information.

And the string-like exception object could also have methods for getting the call stack information, which would be useful for things like logging.

Actually, cSuneido already does part of this. Internally, a throw creates an exception object that records the current frame and stack pointers. But catch is only given the thrown string part of the exception. And it's only recording pointers, so if any code runs (e.g. exception handling code) then the call stack is lost (overwritten).

To preserve the information, you'd have to actually copy it before it was overwritten. Currently, that's only done for uncaught exceptions which end up in the debugger. To allow catch and re-throw you'd have to copy the information before you executed any exception handling code, which would probably be simplest to do at the throw. This could slow down exception handling, but exceptions shouldn't be getting used anywhere that would be a problem. (Note: On Java with jSuneido the Java exception object already captures the information for us.)

finally
Currently, Suneido doesn't have support for "finally" like Java. I thought it would be easy to add this to jSuneido since Java supports it. But unfortunately, finally is implemented by the Java compiler, rather than the JVM. TANSTAAFL

Wednesday, April 08, 2009

jSuneido Progress

I've been making steady progress on the Java version of Suneido. I've been working on compiling to Java byte code and have functions with expressions and control statements pretty much finished. That leaves exceptions, blocks, and classes.

But yesterday I went back and made a fairly radical change to how jSuneido is implemented. I started at 7:30 am, started making changes, broke almost every test, and finally got the tests running again at 9:30 pm. (with breaks for lunch and supper)

cSuneido's data types all inherit from SuValue. Everything is done in terms of virtual calls on SuValue. I followed the same pattern in jSuneido and it was working ok.

The problem is that in Java this means everything being "double wrapped". e.g. a Java string is wrapped in a Suneido SuString. This isn't twice as much memory, since the wrappers are small, but it is twice as many memory objects. And it means every operation, (e.g. string concatenation) has to create both a result object and a wrapper.

Integers aren't as bad since the wrapper can contain the primitive type. But even integers are "worse" than cSuneido because it encoded them directly in the pointer. Needless to say, the JVM doesn't allow this sort of trick.

The other problem is that using wrappers like this makes the code a lot more "awkward". Instead of 123 or "hello" you have to write SuInteger.valueOf(123) or SuString.valueOf("hello").

Another area that I was thinking about was integration with Java code. With Suneido having its own types, you'd have to wrap and unwrap values in the interface, which seems ugly.

One thing that got me thinking about this was Scala's use of implicit conversions to extend existing classes (like String or Integer) without actually storing them in a wrapper. (Although I wonder about the performance impact of wrapping on the fly.)

I started wondering whether I could drop the idea of an SuValue base class and the derived wrapper classes, and just use Java Object instead. That way I could directly use native Java types like Integer, String, and BigDecimal.

The downside of this approach is that you can't do everything with virtual calls on SuValue. Since there's no common base class (other than Object, which obviously doesn't have the methods I need) you have to resort to instanceof and getClass. The object-oriented purists would frown on this. But again, Scala (and other functional languages) opened my eyes on this a bit, since matching on "type" is quite common and accepted.

So operations like "add" have to check the types of values and handle any conversions. This is a little ugly, but it's isolated in a small amount of core code that doesn't change much. And I can't say I was sorry to drop the double dispatch I was using. It's a good technique, but I still find it confusing.

Ideally, you'd test both approaches and measure speed and memory usage. Maybe if I had a team of programmers (or grad students) to assign to it. But with just me, part time, I can't see spending the time to do this. So I have to take my best guess at what the best approach is.

I decided it was the way to go and so I made the switch. Better now, when I could do it in a day, and not worry too much if I introduce bugs, versus later when I'd have more code, and less tolerance for bugs.

So far, I'm happy with the results. There are a few isolated ugly parts, but other than that it seems cleaner. Time will tell, although there's no way to know how the other approach would have worked out, so I'll never know for sure.

Tuesday, March 31, 2009

Testing Techniques

I've written quite a few tests that look like:

String[][] cases = new String[][] {
    { "something", "expected result" },
    ...
};
for (String[] c : cases)
    assertEquals(c[0], c[1], process(c[0]));

This works well enough, but one drawback is that when one case fails, there's no easy way to "go" to that case to edit it.

The testing people would probably say each case should be a separate test method, but that's a lot of extra declarations of methods.

So lately I've been writing tests like:

    test("something", "expected result");
    ...

void test(String input, String expected) {
    assertEquals(input, expected, process(input));
}

Written this way, when there's a failure Eclipse will take me straight to the failing case.

Another Technique

Sometimes it's useful to do some manual "exploratory" testing. (I know you should program test first, but I'm afraid I don't always manage it.)

For example, working on jSuneido's expression code generation I wrote a simple read-eval-print-loop so I could type in expressions and have them compile and run and print out the result.

One of the dangers is that you don't write so many automated tests because you're doing manual testing. To get around this I had my program write the cases to a text file in the correct format to paste straight into my tests.

For example, this session:

> 123 + 456
 => 579
> f = function (x, y) { x + y}; f(123, 456)
 => 579
> "hello world".Size()
 => 11
> q
bye

Added this to the log file:

test("123 + 456", "579");
test("f = function (x, y) { x + y}; f(123, 456)", "579");
test("'hello world'.Size()", "11");

Scala Attractions

Here are some of the things I find attractive about Scala. I still haven't written a line of code, though, so don't take it too seriously. Just first impressions of a language connoisseur :-)

user defined operators
In C++ you'd call this operator overloading, but Scala doesn't have a reserved set of operators. Even "+" is just a method. Obviously, this has potential to be abused, like any other feature. But it's a lot nicer to be able to say x + y for a new type of numbers, rather than x.plus(y)

traits
Scala traits are like Java interfaces, but they can include method implementations. They're similar to mix-ins. I struggled to do similar things in C++ so I can see the value.

implicit conversions
Ruby and other dynamic languages have "open classes" where you can modify existing classes at any time. They call it "monkey patching". It can be really useful, but I also find it really scary. Maybe I'm just a wimp, but I don't want standard classes changing under me. If I use some third party library or plug-in, I'm not sure I want the standard string or list class to change as a result.

Scala takes a different approach - you define implicit conversions to your own class. So instead of adding methods to String, you define an implicit conversion to MyString, which then has the extra methods you want. Scala provides a number of "rich wrappers" with implicit conversions for things like RichString and RichInt.

This is a good example of why you need to focus on the end, not the means. If what you want is to be able to write str.myOperation() there's more than one way to do that, and it doesn't necessarily require a dynamic language.

Of course, implicit conversions can cause problems as I know well from C++. But Scala's approach looks like it avoids most of the problems.

everything's an object
In other words, no more "primitive types", like in Java, that have to be handled specially. And no more boxing and unboxing. It's too bad Java didn't take this approach. There are performance issues, but with todays compiler technology they can be overcome.

first class functions
Meaning you can have unnamed function literals and pass functions around as values. To me, this is huge. Sure, you don't need this all the time, but to write higher level code it's invaluable.

lazy
This is a little thing, but I think it's cool that Scala provides a built-in way to do lazy initialization. As opposed to Java, where Effective Java has a whole section on how to do (and not do) lazy initialization.

type inference
Scala seems to combine the best of both worlds - the benefits of static typing, with almost the brevity of dynamic languages. Would you rather write:

val a = Array("one", "two", "three")
val b = new Map[String,String]

or:

String[] a = new String[] { "one", "two", "three" };
Map<int,string> b = new HashMap<string,string>();

I've talked about this before. C++ is just as bad as Java, although they are proposing an improvement in C++0X.

== is equals
I think Java made the wrong choice in making == compare pointers and equals compare values. I've talked about this before. It's great to see Scala break with Java and normally make == the same as equals.

actor model for concurrency
This is actually a Scala library rather than part of the Scala language. But it's a standard library. The actor model is what has made Erlang known for writing concurrent applications. I'm not sure how Scala's implementation compares to Erlang's, but it's nice to see tools for concurrency. Especially coming from C and C++ that basically said "not our problem". I wonder if I can make use of this when I get to making jSuneido's database concurrent?

lists and tuples
These are classic functional data structures. Although I wouldn't say I "know" functional programming, I'm certainly not ignorant of the benefits of imutable data structures and the dangers of side effects. My programmers will tell you I've been know to harp on the evils of side effects. cSuneido makes good use of lists and other imutable data structures.

class parameters
This looks like a great shortcut for writing simple classes. Would you rather write:

class Data(count: Int, name: String)

or:

class Data {
    final public int count;
    final public String name;
    Data(int count, String name) {
        this.count = count;
        this.name = name;
    }
}

optional semicolons
If nothing else, this will save me from the annoyance of typing semicolons in Suneido where I don't need them, and then turning around and omitting them from Java where I do! The joys of switching languages every other day!

Sunday, March 29, 2009

Mozilla Labs - Bespin

Here's another interesting Mozilla Labs project - a web based programmer code editor, with some interesting UI innovations.

Mozilla Labs - Bespin

Introducing Bespin from Dion Almaer on Vimeo.

Mozilla Labs - Ubiquity

This isn't new, but I just came across a reference and looked it up. Sounds pretty interesting.

Mozilla Labs - Ubiquity

Ubiquity for Firefox from Aza Raskin on Vimeo.

Thursday, March 19, 2009

Not Your Father's Spreadsheet

Google Spreadsheets has some interesting features like Google Lookup and Auto-Fill.

Wednesday, March 18, 2009

Scala Gets My Vote

Of all the languages I've read about recently, Scala strikes me as the one I'd most like to use.

I've always been interested in (computer) languages. I haven't actually seriously used many different languages - mainly C, C++, and recently Java. But I've avidly read about many.

I have books on Scheme, Smalltalk, Perl, Python, Icon, Dylan, C, C++, C#, Tcl, Ruby, Java, Lua, JavaScript, Erlang, Groovy, Scala, and probably more that I've forgotten.

I'm just finishing reading Programming in Scala by Martin Odersky (the creator of the language), Lex Spoon, and Bill Venners. Part of what prompted me to read it was a podcast with Bill Venners talking about why he likes Scala.

I liked C++. People criticize it because it was too complex, but I liked it's powerful features.

I can't get excited about Java. It's ok, and it has great IDE support, but it's a little boring. I like the garbage collection and the safety of no pointers, but I miss C++'s tricks with things like templates and operator overloading.

For some reason I haven't got too excited about Ruby either. I haven't written much code in it, but I've read a fair bit, gone to conferences, and managed a project written with Rails.

Ironically, Java compared to C++ is a bit like Suneido compared to Ruby. Both Java and Suneido try to avoid features that, although powerful, allow you to "shoot yourself in the foot" (if not in the head!)

Scala feels more like C++, lots of power and flexibility and tricks, enough to probably make your head hurt.

Both Scala and C++ try to support writing powerful, natural libraries and extensions, which I like.

For "system" programming (as opposed to "application" programming) I still lean towards static typing. But I do get tired of the verbosity of type declarations in C++ and Java. Scala is statically typed, but it also does type inference which greatly reduces the verbosity.

And it's a plus for me that Scala runs on the JVM and can be easily integrated with Java. (There's a .Net version as well, although not as well supported.) It seems like I could write parts of jSuneido in Scala if I wanted. I suspect you still might want to implement performance bottlenecks in Java, though. Or at least be careful which Scala features you used, since power often comes at a price.

Scala supports functional programming, but in combination with more conventional object-oriented programming. I haven't spent the time to wrap my head around "pure" functional languages like Haskell or (O(Ca))ML. But I could see using some functional programming in Scala.

I thought the Programming Scala book itself was good. It's not small - 2.6 lbs and 776 pages, but I had no trouble getting through it. I see it won a Jolt Award. Apress also has a Scala book, and both Pragmatic and O'Reilly have Scala books coming.

Tech Social Skills

Our internet access quit working the other night. I tried the usual things, restarting computers, routers, cable boxes, etc. But no luck.

So I called tech support. After working my way through the voice menus I finally reached a recording - which said more or less:

"We are currently experiencing problems in the City Park area.
We are working on it.
If you are calling from this area, hang up now."

Perfectly logical, but it sounds like Spock from Star Trek. No apology, no explanation, no asking for patience. Basically just "get lost, don't bug us".

Update: The next day (yesterday) the internet was working during the day but sometime around 4 or 5 pm it died. I would speculate that everyone got home from work and rushed to see if it was working and the load crashed their system and it stayed off all evening.

Side Note: Why do you have to work through Shaw Cable's entire voice menu system to get to internet support? I got the number from a sticker on the cable modem. Couldn't they have a separate phone number? I guess that wouldn't be as convenient for them.

Sunday, March 15, 2009

"Getting" Git

The Git version control system, originally developed by Linus Torvalds for the Linux kernel, has been getting pretty popular lately, especially in the Ruby on Rails crowd.

There's a preliminary Eclipse plugin and even a TortoiseGit project. SourceForge is even offering it already, whereas it took them forever to get Subversion.

At RailsConf last year I went to a session on Git Internals by Scott Chacon and was quite intrigued. You can see the slides with voice over.

Suneido has it's own version control system that has served us well, but it's pretty basic. Git's distributed style and easy branching would be nice to have.

At OSCON 2006 I had gone to a tutorial on Subversion API's with the idea that maybe we could replace Suneido's version control with an interface to Subversion. I was disappointed to find that the API's are all file oriented. I even talked to the presenter after the session to see if there was any way around this, but he couldn't suggest anything. As far as I can tell, Git is similarly file based. Probably a big reason for this is that these types of systems usually evolve from shell scripts that, understandably, work with files.

My interest in Git was raised by reading Pragmatic Version Control Using Git. On top of that, we've recently had some problems with Suneido's version control. Nothing major but still impetus to think about something better. I started to think about what it would take to implement Git like version control in Suneido. Having enjoyed his talk, I also bought Scott Chacon's PeepCode pdf book on Git Internals which expands on the same material.

The basic ideas behind Git are pretty straightforward, although quite different from most other version control systems. The way it works is attractively elegant and simple. Yet it supports powerful uses.

One area where Suneido's version control has problems (as do other version control systems) is with moving and renaming. Git supposedly handled this, but I couldn't figure out how. There didn't seem to be anything in the data structures to track this. Strangely, neither the Pragmatic book or the Git Internals pdf talked much about this.

But it didn't take long on the internet to find the answer. The Wikipedia article was surprisingly useful. It turned out to be an issue that had seen some discussion. For example this mailing list message and this one.

I was right, the data structures don't record any information about moves or renames. Instead, that information is heuristically determined after the fact, by looking at file contents.

One of the common uses of Suneido's version control is to check the history of a record. (Although we haven't gone as far as to implement a "blame" utility yet.) But the way Git works, this is not a simple task. You have to work back through each version of the source tree looking for changes to the record. But no doubt there are ways to speed this up, maybe by caching which records were involved in each commit.

I worked up enough interest and understanding to take a stab at implementing something this weekend. I have a prototype (a few hundred lines of code) that can save a source tree in a repository and retrieve it again.

Performance is a bit of an issue. Git works by comparing source trees. We have about 15,000 records (think source files) in our main application. Building a Git style tree of the working copy takes some time. But it turns out most of the time is not reading the records, it's calculating the hashes. It shouldn't be too hard to make a cache to speed that up. (You could store the hashes in the libraries but that would be dangerous since something might update content without updating the hash.)

There's obviously still a ways to go before it would be able to replace Suneido's existing version control, but it's enough to convince me that it would be feasible. Just what I needed, another project :-)

Leaky Abstractions

Originally, Suneido memory mapped the whole database file at once. This is a simple, fast way to access a file. (Note: This doesn't load the whole file into memory. It uses the virtual memory system to load pages of the file on demand.)

Unfortunately, 32 bit cpu's have a limited amount of address space. Windows only allows a maximum of 2 gb and often you have to share that space with other things.

To allow databases bigger than the address space, I changed Suneido to memory map the database file in sections. Once you've filled up the address space, if you want to map another section, you have to unmap one of the previous ones.

This is a lot like how virtual memory works, and you have the same question of which section to unmap. What you really want is to unmap the section that you're not going to need for the longest time into the future. Unfortunately, there's no way to predict the future. Instead, various heuristics are used, a common one is to unmap the least recently used (LRU) section. In other words, the mapped sections form an LRU cache into the database file.

I used an efficient approximation of LRU called the clock algorithm. We've been using this for a long time and it seemed to work reasonably well.

But recently, we started to get occasional errors from loading large databases. Often this kind of thing is caused by memory or disk errors or corrupted data. But the strange thing was that the error was consistently reproducible. We could bring the dump file back to our office and try to load it and get the same error. And the database wasn't corrupt as far as we could tell - later dumps would load fine.

I started to suspect something to do with the memory mapping. One clue was that we only had this problem on large databases.

Eventually, I discovered the problem. The rest of the database doesn't do anything to "lock" sections in memory while it works on them. Instead, it simply relies on LRU not unmapping anything that's been recently used.

But I was using an approximation of LRU. In certain rare cases the clock algorithm can end up unmapping a recently used page. (That's why it's only an approximation of LRU.) Normally that's not a problem. But because I was depending on LRU, when this rare case happened, it resulted in the error we were getting.

To fix it, I simply implemented a "true" LRU algorithm. It'll be a little slower than the clock algorithm, but given how it's used I don't think this will have a noticeable affect. And the code is actually simpler, which is always nice.

The moral of the story is I got burnt by what Joel Spolsky calls a leaky abstraction.

PS. With jSuneido, I'm considering going back to the old system of mapping the whole database file at once. This would limit the size of the database on 32 bit systems, but it's probably not unreasonable, especially in the future, to require a 64 bit system if you want a big database. This would simplify the code and eliminate bugs like the one above.

Friday, March 13, 2009

String Concatenation

Working on jSuneido, I noticed that I hadn't implemented Suneido's optimizations for concatenating strings.

In most languages this kind of code is grossly inefficient:

s = ""
for each line
s = s + line // concatenate on the end

That's because each concatenation is allocating a new string, a little bit longer each time. The normal solution is to use a different mechanism, e.g. a StringBuffer in Java.

Suneido optimizes this case by constructing a linked list instead of allocating a new string each time. Later operations then "flatten" the list into a simple string.

The downside is that if you concatenate a lot of little strings the linked list representation uses a lot more memory than a simple string. i.e. we've traded memory for speed

Reading the 2nd edition of Programming Lua, I came across the usual discussion of why it's bad to concatenate like this. But they recommend a different solution - to keep a stack/list of pieces with the invariant that you can't put a bigger piece on top of a smaller piece. Instead you combine them, and if the newly combined piece is bigger than the one below it, you combine them, and so on.

It took me a minute to think this through. If you're adding random sized pieces, then 50% of the time you'll be adding one bigger than the last one, and therefore combining them. The end result, if I'm thinking this through properly, is that, on the average, each piece in the stack/list will be half as big as the one before. So a list of N entries will hold roughly 2 ^ N characters. i.e. you can build a big string with a short stack/list.

The Lua book is suggesting doing this in your application code, but my thought is to replace Suneido's current algorithm with this one. I'd be interested to see how it compares. It will do more concatenation and allocation, but will use less memory when dealing with a lot of pieces. It also does the concatenation more "incrementally" rather than one big concatenation to flatten the list like cSuneido currently does.

High-Tech Storytelling

This is a cool video. It's about 10 min and best viewed full screen (button just left of the "vimeo" at the bottom).

World Builder from Bruce Branit on Vimeo.

Tuesday, March 10, 2009

Digital Photography

One of the interesting trends in digital photography is combining multiple images to overcome limitations of sensors and lenses. This is primarily a software trend, that seems to be gradually moving from desktop software to in-camera features. Of course, the desktop software isn't standing still either - the latest Photoshop has some amazing features, and special purpose software does even more.

One of the first uses of multiple shots was for simple panoramas. But people are also using this to get high resolution images from lower resolution cameras. For a while, cameras have had modes to help with this, showing you the previous shot while you line up the next one (and keeping the exposure the same). Now the new Sony HX1 has a "sweep" mode where you just hold down the shutter button and "wave" the camera. It decides what individual shots to take. Cool. (It also has a 20x zoom lens and shoots 1080p hi-definition video!)

Multi-shot techniques are also being used for high dynamic range (HDR), where you take multiple shots at different exposures and blend them together. (examples) You can do this with software like Photoshop, but it's also moving into cameras like the Ricoh CX1.

And multiple shots can be used to overcome limited depth of focus. (You can also go the other direction and simulate a smaller depth of focus.)

The Sony HX1 will even take multiple shots and keep the one where the person's eyes are open.

None of this would be practical without a variety of technological advances - faster image capture, bigger/cheaper memory, more processing power in camera and in computer.

These techniques are sometimes called "computational photography".

There are so many cameras out these days that it's impossible to keep track. A couple that have caught my eye lately are the Canon PowerShot SX200 IS which is a descendant of the S3 I have but much smaller and with 12 mp and 720p video. It looks perfect for travel. And the Olympus Stylus Tough-8000 which is waterproof (to 10m/33ft), shockproof, freezeproof, and crushproof. Can't beat that for adventure sports.

I used to shake my head at my father's collection of cameras, but if I'm not careful I'll end up with my own. The sad difference is that his film cameras stayed relevant a lot longer than any camera you'll buy today, which will be out of date before you get it home.

Sunday, March 08, 2009

jSuneido Language Implementation

As the saying goes, "be careful what you wish for".

I was getting tired of the somewhat mechanical porting of code from C to Java, and I was looking forward to the language implementation because I knew it would involve some thinking.

Of course, now that I'm in the middle of thrashing around trying to figure out the best way to implement things, I'm thinking maybe there's something to be said for having a clear path!

"Mapping" one complex object (Suneido's language) onto another (the JVM) is complicated. There are a lot of choices, each of them with their pro's and con's. And you never have complete information either about how things work, or what their performance characteristics are.

Some of the choices won't be easy to reverse if they turn out to be wrong. Often, you're not even sure they were "wrong". They might turn out to have awkward parts, but then again, any design choice is likely to have awkward aspects.

One of the things to remember is that the Java language and the Java Virtual Machine (JVM) are different. Some features are implemented at the Java compiler level, and so they're not directly available at the JVM level. You have to implement them in your own compiler if you want them.

I'm also having trouble fighting premature optimization. On one hand, I want a fast implementation. On the other hand, it's better to get something simple working and then optimize it.

For example, I'd like to map Suneido local variables to Java local variables. But to implement Suneido's blocks (think closures/lexical scoping) you need a way to "share" local variables between functions. The easiest way to do that is to use an array to store local variables. But that's slower and bigger code. One option is to only use an array when you've got blocks, and use Java local variables the rest of the time. Or even more fine grained, put variables referenced by blocks in an array, and the rest in Java local variables. While faster and smaller, this is also more complicated to implement. To make it worse, when the compiler first sees a variable, it doesn't know if later it will be used in a block. So you either have to parse twice, or parse to an abstract syntax tree first. (Rather than the single pass code generation during parse that cSuneido does.)

One of the things that complicates the language implementation is that Suneido allows you to Eval (call) a method or function in the "context" of another object, "as if" it were a method of that object. This is relatively easy if you're explicitly managing "this", but it's trickier if you're trying to use the Java "this".

I'd prefer to map "directly" to JVM features like local variables and inheritance because the resulting code will be smaller and presumably faster (since JVM features will have had a lot of optimization). And presumably the closer the match, the easier it will be to interface Suneido code with Java code.

Despite the initial thrashing, I think I'm settling into a set of decisions that should work out. I already have code that will compile a minimal Suneido function and output java bytecode (in a .class file).

So far, ASM has been working well. Another useful tool has been the Java Decompiler so I can turn my generated bytecode back to Java to check it.

Friday, March 06, 2009

Computer Vision

Knowing in theory that things are possible is a far cry from actually seeing them.

I'm not sure what the practical applications of this technology are, but it's sure cool!

Wednesday, March 04, 2009

jSuneido Progress

Yesterday I worked on re-implementing the database request and query parsers.

I got the request one done and a good portion of the query one. I'd hoped to finish it in one day, but I didn't quite make it. Probably another half day or so.

It slowed me down a bit trying to share code with the language lexer and parser, yet keep the code independent.

The only difference in the lexers is in the keywords. Unfortunately, there's no way (that I know of) to extend enums. Separate Token classes would have meant duplicating the common stuff - yuck!

Instead, I decided to make the lexer include both sets of keywords. The extra coupling is unfortunate, but I figure it's better than duplication.

Luckily, the way the parsers work, keywords are not "reserved". So it doesn't hurt either parser to have extra keywords. (It wouldn't have been hard to "enable" only ones needed in a particular case, but it would have been one more complication.)

I also ran into complications with generics. The parsers take a type parameter so they can be reused with different "generators" e.g. to generate code, or execute directly, or build a syntax tree. There's a lot of actions shared between the different parsers (e.g. for expressions). The parsers also call each other (e.g. queries call the expression parser) so the generators have to be type compatible. I ended up with code like:

But compared to some of the twisted C++ template code, it doesn't seem too bad.

(I ended up having to use an image for that line of code because Blogger kept getting messed up by the angle brackets, even if I entered them as & lt ; in Edit Html mode.)

Java Enums

When Java (1.)5 came out with the new enum facilities, I thought it was cool, but I wasn't sure how practical it was. Allowing members and methods on enums? It seemed weird. Of course, this was coming from a C(++) background where an enum is just a thinly disguised integer.

But now that I'm programming in Java I'm actually finding these facilities quite useful. For example, in the lexical scanner Token class I attach various information to the tokens:

the string "name" in the case of keywords
the "matching" token for parenthesis, braces, and brackets
whether it's an infix or assignment operator

I use a separate TokenFeature enum for the infix and assignment features, and store them using an EnumSet, which represents them as bits in an integer (for small sets).

The only awkward part is having to define a bunch of different constructors.

Having this information attached to the tokens simplifies the code quite a bit, avoids duplicating the information in several places, and makes it easy to maintain.

PS. Looking back at the code, I think it could be simplified. All assignment operators are infix (at least for my purposes) so I don't really need a set of features, the infix method could just check for INFIX or ASSIGN. Sigh ... sometimes it seems like the refactoring never ends!

Robots

Check out these photographs:

http://www.boston.com/bigpicture/2009/03/robots.html

Monday, March 02, 2009

How FriendFeed uses MySQL to store schema-less data

Here's an interesting database approach that deliberately does not use conventional database indexes.

How FriendFeed uses MySQL to store schema-less data

Sunday, March 01, 2009

cSuneido Query Improvement

Recently we've been working on fixing some of the "slow" queries in our applications. Usually this means adding appropriate composite (multi-field) indexes.

But in one case, that didn't completely fix the problem. We added the index necessary to select the right data but then we summarizing by different fields and that was requiring a temporary index.

Thinking about this, I realized that it doesn't make much sense to create a temporary index so summarize can read the data in the right order to process it sequentially. Better to just read the data in any order and accumulate the results. That means you have to process all the data before you can produce any output, but that's no different than having to create the temporary index first. And (in most cases) the accumulated results will be smaller that the temporary index, since you're summarizing.

It didn't seem like this should be too hard to add. (typical programmer optimism)

But then I started looking at the code and I realized that although the query operations choose strategies, they weren't actually using the strategy pattern. Instead they just have a bunch of if's to alter their behavior based on the chosen strategy.

It's funny that hasn't struck me before. It seems so obvious. And it's not like I just learned about the strategy pattern. I read Design Patterns when it first came out in 1994, 15 years ago. I'd made each query operation a separate class, but for some reason I'd stopped there.

So I decided to refactor the summarize code to use separate strategy classes. Which again turned out to be a bigger job than I thought.

I started out with just using my text editor (Scite) but I've been spoilt with all my IDE usage. So I tried using NetBeans (6.5). But that didn't work too well. I could build from NetBeans but I couldn't double click on the errors to go to the offending source. (Even with Scite I can do that.) And I couldn't seem to configure it to my indenting style (Whitesmith's).

So I switched to Eclipse. It's not as easy to import a project into Eclipse as it is in NetBeans, but after a few false starts I get it going. It does support Whitesmith's style. And you can double click on the errors. But C++ support in NetBeans or Eclipse isn't anywhere close to as nice as Java support.

One problem with splitting things up after the fact is that the code tends to have developed interconnections. Luckily it wasn't too hard to pass the necessary information.