Monday, February 09, 2015

Gimme Structure

"I believe that it may happen that one will succeed, and one must not begin to despair, even though defeated here and there; and even though one sometimes feels a kind of decay, though things go differently from the expected, it is necessary to take heart again and new courage. For the great things are not done by impulse, but by a series of small things brought together. And great things are not something accidental, but must certainly be willed. What is drawing? How does one learn it? It is working through an invisible iron wall that seems to stand between what one feels and what one can do.”
-- Vincent Van Gogh

I used to think what I was looking for was good design. On more cynical days I'd settle for any design, or not even design, just some kind of structure.

I guess that's a bit like saying you want "quality". Would that be good quality or bad quality? Obviously, good structure is better than bad structure. But even bad structure is better than no structure.

I see, and work with, a lot of bad code, some of it written by my programmers, some of it (sadly) written by myself. The code seems to be split up into methods and classes more or less randomly. Names of variables and methods make no sense or are even outright misleading. It might work (most of the time) but it is difficult to understand, usually has duplication, commonly has logic errors, often old dead code, incorrect comments, etc. It will come as no surprise that it is hard to modify.

Part of the problem is incremental development. Even if there was some structure at some point, unless everyone modifying the code pays attention to maintaining that structure, it will degrade. And if it didn't have much structure to begin with it's even worse.

I don't think you can blame this on "evolution". Bad code is not very "fit". Natural selection would soon kill it off. Evolution is not intelligent design, but it comes up with lean, efficient solutions. It's not sloppy.

Much of the blame goes back to a common weakness in programmers - thinking that you are done when you have something that appears to work. Not going the extra distance to make sure it's readable, understandable, logically complete and correct. Often not even bothering to take care of the low hanging fruit like variable and method names.

And of course, once the code is a tangled mess no one wants to touch it to clean it up. Understandably, since it's a lot of work. And there's no doubt unobvious behavior in that code that you need to figure out and preserve. And there's a high risk of breaking things, and many programmers pay more attention to fear than to any desire for good code.

I have no silver bullets. Just a plea - please try to write code with some sort of comprehensible structure, for your own sake if nothing else.

Sunday, January 25, 2015

Effective Modern C++

I just finished reading Effective Modern C++ by Scott Meyers. Like his More Effective C++ and the original Effective C++ it's well written with good explanations and examples. This third book covers the latest C++ features in C++11 and 14.

It's been a long time since I read the first two books. Effective C++ was published in 1991! Back then I was writing fair amounts of C++ code. Nowadays the only C++ programming I do is maintaining the C++ implementation of Suneido.

I expected the new book to be similar to the previous ones - practical advice on how to effectively use modern C++. And there is lots of that. But it was also full of "gotchas" - things that won't compile (and give horrendous error messages), or compile but won't run, or compile and run but do the wrong thing.

C++ has always been a complex language and the new versions have only pushed that even further. If makes me appreciate the simplicity of the Go language which in some ways is a reaction to the complexity of C++.

Don't get me wrong, the new features of C++ are great, they improve the language in many ways. But my head is spinning with things like when perfect forwarding isn't perfect, when universal references aren't, and when uniform initialization isn't uniform.

Monday, January 05, 2015

Safety First

I recently fixed a long standing (many years) bug in the C++ implementation of Suneido. A friend remarked how you'd wish that after this long all the bugs would have been found. Of course, it doesn't take much code to provide room for bugs to lurk.

The problem that was reported was that if you created one thread inside another that cSuneido would crash. It seemed to happen quite consistently and predictably. That was from the IDE. If you ran the same code without the IDE it worked fine. Or if you played around a bit in the IDE first, it would also work fine.

cSuneido "threads" aren't real threads. They are Windows "fibers" - more like coroutines. They don't actually run concurrently, but they allow cooperative multi-tasking. The big advantage is that since you control when the task switching happens and can do it at "safe" points in the code, you don't have to worry about low level concurrency issues. The downside is that you can't take advantage of multiple cpu's. But this was implemented at a time when no one had multiple cpu's and Moore's Law was still happily improving single cpu performance.

Suneido's C++ fiber code had a std::vector of fibers. It also had a main fiber, separate from the vector. The current fiber was a reference (pointer) to either the main fiber or an element of the vector.

Even from that minimal description you could probably guess the problem. Vector implementations normally grow by allocating a new larger array, copying over the data, and throwing out the smaller old array. So adding an element to a vector invalidates any references to its content. So the current fiber reference would be pointing to stale data. (It wouldn't actually be a dangling pointer because cSuneido uses garbage collection.) The reference to stale data could cause an "impossible" situation that would lead to a fatal error. (So the problem was nothing to do with creating one fiber inside another, it was simply that creating two fibers in that sequence happened to be one way to expose the bug.)

The problem was rare because it required a specific sequence of events. First, the vector had to grow. Which is why if you played around first (and expanded the vector) it wouldn't happen. Second, the stale reference had to be used in such a way that it caused a problem. Since the data would normally be identical the stale reference wouldn't matter. And the next fiber switch would update it to a valid value so the stale reference wouldn't hang around.

Actually, I think there was at least one more potential problem scenario. When fibers ended they were removed from the vector. This probably wouldn't cause a reallocation (many implementations never shrink the array) but it would invalidate any references after that item. You'd either end up with a reference to the wrong item or past the end of the array.

I'm a little embarrassed to discover such a long standing blatant mistake, and a newbie mistake at that. All the times I've looked at that code and I never picked up on it. Ouch.

But to me the real moral of the story is "don't use unsafe languages". Interestingly, this bug was not a memory management issue since cSuneido (unlike almost all C++ programs) uses garbage collection. It's just a result of C++ allowing unsafe raw pointers/references.

C++ fans would tell you that modern C++ has plenty of high level features that are "safe". But the point is that it still has lots of unsafe features. (And AFAIK there is no way to enforce use of a "safe" subset. And C++ continues to resist "real" garbage collection.) I would much rather work in a language like Java or Go (or others) that just don't allow unsafe code of this nature, and eliminate a whole class of problems. Figuring out my high level issues is challenging enough without worrying about unsafe low level issues.

Thursday, January 01, 2015

Go Editors

Up till recently I've been using Sublime Text with GoSublime to write Go code. It works pretty well. Sublime is a good editor and GoSublime integrates with the Go tools fairly well. But coming back to it after being away I found it quite annoying that compile errors are only shown in the output pane, not marked on the source code. And you can't even click on the error to go to that line. I'm not a big fan of using line numbers but with Sublime I was pretty much forced to display line numbers and use them manually. (There's probably some way to get clicking on errors to go to the line but nothing obvious.)

I'm not sure where Sublime is at. Sublime 3 has been in beta for a long time. GoSublime has some activity but doesn't seem to be doing too much either.

So I've been on the lookout for alternatives. And I needed something that was available on both Mac and Windows.

I came across something about Github's Atom editor and the go-plus extension. I had some difficulties getting it working on Windows, easier on Mac. It has better integration between Go and the editor, showing lines with errors and letting you click on the errors. But it doesn't seem to have much support for things like running tests. I realize that's outside the scope of just an editor, and I can always run the tests outside the editor. But I'd still prefer to have it. (Again, there may be some way to do it, but if so it wasn't obvious.)

Both Eclipse and IntelliJ have facilities for Go but they seem like very heavy weight tools for a "lightweight" language like Go.

The other recommendation I'd seen was LiteIDE. It's somewhere in between a full IDE like Eclipse, and an editor like Atom. It was easier to install than either Sublime or Atom since it's a single package, no add ons to worry about. I haven't used it a lot yet but it seems like it might be a good option. The editor is decent and it doesn't force me to use line numbers. I can run tests. The only weakness I've found so far is that it doesn't support column select or multiple select. I can probably live without that, if need be I can always use another editor for the odd time I need it. And it looks like the Kate editor that LiteIDE uses does support this so I'd guess it might be added at some point.

The project seems quite active. I found a bug where some keyboard shortcuts didn't work when you had multiple windows open. I couldn't find any mention of this problem so I entered a bug for it. Within hours I got a notification of a fix committed. It looked like an easy fix, and I haven't tried to build from source to test it, but it's still impressive that the issue was addressed so quickly.

Monday, December 29, 2014

Just a Minute

I bought a new iMac Retina 5K. (amazing display!) So Shelley gets my previous four year old iMac. (Replacing her even more ancient iMac.) Personally I prefer to set up new machines from scratch rather than migrate potential junk and problems from the old machine. But for Shelley I knew that would be a big hassle so I used Apple's Migration Assistant.

Shelley was out for the afternoon so I started the migration. It's simple to use, you start it up on both machines and indicate which is the source and which is the destination.

The estimated time remaining went up and down, but was around 4 hours. That seemed relatively accurate since after about 4 hours it said it had a minute left. That was good timing since Shelley had just arrived home.

But an hour later it still said it had a minute left. Crap! I started searching on the web and found lots of other people with the same problem. It's been an issue for years, but I didn't find any official response from Apple. For some people it seemed if they left it long enough it would eventually finish. But other people waited e.g. 24 hours and it still didn't finish. I could abort it at any time and leave Shelley with the old computer, but she was ok with waiting overnight.

I could tell it was still doing something because our internal network was slow. In fact, the first clue that it might have finished was that the network suddenly got a lot faster. It ended up taking about another 4 hours for that "last minute". It reminded me of the 90-90 rule in software development that "The first 90 percent of the code accounts for the first 90 percent of the development time. The remaining 10 percent of the code accounts for the other 90 percent of the development time."

I understand that estimating completion times is difficult, and progress indicators are infamous for stalling at the end. But "a minute" is several orders of magnitude different from 4 hours. Surely Apple could do better, maybe obsess over the migration experience as well as the un-boxing experience.

If they really can't improve the time estimation, then give some visibility to the process. For example, show a list of "things" to be copied and check them off. Sticking at one minute remaining looks like it's hung up and I suspect a lot of people cause additional problems because they kill the process and then tried to recover from a half copied machine.

Other than this hiccup the migration seems to have been successful. But instead of being the hero for giving Shelley a newer, bigger, faster computer, I ended being the indirect cause of "breaking" her Microsoft Office. It needed the product key to reactivate it on the new computer and that seems to be long gone. The key would have been on the physical package which probably got thrown out sometime over the years. And worse, Microsoft now wants you to pay a monthly fee to use Office, rather than just a one time purchase. On top of which, they haven't updated Office for Mac since 2011. Sigh. Home tech support can be a thankless job!

PS. With Migration Assistant you have a choice of copying from the old machine, or copying from a Time Machine backup. I chose to copy from the old machine just in case the backup didn't include everything. Some of what I found on the web seems to indicate that copying from a Time Machine backup doesn't have the same problem.

Tuesday, May 27, 2014

Java 8 Performance

I was just looking at some stats on the average time our customers' servers take to run our application test suite.

I noticed on a particular day the times dropped from an average of 240 seconds to an average of 200 seconds. (These are averages from about 240 customer sites.) The numbers are generally quite stable so I was curious what changed.

I discovered that was the day we updated everyone to the Java 8 JRE so it looks like that's the reason for the improvement. Assuming that's the correct explanation that's a pretty nice upgrade!

It made sense that it was Java related since customers running the older cSuneido did not show any improvement that day.

Note: jSuneido is still compiled for Java 7, this improvement would just be from the runtime.

Monday, May 19, 2014

Portable Tests

With two implementations of Suneido (the original C++ cSuneido and the newer Java jSuneido) I've ended up with three sets of overlapping tests - in each of the implementations plus in the Suneido standard library. And as I play with implementing Suneido in Go I find myself creating yet another set of tests.

Obviously this is not ideal. Apart from the duplication, each version of the tests has better or worse coverage of different areas depending on where I had issues with the implementation. Ideally, I'd like to run the same complete set of tests everywhere, and if I added a test case it would be included everywhere, not just in one of the versions.

One option would be to use something like Fit or Fitnesse. But that would still require writing code (for fixtures and "slim" interfaces) and it would mean accepting a third party dependency which in turn depends on Java.

I figured the simplest thing would be to have the test cases in text files and to write a test runner for each of the versions.

But what format should I use for the test cases? I realized that I could use a format that was easy to read with the Suneido lexical scanner. Any implementation of Suneido has to have this, and it's generally one of the first things I implement. Using the scanner made it easy to handle quoted strings and to ignore comments and whitespace.

I implemented a test runner in Suneido code first, and designed the format to keep the parsing simple. Here is an example:

@add

1, 1, 2 // i.e. assert 1 + 1 == 2

@regex_match

"abc" "b"
"abc", "x", false // this should not match

"foo
bar", 
"^bar" // ^ should match after a newline

An '@' followed by a name precedes a list of test cases for the named test "fixture". Each version has to implement each of the fixtures, but these are simple and I already have equivalent code in the existing tests.

Normally each line of values is a test case. Commas between values are optional, but newlines are ignored after a comma to allow splitting a test case over several lines.

After the Suneido code version it was straightforward to implement a version in Go. Java and C++ should also be simple.

I still want to run these tests as part of the existing automated testing, but that's easy to do by writing a test (e.g. in JUnit) that calls the test runner for the portable tests.

A remaining question is where to put the test files. Expecting them to be in the current directory or a subdirectory is easiest, but then each version will have its own copy and I'd have to keep them in sync. It makes more sense to have a single copy somewhere, but then I need some way to point each version at that central location. One option would be an environment variable but that can be problematic. Instead I decided I'd put a text file at the root of each project that would contain the directory path to the tests. (And if that doesn't fit, each implementation is free to handle this differently.)

My main concern with this was tests with a lot of different cases, where you'd use data driven tests. (like regular expressions) In other areas what I probably should have is more of a BDD (Behavior-driven development) style of tests that would form a kind of specification for Suneido. To keep this portable it would make sense to use the JBehave style that separates the specification from the implementation.