The Software Life: August 2020

Wednesday, August 12, 2020

A New Assert Package for Go

Back when I first started using Go I wrote an assert package similar to Hamcrest. You used it like:

Assert(t).That(x, Equals(y))

That has worked ok, but it has a few weaknesses. I blame it partly on being relatively new with Go when I wrote it.

It requires a dot import, which brings all its public names into the current package scope. That's not recommended because it pollutes the name space and can easily lead to name clashes. (Technically it doesn't "require" a dot import, but it would be very awkward to use otherwise.) The other problem with a dot import is that the code doesn't mention the package, so automatic imports don't work, so I end up copying the import from another file.

The way it works isn't really amenable to e.g. Intellisense in VS Code. You have to know the names of the functions (like Equals).

It's more complex than it really needs to be - matchers return closures, which is clever, but complicates the code and makes it harder to understand.

It's not amenable to use outside of tests, so I wrote another tiny verify package, which was similar but different.

I finally got irritated by it enough to write a new version that addresses these issues.

Here's how you use the new package:

assert.That(a || b)

assert.This(x).Is(y)

assert.T(t).This(x).Like(y)

assert.Msg("first time").That(a || b)

assert.T(t).Msg("second").This(fn).Panics("illegal")

If you're doing multiple assertions and you don't want to specify .T(t) each time, you can do:

assert := assert.T(t)

The testing.T is optional (if it's not supplied errors just panic) so you can use asserts in regular code. (And I don't need my redundant verify package.)

Now VS Code Intellisense works much better. If you type assert. it will suggest appropriate functions like That, This, T, Msg. If you type assert.This(x). it will suggest Is, Isnt, Like, Panics.

One advantage of my old style is that you could write additional matchers in a separate package. But I never actually did this. And it's my code so if I need another matcher, I can just add it to the assert package.

One of the popular Go testing packages, Testify, uses a different style:

assert.Equals(t, x, y, "message")

There's nothing wrong with that style, and in some respects it's simpler. But I've never liked it, partly because it's unclear which is the actual value and which is the expected value.

code in Github

Tuesday, August 04, 2020

Sorted List

Let's say you want to add a large number (e.g. up to millions) of values to a list, one at a time, in no particular order, and at the end you want to iterate through them in sorted order. There are a number of ways you can accomplish this.

Most languages have some sort of list data structure. In C++ you could use a vector, in Java an ArrayList, in Go a slice. Most of these grow by doubling (or possibly some other factor) in size as required. At the end you can simply sort the list. There are several drawbacks to this approach. One is that you have wasted space. Assuming doubling, the list is between 50% and 100% full, so you waste 25% on average. You also allocate roughly double the amount of space you end up with. (e.g. to get 8 elements you allocate 1, 2, 4, and then 8) In a garbage collected language this puts extra pressure on the garbage collector. And each time the size doubles you have to copy all the values to the new space.

One way around the space, allocation, and copying overhead is to split the list up into fixed size blocks and grow the list by adding more blocks.

You could use an ordered set data structure (assuming the values are unique) like a Java SortedSet or a C++ set. These are often implemented with something like a red-black tree which is not very efficient space-wise since each value has one or more pointers associated with it.

Log structured merge (LSM) trees are another option. If you just use lists for each level, the space overhead is minimal. Like the simple list, you can split the levels into blocks to reduce space and allocation overhead.

But we don't really need an ordered data structure since we don't need the order till all the values have been added. So an ordered set or LSM tree is doing more work than necessary.

My current implementation in jSuneido evolved from an LSM tree. It uses a list of blocks and at the end it sorts each block individually and does a series of two-way merges to combine them. Deciding what to use for gSuneido, it struck me that it would probably be more efficient to just sort the list as a whole since the two-way merges were merging (compare & copy) each value log2(n) times. It was relatively easy to modify my jSuneido implementation to try that. Curiously, it was slower. That seems a bit odd since the overall sort should do less compares and copies. I have to assume that once again modern cpu design is making intuition invalid. I would guess that sorting the blocks individually is fast because they fit into memory cache. And merging is fast because it's reading and writing sequentially which is optimal for cache usage.

You could merge all the blocks at once with an N-way merge, rather than a sequence of two way merges. This would mean less comparisons and copies, but it doubles the space usage during the merge. A nice aspect of a block list and two way merges, is that you can recycle the blocks. As soon as you finish merging a block, you can put it on a free list to reuse. This means the space overhead for two way merging is only a couple of blocks instead of double the size. (which reduces allocation, which reduces GC)

One interesting aspect of this approach is that the sorting and merging can be done incrementally. As soon as you fill up a block you can sort it. And as soon as you have two (or power of two) blocks you can merge them. This doesn't change the amount of work required, it just redistributes it over time. Spreading work more evenly and making it less bursty can be beneficial but it's not a clear win.

But then I realized you could do the sorting and merging concurrently with building the list. Again, it doesn't change the amount of work, it just allows some of it to be done in parallel. There are various ways to do this. I created a worker goroutine and triggered it via a channel. You could just start a new goroutine each time, but that's a bit more overhead. In Java I'd be thinking about a thread pool, but goroutines are lightweight enough that it's probably not necessary.

I ended up writing multiple Go versions so I could benchmark them. The results were pretty much what I expected. Simple slice and sort is fast, but lots of allocation. Block list is much less allocation but slightly slower overall. Merging is slightly faster than a complete sort. Block list with concurrent sort/merge was the winner. It had the lower memory usage of the block list, and was 10 to 20% faster due to the concurrency.

BenchmarkSimple 30 35164847 ns/op 664934 B/op 23 allocs/op
BenchmarkBlocks 26 43607473 ns/op 131139 B/op 6 allocs/op
BenchmarkMerge 27 37056507 ns/op 197306 B/op 31 allocs/op
BenchmarkConcur 37 28624720 ns/op 197768 B/op 34 allocs/op

This was with a block size of 4096 and adding 16,384 items.

The initial code is in Github (this points to the original commit so the link doesn't break, in the future the latest version is probably a better bet)