Thursday, October 03, 2024

String Allocation Optimization

This post follows on from the previous one. (Go Interface Allocation) After realizing the allocation from assigning strings to interfaces (common in gSuneido) I started thinking about how this could be improved. I had a few ideas:

  • predefine all 256 one byte strings (common when looping over a string by character)
  • implement a "small string" type (i.e. a one byte size and 15 byte array in 16 bytes, the same space as a Go string pointer + size)
  • intern certain strings with the new Go unique package (unique.Make returns a Handle which is a pointer so it could be stored in an interface without allocation)

Unfortunately, the last two require introducing new string types. That's feasible (gSuneido already has an additional SuConcat string type to optimize concatenation) but it would take some work.

Luckily the first idea was easy and didn't require a new string type.

func SuStr1(s string) Value {
    if len(s) == 1 {
        return SuStr1s[s[0]]
    }
    return SuStr(s)
}

var SuStr1s = [256]Value{
    SuStr("\x00"), SuStr("\x01"), SuStr("\x02"), SuStr("\x03"),
    ...
    SuStr("\xfc"), SuStr("\xfd"), SuStr("\xfe"), SuStr("\xff"),
}

It didn't make much difference overall but it was easy to see the improvements on specific types of code. For a simple change I think it was worth it. (Thanks to "AI" for auto-completing most of that 256 element array.)

See the commit on GitHub

Wednesday, October 02, 2024

Go Interface Allocation

I was running some benchmarks recently and I noticed the memory allocation was higher than I expected. After some investigation, I found it was an issue I was aware of, but that was easily overlooked.

The simplest Go "interface" is "any" aka "interface { }". This is a bit like void* in C or C++, except it has a runtime type. An interface is a fat pointer consisting of a pointer plus the runtime type (another pointer). You can store a pointer in an interface without any allocation. But if you store a value that isn't a pointer e.g. a struct, then it has to be on the heap, which usually means allocation.

The catch is that in Go there are quite a few fat pointers. It is a distinctive feature of Go. For example, a string value consists of a pointer and a length, and a slice consists of a pointer, a length, and a capacity. So even though mentally you might think of strings and slices as pointers, they don't fit in an interface. So storing a string in an interface allocates a copy of the string value (16 bytes) which in turn points to the actual data. (Also not ideal for locality.)

For example:

Even though this benchmark doesn't create any new values, it still allocates. If the type of X is changed from any to string, then there is no allocation.

There's not much you can do about this. In some cases you can use generics to avoid interfaces. But interfaces are a basic feature of Go, it's hard to avoid them totally. My point is just that if you're trying to write high performance code that minimizes allocation, watch out for interfaces.

See also: Go Interfaces and Immediate Integers