Tuesday, February 18, 2025

Go, Cgo, and Syscall

I recently overhauled the Windows interface for gSuneido. The Go documentation for this is sparse and scattered and sometimes hard to interpret, so here are some notes on my current understanding of how Cgo and Syscall should be used.

The big issue is interfacing between typed and garbage collected Go and external code. Go doesn't "see" values in external code. If values are garbage collected by Go while still in use externally it can cause corruption and crashes. This is hard to debug, it happens randomly and often rarely and is hard to reproduce.

The other issue with Go is that thread stacks can expand. This means the stack moves, which means the addresses of values on the stack change. Unlike C or C++, in Go you don't control which values are on the stack and which are on the heap. This leads to the need to "pin" values, so they don't move while being referenced by external code. (There is also the potential that the garbage collector might move values in a future version.)

Go pointers passed as function arguments to C functions have the memory they point to implicitly pinned for the duration of the call. link

Note the "Go pointers" part and remember that uintptr is not a pointer. If you want to pass "untyped" pointer values to C functions, you can use void* which has the nice feature that unsafe.Pointer converts automatically to void* without requiring an explicit cast.

SyscallN takes its arguments as uintptr. This is problematic because uintptr is just a pointer sized integer. It does not protect the value from being garbage collected. So SyscallN has special behavior built-in to the compiler. If you convert a pointer to uintptr in the argument to SyscallN it will keep the pointer alive during the call. It is unclear to me whether this applies to Cgo calls. Do they qualify as "implemented in assembly"? Strangely, SyscallN has no documentation. And the one example uses the deprecated Syscall instead of SyscallN. And you won't even find SyscallN unless you add "?GOOS=windows" to the url.

If a pointer argument must be converted to uintptr for use as an argument, that conversion must appear in the call expression itself. The compiler handles a Pointer converted to a uintptr in the argument list of a call to a function implemented in assembly by arranging that the referenced allocated object, if any, is retained and not moved until the call completes, even though from the types alone it would appear that the object is no longer needed during the call. link

The example for this generally show both the uintptr cast and the unsafe.Pointer call in the argument, for example:

syscall.SyscallN(address, uintptr(unsafe.Pointer(&foo)))

My assumption (?) is that the key part is the uintptr cast and that it's ok to do the unsafe.Pointer outside of the argument, for example:

p := unsafe.Pointer(&foo)
syscall.SyscallN(address, uintptr(p))

What you do NOT want to do, is:

p := uintptr(unsafe.Pointer(&foo))
// WRONG: foo is not referenced or pinned at this point
syscall.SyscallN(address, p)

To prevent something from being garbage collected prematurely you can use runtime.KeepAlive which is simply a way to add an explicit reference to a value, to prevent it from being garbage collected before that point in the code. KeepAlive is mostly described as a way to prevent finalizers from running, but it also affects garbage collection.

KeepAlive marks its argument as currently reachable. This ensures that the object is not freed, and its finalizer is not run, before the point in the program where KeepAlive is called. link

However, it does not prevent stack values from moving. For that you can use runtime.Pinner

Pin pins a Go object, preventing it from being moved or freed by the garbage collector until the Pinner.Unpin method has been called. A pointer to a pinned object can be directly stored in C memory or can be contained in Go memory passed to C functions. If the pinned object itself contains pointers to Go objects, these objects must be pinned separately if they are going to be accessed from C code. link

Passing Go strings to C functions is awkward. Cgo has C.CString but it malloc's so you have to make sure you free. Another option is:

buf := make([]byte, len(s)+1) // +1 for nul terminator
copy(buf, s)
fn((C.char*)(unsafe.Pointer(&buf[0])))

If you don't need to add a nul terminator, you can use:

buf := []byte(s)
fn((C.char*)(unsafe.Pointer(&buf[0])))

Or if you are sure the external function doesn't modify the string, and you don't need a nul terminator, you can live dangerously and pass a direct pointer to the Go string with:

fn((C.char*)(unsafe.Pointer(unsafe.StringData(s)))

One thing to watch out for is that Go doesn't allow &buf[0] if len(buf) == 0. If it's possible for the length to be zero, you can use unsafe.SliceData(buf) instead.

There is a certain amount of run time checking for some of this.

The checking is controlled by the cgocheck setting of the GODEBUG environment variable. The default setting is GODEBUG=cgocheck=1, which implements reasonably cheap dynamic checks. These checks may be disabled entirely using GODEBUG=cgocheck=0. Complete checking of pointer handling, at some cost in run time, is available by setting GOEXPERIMENT=cgocheck2 at build time. link

Sorry for the somewhat disorganized post. Hopefully if someone stumbles on this it might help. Or maybe it'll just be gobbled up and regurgitated by AI.

See also:
Cgo documentation
Go Wiki: cgo
Addressing CGO pains, one at a time