Sunday, September 07, 2025

A Priority Queue with sync.Cond

The good news is Claude (Sonnet 4) found a bug for me.

The bad news is that Claude wrote the code that had the bug.

It's not all Claude's fault though. Garbage in, garbage out. When I searched on the internet for an example, I found a post with the same bug. I left a comment explaining the bug, hopefully it'll get corrected.

The history was somewhat amusing as well: (paraphrased)

Me: we could write a priority queue

Claude: that would be complex

Me: but we could encapsulate the complexity in the queue package

Claude: good idea

Me: write a priority queue (followed by the requirements)

Claude: no problem, here you go

The code for a concurrent producer-consumer queue is not that complicated, even with specific priority and ordering constraints.

I also got Claude to write tests and benchmarks and everything seemed to work well. The benchmarks showed it had comparable performance to a Go channel. I changed my code to use the new queue (replacing a simple Go channel) and it worked fine. It even appeared to show the hoped for performance improvements. I got Claude to write more benchmarks to measure performance better. But the benchmarks would hang sometimes. I tried to get Claude to fix it but it just thrashed around rewriting the benchmark different ways. I suspected a deadlock in the priority queue so I wrote a better test for that. It appeared to work fine, until I increased the number of threads to 16 or 32, then it would hang consistently with a deadlock. I gave the simple failing test to Claude and it immediately spotted the bug.

The "obvious" approach is to use a single condition variable (sync.Cond) for the queue. Put and Get wait on the Cond and then signal it. It seems straightforward, and under low concurrency it appears to work. But under high concurrency, it can deadlock. The problem is that Signal only wakes up one goroutine. If the queue is full and it wakes up a producer, or the queue is empty and it wakes up a consumer, it will deadlock.
One solution is to use Broadcast to wake up all waiters but that can be inefficient with many goroutines. The better solution is to use two sync.Cond, one for not-full and one for not-empty. Put waits on not-full and signals not-empty. Get waits on not-empty and signals not-full. I should have read the Wikipedia article instead of trusting AI generated code.

Go's sync.Cond has a bit of a funny story in itself. There is no example for it in the Go documentation. When someone suggested adding an example they were told that sync.Cond is tricky and therefore shouldn't be used and they didn't want to add an example because that would "encourage" people to use it. So it becomes a self fulfilling prophecy that people misuse it. At one point there was even a movement to remove sync.Cond from Go. It seems a little odd to me, since condition variables are a standard concurrency concept. It seems to me a few examples would prevent almost all the misuse.

Here's the priority queue code if you're interested.

No comments: