Tuesday, January 07, 2020

Recurring Nightmares

gSuneido has been coming along nicely. It's solid enough that it has replaced cSuneido for my personal use. Of course, there are still issues but I'm gradually cleaning them up.

cSuneido has a Windows COM/OLE interface that we primarily use for the Windows browser control. I used go-ole to implement this in gSuneido. I wasn't completely happy with taking a dependency on a small project that wasn't very active, but the code was fairly straightforward and I knew I could write my own if I needed to.

go-ole worked well and I thought that area was complete. But when I got other people testing gSuneido more issues came up. Ctrl+C (copy) in a browser window crashed. Some links crashed. Passing an empty string caused a later crash.

I spent several days trying to debug the problems. The crashes were access violations inside Windows which meant you don't have a lot of insight into what's happening. Windows is a very large black box.

Working on a previous issue I had wondered if it would be better to isolate the COM/OLE interface the same way I had isolated the DLL interface. (see gSuneido Roller Coaster) I'd actually started implementing this but abandoned it when I solved the bug another way.

These COM/OLE bugs were somewhat different from the DLL issues - more reproducible and less random. But they also had a lot of similarities. I suspected there were some threading issues but I don't understand COM/OLE threading well enough to be sure.

Searching on the web I found more issues with interfacing between Go and Windows. Often they caused crashes. Go's threading model, garbage collection, and stack movement do not play well with Windows. As I've come to know, it's a dangerous area to work in. Note: This is not really a flaw with Go itself. If you stick to straight Go you won't run into these problems. It's only a problem when you have "unsafe" code calling external Windows stuff. Unfortunately, that's what I'm doing.

I decided to finish isolating the COM/OLE interface. It was a bit of a gamble since I had no idea if it would solve the problems. But it didn't seem like it would take too long and I didn't have a lot of other ideas.

It only took me a day or two to replace go-ole with my own interface. But ... it didn't work. I kept getting DISP_E_TYPEMISMATCH and CTL_E_INVALIDPROPERTYVALUE. I had basically ported my cSuneido implementation which I knew worked. cSuneido is 32 bit whereas gSuneido is 64 bit, but that didn't seem to be the problem. I also had the go-ole code as an example. I spent a whole day trying to figure out why it didn't work. I added logging to all three implementations and they looked identical.

Finally I realized that the IDispatch Invoke arguments need to be in reverse order. The way I'd added the logging had hidden this and I'd only been looking at the values of the arguments. A day's worth of time wasted!

Once I made that trivial but crucial fix, it all started working.

And even better, the original weird bugs and crashes were gone :-)

So my hunch was right and my gamble paid off (this time). And as a side bonus, I no longer have a dependency on an external library.