Monday, May 05, 2008

Old Lessons

One of the common suggestions for testing is to test around limits or boundaries. e.g. try 0 or 1 or -1. I was painfully reminded of this by a recent debugging session.

Some background. Originally Suneido memory mapped the entire database file into memory. That was simple and fast but limited the size of the database to less than 2 gb (the amount of address space a Win32 process gets).

When our customers' databases started to approach this size I had to change Suneido to memory map a limited number of segments of the database file and re-use address space in an LRU fashion.

Considering how major this change was it went amazingly well. I was quite nervous about weird bugs from things like dangling pointers caused by re-mapping. So I did quite a lot of testing before we rolled it out. And we kept a close eye on it at first.

This was quite a while ago and I figured everything was good. Until our client with the largest database started having problems. Their database was right around 4 gb. So I did some testing and sure enough when the database reached 4 gb MapViewOfFile would fail with "Access Denied". The "Access Denied" part threw me off since it sounded like some kind of security/permissions issue. I dug through MSDN trying to find some mention of a 4 gb limit but couldn't find anything.

As with so many of these things, the error turned out to be really stupid. Memory mapping a file with Win32 is a two step process, first CreateFileMapping, then MapViewOfFile. Since it was MapViewOfFile failing I hadn't paid much attention to the CreateFileMapping step.

Both API calls accept 64 bit file offsets as two 32 bit values. I was only passing the low 32 bit value to CreateFileMapping. Stupid! Of course, it worked up to 4 gb. But over 4 gb it wasn't passing the correct offset. Whereas I was passing both halves of the offset to MapViewOfFile. It was trying to map the correct offset, but was failing because the file mapping was for a different, incorrect offset. (A little confusing because the call with the wrong arguments was succeeding and the call with the right arguments was failing.)

Why hadn't I caught this with my testing? Because the old database size limit was under 2 gb so I tested with over 2 gb. The bigger the database the slower the testing so I never bothered going as big as 4 gb. Oops.

I should have realized that the limit for 32 bit values would be an obvious failure point and I should have tested up to and over 4 gb.

After I fixed this bug I was able to grow a database past 4 gb. For a minute I thought I was done. But a little more testing revealed that it wasn't accessing the data past 4 gb properly.

This was a little tougher to track down. I got fed up with waiting so long to create big databases and came up with a way to quickly expand the temporary databases used by the tests to over 4 gb. (I could have saved considerable time in the long run if I'd done this at the start!)

The problem turned out to be a place in the code where I mistakenly stored a 64 bit file offset in a 32 bit integer. Again, this worked up to 4 gb but not beyond.

After this fix all the tests ran successfully.

Next time maybe I'll remember to test around obvious limits and boundaries.

PS. Not that it's any excuse, but it seems odd that the compiler wouldn't give a warning about assigning a 64 bit value to a 32 bit variable. I'm using -Wall with GCC and the only warning I have turned off is -Wno-sign-compare. Maybe there are warnings I need to turn on that aren't included in -Wall. I should probably look into this.

PPS. Along the way I got an excellent demonstration of the effects of locality on multi-level storage. If I created a big database by doing: for each table, add 10000 records, then it was reasonably fast. If I turned it around to: 10000 times, for each table, add 1 record, then it was about 10 times slower. Dealing with one table at a time kept all the data and index nodes for that table together so they could all be mapped into memory. Looping through each table and adding one record to each causes the data and index nodes for all the tables to be mixed together over too large an area to be mapped all at once, causing continuous un-mapping and re-mapping.

No comments: