tag:blogger.com,1999:blog-115650052024-03-17T21:04:23.737-06:00The Software Lifethoughts and experiences from the world of commercial and open source softwareAndrew McKinlayhttp://www.blogger.com/profile/14951795633428513769noreply@blogger.comBlogger697125tag:blogger.com,1999:blog-11565005.post-88738987686385968042024-01-12T18:25:00.002-06:002024-01-12T18:25:55.733-06:00Fuzzing - the gift that keeps on giving<p>It's been almost a year since I started <a href="https://thesoftwarelife.blogspot.com/2023/02/fuzz-testing-database-queries.html" target="_blank">fuzz testing Suneido database queries</a> and it's still finding bugs. I assumed it would find bugs at first, and then peter out. But we set them up to run for an hour every night to catch any regressions and that has turned out to be very worthwhile. Of course, I'm sure I've fixed the last bug now :-)</p>
<p>I have a bunch of Go tests for the query code, and they have decent coverage. But they barely scratch the surface in terms of combinations of query operations. Passing these tests gives me little confidence that changes are correct.</p>
<p>We also have lots of application tests that do a lot of queries. But all the queries are written by humans and follow similar styles and approaches. The bugs found by fuzzing were not detected by the application tests. That's probably partly because I used those application tests to initially develop and debug the gSuneido query implementation. So those types of queries have been well tested and debugged.</p>
<p>It might seem obvious, but one thing the fuzz testing has hammered home is that making something work for "normal" cases can be quite different from making it work for every bizarre unreasonable case. I look at the failing queries from fuzzing and I think "no one would ever write a query like that". And yet, they are bugs nonetheless.</p>
<p>In most cases when the nightly fuzzing would find an error (a difference between the results from gSuneido and jSuneido) I could easily reproduce it. But occasionally it would depend on the randomly generated data. (e.g. that a certain value did or didn't exist) That could make it hard to reproduce the error to debug it. We tried saving the data from each run, but that was awkward. We ended up saving the seed for generating the random data. We could include that with the results and use it to recreate the data.</p>
<p>The data dependency made me realize that we weren't testing many sets of data, just one per night. Rather than generate more sets of data (slow relative to running queries) I changed the fuzzing to try several variations of the constant values in each query. That had a similar effect to trying different sets of data.</p>
<p>It wasn't entirely surprising, but it was interesting that the fuzzing didn't find any errors in the jSuneido query implementation. The fuzzing only detects differences between the implementations, it didn't tell me which was right and which was wrong. Several times I convinced myself that it was jSuneido that was wrong, but it always turned out to be gSuneido. The gSuneido query implementation started out very similar to jSuneido. But over the years it got refactored and improved and optimized. gSuneido's query optimization is now quite a bit more sophisticated, which means more complex, which means more bugs unfortunately. jSuneido has also been around longer so maybe that helped work out the bugs. It's also much more stable - I've barely touched the code for several years whereas the gSuneido code has been changing constantly.</p>
<p>Another aspect that was surprising was how "rare" some of these failing queries were. We are running about a million queries per night. I would have thought that would explore the possibilities fairly well. But some of the bugs were only caught once after weeks of running. In hindsight that shouldn't be surprising - there are an infinite number of possible queries, and when you add in the data dependencies, the failing cases are even more rare.</p>
<p>Like other fuzz testing systems, I accumulate the failing queries in a "corpus" and run them every time along with new random queries. Quite often these are the ones that fail again. They obviously are "tricky" queries as far as my query implementation goes.</p>
<p>One concern I had was whether my ad hoc random query generation would generate "all" the possible queries. There could be bugs that could not be "reached" by the possible range of queries I was generating. Obviously, I don't know if there are bugs it's missing. But judging by the bugs it has found, it seems to have quite good coverage. In theory it should probably be based on the query grammar (the reverse of parsing). I ended up using a random sequence of additions and substitutions. One of the challenges is to ensure that the query is still valid e.g. that the columns referenced actually exist in the sub-query.</p>
<p>When I get a failure, the first thing I try to do is simplify the query. Most of the time there are parts of the query that can be removed and it will still fail. It's a lot easier to debug if e.g. there is only one join in the query. This is a fairly mechanical process and could probably be automated. (More easily from the parse tree than from the query text.) But it only takes a few minutes to do manually so it's not worth the effort to automate. (As long as there aren't too many bugs!)</p>
<p>One issue with my current approach is that it depends on using jSuneido for comparison. But we have replaced jSuneido with gSuneido and we're not using it any more. It's also awkward to have to run two processes and compare the results. I decided to write a "simplest possible query implementation" within gSuneido. I procrastinated on this for a while because I thought it would be a lot of work. But it only took me a few hours. I have to remember that the complexity is not inherent in the operations, it's almost all from the optimization. It still uses the same parser, but that seems ok because that's not where the problems are. It doesn't use any of the normal query optimization or execution. And in a way it's better than using jSuneido since it's not at all the same approach. And it's easier to be sure the simple implementation is correct. The alternate implementation is quite slow and isn't suitable for large results but it's sufficient for the fuzzing. And to get around the slower speed, with it all being within one gSuneido process, it's easy to run multiple threads.</p>
Andrew McKinlayhttp://www.blogger.com/profile/14951795633428513769noreply@blogger.com0tag:blogger.com,1999:blog-11565005.post-8630541738742755182023-09-22T11:12:00.001-06:002023-09-22T11:12:37.187-06:00Poisoning Workers<p>As in worker threads, not people.</p>
<p>gSuneido uses an unbounded thread pool to handle client requests to the server. Most Go thread pools are intended to limit the number of goroutines e.g. to match the number of hardware threads. But in this case I want all the requests to execute in parallel, even if it wasn't the most efficient, so that all the requests made progress. Normally in Go you don't need an unbounded thread pool, goroutines are usually fast enough that you can just start a goroutine for each request. However, if the goroutine will need to allocate memory, either for the stack or for other things, then creating new ones every time becomes more expensive. One option is to use a sync.Pool to avoid the allocation. But that doesn't solve the cost of stack growth. (e.g. <a href="https://github.com/golang/go/issues/18138" target="_blank">issue 1838</a>)</p>
<p>In the old version of my code, each worker was responsible for terminating itself if it was idle. This was implemented with a timer that was reset by each request. </p>
<script src="https://gist.github.com/apmckinlay/1d4eb7b5489cdff0e0c5a00a9eee9dee.js"></script>
<p>This seemed straightforward. I tested it and it worked fine. But then recently I added some metrics, including the number of workers, and I noticed the number of workers rarely went down. Definitely not dependably after the 5 second timeout. I puzzled over this for a while before I realized what was happening. If multiple worker goroutines are waiting to receive from the same channel, it's unpredictable which worker will receive the message. This doesn't seem to be defined in the language spec so I'm not sure if it's random (as the select statement is defined to be) or if the receivers are queued and processed first-in-first-out. It doesn't really matter for my issue. The bottom line is that tasks end up distributed over workers. So even with a low level of activity, workers tend to get at least one task per 5 seconds, and therefore they don't terminate.</p>
<p>I could dream up lots of complicated approaches but it seemed like there should be a simple solution. I could probably get away with just not ever killing workers. But then if a spike in load creates a bunch of workers, they will hang around forever, which isn't the best behavior. Searching the web, I found a lot of introductory material about goroutines and channels, and a lot of bounded thread pools, but not much on unbounded ones. I found <a href="https://github.com/lnquy/elastic-worker-pool" target="_blank">lnquy/elastic-worker-pool</a>, but that seemed more complicated than I needed.</p>
<p>Eventually I realized I could make the pool a "leaky bucket" i.e. just periodically kill a worker. That would ensure the pool would eventually shrink. This is very simple and doesn't require anything complex like trying to measure the load. The downside is that even if you're in a steady state with the correct number of workers, it's still going to kill workers and they'll end up being recreated. But a worker pool is really just a cache and a cache doesn't need to have 100% hit rate to be effective. I made a slight improvement by preventing killing too soon after creating a new worker goroutine.</p>
<p>I implemented this by having a killer goroutine that sends a "poison pill" to the same channel as the tasks. One of the workers would receive this and terminate itself.</p>
<script src="https://gist.github.com/apmckinlay/66cb963f726d171170418d78ef3b965a.js"></script>
<p>The killClock and nworkers are atomic since they are accessed from multiple threads. The workers just needed to recognize the poison pill and terminate. The submit code just need to reset the killClock to zero whenever it created a new worker goroutine.</p>
<p>For example, if the interval is 1 second and the delay 10 seconds, in a steady state a worker would be killed and then recreated every 10 seconds. Given that creating a worker is almost fast enough to do on every task, that's very minor overhead.</p>
<p>Hopefully I haven't overlooked any flaws in this solution. It seems simple and fast.</p>
<p>As usual <a href="https://github.com/apmckinlay/gsuneido/blob/master/dbms/mux/workers.go" target="_blank">the code is on Github</a></p>
Andrew McKinlayhttp://www.blogger.com/profile/14951795633428513769noreply@blogger.com0tag:blogger.com,1999:blog-11565005.post-64849710833032433172023-09-10T17:06:00.000-06:002023-09-10T17:06:09.007-06:00A gSuneido Database Optimization<p>While working on <a href="https://thesoftwarelife.blogspot.com/2023/09/a-gsuneido-database-issue.html" target="_blank">the recent gSuneido database issue</a> I had an idea for optimizing the transaction conflict checking code.</p>
<p>The current code kept a list of outstanding transactions and for each a list of tables and their reads and writes. To check a new read or write meant a linear search through all the transactions for other activity on that table. If the data was organized by table rather than by transaction, it would eliminate the linear search. An action on a "rare" table wouldn't have to look at so much data. Some tables would be common to many transactions, but still normally not all of them. And even in a situation where all the outstanding transactions included a particular table it wouldn't be any slower than before.</p>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhfj9dRKwaYYOuABuiT9L4-Tiqdk_oJWGZqFNIHJGoNqq7JyFZnP9gAPBBIVa7c3Jqwn_zKTP_PRrY_t6z9Lg6uy50oxHawybn4WZqmZbO619GDjJ9Ph0O-l__9a0h1f6wbD6MEgQDKP90ThMLPSPdCIaPS-wM9fj7gtDG-UTAgCYDpdmA6N06wmg/s541/checker%20optimization.png" style="display: block;"><img alt="" border="0" data-original-height="366" data-original-width="541" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhfj9dRKwaYYOuABuiT9L4-Tiqdk_oJWGZqFNIHJGoNqq7JyFZnP9gAPBBIVa7c3Jqwn_zKTP_PRrY_t6z9Lg6uy50oxHawybn4WZqmZbO619GDjJ9Ph0O-l__9a0h1f6wbD6MEgQDKP90ThMLPSPdCIaPS-wM9fj7gtDG-UTAgCYDpdmA6N06wmg/s400/checker%20optimization.png" width="541" /></a>
<p>At a high level it was a small change, but I wasn't sure how it would fit with the code. I knew I'd still need to keep a list of transactions and the tables they had accessed. It turned out to be relatively straightforward to modify the code. (A few hours work.) One of the questions was where to embed data and where to use pointers to separately allocated data. Embedding tends to be faster, but embedding in maps or slices can use more memory because empty slots are larger.</p>
<p>Before starting the changes I wrote a Go benchmark so I could see what effect the changes had on performance. As usual, it was hard to know what a "typical" pattern of activity was. For this benchmark the new version was over twice as fast. That was a bigger difference than I'd expected. Sadly, but not surprisingly, the change made no difference to the speed of our application test suite. It doesn't do many concurrent transactions so it wouldn't benefit. However, the stress test I'd used to investigate the "too many outstanding transactions" issue did show big improvements. When I collected the same measurements as before and graphed them, it looked even better. Up to three threads it didn't make much difference but after that it was substantially better. And performance didn't drop off under load like it had before. (Or at least, not until much higher load.)</p>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQejeUqYBRav6SmnytR5pp1QYdl5cGrBvD-RUkO2oXryMHxJF4iQLOEshjav25YPCa6eP7Omgv3u2CYaOSJwfkPY-L_IQrkR-V1LFgnCUubhtK5tbZErPvsWbYu7SesNtsOhVE3XT0UfIOAgr2sqHf-HdavBEd5TrWZYrbpwz1ojWlg8WY0CcBkQ/s600/chart.png" style="display: block;"><img alt="" border="0" data-original-height="371" data-original-width="600" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQejeUqYBRav6SmnytR5pp1QYdl5cGrBvD-RUkO2oXryMHxJF4iQLOEshjav25YPCa6eP7Omgv3u2CYaOSJwfkPY-L_IQrkR-V1LFgnCUubhtK5tbZErPvsWbYu7SesNtsOhVE3XT0UfIOAgr2sqHf-HdavBEd5TrWZYrbpwz1ojWlg8WY0CcBkQ/s600/chart.png" width="600" /></a>
<p>The conventional wisdom for optimization would be to profile the code and find the hot spots. That wouldn't have helped in this case. The impact of the data structure was spread over the code. Nor would profiling have given any insight into a better data structure. As I commonly find, what is required is to have good mental models of the problem, the code, the programming language, and the hardware. Then you need to find an approach that is a good fit between all of these. Of course, the hardware performance model is a moving target. These days it's mostly about cache misses and branch prediction.</p>
<p>I'm always happy to improve performance, especially by significant amounts. But in the back of my mind I can't help thinking I should have written it better the first time around. Of course, that doesn't make sense because you can't write the "perfect" version. There's always going to be room for improvement.</p>
Andrew McKinlayhttp://www.blogger.com/profile/14951795633428513769noreply@blogger.com0tag:blogger.com,1999:blog-11565005.post-53871985925805648562023-09-02T13:49:00.000-06:002023-09-02T13:49:06.129-06:00A gSuneido Database Issue<p>We've been gradually converting our customers' servers from jSuneido to gSuneido. Recently we converted a larger customer and they started to get "too many outstanding transactions" errors. Unfortunately, we'd changed multiple things at once with this customer, so it wasn't immediately clear whether this issue was because of the switch to gSuneido, but I suspected it was.</p>
<p>I had set the limit in gSuneido to the same value as jSuneido - 200 outstanding transactions. "outstanding" in this case either means uncompleted in progress, or completed but overlapping with an uncompleted one.</p><p>
</p><p>It seemed a little counterintuitive. gSuneido is faster than jSuneido. So why should it run into this more? My theory is that because it's faster, more transactions get processed and if some of them are slow this will mean more overlapping transactions, even though they completed quickly.</p>
<p>One of my first thoughts was to look at optimizing the gSuneido code. But if my theory was right, this could actually make the problem worse rather than better.</p>
<p>We temporarily solved the problem by adding slight delays to code that was using a lot of transactions. (i.e. we throttled or rate limited them). This worked but it seemed a little ugly to slow it down all the time, just to avoid a problem that only occurred occasionally.</p>
<p>That led me to add throttling to gSuneido, but only when the outstanding transactions got high (e.g. 100). Throttling would prevent hitting the limit (200). I wrote a stress test and this did solve the problem. And it was better than adding delays in the application code. But the problem was that once it throttled, it got really slow. If you throttled enough to prevent hitting the limit e.g to 10 transactions per second, then 1000 transactions would go from maybe .1 seconds to 100 seconds - ouch! And because throttling slowed things down so much, it increased the chances of transactions hitting the 20 second time limit. The cure seemed worse than the disease, it won the battle but lost the war. (Sorry for the proverbs.) </p>
<p>I started to wonder what the right limit was. 200 was an arbitrary value. The reason to have a limit was to prevent it from "falling over" under heavy load. But at what point would that happen? So I removed the limit and tested under different loads. </p>
<div class="separator" style="clear: both;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg4XcQsHP9QNnBRtVRrYUBiTAVKjdLbkRzetksy1P_o-4ZbWFiPAI8wbSoQVzd03mQhwPpFB3Rl-LL5xqVRcEOmQt2abtelajnG9YOnp-MZq2cCqJpSUAfJ5o3NVJfjJwd0zTZafFDmYexw1H1shJRt7by48EpaZDT2dOQCeRBIkXjCEKD3h18P5A/s600/tran_sec%20vs.%20threads.png" style="display: block; padding: 1em 0px;"><img alt="" border="0" data-original-height="371" data-original-width="600" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg4XcQsHP9QNnBRtVRrYUBiTAVKjdLbkRzetksy1P_o-4ZbWFiPAI8wbSoQVzd03mQhwPpFB3Rl-LL5xqVRcEOmQt2abtelajnG9YOnp-MZq2cCqJpSUAfJ5o3NVJfjJwd0zTZafFDmYexw1H1shJRt7by48EpaZDT2dOQCeRBIkXjCEKD3h18P5A/s400/tran_sec%20vs.%20threads.png" width="400" /></a></div>
<table border="1" cellpadding="4px" style="border-collapse: collapse;">
<thead>
<tr>
<th id="threads" style="text-align: right;">threads</th>
<th id="time" style="text-align: right;">time</th>
<th id="tran/sec" style="text-align: right;">tran/sec</th>
<th id="conflicts" style="text-align: right;">conflicts</th>
<th id="max_trans" style="text-align: right;">max trans</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: right;">1</td>
<td style="text-align: right;">2 sec</td>
<td style="text-align: right;">5000</td>
<td style="text-align: right;">100</td>
<td style="text-align: right;">70</td>
</tr>
<tr>
<td style="text-align: right;">2</td>
<td style="text-align: right;">2 sec</td>
<td style="text-align: right;">10000</td>
<td style="text-align: right;">150</td>
<td style="text-align: right;">110</td>
</tr>
<tr>
<td style="text-align: right;">3</td>
<td style="text-align: right;">3 sec</td>
<td style="text-align: right;">10000</td>
<td style="text-align: right;">200</td>
<td style="text-align: right;">150</td>
</tr>
<tr>
<td style="text-align: right;">4</td>
<td style="text-align: right;">4 sec</td>
<td style="text-align: right;">10000</td>
<td style="text-align: right;">200</td>
<td style="text-align: right;">180</td>
</tr>
<tr>
<td style="text-align: right;">5</td>
<td style="text-align: right;">4 sec</td>
<td style="text-align: right;">12500</td>
<td style="text-align: right;">200</td>
<td style="text-align: right;">200</td>
</tr>
<tr>
<td style="text-align: right;">6</td>
<td style="text-align: right;">5 sec</td>
<td style="text-align: right;">12000</td>
<td style="text-align: right;">300</td>
<td style="text-align: right;">230</td>
</tr>
<tr>
<td style="text-align: right;">8</td>
<td style="text-align: right;">8 sec</td>
<td style="text-align: right;">10000</td>
<td style="text-align: right;">300</td>
<td style="text-align: right;">270</td>
</tr>
<tr>
<td style="text-align: right;">10</td>
<td style="text-align: right;">10 sec</td>
<td style="text-align: right;">10000</td>
<td style="text-align: right;">300</td>
<td style="text-align: right;">300</td>
</tr>
<tr>
<td style="text-align: right;">16</td>
<td style="text-align: right;">20 sec</td>
<td style="text-align: right;">8000</td>
<td style="text-align: right;">400</td>
<td style="text-align: right;">450</td>
</tr>
<tr>
<td style="text-align: right;">20</td>
<td style="text-align: right;">30 sec</td>
<td style="text-align: right;">7000</td>
<td style="text-align: right;">500</td>
<td style="text-align: right;">490</td>
</tr>
<tr>
<td style="text-align: right;">30</td>
<td style="text-align: right;">50 sec</td>
<td style="text-align: right;">7000</td>
<td style="text-align: right;">600</td>
<td style="text-align: right;">580</td>
</tr>
</tbody>
</table>
<p>I was quite happy with the results. Performance did drop off with more than 5 or 6 threads, but not that badly. The maximum outstanding got well above the old limit of 200, but that didn't seem to be a major problem. The number of transactions aborting due to conflicts increased with load, but that's expected with that much concurrent activity. To some extent it seemed to be self limiting. i.e. the load itself slowed things down and effectively did its own throttling.<br /></p>
<p>Of course, this is completely dependent on what my test is doing. Is it "typical"? No, probably not. For starters, it's doing continuous activity on multiple transactions as fast as it can, with no other work. Real application code is not going to be doing that. Our applications never do 10,000 transactions per second for long periods. The test does result in shorter and longer transactions, but the longest are only around 1 second, whereas in production we do see transactions taking much longer and sometimes hitting the 20 second limit (usually as a result of external factors). And my workstation only has 8 cores / 16 threads, so it's not really 30 threads.<br /></p>
<p>Out of curiosity, I removed the limit from jSuneido to see how it would perform. 10 threads took 15 seconds or about 50% slower. That wasn't bad. But roughly half the transactions (5000) failed due to conflicts. Ouch! That wasn't so good. Even with only one thread, roughly 10% (1000) conflicted. As opposed to 1% (100) with gSuneido. The usual method of dealing with conflicts is to retry, but that would just increase the load and make things worse.</p>
<p>So what is the correct limit for gSuneido? Or should there be no limit? Should it ever throttle, and if so, at what point? Honestly, I still don't know the answers to those questions. But it does appear the limit could be considerably higher e.g. 500 instead of 200, which would likely eliminate the problem, other than rare extreme cases.</p>
Andrew McKinlayhttp://www.blogger.com/profile/14951795633428513769noreply@blogger.com0Victoria, BC, Canada48.4284207 -123.365644445.549149049050953 -127.76017565000001 51.30769235094904 -118.97111314999998tag:blogger.com,1999:blog-11565005.post-51799019996006802852023-07-20T20:28:00.000-06:002023-07-20T20:28:11.180-06:00Keyword Recognition<p>Daniel Lemire's recent blog post <a href="https://lemire.me/blog/2023/07/14/recognizing-string-prefixes-with-simd-instructions/" target="_blank">Recognizing string prefixes with SIMD instructions</a> and one he linked to <a href="http://0x80.pl/notesen/2023-04-30-lookup-in-strings.html" target="_blank">Modern perfect hashing for strings</a> prompted me to revisit how I was recognizing keywords in gSuneido. Although I find it interesting how you can use advanced processor instructions, I didn't really want to get into that. But like most things, even without looking at my code, I figured it could probably be improved.</p>
<p>I found that I was recognizing language keywords (28) with a linear search, sorted by frequency of occurrence. For query keywords I was using a Go map. Both were quite fast, certainly not a bottleneck.<br /></p>
<p>The reason I hadn't used a Go map for language keywords was that I wanted to "intern" the strings. And there's no way to retrieve the key of a Go map (unless you loop through all of them). Normally it doesn't matter since you need the key to do a lookup in the first place. But the key I'm doing a lookup with is a slice from the source code. If I hold onto that key, it will keep the entire source code in memory (because the reference prevents garbage collection).</p>
<p>One of the first things I tried (Compile-time switches from the Modern article) was one of the fastest (4x faster than the old linear search) - switch on the length, and then switch on the i'th character (chosen for each length). For example:</p>
<pre>switch len(s) {
case 2:
switch s[1] {
case 'f':
if s == "if" {
return tok.If, "if"
}
case 's':
if s == "is" {
return tok.Is, "is"
}</pre>
<p>I find it a little surprising that this turned out to be the fastest since it has quite a few branches - two switches plus a string comparison. Using <a href="https://godbolt.org/" target="_blank">Compiler Explorer</a> to look at the assembler, I found the code was quite good. I was expecting a call to a string comparison function, but it is unrolled inline. It may actually be beneficial that the switches are explicit so each branch can get its own prediction.</p>
<p>Based on <a href="https://godbolt.org/z/Tcq3ePEPn" target="_blank">the assembler</a>, I simplified the code. This was just as fast, smaller source code, and smaller machine code.</p>
<pre>switch len(s) {
case 2:
if s == "if" {
return tok.If, "if"
}
if s == "is" {
return tok.Is, "is"
}</pre>
<p>Because it was comparing constant strings and it knew the length matched, it compared them as multi-byte integers instead of strings. 2, 4, and 8 bytes matched int16, int32, and int64. For lengths of 3, 5, 6, and 7 it used a combination e.g. 6 characters was an int32 and an int16.</p>
<p>I did try various other approaches.</p>
<p>I tried finding "perfect" hash functions and table sizes, which turned out to be trickier than I thought. It was almost as fast as the switch version.</p>
<p>I tried another hash table implementation, but it was slower than the Go map.</p>
<p>I tried switching on the length and then doing a linear search.</p>
<p>In the end I went with the switch version since it was fastest. It was a little more verbose, but Tabnine handled most of the code generation.</p>
<p>Did this micro-optimization make any difference to the overall speed? No, not that I saw from a quick test. But it was an interesting investigation. And yes, premature optimization can be a mistake. But ignoring speed is also a mistake.</p>Andrew McKinlayhttp://www.blogger.com/profile/14951795633428513769noreply@blogger.com0tag:blogger.com,1999:blog-11565005.post-33330482089460616232023-04-23T18:06:00.000-06:002023-04-23T18:06:29.799-06:00New Regular Expression Implementation for gSuneido<p>Suneido has its own regular expression implementation. 25 years ago, when I wrote the original C++ cSuneido implementation, there weren't a lot of regular expression libraries, and I already had an implementation from a previous product, so that's what I used. With the Java version, I initially tried to use the Java standard regular expressions. But it was too hard to get exactly the same behavior, so I ended up using my own implementation. And I did the same thing with the Go version of Suneido.</p>
<p>But I haven't been totally happy with my implementation. It uses backtracking (like many other implementations) which means it has some bad worst cases. I had to put limits on the backtracking, which meant we couldn't use regular expressions in some places, which meant hand writing code to replace them. To make it worse, there was no easy explanation of what kinds of regular expressions were ok.</p>
<p>Recently I re-encountered this series of articles about implementing regular expressions. </p>
<p><a href="https://swtch.com/~rsc/regexp/regexp1.html">Regular Expression Matching Can Be Simple And Fast</a><br />
<a href="https://swtch.com/~rsc/regexp/regexp2.html">Regular Expression Matching: the Virtual Machine Approach</a><br />
<a href="https://swtch.com/~rsc/regexp/regexp3.html">Regular Expression Matching in the Wild</a></p>
<p>I'd read them before but at that point I'd been satisfied with what I had. The articles made it seem easy to implement a non-backtracking approach to avoid the worst case behavior. Of course, it didn't turn out to be quite that easy to write a full implementation. Naturally, the articles didn't cover all the details.</p>
<p>It started out as a weekend project, but it ended up taking me longer than that, partly because I implemented optimizations for literal strings and single pass expressions.</p>
<p>Because Suneido is ASCII only (due to its age), the code is a little simpler and faster than the Go standard library version which handles UTF-8.</p>
<p>The Go regular expression package compiles to linked structs that are relatively large. I chose to compile to byte code stored in a string. I like to use strings because they are one of the few constant immutable types in Go. And string references are only 16 bytes compared to 24 bytes for byte slices. There's probably a bit more overhead processing the byte code, but it's a more compact chunk of memory so it'll be more cache friendly.</p>
<p>Humorously (to me) when I was fuzz testing my implementation against the Go standard library implementation I discovered <a href="https://github.com/golang/go/issues/59007">a bug in the Go code</a>. As expected, one of the initial responses was denial, blaming it on my lack of understanding. Thankfully, someone else (not on the Go team) confirmed the problem and even located a possible location of the bug. I thought a bug in something like regular expressions would be treated fairly seriously, but over a month later it's still just labeled as "NeedsInvestigation" rather than "NeedsFix". It was put under the Go1.21 milestone at least.</p>
<p>The new implementation is about 2000 lines of Go code. Probably half of that I was able to reuse from the previous version. As usual <a href="https://github.com/apmckinlay/gsuneido/tree/master/util/regex">the code is on GitHub</a></p>
Andrew McKinlayhttp://www.blogger.com/profile/14951795633428513769noreply@blogger.com0tag:blogger.com,1999:blog-11565005.post-5179666232975088772023-04-08T14:56:00.000-06:002023-04-08T14:56:57.873-06:00Full Text Search<p>Suneido uses full text search for its help and wiki. jSuneido used <a href="https://lucene.apache.org/" target="_blank">Lucene</a>, a natural fit for Java. I needed something different for gSuneido. I looked at <a href="https://blevesearch.com/" target="_blank">Bleve</a> which seems to be the most common Go full text search. But it was big and slow, and the indexes were large. I had used <a href="https://lunrjs.com/" target="_blank">Lunr.js</a> on a previous project and it had worked well. It's small and fast and easy to use. It's JavaScript, but we display our help and wiki in a browser window so that seemed ok.</p>
<p>We (thanks Jatin) got it working, but it was a little ugly. Our customer systems have different combinations of applications and options so we built custom indexes on each system. But for Lunr we were using Node.js to build the indexes and we didn't want to install Node on all our customer systems. So we had to build an index containing "everything", ship that to our customers (every time we updated the help), and then filter the search results based on their configuration.</p>
<p>We only use our Wiki in-house, but with it there was another issue. People edit the wiki all the time. With Lucene, we could update the index with changes, but Lunr.js doesn't support that, you have to rebuild the whole index.</p>
<p>Eventually I got frustrated with the "friction" of using Lunr and decided to continue my long tradition of re-implementing the wheel. I'd considered writing my own full text index/search previously but had resisted the urge. But I had a weekend, and nothing urgent to do, so I decided to give it a try. I started with <a href="https://artem.krylysov.com/blog/2020/07/28/lets-build-a-full-text-search-engine/" target="_blank">Let's build a Full-Text Search engine</a>, which makes it sound easy, but mostly because the simple version it describes isn't complete.</p>
<p>Tokenizing is easy, although there are questions about exactly what constitutes a token. Obviously, sequences of letters, but what length? I decided to ignore single letters and also ignore anything over a certain length (e.g. 32 characters). Especially in our wiki we also wanted to search on numbers. There's also the issue of punctuation, which I'm currently ignoring.</p>
<p>I used the <a href="https://github.com/kljensen/snowball" target="_blank">Go Snowball (Porter2) stemmer</a> mentioned in the article. A stemmer simplifies words. For example, fishing, fished and fisher are reduced to the base form (stem) fish. That reduces the number of terms in the index and makes searching more forgiving.</p>
<p>I got sidetracked into looking at bitmap data structures like <a href="https://github.com/RoaringBitmap/roaring" target="_blank">roaring bitmaps</a> (as suggested in the article), but then I started looking at how to implement search scoring and found that bitmaps weren't sufficient, I needed counts per term per document, not just true/false.</p>
<p>I decided to use BM25 scoring like Lucene and Lunr. I found a good <a href="https://www.elastic.co/blog/practical-bm25-part-2-the-bm25-algorithm-and-its-variables" target="_blank">article</a> that explained it and even gave some examples I could check my results against.</p>
<p>It came together surprisingly easily and by the end of the weekend I had about <a href="https://github.com/apmckinlay/gsuneido/tree/master/util/ftsearch" target="_blank">1000 lines of Go code</a> that could create indexes and search them. It was fast and the indexes were small (compared to Lunr). The results seemed comparable. I felt a little guilty because it meant throwing out all the work that went into trying to use Lunr. For a change I felt like I should have written my own version sooner.</p>
Andrew McKinlayhttp://www.blogger.com/profile/14951795633428513769noreply@blogger.com0tag:blogger.com,1999:blog-11565005.post-59619391432059864772023-03-22T14:16:00.000-06:002023-03-22T14:16:28.756-06:00Java 20<p>I don't program much in Java these days, just minimal maintenance on jSuneido. But I still have an interest in what's happening with the language.</p><p>Once upon a time, Java had a problem - no new releases for a long time. They solved that by switching to a fixed release every six months. For a while we got all kinds of new features. It was great. But lately it's been a new problem, the releases have been on time, but they haven't "released" much. Here's the complete list of features for the latest release:</p><h2 id="Features">Features</h2>
<table class="jeps" summary="jeps"><tbody><tr>
<td>429:</td>
<td><a href="https://openjdk.org/jeps/429">Scoped Values (Incubator)</a></td>
</tr>
<tr>
<td>432:</td>
<td><a href="https://openjdk.org/jeps/432">Record Patterns (Second Preview)</a></td>
</tr>
<tr>
<td>433:</td>
<td><a href="https://openjdk.org/jeps/433">Pattern Matching for switch (Fourth
Preview)</a></td>
</tr>
<tr>
<td>434:</td>
<td><a href="https://openjdk.org/jeps/434">Foreign Function & Memory API (Second
Preview)</a></td>
</tr>
<tr>
<td>436:</td>
<td><a href="https://openjdk.org/jeps/436">Virtual Threads (Second Preview)</a></td>
</tr>
<tr>
<td>437:</td>
<td><a href="https://openjdk.org/jeps/437">Structured Concurrency (Second
Incubator)</a></td>
</tr>
<tr>
<td>438:</td>
<td><a href="https://openjdk.org/jeps/438">Vector API (Fifth Incubator)</a></td></tr></tbody></table><p></p><p>Every feature is either incubator or preview. I understand the need for incubator and preview. But "Fourth Preview" and "Fifth Incubator"? Yes, you want to get it "right", but if you wait for perfect in the eyes of committees, you could be waiting a long time.<br /></p><p>I wonder if this is partly a matter of Java's "enterprise" association. Let's face it, the hip kids aren't using Java. They don't care if e.g. Java Virtual Threads are ever released. On the other hand the enterprise customers don't really want new features. They're still running Java 8. But they do want to be reassured that Java is alive and well. Previews and incubators serve that purpose just fine.</p><p>Maybe I'm just impatient. I'm excited about some of the new features (and others, like Valhalla that have been "coming soon" for many years but haven't even made it to incubation). But I don't have that many decades left.<br /></p>Andrew McKinlayhttp://www.blogger.com/profile/14951795633428513769noreply@blogger.com0tag:blogger.com,1999:blog-11565005.post-76185951769494539242023-02-15T11:45:00.000-06:002023-02-15T11:45:13.460-06:00Go Telemetry<p>There has been a big debate recently over the proposal to add telemetry to Go. </p><p>It started with Russ Cox's multi-part <a href="https://research.swtch.com/telemetry" target="_blank">Transparent Telemetry</a> <br /></p><p>I read the proposal and it seemed well thought out. I could understand the need to get more information about how Go was actually being used. Collecting a few anonymous counters seemed relatively benign compared to the "big" (i.e. invasive) data being collected by seemingly everyone these days.</p><p>Naively, I didn't foresee the big push back in the discussion at <a href="https://github.com/golang/go/discussions/58409" target="_blank">telemetry in the Go toolchain</a> which was eventually locked after 506 comments. (518 thumbs down to 118 thumbs up)<br /></p><p>I must admit I have a few qualms myself because it's Google. Go is it's own team, and I would say they have a good track record, but it's still Google paying their salaries and running their servers.</p><p>One point I missed until reading the discussion was that they would "temporarily" collect traffic logs with IP addresses. Supposedly this data would just be discarded, but how long until someone at Google decides they could "use" this data?</p><p>I think part of the knee jerk reaction was because it's a compiler. That seems wrong somehow. It's a bit reminiscent of the <a href="https://www.win.tue.nl/~aeb/linux/hh/thompson/trust.html" target="_blank">Ken Thompson hack</a>. We may not like it, but these days we accept that Facebook and Apple etc. are going to track us. VS Code is one of the most popular editors, and it sends large amounts of telemetry. (I keep meaning to switch to <a href="https://vscodium.com/" target="_blank">VSCodium</a>) I used to always opt in to sending telemetry because I wanted to help the developers. Nowadays I opt out of everything I can because it seems that most of it is just spying. </p><p>I don't have a lot to add to the debate. But I do have an idea/proposal that might help. How about if the telemetry was collected and published by a third party, someone with a vested interest in not abusing it. Perhaps someone like the <a href="https://www.eff.org/" target="_blank">Electronic Frontier Foundation</a>. The proposal was already saying the data would be public. The Go team could access it from the public source just like anyone else. The Go team would still control the actual telemetry code, but since they wouldn't be collecting the data, it would be pointless to "sneak in" extra information.</p><p>It's a bit sad that it's almost impossible to collect legitimate data because so many big companies have abused data collection.<br /></p>Andrew McKinlayhttp://www.blogger.com/profile/14951795633428513769noreply@blogger.com0tag:blogger.com,1999:blog-11565005.post-17920134685580437402023-02-13T12:04:00.000-06:002023-02-13T12:04:02.675-06:00A Go Generics Technique<p>Say you want to make a generic hash map in Go. One of the questions is whether hash and equals should be methods on the key type, or whether they should be functions supplied when creating an instance. In general Go recommends passing functions. One of the advantages of this approach is that the functions can be closures which then have access to context.</p>
<p>In my cases I had a mix of uses. In several cases I already had hash and equals methods (from a previous incarnation). In several other cases I need context so closures would be better.</p>
<p>After a certain amount of head scratching I came up with a way to handle both.</p>
<p>Normally a generic hash map would be parameterized by key and value types. I added a third "helper" type. This type supplies the hash and equals functions. I created two helpers - one that calls methods on the key, and one that stores references to supplied hash and equals functions.</p>
<p>To use this the helper type you need an instance. A neat Go trick is that the helper that calls the methods can be struct{} - a valid type that is zero size, so no space overhead.</p>
<script src="https://gist.github.com/apmckinlay/1b590eecc8e1843846706fd3181d0210.js"></script>
<p>Getting the type constraints right took some experimentation. The key and value types do not have any constraints (any). The helper is parameterized by the key type. The helper that calls methods obviously is constrained by an interface with those methods. It confused me that the constraints that would normally be on the key type get moved to the helper, but I guess that makes sense because it is specific helpers that have requirements.</p>
<p>PS. Go has a built-in map type that is "generic" (but predates generics in the language). The problem is that it only works with types that have built-in hash and equals. If you need to write your own hash and equals, you can't use it.</p> Andrew McKinlayhttp://www.blogger.com/profile/14951795633428513769noreply@blogger.com0tag:blogger.com,1999:blog-11565005.post-51863651913483962023-02-01T21:21:00.002-06:002023-02-01T21:21:21.979-06:00Fuzz Testing Database Queries<p>The gSuneido database was ready for production, as far as I could tell. All our tests passed.</p>
<p>But there was a steady trickle of bugs showing up. Sure, they were obscure cases, but they were showing up in actual application code, so they weren't that obscure.</p>
<p>Every time I'd think it was ready to deploy further, another bug would show up.</p>
<p>I finally decided I had to do something to flush out these obscure bugs. Waiting for them to show up in production was not the way to go.</p>
<p>I had thought about trying to fuzz test database queries before, but it always seemed too hard. Even if I could figure out a way to generate random but legal queries, how would I check the results? Then I realized I could compare jSuneido and gSuneido. It wasn't a perfect test since they were roughly the same design and therefore could have matching bugs. But I had made enough changes and improvements to gSuneido that they were now significantly different. And even if it wasn't a perfect test, it was a heck of a lot better than nothing.</p>
<p>I puzzled over how to generate valid random queries. And what kind of data to use. I started with a simple set of four tables with obvious joins between them. The joins seemed to be the most constrained element, so I started with simply picking randomly from a fixed list of joins.</p>
<p>e.g. cus join ivc</p>
<p>I represented the query as a tree of nested objects and wrote a function to randomly pick a sub-expression branch.</p>
<p>unions are added by randomly picking a branch and replacing it with: branch union branch</p>
<p>e.g. (cus join ivc) union (cus join ivc)</p>
<p>leftjoins are added by randomly replacing some of the join's.</p>
<p>Then where, rename, extend, and remove (project) are added at random spots.</p>
<p>(((cus where ck = 12) join ivc) union ((cus leftjoin ivc) extend x1)) remove c1</p>
<p>Finally it optionally adds a sort.</p>
<p>The resulting queries aren't necessarily realistic, but they seemed to cover most of the variations.</p>
<p>It's a little more ad-hoc than I originally hoped. You could generate random queries from the grammar, and they would be valid syntax, but queries also have to have valid semantics, and that isn't represented in the grammar.</p>
<p>First I just parsed and optimized the queries, no execution. This soon triggered some assertion failures which uncovered a few bugs.</p>
<p>The next step was to compare the actual results between gSuneido and jSuneido. I decided the simplest approach was to calculate a checksum of the result and output queries and their result checksums to a text file. Then I could re-run those queries on jSuneido and compare the checksums.</p>
<p>In the end I found about a dozen bugs. Several of them were in jSuneido, which is a little surprising since it has been in production with thousands of users for about 10 years.</p>
<p>The problem with finding a bug through fuzzing, is that the inputs it finds are often messy. Some fuzzing systems will try to simplify the inputs for you. I just did it manually, starting the debugging of each failure by simplifying the query as much as possible while still retaining the failure. Fuzzing can make finding bugs easier, but it doesn't really help with fixing them. And the more you fuzz, the more obscure the bugs get. On the other hand, by the end I got pretty good at tracking them down, inserting prints or assertions or using the debugger.</p>
<p>I added all the failed queries to a "corpus" that gets run every time, along with new random queries as a defense against regressions. In theory they would get generated randomly again, but that could potentially take a long time.</p>
<p>I called it success when I ran about 500,000 queries (roughly two hours of processing) with no differences. The plan is to add this to our continuous test system and fuzz for an hour or so per night when we have spare compute power. That should prevent regressions and possibly find even more obscure bugs.</p>
<p>I'm pretty happy with how this turned out. It only took me a few days work to write the fuzzing system (not counting the time to actually fix the bugs), and it flushed out a bunch of bugs that I'm sure would have haunted me for a long time otherwise.</p>
<p>Of course, in hindsight I should have done this 20 years ago, or at least 10 years ago when I wrote a second implementation (jSuneido). Sometimes I'm a little slow! But better now than never.</p>
<p>Maybe the counterpart to Eric Raymond's "given enough eyeballs, all bugs are shallow" is "given enough fuzzing, all bugs are visible". (Of course, it's never "all", but you get the idea.)</p>Andrew McKinlayhttp://www.blogger.com/profile/14951795633428513769noreply@blogger.com0tag:blogger.com,1999:blog-11565005.post-37355874843529304002023-01-26T21:25:00.003-06:002023-01-26T21:25:52.236-06:00AI<p>I used to think AI (when it finally "arrived") would help compensate for the many irrationalities of Homo sapiens. What can I say, I grew up on a steady diet of science fiction.</p>
<p>And now AI is arriving in the form of ChatGPT. And it successfully duplicates the failings of human brains. I could already get all kinds of biases, misconceptions, confabulations, and outright lies from humans, amplified by the internet. Now I can get it from AI too.</p>
<p>Even in the specialized area of programming assistance, I'm somewhat skeptical. Tabnine is a good, helpful tool. It claims it wrote 15% of my code in the last month. But when I review the highlights, they're one or two short lines of code. Not much 'I' in that AI. To me, the challenge in coding is writing coherent well organized code. Copying and pasting snippets will never achieve that. It seems to me it's just going to encourage even more boilerplate. Why not, when it's generated for you. Think how many lines a day you can "write". Most code is crap (including mine), and that's what these models are trained on. Even if you wanted to train it on "good" code, where do you find that? Who is going to judge it?</p>
<p>Perhaps this is just a temporary situation and AI will solve these problems. I'm not optimistic for the near future because it seems inherent in the current approach.</p>
<p>Although, there is perhaps some cause for hope:
<a href="https://writings.stephenwolfram.com/2023/01/wolframalpha-as-the-way-to-bring-computational-knowledge-superpowers-to-chatgpt" target="_blank">https://writings.stephenwolfram.com/2023/01/wolframalpha-as-the-way-to-bring-computational-knowledge-superpowers-to-chatgpt</a>/</p>
<p>Meanwhile, after billions of dollars worth of research, self driving cars are currently a flop. I suspect that's a temporary setback.</p>
<p>Interesting times.</p> Andrew McKinlayhttp://www.blogger.com/profile/14951795633428513769noreply@blogger.com0tag:blogger.com,1999:blog-11565005.post-24407329341000139462023-01-04T21:17:00.000-06:002023-01-04T21:17:37.104-06:00Three Weeks for Three Tweaks<p>It started with reports of slow database queries. At first I didn't pay too much attention, after all, some queries are slow. We're dealing with quite large data (in a small business sense, not in Google terms) and customers are choosing what to select on and how to sort.</p>
<p>It progressed to the same query being slow sometimes and fast other times. That seemed a little more suspicious, but there were still a lot of factors that might have explained it.</p>
<p>Finally someone came up with a simple example where the same query, ran on the same data, was 1000 times slower when you ran it slightly differently. That was bad news for me, since it definitely meant there was a problem with the query optimization. Why was it picking such a slow strategy sometimes, when there was an obviously better strategy. Especially when it was such an extreme difference. The optimization shouldn't have to be very accurate when there is a 1000 times difference!</p>
<p>It didn't turn out to be a bug, the code was working as designed. It was just a particular scenario that wasn't handled well by the current design.</p>
<p>After 20 years, the easy improvements have been made and I'm into the realm of diminishing returns. I ended up needing to "tweak" three aspects of the code in order to fix the issue. And all three were required before I could see if it was going to work. TL;DR - it did solve the problem.</p>
<p>One way of looking at the issue was that the optimization was getting caught by a local minimum. It wasn't smart enough to use a sub-optimal strategy in one spot in order to allow a better strategy overall. (It doesn't explore every possible strategy, for a complex query that would be far to slow.)</p>
<p>None of the changes were difficult, but they were all somewhat invasive, requiring changes to all the query operations.</p>
<h3 id="background">Background</h3>
<p>In case you actually want to try to follow what I'm talking about, here's a little background on how Suneido implements queries.</p>
<p>A query like:</p>
<p><code>(table1 join table2) union table3</code></p>
<p>gets parsed into a tree like: <br /></p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhn7u_xYKXfOUCBN-sXVFMUj6oz0qzcXiPJGy0v-dyD8msq5T0weZJox7K2jctO7NPm2FGzGUnC0G99Zc3pbFo9AW2Xk-ScU3xyGA_mBhXfBUs81YhQ1Xb4zc0J533UwtWw1kTbI9hSvRUx5GmvNwveyS3Kl_in83byaFnLnz1CHNB5WhvwBPw/s377/query%20tree.png"><img border="0" data-original-height="246" data-original-width="377" height="209" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhn7u_xYKXfOUCBN-sXVFMUj6oz0qzcXiPJGy0v-dyD8msq5T0weZJox7K2jctO7NPm2FGzGUnC0G99Zc3pbFo9AW2Xk-ScU3xyGA_mBhXfBUs81YhQ1Xb4zc0J533UwtWw1kTbI9hSvRUx5GmvNwveyS3Kl_in83byaFnLnz1CHNB5WhvwBPw/s320/query%20tree.png" width="320" /></a></div>
<p>where the leaves are database tables and the other nodes are operations. Each node has one or more strategies for execution, plus some parameters like choice of index.</p>
<p>Optimization starts with calling the optimize method on the root operation (union in this case) which then calls optimize on each of its sub-queries.</p>
<p>Execution works similarly by calling the get method on the root operation, which then "pulls" data from its sub-queries.</p>
<h3 id="tweak-1">Tweak #1</h3>
<p>Lets say we have a query like: <code> </code></p><p><code>table1 join table2 where id=1</code></p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj02OCl_RF34sFEYa8SveJ-xoChWONw14QNiupv2Nz_EgF-ro0Uos_ZhPlfZckI8xsG43VgghR8pqqjPSiT2W2FuXmHGDo6sAiNxm2t9ApQF8Vw0_P4d8knWxa-h1F37GUEbnioYkrS8o7Hh0QHb6jtqeKeFuT81u6nZLJ5RLErK_xHgWDMmxY/s293/problem1.png"><img border="0" data-original-height="234" data-original-width="293" height="209" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj02OCl_RF34sFEYa8SveJ-xoChWONw14QNiupv2Nz_EgF-ro0Uos_ZhPlfZckI8xsG43VgghR8pqqjPSiT2W2FuXmHGDo6sAiNxm2t9ApQF8Vw0_P4d8knWxa-h1F37GUEbnioYkrS8o7Hh0QHb6jtqeKeFuT81u6nZLJ5RLErK_xHgWDMmxY/s1600/problem1.png" /></a></div>
<p>During optimization, query operations can ask their children for their estimated number of rows. In this case <b>join</b> would get the table size from <b>table1</b> e.g. 1000 and one (1) from the <b>where</b> since it's selecting on a unique id. But that information isn't sufficient to estimate the fan-out of the join.</p>
<p>So the first change I made was to return the "population" count in addition to the row count. The <b>where</b> could return 1 for the result count and e.g. 10,000 for the population i.e. table2 size. This allows the <b>join</b> to estimate a more realistic fan-out of 1:10.</p>
<h3 id="tweak-2">Tweak #2</h3>
<p>The next problem is when a temporary index is required, there are two costs - the "fixed" cost of building the index, and the "variable" cost of reading from it. But until now these had been combined into a single "cost". Other operations usually have only a variable cost.</p>
<p>Continuing the example from Problem #1, now the <b>join</b> can estimate the fan-out is 1:10, it can estimate it's only going to read 10 records from table2. So the variable cost will be low, but we'll still incur the entire fixed cost. So in this case we want to minimize the fixed cost. But to do that, I needed to separate the cost into fixed and variable parts everywhere.</p>
<h3 id="tweak-3">Tweak #3</h3>
<p>The remaining problem was that <b>join</b> didn't have any way to tell the sub-query that only a small part of the result would be read. To address this, I added a "fraction" argument to allow queries to tell their sub-queries how much of the result they estimated they would read.</p>
<p>In the running example, 10 rows from 10,000 would be a fraction of .001 Where before it would choose a strategy with e.g. a fixed cost of 1000 and a variable cost of 1000 (total 2000) over one with a fixed cost of 0 and a variable cost of 5000 (total 5000). Now the 5000 be multiplied by .001 giving 5, which is obviously preferable to 2000.</p>
<p>This allowed the optimization to avoid the local minimum.</p>
<h3 id="conclusion">Conclusion</h3>
<p>I'm not sure how common the problem scenario is. It was only because we ran into such an extreme case (1000 times slower) that I got motivated to investigate. Perhaps less extreme cases also occur and will be improved by these changes. As far as I can reason, the changes should not make any cases worse.</p>
Andrew McKinlayhttp://www.blogger.com/profile/14951795633428513769noreply@blogger.com0tag:blogger.com,1999:blog-11565005.post-82716693297123834422022-12-17T19:12:00.004-06:002022-12-17T19:25:40.763-06:00Go Tip<p>Lets say you have a function:</p>
<pre>func ShouldNotReachHere() {
panic("should not reach here")
}
</pre>
<p>And you try to use it like:</p>
<pre>func Fn(x int) int {
switch x {
case 1:
... return ...
case 2:
... return ...
default:
ShouldNotReachHere()
}
}
</pre>
<p>Unfortunately, that won't work. You'll get a "missing return" error when you try to compile. That's because the compiler doesn't know that ShouldNotReachHere never returns. (Actually, it does know that when it's compiling ShouldNotReachHere, but it doesn't keep track of that information.) C++ has [[noreturn]] to specify that but currently Go doesn't have an equivalent.</p><p>You could add a dummy return statement but I find that misleading since it will never actually return anything. <br /></p>
<p>What I tend to do is add a dummy return type to ShouldNotReachHere (it's not critical what type since you're not actually using it).</p>
<pre>func ShouldNotReachHere() <i><b>int</b></i> {
panic("should not reach here")
}
</pre>
<p>and then use it like:</p>
<pre>panic(ShouldNotReachHere())
</pre>
<p>If that was the only way you used it, then ShouldNotReachHere could just be a string constant. But defining it this way means you're not forced to use panic. (And if it was a function returning the string, then you could forget to use panic.)<br /></p>Andrew McKinlayhttp://www.blogger.com/profile/14951795633428513769noreply@blogger.com0tag:blogger.com,1999:blog-11565005.post-16065439452738144382022-12-04T17:45:00.002-06:002022-12-04T17:45:14.238-06:00Blogging, Suneido, and Programming Languages<p>Another blogger I follow has announced they are stopping blogging. It's sad (to me) how the world has shifted away from blogging. Of course, there are still active blogs, but far fewer than there were. I thought maybe the pandemic would give people time to do more blogging but it didn't seem to work that way. Of course, it always surprised me that smart productive busy people would take the time to write good blog posts.<br /></p><p>I'm no exception. I post less and less often, other than photographs on my non-work blog. But even that is a shift from my past longer text posts. Readers have dwindled too. (Not that I ever had a big audience.)</p>
<p>But looking back through some of my blog posts, I'm glad I wrote them even if hardly anyone read them. It's a bit like a sporadic diary, but more thoughtfully written than a personal diary would be. I started blogging in 2005. It would have been interesting to go back farther but even in 2005 the tech world was a different place than today. </p>
<p>Thinking about this led me back here. I was writing a document for my staff, and some of it seemed of slightly more general interest so I thought I'd write about it.</p>
<p>Here's roughly when the versions of Suneido went into "production" (used by customers)<br /></p>
<ul>
<li>2000 cSuneido (C++)</li>
<li>2010 jSuneido (Java)</li>
<li>2020 gSuneido client (Go)</li>
<li>2022 gSuneido server</li>
<li>2022 suneido.js (browser client)</li>
</ul>
<p>Two data points doesn't mean much, but it's interesting that it was roughly 10 years between the major versions. The concurrent release of suneido.js is because another programmer did the development. (I started it, but developing both gSuneido and suneido.js at the same time was too much.)</p>
<p>All three implementation languages (C++, Java, Go) were in their early stages when I started using them. I was lucky that all went on to become mainstream. Some of the other languages I considered, such as C#/.Net and Kotlin have also been successful. Scala and D not as much, although they're still around.</p>
<p>C++ is a fascinating language. But it's too complex and it's unsafe. Life is too short. (gSuneido does have some unsafe code to interface with Windows and it has been the source of the hardest bugs.)</p>
<p>Java has a world class runtime with a great garbage collector and JIT compiler. But the language itself is not really a good fit for low level systems programming. gSuneido, with a simple byte code interpreter and no JIT still manages to outperform jSuneido. If <a href="https://openjdk.org/projects/valhalla/" target="_blank">Project Valhalla</a> ever gets released that will help. But it's been "coming soon" since 2014.<br /></p>
<p>Go felt like a bit of a gamble at the time. The garbage collector wasn't very good and neither was performance. But the language was a good fit for implementing Suneido. And it seemed like Google was committed to it. Although I wish it was a little less of a one-company language.<br /></p>
<p>One of the things that bugged me about Go was its lack of generics.
But now that it has them, I don't really use them that much. The people
that resisted generics would say "I told you so". But it did simplify
certain kinds of code and I'm still glad to have it when I need it. </p><p>I
wish Go had better support for immutability. C++ and Java aren't much
better but at least they have a little bit. Immutability (and pure
functions) are one of the best tools to handle concurrency, and
concurrency is supposed to be Go's strong point. I think there's potential for a conventional mainstream language with strong immutability. I realize there are functional languages with strong immutability, but so far they're not really mainstream.<br /></p>
<p>The big thing these days is Rust, but Suneido is garbage collected, and
unless you want to write your own garbage collector (no thanks, been
there, done that) the implementation language also needs to be garbage
collected.</p>
<p>As far as the Suneido language itself, I think it has stood the test of time reasonably well. It was never intended to be especially novel or original, although at the time object-oriented programming and dynamic languages were leading edge. There have been a few minor adjustments to the language, a bit of syntactic sugar here and there, but for the most part code written 20 years ago would still run today. There are a few things I'd change but overall, when I program in Suneido, there's not a lot I wish was different. </p>Andrew McKinlayhttp://www.blogger.com/profile/14951795633428513769noreply@blogger.com1tag:blogger.com,1999:blog-11565005.post-80201736534343640562022-11-02T17:26:00.005-06:002022-11-02T17:26:54.212-06:00The Value of Fuzz Testing<p>Coincidentally after my <a href="https://thesoftwarelife.blogspot.com/2022/11/go-fuzz.html" target="_blank">Go Fuzz post</a> yesterday, today I see:<br /></p><p><a href="https://words.filippo.io/dispatches/openssl-punycode/" target="_blank">Why Did the OpenSSL Punycode Vulnerability Happen</a></p><p>Buffer overruns aren't a problem with Go (unless you're using unsafe) but the lesson still applies. I should spend more time writing fuzz tests. Especially when the Go tools make it easy.<br /></p>Andrew McKinlayhttp://www.blogger.com/profile/14951795633428513769noreply@blogger.com0tag:blogger.com,1999:blog-11565005.post-29872021711582997722022-11-01T09:19:00.001-06:002022-11-02T17:27:08.314-06:00Go Fuzz<p>Go 1.18 included a new <a href="https://go.dev/security/fuzz/" target="_blank">fuzz testing facility</a>. (overshadowed by generics.) I haven't used it a lot but it has been helpful in a few cases. For example, I used it to test gSuneido's regular expression code and found a number of obscure bugs.</p>
<p>A few days ago, one of our customers got a runtime bounds error from parsing a date. This code has been running in production for years without seeing this error so presumably it was something obscure. The date parsing code is a little complicated because dates can be written many different ways. The error log didn't have the bad input, but it did have a call stack, so it probably wouldn't have been too hard to find the issue manually. But I was curious whether fuzzing would find it. And if there was one bug, maybe there were more.</p>
<script src="https://gist.github.com/apmckinlay/081a5ee957f1e2befd2526feb729ae0d.js"></script>
<p>To keep it simple, I didn't check for correct results, so the test was just looking for panics. </p><p>One of the things that should be simple, but I found confusing, was how to actually run fuzz tests. I'm not sure if it's the simplest, but this seems to work:</p><p><span style="font-family: courier;">go test -fuzz=FuzzParseDate -run=FuzzParseDate</span></p><p>Within a few seconds the fuzz test found an input that would cause a panic.</p><p><span style="font-family: courier;">"0A0A0A0A0A0A0A0A0A00000000"</span></p><p>That's probably not what the customer entered, but it was the same bug, and easily fixed once I could recreate it.</p><p>I let it run until it was no longer finding "interesting" inputs which took a few minutes. It didn't find any more bugs.</p><p>Ideally the test would be checking for correct output, not just lack of panics. But that requires a second implementation, which I don't have. Sometimes you can round-trip e.g. encode/decode but that wouldn't work in this case.<br /></p>Andrew McKinlayhttp://www.blogger.com/profile/14951795633428513769noreply@blogger.com0tag:blogger.com,1999:blog-11565005.post-82371643592283605372022-04-18T12:08:00.000-06:002022-04-18T12:08:25.028-06:00Git + Windows + Parallels + make + msys sh + Go<p>Git recently made a security fix. Unfortunately, it broke things for a lot of people, including me. Just in case anyone else has a similar obscure setup (and for my own notes), here's how I solved my issue.<br /></p>
<p>My configuration is a bit off the beaten path.</p>
<ul style="text-align: left;">
<li>I am working on a Mac</li>
<li>I have my git repos in Dropbox</li>
<li>I use Parallels to build and test on Windows</li>
<li>I use make to run my go build</li><li>make is using msys sh as its shell</li>
</ul>
<p>As of 1.18 Go includes vcs info in builds. So when I ran a Go build on Windows, it would fail with:</p>
<pre># cd x:\gsuneido; git status --porcelain
fatal: unsafe repository ('//Mac/Dropbox/gsuneido' is owned by someone else)
To add an exception for this directory, call:
git config --global --add safe.directory '%(prefix)///Mac/Dropbox/gsuneido'
error obtaining VCS status: exit status 128
Use -buildvcs=false to disable VCS stamping.
</pre>
<p>Adding -buildvcs=false did get the builds working, but it seemed ugly to disable a Go feature to work around a Git issue. It also didn't help if I wanted to do other Git commands.</p>
<p>I struggled with adding the safe.directory. I wasn't sure if %(prefix) was literal or was a variable. I also wasn't sure about whether it should be forward slashes or back slashes and how many (a perpetual issue on Windows). And I wasn't sure about the quoting. Eventually, I just edited my .gitconfig with a text editor.</p>
<p>Here's what worked:</p>
<pre>[safe]
directory = %(prefix)///Mac/Dropbox/gsuneido
</pre>
<p>Now I could do git commands.</p>
<p>But my build still failed with the same error!?</p>
<p>make is using msys sh as its shell. And sure enough, from within sh, I was back to the same error. git config --global --list didn't show my safe directory. That turned out to be because my home directory from inside sh was different. If I ran git config --global --add safe.directory from within sh, then it created another .gitconfig in /home/andrew. Now I could run git commands from sh, and now my build works.</p>
<p>I'm a little nervous about having multiple .gitconfig files, one on Mac, one for Windows, and one for sh on Windows but I don't have anything critical in there, so hopefully it'll be ok.</p>
<p>I'm all for security fixes, and I try to keep up to date, but it's definitely frustrating when it breaks stuff.</p>Andrew McKinlayhttp://www.blogger.com/profile/14951795633428513769noreply@blogger.com0tag:blogger.com,1999:blog-11565005.post-57733903697681469002022-02-01T22:17:00.000-06:002022-02-01T22:17:04.807-06:00A Better LRU Cache for gSuneido<p>When I was working on gSuneido performance I discovered that the LRU cache in the standard library was one of the hot spots.</p>
<p>It was a simple design - a hash map of keys to values, plus an array of keys to track which was least recently used. When I wrote it, many years ago, it was just a quick addition. I had no idea it would be a heavily used bottleneck. Usage increased a lot when we added the easy ability to memoize a function.</p>
<p>I actually had to work backwards to find that LruCache was a bottleneck. What I first discovered (with the Go memory profiler) was that deepEqual was doing the majority of allocation. It was being called by SuObject Equal. The allocation was to track which comparisons were in progress to avoid handle self referential data. I was able to tweak deepEqual to greatly reduce the allocation. That removed the majority of the bottleneck.</p>
<p>But why were there so many object comparisons? I added debugging to print the Suneido call stack every 100 comparisons. It was coming from object.Remove which was being called by LruCache. To maintain its LRU list, it would move items to the most-recently-used end of the list by removing them and re-adding them. Since this was just an array, to find the item to remove it, it did a linear search. i.e. a lot of comparisons. Originally, caches had mostly been small, as you could tell from the default size of 10. But now the most common size was 200. 10 is reasonable for linear search, 200 is not so good.</p>
<p>In addition, originally keys were simple values like strings. But keys often ended up being several values in an object. The combination of linear searches through long arrays of composite values is what led to the bottleneck.</p>
<p>I decided to make LruCache built-in i.e. written in Go. But I also needed a new design that would avoid the linear searches. The normal way to track LRU is with a double linked list. <a href="https://github.com/apmckinlay/gsuneido/commit/7a4aff034db02e6bc3d6d8429104a856fbfdc31b" target="_blank">I started in that direction</a> but linked lists are not a good fit for modern CPU’s. Because the entries are individually allocated they are scattered in memory which leads to CPU cache misses. Contiguous arrays are much better because memories are optimized for sequential access. You could store the linked list entries in a contiguous block, but you still have the overhead of pointers (or indexes) and you still don't have locality or sequential access.<br /></p>
<p>I ended up with the following design.</p>
<ul><li>an unordered array of key, value pairs (entries)</li><li>a hash map of keys to entry indexes (hmap)<br /></li><li>an array of entry indexes in LRU order</li></ul>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEg7EuH8ik0gngj5C9mT3mFN2DBO5lbOk8F2W3gawYrgW34PdejdqqZhvx-iCrSLHvHQHgzY_lBCGbpbbj4F87peXZnZNAgGQdzdDza8-TmTpuvk1zD2zqRTrzBLGoweEp3QHMw005oI6g_a5xN5X8tU0_P0oLYYw11Z3fVU43Jx-gNZIhtNS3w=s1044" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="306" data-original-width="1044" height="118" src="https://blogger.googleusercontent.com/img/a/AVvXsEg7EuH8ik0gngj5C9mT3mFN2DBO5lbOk8F2W3gawYrgW34PdejdqqZhvx-iCrSLHvHQHgzY_lBCGbpbbj4F87peXZnZNAgGQdzdDza8-TmTpuvk1zD2zqRTrzBLGoweEp3QHMw005oI6g_a5xN5X8tU0_P0oLYYw11Z3fVU43Jx-gNZIhtNS3w=w400-h118" width="400" /></a></div>
<p>A successful lookup becomes slightly more complex - look in the hash table for the entry index, and then get the value from the entries array. Finally, we have to find that entry index in the lru array (a linear search but of small integers) and move it to the most recently used end.</p>
<p>The hash table lookup will still require comparing multiple argument objects, but much fewer than a linear search.</p>
<p>To replace the oldest item, we take the entry index from the least recently used end of the lru array. From that entry we get the key to remove it from the hash table. And we reuse that slot in the entries array to store the new key and value.</p>
<p><a href="https://github.com/apmckinlay/https://github.com/apmckinlay/gsuneido/blob/5cefaa85e79b3a69b0b9a6c2ef3844048bdb77d2/builtin/lrucache.go" target="_blank">The new built-in LruCache</a> was about 100 times faster than the old one. Overall, it improved the speed of our application test suite by about 5%. That’s pretty good for a couple days work.</p>
<p>ASIDE: There are two main aspects to a cache. The focus is usually on which items to evict i.e. least recently used or least frequently used. The other aspect is which items to add. The normal assumption is that you add everything. But actually, you really only want to add items you’re going to be asked for again. Otherwise you’re evicting potentially useful items. We can’t predict the future, but we can assume that items we see twice are more likely to be seen again. One approach is to use a bloom filter to track which keys you’ve seen and only add to the cache ones you’ve seen before.</p>
Andrew McKinlayhttp://www.blogger.com/profile/14951795633428513769noreply@blogger.com0tag:blogger.com,1999:blog-11565005.post-37288319959292953442022-01-29T19:41:00.004-06:002022-01-29T19:41:37.004-06:00Checking Immutable Data<p>The database implementation in the Go version of Suneido relies a lot on immutable data that is shared between threads without locking.</p>
<p>The problem is that Go has no support for immutability. It doesn’t have const like C++ or even final like Java. (Go has const but only for simple numbers and strings.)</p>
<p>I was seeing some sporadic crashes and corruption. One possible cause was something accidentally modifying the theoretically immutable data. The immutable database state is a large complex set of linked structures processed by a lot of complex code. As with a lot of immutable persistent data structures you don’t copy the entire structure to make a new version, you basically path copy from each spot you want to change up the tree to the root. It's easy to miss copying something and end up mistakenly modifying the original.<br /></p>
<p>I had recently been using checksums (hashes) to verify that the data read from the database file was identical to the data before it was written. (To verify the reading and writing code, not to check for hardware errors.)</p>
<p>I realized I could use the same checksums to detect erroneous changes to the immutable state. I ran a background go routine that looped forever, fetched the state (atomically), computed its checksum, waited a short time (e.g. 100 ms) and then computed the hash again (on the same state). If something had incorrectly mutated the state the checksum would be different.</p>
<p>And I actually found a few bugs that way :-)</p>
<p>Although unfortunately they were not the ones that were causing the crashes :-(<br /></p>Andrew McKinlayhttp://www.blogger.com/profile/14951795633428513769noreply@blogger.com0tag:blogger.com,1999:blog-11565005.post-9410561995016592782021-05-15T08:40:00.005-06:002021-05-15T08:40:48.483-06:00Blogger Issues<p>Yesterday afternoon I wrote a blog post about my gSuneido progress. In the evening I got an email saying "Your post has been deleted" because "Your content has violated our Malware and Viruses policy."</p>
<p>The post was just some text and a couple of screenshots. It's hard to see how it could contain malware or viruses. Of course, it was gone, so I couldn't prove that. And of course, there was no human to contact.</p><p>It was a funny because it actually made me a bit upset. I think that was partly from the feeling of helplessness against the faceless Google behemoth. A bit like dealing with the government. And it's free, so what can you say?<br /></p>
<p>This morning I got another email saying "We have re-evaluated the post. Upon review, the post has been reinstated." Who exactly is "we"? Somehow I doubt that was a human. Now our software gets to use the royal "we"? I suspect it would have been more honest to say "sorry, we screwed up"<br /></p>
<p>It was still not showing up, but then I found they had put it back as a draft and I had to publish it again.</p>
<p>A quick search found someone else reporting a similar issue and Google responding with "we're aware of the problem".</p>
<p>It was a good reminder to back up my content. Not that there's anything too important, but it's of nostalgic interest to me, if nothing else. (You can download from the blog Settings.)<br /></p>Andrew McKinlayhttp://www.blogger.com/profile/14951795633428513769noreply@blogger.com0tag:blogger.com,1999:blog-11565005.post-6293059253568680942021-05-14T17:18:00.003-06:002021-05-15T07:52:58.522-06:00Another gSuneido Milestone<div class="separator" style="clear: both;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgAz9SJg6nsGi_OmqMO6seUzjbbja5QkUf3J-hZKXRduxlvEiKFeIGv5NIH8ceEoz1q1vhsNPDvvNnxvvrIeqVN1aV795qsrtJR9TlKl-vya0ZVeU5mTRx6ozeP0eJrf5pdP2iRug/s640/Screen+Shot+2021-05-14+at+11.11.06+AM.png"><img border="0" data-original-height="437" data-original-width="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgAz9SJg6nsGi_OmqMO6seUzjbbja5QkUf3J-hZKXRduxlvEiKFeIGv5NIH8ceEoz1q1vhsNPDvvNnxvvrIeqVN1aV795qsrtJR9TlKl-vya0ZVeU5mTRx6ozeP0eJrf5pdP2iRug/s16000/Screen+Shot+2021-05-14+at+11.11.06+AM.png" /></a></div>
<p>This screenshot probably doesn't look too significant - just the Suneido IDE. The only noticeable difference is down in the bottom left corner. Normally it would show something like:</p>
<div class="separator" style="clear: both; "><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgq-JvZuWmgxTmC8Z6XWEmQYWCbbvTqgOWhXKN6sKdQX9S3S5WAePfJLZzkv2BQoXjcjthGQ4cVixbZlIMa4_rED4ei3fuFJMXcH3j81T6zf9XEjY-uljkHdBkQ7KP5F4H8FxbkUw/s1232/Screen+Shot+2021-05-14+at+11.15.28+AM.png"><img border="0" data-original-height="94" data-original-width="1232" height="50" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgq-JvZuWmgxTmC8Z6XWEmQYWCbbvTqgOWhXKN6sKdQX9S3S5WAePfJLZzkv2BQoXjcjthGQ4cVixbZlIMa4_rED4ei3fuFJMXcH3j81T6zf9XEjY-uljkHdBkQ7KP5F4H8FxbkUw/w640-h50/Screen+Shot+2021-05-14+at+11.15.28+AM.png" width="640" /></a></div>
<p>Instead it shows:</p>
<div class="separator" style="clear: both;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjiW0wlCOd50drQdjHdED2MLUrvHhEFSKg9BRAvJTnDlP_wsdj55wpl7ETXLcZK-08RTgZtTBvU0JJvfadXa9ZeqfdZeQeKdpbSEAS-oMcXyiZ7LjfblJOtFdFIJX85Wi-tsqDLQw/s1232/Screen+Shot+2021-05-14+at+11.16.12+AM.png"><img border="0" data-original-height="90" data-original-width="1232" height="46" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjiW0wlCOd50drQdjHdED2MLUrvHhEFSKg9BRAvJTnDlP_wsdj55wpl7ETXLcZK-08RTgZtTBvU0JJvfadXa9ZeqfdZeQeKdpbSEAS-oMcXyiZ7LjfblJOtFdFIJX85Wi-tsqDLQw/w640-h46/Screen+Shot+2021-05-14+at+11.16.12+AM.png" width="640" /></a></div>
<p>That means gSuneido is running "standalone", i.e. using its own database instead of connecting as a client to a jSuneido database. While the surface difference is tiny, internally this is a huge jump.</p>
<p>I've been working away on the gSuneido database over the last year at the same time that we've been debugging and then rolling out the gSuneido client.</p><p>If I had just ported the jSuneido database implementation it would have been much easier, but what would be the fun in that. I kept the query implementation but redesigned the storage engine and transaction handling. I'd call it second system effect, but it's more like third system since I also redesigned this for jSuneido.</p>
<p>I still have lots to do. Although the IDE starts up, it's quite shaky and easily crashed. Many of the tests fail. But even to get to this point a huge number of pieces have to work correctly together. It's a bit like building a mechanical clock and reaching the point where it first starts to tick.</p>Andrew McKinlayhttp://www.blogger.com/profile/14951795633428513769noreply@blogger.com4tag:blogger.com,1999:blog-11565005.post-8013456368725376572021-03-14T13:42:00.000-06:002021-03-14T13:42:26.260-06:00Twenty Years of cSuneido<p>Last week was a milestone. We (my company) finished converting all our customers from cSuneido (the original C++ implementation) to gSuneido (the most recent implementation, in Go).<br /><br />That means I no longer have to maintain cSuneido and I no longer have to deal with C++. (sigh of relief)<br /><br />cSuneido has had a good run. We first deployed it to the first customer in 2000, so it's been in continuous production use for over 20 years. It's served us well.<br /><br />When I first started developing Suneido, in the late 1990's, C++ was relatively new on PC's. I started with Walter Bright's Zortech C++ which became Symantec C++ (and later Digital Mars C++). Later I moved to Microsoft C++ and MinGW.<br /><br />Suneido, like most dynamic languages is garbage collected. But C++ is not. I implemented a series of my own increasingly sophisticated conservative garbage collectors. But eventually I admitted my time was better spent on other things and I switched to using the <a href="https://en.wikipedia.org/wiki/Boehm_garbage_collector" target="_blank">Boehm-Demers-Weiser conservative garbage collector</a>. Under normal use conservative garbage collection works well. But there are cases where memory is not recycled and eventually you run out. That was somewhat tolerable on the client side, but it wasn't so good on the server side. (That was one of the factors that prompted the development of jSuneido, the Java version that we use on the server side. Another factor was the lack of concurrency support in C++ at that time.) It seemed for a while that the pendulum was swinging towards garbage collection. But Rust has given new life to manual memory management.<br /></p><p>Honestly, I won't be sorry to leave C++ behind. It has grown to be extremely complex, and while you can avoid much of that complexity, it's hard to not be affected by it. I've also had my fill of unsafe languages. Even after 20 years of fixing bugs, there are very likely still things like potential buffer overflows in cSuneido. (Ironically, one of the things that added a lot of complexity to C++ was template generics. Meanwhile I am anxiously waiting for the upcoming addition of generics in Go. However, Go's generics will be much simpler than C++'s Turing complete template programming.)<br /></p><p>While it might seem crazy to re-implement the same program (Suneido) three times, it's been an interesting exercise. I've learned things each time, and made improvements each time. It's been extra work to maintain multiple versions, but it's also caught bugs that would have been missed if I'd only had a single implementation. Doing it in three quite different languages - C++, Java, and Go, has also been enlightening. And having the hard constraint of needing to flawlessly run a large existing code base (about a million lines of Suneido code) means I've avoided most of the dangers of <a href="https://en.wikipedia.org/wiki/Second-system_effect" target="_blank">"second system" effects</a>.<br /></p><p>So far I've only implemented the client side of gSuneido. We are still using jSuneido (the Java version) for the server side. I'm currently working on implementing the database/server for gSuneido (in Go). Once that's finished I intend to retire jSuneido as well and be back to a single implementation to maintain, like the good old days :-) And given where I'm at in my career gSuneido will almost certainly be the last implementation I do. I wonder if it will last as long as cSuneido did?<br /></p>Andrew McKinlayhttp://www.blogger.com/profile/14951795633428513769noreply@blogger.com2tag:blogger.com,1999:blog-11565005.post-2368872918178829062021-02-10T20:26:00.000-06:002021-02-10T20:26:17.872-06:00Tools: Joplin Notes App<p>I recently (Nov.) started using <a href="https://joplinapp.org/" target="_blank">Joplin</a>, "An open source note taking and to-do application with synchronization capabilities" and I'm quite happy with it.</p><p>I've been a long time <a href="https://evernote.com/" target="_blank">Evernote</a> user (over 10 years). Although it was a bit rough at first (see <a href="https://thesoftwarelife.blogspot.com/2010/04/plea-to-evernote.html" target="_blank">A Plea to Evernote</a>) it has worked well for me. Like Joplin, it meets my requirement for something that runs on Windows, Mac, and phone/tablet, and works off-line. (Joplin also runs on Linux.)<br /></p><p>I'm pretty sure Evernote is an example of <a href="https://en.wikipedia.org/wiki/Conway%27s_law" target="_blank">Conway's Law</a> at work. Their versions have the same overall features, but there are enough small differences to be quite annoying when you're switching back and forth. You'd think someone at a high level would push for consistency. It's stupid stuff like one version puts a button at the right and another at the left. Then they came out with a new Mac version that was missing a bunch of features, and made yet another set of differences.</p><p>I've looked for alternatives in the past, but haven't found anything that matched my needs. I can't remember where I came across Joplin. I hadn't found it when I looked in the past. I wasn't specifically looking for an open source solution, but it's nice that Joplin is open source. It seems to have an active and growing community and user base. </p><p>One of the things I like about Joplin is that it's primarily Markdown. Most of my notes are plain text (see <a href="https://thesoftwarelife.blogspot.com/2015/12/taking-notes.html" target="_blank">Taking Notes</a>) but it's nice to have a little formatting at times. There is a new WYSIWYG editor, but previously it was the standard split screen edit & preview, or toggle back and forth. I mostly stay in the raw markdown mode.<br /></p><p>One of the things I feared about switching was having to leave all my old notes behind in another program. But Joplin can import Evernote's export format. I haven't moved everything yet (there's a lot) but I transferred my Suneido notebook which is roughly 2000 notes. It took a little while to sync/upload and then sync/download on other devices but it worked well. There are a few formatting glitches, but that isn't surprising.</p><p>Interestingly, Joplin doesn't (yet) run its own sync servers. Instead you can use Nextcloud, Dropbox, OneDrive, WebDAV, or the file system. I already use Dropbox so that was the easiest for me. They are working on their own sync server software.</p><p>Since I got Joplin I have hardly touched Evernote. I use Joplin every day to keep notes on my work. If you're looking for a notes app it's worth checking out.<br /></p>Andrew McKinlayhttp://www.blogger.com/profile/14951795633428513769noreply@blogger.com0tag:blogger.com,1999:blog-11565005.post-30887093396245887222021-01-04T20:41:00.001-06:002021-01-04T20:41:44.415-06:00Unix: A History and a Memoir<p>I just finished reading <a href="https://www.cs.princeton.edu/~bwk/memoir.html" target="_blank">Unix: A History and a Memoir by Brian Kernighan</a>. I'm not much of a history buff, but I enjoyed it, mostly because it brought back memories.<br /></p><p>By the time I was in high school I was obsessed with computers. My father somehow arranged for me to get a Unix account in the university computer science department. I'm not sure of the details - my father worked on campus, but wasn't part of the university, he was an entomologist (studied insects) with Canada Agriculture.</p><p>My father also arranged permission for me to sit in on some computer science classes. But my high school principal refused to let me take a few hours a week for it. I can't recall why, something about disrupting my school work. Which is totally ridiculous given that I was at the top of my classes. You wonder what goes through the minds of petty bureaucrats.<br /></p><p>I remember being quite lost at first. I was entirely self taught which left plenty of gaps in my knowledge. I was intimidated by the university and too shy to ask anyone for help, But I muddled along, teaching myself Unix and C. This would have been in 1977 or 1978, so the early days of Unix. Of course, it was in the universities first.<br /></p><p>I recall being baffled that C had no way to input numbers. For some reason I either didn't discover scanf or didn't realize it was what I was looking for. It wasn't really a problem, I just figured out how to write my own string to integer conversions. When the C Programming Language book (by Kernighan and Ritchie) came out in 1978, that helped a lot.</p><p>Software Tools (by Kernighan and Plauger) was probably the book I studied the most. I implemented a kind of <a href="https://en.wikipedia.org/wiki/Ratfor">Ratfor</a> on top of TRS-80 Basic, and implemented most of the tools multiple times over the years. I still have my original copy of the book - dog eared, coffee stained, and falling apart. A few years later, I reverse engineered and implemented my own Lex and then Yacc. Yacc was a challenge because I didn't know the compiler-compiler theory it was based on. Nowadays there are open source versions of all this stuff, but not back then.<br /></p><p>I read many more Kernighan books over the years, The Elements of Programming Style (with Plauger), The Unix Programming Environment (with Pike), The Practice of Programming (with Pike), and more recently, The Go Programming Language (with Donovan).</p><p>Nowadays, with the internet and Stack Overflow, it's hard to
remember how much harder it was to access information in those days
(especially if you didn't talk to other people). I had one meeting with someone from the computer science department. (Again, arranged by my father, probably because I was so full of questions.) Looking back it was likely a grad student (he was young but had an office). I questioned him about binding time. He didn't know what I was talking about. I don't remember why I was fixated on that question. I must have seen some mention of it in a book. Me and unsolved questions/problems, like a dog and a bone.</p><p>Presumably the Unix man
pages were my main resource. But if you didn't know where to start it
was tough. I remember someone gave me a 20 or 30 page photocopied
introduction to Unix and C. That was my main resource when I started
out. Nowadays, I'd be hard pressed to make it through a day of
programming without the internet.<br /></p>Andrew McKinlayhttp://www.blogger.com/profile/14951795633428513769noreply@blogger.com0