One of the things I've been meaning to work on for a while is the memory usage for temporary indexes in Suneido's database.
I started by measuring the current memory usage. I used two test cases, first a single table with 400,000 records sorting by a non-indexed field, and second the same table but joined with another one. I used Visual VM to watch the heap and force garbage collections to get an idea of memory usage. The first case used about 200mb and the second about 400mb. For the simple case, that's about 500 bytes per record. I was pretty sure I could improve this usage.
My assumption was that a lot of the usage was Java's per-object overhead.
For each record, the temporary index code created a new sort key record, plus an Object[] array that stored references to each of the source data records (more than one in the case of joins). These sort key arrays are then added to a MergeTree.
My first step was to switch from allocating a separate small array for each sort key, to putting them into a big array and referencing them by offset.
I could have used a standard ArrayList, but it works by doubling the size of it's internal array each time it needs to grow. At very large sizes, this is not the best strategy. For example, to grow from 100mb to 200mb you need to allocate a new 200mb array, copy the old 100mb array into it, and then the old array has to be garbage collected. On top of this, if the data is long lived, it will probably end up being moved to older generations by the garbage collector, where it won't get garbage collected as easily.
Instead, I wrote an ArraysList which uses a bunch of medium sized arrays rather than one big array. To grow it simply allocates another medium size array, without any copying or garbage collecting. This is not necessarily better in every situation, but for my usage it is.
The next problem was that now I was storing an int offset (into the ArraysList) instead of an array. My MergeTree is generic, which means to use if for int's they'd have to be boxed into Integers. Then I'd be back to too much per-object overhead. So I had to make a specialized IntMergeTree. It's too bad that Java makes such a big distinction between primitives and references, and that generics only work with references. (Scala tries to reduce the distinction, and it even lets you specialize a generic class for a primitive type, but it still ends up boxing and unboxing most of the time.)
In the common case where you're just sorting a single table, there's no need for the array at all, and instead of storing the ArraysList offset, you can just store the integer "address" of the record.
The next step was to store the sort key record in larger blocks of memory rather than individually, and again reference them by offset rather than a Java reference. For this I could use the MemStorage class I had already written to create in-memory databases for tests. It inherits from the same Storage base class as my MmapFile memory mapped file access. These are based on ByteBuffers. Each sort key record used at least three memory objects (the record, its ByteBuffer, and the ByteBuffer's array) so this was a significant saving. (This doesn't require much serialization because the sort key records are already in a serialized form.)
The end result was that the memory usage was less than 1/4 of what it was before. My first case now took about 40mb instead of 200mb, and my second case took about 90mb instead of 400mb. Not bad for a couple of days work.
I wouldn't normally recommend this kind of messing around - the overhead is seldom a big issue. But in the case of a server, with large numbers of users competing for memory, I think it was worth it.
No comments:
Post a Comment