tscs37's Blog

The Journey so far and what lies ahead

I’m currently working on finishing some of the base architecture of the previously discussed tree architecture.

The BlobLayer has received an update to allow direct writing on the blobs. Based on this functionality a slice access is established which addresses slices based on their index. To find the end and start point the index is multiplied by the size of the slices.

Records have been moved from independent slices into the Node itself while also reducing the number of records in a node. This helps to reduce storage overhead. Some of their functionality was dropped or revised, for example, instead of using a boolean to indicate the presence of the data, the buffer is simply set to nil when unloading. To prevent unloading code can use the Hold and Unhold functions which will prevent paging out the buffer.

The previously mentioned cache was simplified in favor of a simple reference store. The tree maintains this store.

To access the tree, a NodeRef is obtained, this struct proxies request, it will first search the cache before dropping the search to the tree.

The free-slice list is already implemented but not tested, however, it is not stored in-tree, making it necessary to store it externally. The entire Tree object currently maintains a lot of information manually. I hope this can be moved in-tree eventually to simply the structure and remove limitations.

First benchmarks indicate the pure datarecords can read at roughly 1GiB/s on my laptop. With unsafe writes, it achieves 200MB/s writes while with safe and sync’d writes, it drops to 2MB/s. This means rewriting records or nodes a lot will induce overhead and latency. However, since you can access subtrees directly, the root node should be less of a congestion problem. Additionally, the tree will maintain a constant operation speed while it’s converging the messages to the root.

This doesn’t mean the tree is ready for heavy write loads yet, contrary, it will probably clog up once the overall write rate exceeds the 2MB/s mark on a part of the tree.

Nodes have received some care, first iterations contained lots of fragile code. Some parts still rely on panics to communicate failures, I hope I will be able to reduce this to a minimum in the future.