Status
11 May 2017
Leonardo Silvestri

ztsdb is still alpha. To transition to beta, additional testing and user validation are necessary, as well as some improvements in the areas outlined below.

R interface

ztsdb's time type maps to POSIXct. This is very problematic because time has nanoseconds precision whereas POSIXct is floating point based and cannot achieve nanosecond precision. ztsdb's duration, interval and period types do not yet have a mapping. zts (ztsdb's time-series type) maps to xts with a POSIXct index.

time could map to nanotime from the nanotime package and duration could map to integer64 from the bit64 package, but there is still interval and period to take care of. Also, one issue is that xts does not currently support nanotime as an index. Hopefully it will be possible to add nanotime to the list of valid xts index types. In the shorter term, a fork of xts or using data.table instead are options.

In order to make sensible decisions on this R interface there needs to be more input from ztsdb users. The current interface is sufficient to show the level of integration between R and ztsdb that can be achieved, but it is not production grade.

R functionality

Although ztsdb is a deliberately minimal subset of R and will continue to provide only complementary time-series DBMS functionality, there are certainly some core R functions that are still missing (e.g. rep, abs, grep, functionals, ...).

Performance

Many design choices were made with speed in mind. For example, incoming buffers are directly assembled into the data structure they represent (this is true for both requests and appends) and the internals are designed to eliminate spurious copies (for example in argument passing, list assignment, etc.).

Nonetheless, a rigorous study of performance needs to be made in order to assess areas for improvement. Pending this investigation, these areas are currently potential candidates:

Durability

ztsdb uses Linux's mmap infrastructure to persist objects to files. From initial testing this seems to work well, but for very large time-series it could make sense to split files in smaller time chunks and ask ztsdb to string the chunks together. This would allow in particular a better backup strategy for very long time series (e.g. financial time-series).

A quick note here to mention that ztsdb's arrays are designed to be able to handle mmap truncation from the beginning of the array, i.e. throwing away old observations. In fact, in-memory time-series already have this ability (see this example), but it's more difficult for file-backed mappings: using fallocate, it would only be possible to achieve this on an XFS filesystem.