Stream is a Go library for online statistical algorithms. Provided statistics can be computed globally over an entire stream, or over a rolling window.
- Stream
Use go get
:
go get github.com/alexander-yu/stream
In-depth examples are provided in the examples directory, but a small taste is provided below:
// tracks the autocorrelation over a
// rolling window of size 15 and lag of 5
autocorr, err := joint.NewAutocorr(5, 15)
// handle err
// all metrics in the joint package must be passed
// through joint.Init in order to consume values
err = joint.Init(autocorr)
// handle err
// tracks the global median using a pair of heaps
median, err := quantile.NewGlobalHeapMedian()
// handle err
for i := 0., i < 100; i++ {
err = autocorr.Push(i)
// handle err
err = median.Push(i)
// handle err
}
autocorrVal, err := autocorr.Value()
// handle err
medianVal, err := median.Value()
// handle err
fmt.Println("%s: %f", autocorr.String(), autocorrVal)
fmt.Println("%s: %f", median.String(), medianVal)
For time/space complexity details on the algorithms listed below, see here.
Quantile keeps track of the quantiles of a stream. Quantile can calculate the global quantiles of a stream, or over a rolling window. You can also configure which implementation to use as the underlying data structure, as well as which interpolation method to use in the case that a quantile actually lies in between two elements. For now skip lists as well as order statistic trees (in particular modified forms of AVL trees and red black trees) are supported.
Median keeps track of the median of a stream; this is simply a convenient wrapper over Quantile, that automatically sets the quantile to be 0.5 and the interpolation method to be the midpoint method.
IQR keeps track of the interquartile range of a stream; this is simply a convenient wrapper over Quantile, that retrieves the 1st and 3rd quartiles and sets the interpolation method to be the midpoint method.
HeapMedian keeps track of the median of a stream with a pair of heaps. In particular, it uses a max-heap and a min-heap to keep track of elements below and above the median, respectively. HeapMedian can calculate the global median of a stream, or over a rolling window.
Min keeps track of the minimum of a stream; it can track either the global minimum, or over a rolling window.
Max keeps track of the maximum of a stream; it can track either the global maximum, or over a rolling window.
Mean keeps track of the mean of a stream; it can track either the global mean, or over a rolling window.
EWMA keeps track of the global exponentially weighted moving average.
Moment keeps track of the k
-th sample central moment; it can track either the global moment, or over a rolling window.
EWMMoment keeps track of the global k
-sample exponentially weighted moving sample central moment. This uses the exponentially weighted moving average as its center of mass, and uses the same exponential weights for its power terms.
Std keeps track of the sample standard deviation of a stream; it can track either the global standard deviation, or over a rolling window. To track the sample variance instead, you should use Moment, i.e.
variance := New(2, window)
EWMStd keeps track of the global exponentially weighted moving standard deviation. To track the exponentially weighted moving variance instead, you should use EWMMoment, i.e.
variance := NewEWMMoment(2, decay)
Skewness keeps track of the sample skewness of a stream (in particular, the adjusted Fisher-Pearson standardized moment coefficient); it can track either the global skewness, or over a rolling window.
Kurtosis keeps track of the sample kurtosis of a stream (in particular, the sample excess kurtosis); it can track either the global kurtosis, or over a rolling window.
Core is the struct powering all of the statistics in the stream/moment
subpackage; it keeps track of a pre-configured set of centralized k
-th power sums of a stream in an efficient, numerically stable way; it can track either the global sums, or over a rolling window.
To configure which sums to track, you'll need to instantiate a CoreConfig
struct and provide it to NewCore
:
config := &moment.CoreConfig{
Sums: SumsConfig{
2: true, // tracks the sum of squared differences
3: true, // tracks the sum of cubed differences
},
Window: stream.IntPtr(0), // tracks global sums
Decay: stream.FloatPtr(0.3), // tracks exponentially weighted sums with a decay factor of 0.3
}
core, err := NewCore(config)
See the godoc entry for more details on Core's methods.
Cov keeps track of the sample covariance of a stream; it can track either the global covariance, or over a rolling window.
EWMCov keeps track of the global exponentially weighted sample covariance of a stream. This uses the exponentially weighted moving average as its center of mass, and uses the same exponential weights for its power terms.
Corr keeps track of the sample correlation of a stream (in particular, the sample Pearson correlation coefficient); it can track either the global correlation, or over a rolling window.
EWMCorr keeps track of the global sample exponentially weighted correlation of a stream (in particular, the exponentially weighted sample Pearson correlation coefficient). This uses the exponentially weighted moving average as its center of mass, and uses the same exponential weights for its power terms.
Autocorr keeps track of the sample autocorrelation of a stream (in particular, the sample autocorrelation) for a given lag; it can track either the global autocorrelation, or over a rolling window.
Autocov keeps track of the sample autocovariance of a stream (in particular, the sample autocovariance) for a given lag; it can track either the global autocovariance, or over a rolling window.
Core is the struct powering all of the statistics in the stream/joint
subpackage; it keeps track of a pre-configured set of joint centralized power sums of a stream in an efficient, numerically stable way; it can track either the global sums, or over a rolling window.
To configure which sums to track, you'll need to instantiate a CoreConfig
struct and provide it to NewCore
:
config := &joint.CoreConfig{
Sums: SumsConfig{
{1, 1}, // tracks the joint sum of differences
{2, 0}, // tracks the sum of squared differences of variable 1
},
Vars: stream.IntPtr(2), // declares that there are 2 variables to track (optional if Sums is set)
Window: stream.IntPtr(0), // tracks global sums
Decay: stream.FloatPtr(0.3), // tracks exponentially weighted sums with a decay factor of 0.3
}
core, err := NewCore(config)
See the godoc entry for more details on Core's methods.
SimpleAggregateMetric is a convenience wrapper that stores multiple univariate metrics and will push a value to all metrics simultaneously; instead of returning a single scalar, it returns a map of metrics to their corresponding values.
SimpleJointAggregateMetric is a convenience wrapper that stores multiple multivariate metrics and will push a value to all metrics simultaneously; instead of returning a single scalar, it returns a map of metrics to their corresponding values.