Skip to content

statistics

Yeray edited this page Jun 28, 2017 · 2 revisions

Statistics and Maps

Several basic statistic formulas are calculated when importing data.

TDataItem has a "Stats" property that returns the calculations at column and table level.

MyData.Stats

These basic values (things like Standard Deviation, Kurtosis, Skewness, etc, etc) are intended to be used by the machine-learning algorithms. Calculating them in advance, at import time, and persisting them together with the data saves time.

A "map" of each column is also created. A map is a class that contains a sorted array with the unique element values of a data item, and the frequency of each element (the number of times it appears).

The map sorted array is used for fast searching of data (using a binary search algorithm).

MyData.DataMap
Clone this wiki locally