-
Notifications
You must be signed in to change notification settings - Fork 43
statistics
Yeray edited this page Jun 28, 2017
·
2 revisions
Several basic statistic formulas are calculated when importing data.
TDataItem
has a "Stats" property that returns the calculations at column and table level.
MyData.Stats
These basic values (things like Standard Deviation, Kurtosis, Skewness, etc, etc) are intended to be used by the machine-learning algorithms. Calculating them in advance, at import time, and persisting them together with the data saves time.
A "map" of each column is also created. A map is a class that contains a sorted array with the unique element values of a data item, and the frequency of each element (the number of times it appears).
The map sorted array is used for fast searching of data (using a binary search algorithm).
MyData.DataMap