Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dynamic time warping algorithm with optional constraints #250

Open
wants to merge 158 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
158 commits
Select commit Hold shift + click to select a range
1dfea5c
dtw/fastdtw client files
lewardo Aug 11, 2023
fe0d3cf
dtw initial structure
lewardo Aug 15, 2023
6903f19
plagiarise datasetclient for now
lewardo Aug 22, 2023
46ecaf3
rename object to dataseries
lewardo Aug 22, 2023
9993a1e
functional clone of dataset
lewardo Aug 22, 2023
0aa7d8b
dataseries CTORs
lewardo Aug 23, 2023
1bd2e93
addSeries member
lewardo Aug 23, 2023
12e52c4
getSeries member
lewardo Aug 23, 2023
f54ce93
addFrame/getFrame members
lewardo Aug 23, 2023
725342b
get member function
lewardo Aug 23, 2023
8ffcfd3
modify member getters
lewardo Aug 23, 2023
8a9a92d
change container type to vector of tensors
lewardo Aug 23, 2023
c8aa44c
modify initFromData for vector change
lewardo Aug 23, 2023
a1b89a8
removeSeries/Frame and updateSeries/Frame member functions
lewardo Aug 23, 2023
af564f9
remove superfluous dataset members and convert names
lewardo Aug 23, 2023
db247f6
remove kFrameLen paramter
lewardo Aug 23, 2023
2774b29
add const qualifiers to input views
lewardo Aug 23, 2023
d875bd4
add `DataSeries` to libmanipulation
lewardo Aug 23, 2023
9e116fa
fix constness point setting issues
lewardo Aug 23, 2023
66897f9
xvalue views for casting issues
lewardo Aug 23, 2023
534fd36
printing shenanigains
lewardo Aug 23, 2023
9536952
series const getter
lewardo Aug 23, 2023
4576eb9
fix json (de)serialisation
lewardo Aug 23, 2023
2145f25
deft solution for tensor vector to tensorview vector casting
lewardo Aug 23, 2023
6e93be4
acc incrementing the right iterator might help
lewardo Aug 23, 2023
14a9d8e
const casting shenanigains
lewardo Aug 23, 2023
eab2162
change buffer frame return paradigm
lewardo Aug 23, 2023
4459cf9
rename messages
lewardo Aug 23, 2023
adb599d
fix assert crash by casting new points properly
lewardo Aug 23, 2023
440bfa6
printing now neat and tidyy
lewardo Aug 23, 2023
5503969
fixed read/write operations
lewardo Aug 23, 2023
5a34df7
replace asserts with errors
lewardo Aug 23, 2023
327ed07
consistent whitespace formatting
lewardo Aug 23, 2023
08ec779
deleteSeries message
lewardo Aug 23, 2023
9e41fb0
view converting methods for buffer `float` shenanigains
lewardo Aug 24, 2023
409d3cd
fix printing issues
lewardo Aug 24, 2023
99015b8
remove superflous reference creation
lewardo Aug 24, 2023
06692a0
convert to template view function calls
lewardo Aug 24, 2023
a6a23ef
addSeries message
lewardo Aug 24, 2023
77251f5
register new series-level messages
lewardo Aug 24, 2023
2845d37
regroup member functions
lewardo Aug 24, 2023
784fff6
`getSeries` message
lewardo Aug 24, 2023
2c46d51
`setSeries` message
lewardo Aug 24, 2023
6128d91
`updateSeries` message
lewardo Aug 24, 2023
54d48c6
`getDataSet` message
lewardo Aug 24, 2023
3ab06c2
actually pointing to the right member might be helpful
lewardo Aug 24, 2023
7d15aaf
fix `deleteframe` case with single frame in series
lewardo Aug 24, 2023
f9214a9
Squash merge `data-series` into `dtime-warp`
lewardo Aug 24, 2023
1a5c246
member matrix and helper distance function
lewardo Aug 24, 2023
fb4f7f9
dtw brute-force algorithm
lewardo Aug 24, 2023
eae785f
props and dataclient boilerplate
lewardo Aug 24, 2023
c25836c
member function constness
lewardo Aug 24, 2023
abdf5ce
fastdtw boilerplate
lewardo Aug 24, 2023
4d0bd5c
explicit eigen dot product
lewardo Aug 24, 2023
cdc8237
`kNearest` message for `DataSeries`
lewardo Aug 24, 2023
799374c
what was i thinking euclidian dist isnt remotely their dot product :|
lewardo Aug 24, 2023
fe27129
p-norm distance replaces euclidian
lewardo Aug 25, 2023
062df21
`DTW` client boilerplate
lewardo Aug 25, 2023
bd09568
add `DTW` to `libmanipulation`
lewardo Aug 25, 2023
711ef37
`dtw` templatifying shenanigains
lewardo Aug 25, 2023
19de4a2
`dtw` `bufCost` message
lewardo Aug 25, 2023
ee3490e
make stateless `dtw` static
lewardo Aug 25, 2023
4024033
added pNorm ui
lewardo Aug 25, 2023
75287c7
remove query object
lewardo Aug 25, 2023
6521884
returning the correct corner of the matrix might help
lewardo Aug 25, 2023
5869ebb
rm templating dipshittery i already have one i dont need a nested one
lewardo Aug 25, 2023
a8aa137
`dtw` `cost` message with `dataseries`
lewardo Aug 25, 2023
69ba4a6
private implementation of cost calculation
lewardo Aug 25, 2023
740dbe8
make algorithm stateful for consistency
lewardo Aug 26, 2023
ce0efe8
add `toBuffer` and `fromBuffer` message aliases
lewardo Aug 28, 2023
8c71e91
run clang-format
lewardo Aug 28, 2023
54d3281
`knearestdist` message on `dataseries` with `dtw` and clang-format
lewardo Aug 28, 2023
aaf04b1
fix empty buffer crash
lewardo Aug 28, 2023
acb3bbb
made pNorm exponent a member
lewardo Aug 28, 2023
1589b36
remove `fastdtw` from this branch
lewardo Aug 29, 2023
2724898
reorder member functions
lewardo Aug 29, 2023
14fccfc
remove window/path helper functions (`fastdtw` framework)
lewardo Aug 29, 2023
beb0ff1
added Constraint object
lewardo Aug 29, 2023
1273cbf
raster line range calculation
lewardo Aug 29, 2023
2ff212a
Sakoe-Chiba band constraint
lewardo Aug 29, 2023
357410e
Ikatura parallelogram constraint (WIP)
lewardo Aug 29, 2023
e9ac64b
rename row/col functions
lewardo Aug 29, 2023
b714b45
add constraint interface
lewardo Aug 29, 2023
9dbc09f
float param for fractional constraints
lewardo Aug 30, 2023
93f6fdc
variadic minimum helpers
lewardo Aug 30, 2023
3a55c3b
iterating lambda for main body to isolate range calculations
lewardo Aug 30, 2023
72d170e
add Ikatura gradient catch
lewardo Aug 30, 2023
f075e9d
separate constraint parameters by name
lewardo Aug 30, 2023
7f0378f
float `radius` param
lewardo Aug 30, 2023
067c9ad
acc using the different params might be useful
lewardo Aug 30, 2023
9bb7023
`dataSeries` distance messages now using sakoe-chiba
lewardo Aug 30, 2023
650035b
cannibalise `knnclassifier`
lewardo Aug 30, 2023
ce7b783
classifier data class
lewardo Aug 30, 2023
49e3f27
remove rt query object
lewardo Aug 30, 2023
a168e2f
hid pnorm power interface
lewardo Aug 30, 2023
892b3a6
use custom allocator on std containers
lewardo Aug 30, 2023
c78368a
use custom allocator on std containers
lewardo Aug 30, 2023
066bc70
`dtwClassifier` `fit` message
lewardo Aug 30, 2023
746f312
classifier parameters
lewardo Aug 30, 2023
07973f6
constraint getter
lewardo Aug 30, 2023
095b8ad
assignment ordering and custom allocator
lewardo Aug 30, 2023
e8834dc
convert `dataseries` template operations for consistency
lewardo Aug 30, 2023
1e82ed9
formatting
lewardo Aug 30, 2023
dde8232
convert `dataseries` template operations for consistency
lewardo Aug 30, 2023
4c41387
merge all relevant `data-series` into `dtime-warp`
lewardo Aug 30, 2023
845b218
custom vector allocator for `knearestdist`
lewardo Aug 30, 2023
48ef76e
classifier `predictpoint` message
lewardo Aug 30, 2023
eb5838f
fix templature correction
lewardo Aug 31, 2023
f1ede98
fix templature correction
lewardo Aug 31, 2023
cdf7faa
fix copy construction
lewardo Aug 31, 2023
3a237ee
remove pnorm option (not a practically useful parameter)
lewardo Aug 31, 2023
8ea77b4
check if labels and series have matching ids
lewardo Aug 31, 2023
5bc8aa9
`predictpoint` classifier message
lewardo Aug 31, 2023
e719394
private calculation member function
lewardo Aug 31, 2023
1160127
`predict` message
lewardo Aug 31, 2023
d6a2272
`predict` classifier message
lewardo Aug 31, 2023
d1cd2a0
distance weighting
lewardo Aug 31, 2023
d84b2c9
add `DTWClassifier` to clients
lewardo Aug 31, 2023
5857e3a
ass `DTWRegressor` client
lewardo Aug 31, 2023
ccc4342
more logical `dims` forwarding
lewardo Aug 31, 2023
1cc3c92
interface calls for no change
lewardo Aug 31, 2023
dbc11e7
boilerplate from classifier
lewardo Aug 31, 2023
fd4de21
weighted sum of k nearest points
lewardo Aug 31, 2023
649f4e0
predict dataset
lewardo Aug 31, 2023
cc8199e
predict buffer
lewardo Aug 31, 2023
0e3920f
set mapping space (`fit`)
lewardo Aug 31, 2023
46db66b
actually changing the name might avoid bugs
lewardo Aug 31, 2023
b24ad38
ohh so you actually have to give the correct path for includes
lewardo Aug 31, 2023
e94dd03
prevent distance 0 from losing sanity
lewardo Aug 31, 2023
91c9f6b
perhaps weighting in the corrent direction may be helpful
lewardo Aug 31, 2023
3a59eae
return max of distance and epsilon, rather than sum thereof
lewardo Sep 1, 2023
b847494
consistent printing terminology
lewardo Sep 5, 2023
c372c22
check if id exists before resizing buffer
lewardo Sep 5, 2023
dc4c2d3
squash merge `data-series` into `dtime-warp`
lewardo Sep 5, 2023
b8eb786
buffer `predictpoint` output
lewardo Sep 8, 2023
d2c3011
Merge branch 'main' into dtime-warp
tremblap Sep 9, 2023
1e2bd33
remove lpnorm argument to fix docs
lewardo Sep 9, 2023
bdc9fa4
remove `toBuffer` and `fromBuffer` aliases to unconfuse
lewardo Sep 9, 2023
6173b9c
remove `toBuffer` and `fromBuffer` aliases to unconfuse
lewardo Sep 9, 2023
c800a96
added proper dataseries error messages
lewardo Sep 9, 2023
8a9a059
added proper dataseries error messages
lewardo Sep 9, 2023
4cfc4f7
proper dataseries error messages
lewardo Sep 9, 2023
1740ee7
merge constraint parameters into once conceptual value
lewardo Sep 10, 2023
df9ccd9
fallback control path
lewardo Sep 10, 2023
8f19fef
proper knearest checks
lewardo Sep 10, 2023
69265ac
negative frame indexing
lewardo Sep 10, 2023
035638a
`getdataset` argument order
lewardo Sep 10, 2023
aa2e5d9
negative frame indexing
lewardo Sep 10, 2023
8981f98
`getdataset` argument order
lewardo Sep 10, 2023
902cdd7
Merge branch 'dtime-warp' of https://github.com/lewardo/flucoma-core …
tremblap Sep 10, 2023
ac1293e
modify the interface name - we predict from a buffer containing a Ser…
tremblap Sep 25, 2023
890985f
temporary fix to reload error - json is alphabetical - TODO: potentia…
tremblap Jan 8, 2024
502b0d2
Merge branch 'data-series' into dtime-warp
tremblap Jan 8, 2024
0d2f546
Merge branch 'main' into dtime-warp
tremblap Feb 29, 2024
c4a80fe
printseries fix print 2nd boundary counter
tremblap Feb 29, 2024
05af721
remove the T in dataset printing and json dumping
tremblap Mar 7, 2024
023ed85
correct padding per series instead
tremblap Mar 7, 2024
c16aacb
Merge branch 'main' into dtime-warp
tremblap Mar 9, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions FlucomaClients.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,9 @@ add_client(Transients clients/rt/TransientClient.hpp CLASS RTTransientClient )

#lib manipulation client group
add_client(DataSet clients/nrt/DataSetClient.hpp CLASS NRTThreadedDataSetClient GROUP MANIPULATION)
add_client(DataSeries clients/nrt/DataSeriesClient.hpp CLASS NRTThreadedDataSeriesClient GROUP MANIPULATION)
add_client(DataSetQuery clients/nrt/DataSetQueryClient.hpp CLASS NRTThreadedDataSetQueryClient GROUP MANIPULATION)
add_client(DataSeries clients/nrt/DataSeriesClient.hpp CLASS NRTThreadedDataSeriesClient GROUP MANIPULATION)
add_client(LabelSet clients/nrt/LabelSetClient.hpp CLASS NRTThreadedLabelSetClient GROUP MANIPULATION)
add_client(KDTree clients/nrt/KDTreeClient.hpp CLASS NRTThreadedKDTreeClient GROUP MANIPULATION)
add_client(KMeans clients/nrt/KMeansClient.hpp CLASS NRTThreadedKMeansClient GROUP MANIPULATION)
Expand All @@ -158,3 +160,6 @@ add_client(UMAP clients/nrt/UMAPClient.hpp CLASS NRTThreadedUMAPClient GROUP MAN
add_client(MLPRegressor clients/nrt/MLPRegressorClient.hpp CLASS NRTThreadedMLPRegressorClient GROUP MANIPULATION)
add_client(MLPClassifier clients/nrt/MLPClassifierClient.hpp CLASS NRTThreadedMLPClassifierClient GROUP MANIPULATION)
add_client(Grid clients/nrt/GridClient.hpp CLASS NRTThreadedGridClient GROUP MANIPULATION)
add_client(DTW clients/nrt/DTWClient.hpp CLASS NRTThreadedDTWClient GROUP MANIPULATION)
add_client(DTWClassifier clients/nrt/DTWClassifierClient.hpp CLASS NRTThreadedDTWClassifierClient GROUP MANIPULATION)
add_client(DTWRegressor clients/nrt/DTWRegressorClient.hpp CLASS NRTThreadedDTWRegressorClient GROUP MANIPULATION)
238 changes: 238 additions & 0 deletions include/algorithms/public/DTW.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,238 @@
/*
Part of the Fluid Corpus Manipulation Project (http://www.flucoma.org/)
Copyright University of Huddersfield.
Licensed under the BSD-3 License.
See license.md file in the project root for full license information.
This project has received funding from the European Research Council (ERC)
under the European Union’s Horizon 2020 research and innovation programme
(grant agreement No 725899).
*/

#pragma once

#include "../util/FluidEigenMappings.hpp"
#include "../../data/FluidDataSet.hpp"
#include "../../data/FluidIndex.hpp"
#include "../../data/FluidMemory.hpp"
#include "../../data/FluidTensor.hpp"
#include "../../data/TensorTypes.hpp"
#include <Eigen/Core>
#include <cstddef>
#include <iterator>
#include <random>

namespace fluid {
namespace algorithm {


enum class DTWConstraint { kUnconstrained, kIkatura, kSakoeChiba };

// debt of gratitude to the wonderful article on
// https://rtavenar.github.io/blog/dtw.html a better explanation of DTW than any
// other algorithm explanation I've seen

class DTW
{
struct Constraint;

public:
explicit DTW() = default;
~DTW() = default;

void init() {}
void clear() {}

index size() const { return mPNorm; }
constexpr index dims() const { return 0; }
constexpr index initialized() const { return true; }

double process(InputRealMatrixView x1, InputRealMatrixView x2,
DTWConstraint constr = DTWConstraint::kUnconstrained,
index param = 2,
Allocator& alloc = FluidDefaultAllocator()) const
{
ScopedEigenMap<Eigen::VectorXd> x1r(x1.cols(), alloc),
x2r(x2.cols(), alloc);
Constraint constraint(constr, x1.rows(), x2.rows(), param);

mDistanceMetrics.resize(x1.rows(), x2.rows());
mDistanceMetrics.fill(std::numeric_limits<double>::max());

constraint.iterate([&, this](index r, index c) {
x1r = _impl::asEigen<Eigen::Matrix>(x1.row(r));
x2r = _impl::asEigen<Eigen::Matrix>(x2.row(c));

mDistanceMetrics(r, c) = differencePNormToTheP(x1r, x2r);

if (r > 0 || c > 0)
{
double minimum = std::numeric_limits<double>::max();

if (r > 0) minimum = std::min(minimum, mDistanceMetrics(r - 1, c));
if (c > 0) minimum = std::min(minimum, mDistanceMetrics(r, c - 1));
if (r > 0 && c > 0)
minimum = std::min(minimum, mDistanceMetrics(r - 1, c - 1));

mDistanceMetrics(r, c) += minimum;
}
});

return std::pow(mDistanceMetrics(x1.rows() - 1, x2.rows() - 1),
1.0 / mPNorm);
}

private:
mutable RealMatrix mDistanceMetrics;
index mPNorm{2};

// P-Norm of the difference vector
// Lp{vec} = (|vec[0]|^p + |vec[1]|^p + ... + |vec[n-1]|^p + |vec[n]|^p)^(1/p)
// i.e., the 2-norm of a vector is the euclidian distance from the origin
// the 1-norm is the sum of the absolute value of the elements
// To the power P since we'll be summing multiple Norms together and they
// can combine into a single norm if you calculate the norm of multiple norms
// (normception)
inline double
differencePNormToTheP(const Eigen::Ref<const Eigen::VectorXd>& v1,
const Eigen::Ref<const Eigen::VectorXd>& v2) const
{
// assert(v1.size() == v2.size());
return (v1.array() - v2.array()).abs().pow(mPNorm).sum();
}

// fun little fold operation to do a variadic minimum
template <typename... Args>
inline static auto min(Args&&... args)
{
auto m = (args, ...);
return ((m = std::min(m, args)), ...);
}

// filter for minimum chaining, if cond evaluates to false then the value
// isn't used (never will be the minimum if its the numeric maximum)
template <typename T>
inline static T useIf(bool cond, T val)
{
return cond ? val : std::numeric_limits<T>::max();
}

struct Constraint
{
Constraint(DTWConstraint c, index rows, index cols, float param)
: mType{c}, mRows{rows}, mCols{cols}, mParam{param}
{
// ifn't gradient more than digonal set it to be the diagonal
// (sakoe-chiba with radius 0)
if (c == DTWConstraint::kIkatura)
{
float big = std::max(mRows, mCols), smol = std::min(mRows, mCols);

if (mParam <= big / smol)
{
mType = DTWConstraint::kSakoeChiba;
mParam = 0;
}
}
};

void iterate(std::function<void(index, index)> f)
{
index first, last;

for (index r = 0; r < mRows; ++r)
{
first = firstCol(r);
last = lastCol(r);

for (index c = first; c <= last; ++c) f(r, c);
}
};

private:
DTWConstraint mType;
index mRows, mCols;
float mParam; // mParam is either radius (SC) or gradient (Ik)

inline static index rasterLineMinY(index x1, index y1, float dydx, index x)
{
return std::round(y1 + (x - x1) * dydx);
}

inline static index rasterLineMinY(index x1, index y1, index x2, index y2,
index x)
{
float dy = y2 - y1, dx = x2 - x1;
return rasterLineMinY(x1, y1, dy / dx, x);
}

inline static index rasterLineMaxY(index x1, index y1, float dydx, index x)
{
if (dydx > 1)
return rasterLineMinY(x1, y1, dydx, x + 1) - 1;
else
return rasterLineMinY(x1, y1, dydx, x);
}

inline static index rasterLineMaxY(index x1, index y1, index x2, index y2,
index x)
{
float dy = y2 - y1, dx = x2 - x1;
return rasterLineMaxY(x1, y1, dy / dx, x);
}

index firstCol(index row)
{
switch (mType)
{
case DTWConstraint::kUnconstrained: return 0;

case DTWConstraint::kIkatura: {
index colNorm = rasterLineMinY(mRows - 1, mCols - 1, mParam, row);
index colInv = rasterLineMinY(0, 0, 1 / mParam, row);

index col = std::max(colNorm, colInv);

return col < 0 ? 0 : col > mCols - 1 ? mCols - 1 : col;
}

case DTWConstraint::kSakoeChiba: {
index col = rasterLineMinY(mParam, -mParam, mRows - 1 + mParam,
mCols - 1 - mParam, row);

return col < 0 ? 0 : col > mCols - 1 ? mCols - 1 : col;
}
}

return 0;
};

index lastCol(index row)
{
switch (mType)
{
case DTWConstraint::kUnconstrained: return mCols - 1;

case DTWConstraint::kIkatura: {
index colNorm = rasterLineMaxY(0, 0, mParam, row);
index colInv = rasterLineMaxY(mRows - 1, mCols - 1, 1 / mParam, row);

index col = std::min(colNorm, colInv);

return col < 0 ? 0 : col > mCols - 1 ? mCols - 1 : col;
}

case DTWConstraint::kSakoeChiba: {
index col = rasterLineMaxY(-mParam, mParam, mRows - 1 - mParam,
mCols - 1 + mParam, row);

return col < 0 ? 0 : col > mCols - 1 ? mCols - 1 : col;
}
}

return mCols - 1;
};
}; // struct Constraint
};

} // namespace algorithm
} // namespace fluid
2 changes: 2 additions & 0 deletions include/clients/nrt/CommonResults.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,10 @@ static const std::string LargeK{"k is too large"};
static const std::string SmallDim{"Number of dimensions is too small"};
static const std::string LargeDim{"Number of dimensions is too large"};
static const std::string EmptyDataSet{"DataSet is empty"};
static const std::string EmptyDataSeries{"DataSeries is empty"};
static const std::string EmptyLabelSet{"LabelSet is empty"};
static const std::string NoDataSet{"DataSet does not exist"};
static const std::string NoDataSeries{"DataSeries does not exist"};
static const std::string NoLabelSet{"LabelSet does not exist"};
static const std::string NoDataFitted{"No data fitted"};
static const std::string NotEnoughData{"Not enough data"};
Expand Down
Loading
Loading