Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for multivariate datasets #15

Open
agrija9 opened this issue Dec 2, 2019 · 2 comments
Open

Support for multivariate datasets #15

agrija9 opened this issue Dec 2, 2019 · 2 comments

Comments

@agrija9
Copy link

agrija9 commented Dec 2, 2019

Hi Carlos,

I'm trying to visualize the performance of this model using a multivariate time series data set. More specifically I want to see clustering behaviour of my time series. It is of shape (1000, 9601, 6) (time series, time steps, sensor readings).

In your example, you grab the trajectories of the Brownian motion and get a projection value. So your trajectory data has a shape of say (10000, 2). In my case, I have six sensor readings for each time series with shape (9601, 6).

When you do the scatter plot you unpack these two columns and plot them with their corresponding energy value. By running your code with my data I get six possible values to unpack.

I'm still not clear as to what your dimension reduction is doing though, are you just computing the projection value in a reduced space but plotting the trajectories in the original space?

I appreciate any insight!

@cxhernandez
Copy link
Member

Hi @agrija9,

I'm still not clear as to what your dimension reduction is doing though, are you just computing the projection value in a reduced space but plotting the trajectories in the original space?

The VDE captures slow dynamical information in a ergodic system (e.g. how a particle might diffuse over long timescales under a given potential). This is fundamentally different to, say, PCA or an autoencoder in that the VDE's objective function (autocorrelation) explicitly maximizes time-lagged information content in its latent representation.

Apologies since the Mueller potential isn't necessarily the best example of results, since its slowest dynamical mode corresponds to its axis of largest variance, but you can see an example here of the linear case as applied to a double-well potential where this is not the case.

In my case, I have six sensor readings for each time series with shape (9601, 6).

I'd recommend using or PCA or UMAP for visualizing youre data in 2-D. You can run the VDE on the raw data and then project it onto a reduced projection afterwards.

You could also just project onto marginal distributions (e.g. corner.py if you'd like to keep the full dimensionality of your dataset.

@agrija9
Copy link
Author

agrija9 commented Dec 8, 2019

Hi @cxhernandez,

Can the VDE perform for other types of physical systems/data (info contained in time-series)? Or is it that due to things like the objective function it's restricted to analyze particle diffusion?

Thanks a lot for the data visualization tool you recommend, I'll have a look at it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants