Skip to content
This repository has been archived by the owner on Sep 11, 2023. It is now read-only.

Continuous time MSM #951

Open
MamtaMohan opened this issue Oct 2, 2016 · 23 comments
Open

Continuous time MSM #951

MamtaMohan opened this issue Oct 2, 2016 · 23 comments

Comments

@MamtaMohan
Copy link

Hi,

Is it possible to generate continuous time MSM with pyEMMA.

Mamta

@franknoe
Copy link
Contributor

franknoe commented Oct 2, 2016

This has been on our lists, but we hadn't had the time to finish it yet.
@fabian-paul has implemented rate matrix estimator using the
Kalbfleisch-Lawless method in a pull request here:
markovmodel/msmtools#57
But I'm not sure what the status is, Fabian can you comment? Maybe
having the estimator in msmtools would already help you, Mamta? My
feeling is that this would be easy to do.

I think it would also be relatively straightforward to wire it into
PyEMMA - we'd need to implement a Model class (MasterEquationModel or
ContinuousTimeMSM) that is very similar to the current MSM class but has
a few changes with respect to how the eigenvalues are handled. I think
we could re-use all MSM tests. The estimator would be similar to
MaximumLikelihoodMSM and would just call the msmtools estimator.
Probably much of the code in MSM and MaximumLikelihoodMSM can be
generalized to be used in both cases. My guess is that this is a few
days of coding for a good PyEMMA developer. Fabian, have you already
done any of that or did you exclusively work with the low-level method?

Am 01/10/16 um 20:45 schrieb MamtaMohan:

Hi,

Is it possible to generate continuous time MSM with pyEMMA.

Mamta


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#951, or mute the thread
https://github.com/notifications/unsubscribe-auth/AGMeQiQzYIDN25SJtosp4Urq4kIe3AOGks5qvwyigaJpZM4KL6L3.


Prof. Dr. Frank Noe
Head of Computational Molecular Biology group
Freie Universitaet Berlin

Phone: (+49) (0)30 838 75354
Web: research.franknoe.de

Mail: Arnimallee 6, 14195 Berlin, Germany

@MamtaMohan
Copy link
Author

MamtaMohan commented Oct 2, 2016

Thank you.

Appreciate your response.

I will give a try.

Mamta

@fabian-paul
Copy link
Member

fabian-paul commented Oct 3, 2016

Hi Mamta, hi Frank,

the continuous time estimator that is implemented in the pull request markovmodel/msmtools#57 is functional. To use the estimator as quick as possible, you could replace your version of msmtools with the version from my branch: https://github.com/fabian-paul/msmtools/tree/ratematrix
This version is quite up-to-date with the development branch.

The rate-matrix estimator is available after importing msmtools as
msmtools.estimation.estimate_rate_matrix
It is not integrated with Pyemma yet, estimate_rate_matrix takes a (numpy) count matrix as input and returns a (numpy) rate-matrix.
Integrating this in Pyemma will require a bit of work, as we would have to provide the implied time-scale plot, the Chapman-Kolmogorov test and Transition Path Theory expressed with the rate matrix as well. I'm would be pleased if we could add these features. However I'm quite busy right now so I would prefer to implement only the functionality that is needed right now.
I vigorously recommend that you carry out the Chapman-Kolmogorov-test with the estimated rate matrix because is it not guaranteed a priori that a continuous time MSM even exists for your data.
So I think it would be a sensible goal to implement everything in Pyemma that is needed for a continuous time MSM CK-test.
Mamta, to you need any specific functionality for the continuous time MSM?

Best,
Fabian

@franknoe
Copy link
Contributor

franknoe commented Oct 3, 2016

most of these additional features (Implied Timescales and CKTest) would
come for free if an Estimator and Model class are implemented, so I
think it's less work than you suggest, but of course I see you can't do
it right now.

Can we merge the PR in msmtools to have the basic functionality and
discuss the rest in Berlin?

Am 03/10/16 um 04:19 schrieb fabian-paul:

Hi Mamta, hi Frank,

the continuous time estimator that is implemented in the pull request
markovmodel/msmtools#57
markovmodel/msmtools#57 is functional. To
use the estimator as quick as possible, you could replace your version
of msmtools with the version from my branch:
https://github.com/fabian-paul/msmtools/tree/ratematrix
This version is quite up-to-date with the development branch.

The rate-matrix estimator is available after importing msmtools as
|msmtools.estimation.estimate_rate_matrix|
It is not integrated with Pyemma yet, estimate_rate_matrix takes a
(numpy) count matrix as input and returns a (numpy) rate-matrix.
Integrating this in Pyemma will require a bit of work, as we would
have to provide the implied time-scale plot, the Chapman-Kolmogorov
test and Transition Path Theory expressed with the rate matrix as
well. I'm would be pleased if we could add these features. However I'm
quite busy right now so I would prefer to implement only the
functionality that is needed right now.
I vigorously recommend that you carry out the Chapman-Kolmogov-test
with the estimated rate matrix because is it not guaranteed a priori
that a continuous time MSM even exists for your data.
So I think it would be a sensible goal to implement everything in
Pyemma that is needed for a continuous time MSM CK-test.
Mamta, to you need any specific functionality for the continuous time MSM?

Best,
Fabian


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#951 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGMeQhLmSuhZ98cV9AhjmCAEYo2f7QwAks5qwMi7gaJpZM4KL6L3.


Prof. Dr. Frank Noe
Head of Computational Molecular Biology group
Freie Universitaet Berlin

Phone: (+49) (0)30 838 75354
Web: research.franknoe.de

Mail: Arnimallee 6, 14195 Berlin, Germany

@fabian-paul
Copy link
Member

I see. If we implement it as a Model with .timescales() and .propagate() we get much for free.

Yes the PR is ready to merge. It was on halt for two reasons:

  • We were waiting for my PR on scipy to make it into the next release. That has happened and I have removed all duplicate code that now is in scipy from the PR.
  • We wanted to clarify whether my or Robert's approach works better. We didn't sort this out yet. But I don't think this should stop us from merging. If Robert's method turns out the be better, we can change that later.

@franknoe
Copy link
Contributor

franknoe commented Oct 3, 2016

I agree. Let's go ahead, we can always swap out code if useful, but for
now it's important to have the functionality.

Am 03/10/16 um 14:26 schrieb fabian-paul:

I see. If we implement it as a Model with .timescales() and
.propagate() we get much for free.

Yes the PR is ready to merge. It was on halt for two reasons:

  • We were waiting for my PR on scipy to make it into the next
    release. That has happened and I have removed all duplicate code
    that now is in scipy from the PR.
  • We wanted to clarify whether my or Robert's approach works better.
    We didn't sort this out yet. But I don't think this should stop us
    from merging. If Robert's method turns out the be better, we can
    change that later.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#951 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGMeQvShWgiDGHWoj_G7EC9zf7x8wA2Pks5qwVbNgaJpZM4KL6L3.


Prof. Dr. Frank Noe
Head of Computational Molecular Biology group
Freie Universitaet Berlin

Phone: (+49) (0)30 838 75354
Web: research.franknoe.de

Mail: Arnimallee 6, 14195 Berlin, Germany

@franknoe
Copy link
Contributor

Matma, the msmtools code is merged. You can find the low-level functionality in msmtools.estimation.rate_matrix if you install from github. If you want to use the released version, wait a few days until the new msmtools release is cut.

@marscher
Copy link
Member

@MamtaMohan you can now use msmtools-1.2 for rate matrix estimation

@MamtaMohan
Copy link
Author

Thank you Frank.

I will wait few days.

Mamta


From: Frank Noe [email protected]
Sent: Friday, October 21, 2016 1:24:29 PM
To: markovmodel/PyEMMA
Cc: Mohan, Mamta; Author
Subject: Re: [markovmodel/PyEMMA] Continuous time MSM (#951)

Matma, the msmtools code is merged. You can find the low-level functionality in msmtools.estimation.rate_matrix if you install from github. If you want to use the released version, wait a few days until the new msmtools release is cut.

You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHubhttps://github.com//issues/951#issuecomment-255429878, or mute the threadhttps://github.com/notifications/unsubscribe-auth/APcc_Yc_ztnFxcOdhhqRSFaXdHwGnTCPks5q2PVNgaJpZM4KL6L3.

@marscher
Copy link
Member

marscher commented Oct 25, 2016 via email

@MamtaMohan
Copy link
Author

Thank you Martin.

I was waiting for conda release.

Appreciate it.

Mamta


From: Martin K. Scherer [email protected]
Sent: Tuesday, October 25, 2016 3:01:19 PM
To: markovmodel/PyEMMA
Cc: Mohan, Mamta; Mention
Subject: Re: [markovmodel/PyEMMA] Continuous time MSM (#951)

the binaries the released version are also available ;)

just do a conda update msmtools

You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHubhttps://github.com//issues/951#issuecomment-256142554, or mute the threadhttps://github.com/notifications/unsubscribe-auth/APcc_VyrADVjBXMD7eEYHR1lg5rxNMpuks5q3lH_gaJpZM4KL6L3.

@MamtaMohan
Copy link
Author

Dear Fabian,

I was caught up with things so writing to after a delay.

I am new to lot of things so please be patient with me.

I just want basic Continuous Markov Model, functionality for it there now.

Please correct me where ever you think I am off.

To make Continuous Markov Model, Using PentaPeptide test files and tutorial and following script
I build the model up to the point to calculate Implied timescales
Bayesian Markov state model estimator, which give estimation of msm-lag time.
Then discrete Markov model is build by call: M = msm.estimate_markov_model(dtrajs, msm_lag)
I understand here you would like me to use new rate matrix, first by importing msmtools.

But Isn't msmtools is already loaded with PyEmma?

As per you suggested : The rate-matrix estimator is available after importing msmtools as
msmtools.estimation.estimate_rate_matrix

I tried [83]: import msmtools as msmtools.estimation.estimate_rate_matrix
File "", line 1
import msmtools as msmtools.estimation.estimate_rate_matrix
^
So I imported everything from msmtools

import msmtools as MT

In [87]: MC = MT.estimation.estimate_rate_matrix(dtrajs, msm_lag)

AttributeError Traceback (most recent call last)
in ()
----> 1 MC = MT.estimation.estimate_rate_matrix(dtrajs, msm_lag)

AttributeError: 'module' object has no attribute 'estimate_rate_matrix'

Last command I think is part of PyEMMA so it is not suppose to work.

So could you please guide me how to import the required module.

If you can point out calls in tutorial It will help.

@franknoe
Copy link
Contributor

Hi,

you may need to update msmtools:
conda install msmtools

then your import statement should work as this:
from msmtools.estimation import estimate_rate_matrix
and use as
estimate_rate_matrix(args)

or
import msmtools
and use as
msmtools.estimation import estimate_rate_matrix()

Am 16/11/16 um 02:50 schrieb MamtaMohan:

import msmtools as msmtools.estimation.estimate_rate_matrix


Prof. Dr. Frank Noe
Head of Computational Molecular Biology group
Freie Universitaet Berlin

Phone: (+49) (0)30 838 75354
Web: research.franknoe.de

Mail: Arnimallee 6, 14195 Berlin, Germany

@MamtaMohan
Copy link
Author

Dear Frank,

I will try your suggestion again as you mentioned.

I updated PyEMMA before I started via conda update and following packages were updated:

package build
progress_reporter-1.2 py_1 14 KB acellera
conda-4.2.12 py27_0 373 KB
msmtools-1.2 np111py27_0 1.2 MB acellera
pyemma-2.2.7 np111py27_0 1.3 MB acellera
------------------------------------------------------------ 
                                       Total:         2.8 MB 

The following NEW packages will be INSTALLED:

progress_reporter: 1.2-py_1          acellera                      

The following packages will be UPDATED:

conda:             4.2.7-py27_0                                     --> 4.2.12-py27_0             
msmtools:          1.1.3-np111py27_0 http://conda.binstar.org/omnia --> 1.2-np111py27_0   acellera 
pyemma:            2.2.3-np111py27_0 http://conda.binstar.org/omnia --> 2.2.7-np111py27_0 acellera 

@franknoe
Copy link
Contributor

then I guess you should have an msmtools that includes the feature

Am 16/11/16 um 13:43 schrieb MamtaMohan:

Dear Frank,

I will try your suggestion again as you mentioned.

I updated PyEMMA before I started via conda update and following
packages were updated:

package build
progress_reporter-1.2 py_1 14 KB acellera
conda-4.2.12 py27_0 373 KB
msmtools-1.2 np111py27_0 1.2 MB acellera
pyemma-2.2.7 np111py27_0 1.3 MB acellera

|------------------------------------------------------------ Total:
2.8 MB |

The following NEW packages will be INSTALLED:

|progress_reporter: 1.2-py_1 acellera |

The following packages will be UPDATED:

|conda: 4.2.7-py27_0 --> 4.2.12-py27_0 msmtools: 1.1.3-np111py27_0
http://conda.binstar.org/omnia --> 1.2-np111py27_0 acellera pyemma:
2.2.3-np111py27_0 http://conda.binstar.org/omnia --> 2.2.7-np111py27_0
acellera |


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#951 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGMeQqZBb7283L1ntQ7A8X-FQ4KRPiUEks5q-vp3gaJpZM4KL6L3.


Prof. Dr. Frank Noe
Head of Computational Molecular Biology group
Freie Universitaet Berlin

Phone: (+49) (0)30 838 75354
Web: research.franknoe.de

Mail: Arnimallee 6, 14195 Berlin, Germany

@fabian-paul
Copy link
Member

Hi Mamta,

The comments above are a bit out-dated. We renamed the method in msmtools to msmtools.estimation.rate_matrix.
Could you give that a try?
Also msmtools doens't work with dtrajs directly (like pyemma). You first have to estimate a count matrix with msmtools.estimation.count_matrix and restrict that to its connected set with msmtools.estimation.largest_connected_submatrix. That submatrix has to be given to msmtools.estimation.rate_matrix as the first argument.

Best,
Fabian

@MamtaMohan
Copy link
Author

ok.

Let me give it a try and let me get back to you.

@marscher
Copy link
Member

On 11/16/2016 03:46 AM, Frank Noe wrote:

Hi,

you may need to update msmtools:
conda install msmtools

then your import statement should work as this:
from msmtools.estimation import estimate_rate_matrix
The function is just called rate_matrix, in analogy to the function transition_matrix.

To easily get a correct (connected count matrix) you can first estimate an msm via
pyemma and re-use the count_matrix_active property:

msmtools.estimation.rate_matrix(msm.count_matrix_active)

@franknoe
Copy link
Contributor

Yes, correct.

Note that estimation of the rate matrix on a fine clustering that MSMs
are built upon
is very expensive and not necessarily very meaningful. It's a better
idea to use a HMM and then do rate
matrix estimation on the resulting macroscopic count matrix.

Am 17/11/16 um 13:40 schrieb Martin K. Scherer:

On 11/16/2016 03:46 AM, Frank Noe wrote:

Hi,

you may need to update msmtools:
conda install msmtools

then your import statement should work as this:
from msmtools.estimation import estimate_rate_matrix
The function is just called rate_matrix, in analogy to the function
transition_matrix.

To easily get a correct (connected count matrix) you can first
estimate an msm via
pyemma and re-use the count_matrix_active property:

msmtools.estimation.rate_matrix(msm.count_matrix_active)


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#951 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGMeQito0es5i0knR86mO07yvLGWEQlHks5q_EtNgaJpZM4KL6L3.


Prof. Dr. Frank Noe
Head of Computational Molecular Biology group
Freie Universitaet Berlin

Phone: (+49) (0)30 838 75354
Web: research.franknoe.de

Mail: Arnimallee 6, 14195 Berlin, Germany

@MamtaMohan
Copy link
Author

Thank you.

Hopefully I will be able to get this today or tomorrow and I will get back to you.

@MamtaMohan
Copy link
Author

Sorry for the delay.

This is what I understand from the discussion:

For Continuous Markov Model:
After I build discrete Markov Model via PyEMMA
I should first build HMM as Frank suggested.
I did make HMM for 4 states for Penta-peptide but I realized
As Martim suggested for rate_matrix
I imported msmtools as:
import msmtools as msmtools
I checked Markov model which is in M.
M.count_matrix_active give me result:
In [126]: M.count_matrix_active
Out[126]:
array([[ 6.91231723, 0. , 0. , ..., 1.97494778,
0. , 9.8747389 ],
0. , 4.95939945]])

Just listing part of output.
In [129]: M.count_matrix_active.shape
Out[129]: (250, 250)
Finally to get rate_matrix:
In [130]: MMC = msmtools.estimation.rate_matrix(M.count_matrix_active)
......
AssertionError:
Commands ends with AssertionError. I think it is this function:
def function_and_gradient(self, x):
343 if self.sparsity is None:
--> 344 assert np.all(x >= 0)
345 else:
346 assert np.all(x > 0)

If you can help me fix this I would appreciate it.

@fabian-paul
Copy link
Member

Hi Mamta,

apologies for the long delay. I had some time to dig into this error only now. I created a pull request with a tentative fix on markovmodel/msmtools#98
There seems to be some internal problem of the l_bfgs_b routine as it apparently can violate the constraints that are imposed during the minimization of the likelihood.
The fix would be to install a new version of msmtools from github once the pull request is merged and call rate_matrix with the option on_error='warn'. This will turn the error that you get into a warning. Hopefully then the l_bfgs_b minimizer will recover from overshooting the bounds in a subsequent iteration, s. t. the result will be ok.
I know that this might not be the solution you expect. However there are still some numerical difficulties with the maximum-likelihood rate matrix estimator. The gradient of the likelihood requires an element-wise division by the transition matrix which leads to a bad condition. In a previous commit, I had already implemented the ad-hoc fix that was proposed by Pande and McGibbon for this problem in DOI:10.1063/1.4926516 This might already solve the problem.

Fabian

@MamtaMohan
Copy link
Author

MamtaMohan commented Jan 3, 2017 via email

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants