-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Combine with TransitionIndicators.jl to create the ultimate package for identifying changes or transitions in timeseries #36
Comments
Hi,
Thanks for reaching out, this is an interesting piece of work you're proposing.
Changepoints.jl is currently only being maintained, there's no plan to add further content. Of course, it's open source software, so please feel free to use the code as you see fit, it sounds like a good idea to me.
Out of personal interest, do you have any sources describing your approach to change point detection from a statistical perspective? This is the field I work on and I haven't seen anything like your approach before.
Best wishes
Dom
…________________________________
From: George Datseris ***@***.***>
Sent: 27 April 2023 10:26
To: STOR-i/Changepoints.jl ***@***.***>
Cc: Subscribed ***@***.***>
Subject: [STOR-i/Changepoints.jl] Combine with TransitionIndicators.jl to create the ultimate package for identifying changes or transitions in timeseries (Issue #36)
Hi there devs of Changepoints.jl.
Recently in JuliaDynamics we started developing a package currently called TransitionIndicators.jl for identifying or forecasting transitions in a timeseries. The package is here: https://github.com/JuliaDynamics/TransitionIndicators.jl . Unfortunately we started working on this package before we found Changepoints.jl. In any case, at the moment the algorithms of the two packages are completely different.
Here is a quick summary of how TransitionIndocators.jl works:
1. Input timeseries are first transformed into an indicator timeseries using a rolling window. The indicator is a quantity that can forecast an upcoming change. A typical application of this is the so called critical slowing down<https://en.wikipedia.org/wiki/Critical_transition#Early-warning_signals_and_critical_slowing_down>: the AR1-coefficient of a timeseries increases as we approach a transition. Hence, the AR1 coefficient can be estimated over a rolling window and this gives us the "indicator timeseries".
2. The indocator timeseries is then transformed into a change metric timeseries using a rolling window. The change metric is a function mapping indicator timeseries windows into real numbers. Its goal is to quantify significant change in the indicator timeserries. In the above Critical Slowing Down example, the change metric is simply the slope of timeseries segment in the window.
3. Lastly, a significance test is done: we estimate at each time points (at which time windows to be precise) the change metric is significantly higher than the mean change metric of the timeseries. We do this using the method of timeseries surrogates. The best way to understand this is to quickly read through our example here: https://juliadynamics.github.io/TransitionIndicators.jl/dev/tutorial/
Our goal was from the get go to make a software that can do both forecasting of transitions, but also the more established change point detection. Our approach can do both. Let's imagine the simple change point detection scenario of data coming from guassian distributions where the or std changes with time. In the above enumerated list, the indicator is the identity (i.e., the indicator timeseries and input timeseries are one and the same). Then, the change metric timeseries can be anything quantifying distribution distance, such as KL Divergence: we calculate the distance in of the distribution of the first half of the rolling window with the second half of the rolling window. Alternative way is to instead use the pvalue of a Two sample Kolmogorov Smirnov Test<https://juliastats.org/HypothesisTests.jl/stable/nonparametric/#HypothesisTests.ApproximateTwoSampleKSTest> between the data of the first and the second half of the window. Significance is once again tested using the method of surrogates.
Changepoints.jl differs in a fundamental way in two parts: first, the "casting timeseries into indicators" part is skipped, and left to the discretion of the user. This isn't really a big change. The biggest change however are the following two points:
* One, how the change metric is computed: it is the value of a loss function. Conceptually, the change metric is still something similar with our second example: it is some distance from some other reference distribution.
* Two, how the significance is estimated: here significance is simply when the cost function is smallest or falls below a threshold.
Due to these considerations, we believe that we can merge the two approaches. The steps 1. and 2. that are part of the pipeline of current TransitionIndicators.jl stay as they are. Part 3 becomes modular, and allows users to choose different ways of how to estimate significance "change" in the "change metric timeseries" either via surrogate testing or via minimization of a cost function.
Please let us know what you think of this plan and whether it seems sensible. Please also let us know whether you are still developing this Changepoints.jl repository, or whether it is in purely maintance mode, or whether it is compeltely archived and no new development effort will be spent here (it is not clear from the GitHub page unfortunately).
cc @JanJereczek<https://github.com/JanJereczek>
—
Reply to this email directly, view it on GitHub<#36>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ANMEZOPAKYG4GFOAKDAJQL3XDI3V7ANCNFSM6AAAAAAXNSSNSU>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Hi @Datseris, looks like an interesting project!!! From my point of view, is the very important part of "change analysis" the continuous piece-wise linear regression. See for info the following paper. The method presented in this paper is similar to the standard PELT method for linear segments estimation, but it is moreover forced to be continuous (without step changes between linear segments!!!). See the code on GitLab, too. I am trying to collaborate with Jami Pekkanen to generalize this (extremely fast!!!) method to be able to work with user defined fixed change points (is some user scenarios is that possibility very important) and to be able to work with time dependent signals noise estimation (sigma). But so far without any significant success. |
Hi there @Dom-Owens-UoB and @michalkvasnicka ! First of all, thank you very much for the quick reply, I really appreciate it! To answer:
Sounds fair. My idea would be to use the code of Changepoints.jl as a "backend" in TimeseriesTransitions.jl (the new name the package will have). If it is possible to use Changepoints.jl directly as a dependency without much clutter in the API, I will go for this; if not possible, we will probably extract the core optimization loop code of the package, as doing substantial changes in the API of Changepoints.jl does not seem wise at this point; package is stable so no reason to change it. In any case, the license will be copied as well.
No, and this is exactly the problem. As you correctly stated, change point detection is a topic large enough that someone can "work in". Nevertheless, in the last two years I have seen very clearly the existence of two communities that are separated from each other and they never really talked to each other. The first is change point detection, and the other is forecasting transitions. Yet I believe that these two communities are actually two sides of the same coin! They solve the "same problem", just dressed up differently according to the scientific question at hand. Indeed, as I have illustrated already to @JanJereczek , whether you "forecast" a transition or whether you "estimate its exact position" really depends on the "estimator" and "change metric" you will choose. (also, the two communities use different tools to estimate a significant change point, optimization on your end, surrogate testing on our end) I know that the approach of the surrogate significance has been applied to change point detection. The typical example in the field where I come from (nonlinear dynamics) is to use the permutation entropy to detect a change in the dynamics of a timeseries. This uses as the "change metric" not a difference in statistical distributions in terms of means or stds., but rather a difference in how the points are sequenced. I am sure there are many papers that have done this, but in most cases the "change point detection" itself isn't the focus of the paper, it is rather the distinction of two dynamic regimes in the timeseries. Googling "permutation entropy change point" already gives many results, e.g. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7513234/ (although that article in particular doesn't use surrogates for significance, it goes the optimization route, but I'm sure surrogate papers exist as well)
Wow thanks for sharing! Actually, this kind of approach might be directly applicable to early warning signals where your end goal is to estimate increases in slopes of segments of the data! cc @JanJereczek this could go into your research on best estimators |
Closing this in favor of JuliaDynamics/TransitionsInTimeseries.jl#52 . We have designed that package to be a general "find transitions/changes in timeseries package". It defines a generic interface for doing so. Now PELT/ChangePoints.jl can be part of this interface. Once JuliaDynamics/TransitionsInTimeseries.jl#52 is implemented, there will be a type corresponding to using ChangePoints.jl to find transitions in timeseries. Currently the package has two other types that do other analysis piplines for finding transitions. |
(oh yeah, also thanks again for the wonderful resources you guys shared!) |
Hi there devs of Changepoints.jl.
Recently in JuliaDynamics we started developing a package currently called TransitionIndicators.jl for identifying or forecasting transitions in a timeseries. The package is here: https://github.com/JuliaDynamics/TransitionIndicators.jl . Unfortunately we started working on this package before we found Changepoints.jl. In any case, at the moment the algorithms of the two packages are completely different.
Here is a quick summary of how TransitionIndocators.jl works:
Our goal was from the get go to make a software that can do both forecasting of transitions, but also the more established change point detection. Our approach can do both. Let's imagine the simple change point detection scenario of data coming from guassian distributions where the or std changes with time. In the above enumerated list, the indicator is the identity (i.e., the indicator timeseries and input timeseries are one and the same). Then, the change metric timeseries can be anything quantifying distribution distance, such as KL Divergence: we calculate the distance in of the distribution of the first half of the rolling window with the second half of the rolling window. Alternative way is to instead use the
pvalue
of a Two sample Kolmogorov Smirnov Test between the data of the first and the second half of the window. Significance is once again tested using the method of surrogates.Changepoints.jl differs in a fundamental way in two parts: first, the "casting timeseries into indicators" part is skipped, and left to the discretion of the user. This isn't really a big change. The biggest change however are the following two points:
Due to these considerations, we believe that we can merge the two approaches. The steps 1. and 2. that are part of the pipeline of current TransitionIndicators.jl stay as they are. Part 3 becomes modular, and allows users to choose different ways of how to estimate significance "change" in the "change metric timeseries" either via surrogate testing or via minimization of a cost function.
Please let us know what you think of this plan and whether it seems sensible. Please also let us know whether you are still developing this Changepoints.jl repository, or whether it is in purely maintance mode, or whether it is compeltely archived and no new development effort will be spent here (it is not clear from the GitHub page unfortunately).
cc @JanJereczek
The text was updated successfully, but these errors were encountered: