Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new feature peak-to-peak variation #332

Merged
merged 7 commits into from
Apr 5, 2024
Merged

new feature peak-to-peak variation #332

merged 7 commits into from
Apr 5, 2024

Conversation

GaluTi
Copy link
Contributor

@GaluTi GaluTi commented Mar 4, 2024

Realisation of the new feature: peak-to-peak variation
issue: https://github.com/light-curve/light-curve-feature/issues/104

Copy link

codspeed-hq bot commented Mar 4, 2024

CodSpeed Performance Report

Merging #332 will not alter performance

Comparing ptp_var (2e73538) with master (5044a27)

Summary

✅ 97 untouched benchmarks

Copy link
Member

@hombit hombit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rebase on master.

It is not clear what is the source of the desired values for random samples and sine. Could you please some comments to the code about this? For me it looks like this value should not converge with large n, could you explain why it is not true?

One more thing about the desired value. In the docs you say that threshold should be 0.1-0.15 and for non-variable data the value should be close to zero. Could you please justify these values in one or few tests, maybe modifying existing tests.

n = 100
t = np.arange(n)
m = np.ones_like(t)
sigma = 0.1 * np.ones_like(t)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
sigma = 0.1 * np.ones_like(t)
sigma = np.full_like(t, 0.1)

sigma = 0.1 * np.ones_like(t)
actual = feature(t, m, sigma)
desired = 0.8
assert_allclose(actual, desired, rtol=3 / np.sqrt(n))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we use rtol=3 / np.sqrt(n), I see no random number used...

rng = np.random.default_rng(0)
n = 100
t = np.linspace(0, 1, n)
m = np.abs(rng.normal(0, 1, n))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we use np.abs here?

n = 100
t = np.linspace(0, 1, n)
m = np.abs(rng.normal(0, 1, n))
sigma = 0.2 * np.ones_like(t)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
sigma = 0.2 * np.ones_like(t)
sigma = np.full_like(t, 0.2)

sigma = 0.2 * np.ones_like(t)
feature = PeakToPeakVar()
actual = feature(t, m, sigma)
desired = 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we expect 1 here?

assert_allclose(actual, desired, rtol=3 / np.sqrt(n))


def test_ptpvar_norm_data_2():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous function

sigma = np.ones_like(t)
feature = PeakToPeakVar()
actual = feature(t, m, sigma)
desired = 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we expect 1 here?

\frac{(m_i - \sigma_i)_\text{max} - (m_i + \sigma_i)_\text{min}}{(m_i - \sigma_i)_\text{max} + (m_i + \sigma_i)_\text{min}}
$$
For non-variable data, it should be close to zero.
If data is close to be variable, the index should be more or equal than 0.10-0.15.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "close to be variable" mean? Could you please clarify text here? Maybe we should say something like "commonly used variability threshold lies in a range of 0.1-0.15"?

$$
For non-variable data, it should be close to zero.
If data is close to be variable, the index should be more or equal than 0.10-0.15.
It is sensitive to magnitude of error values and can be negative in overestimated errors case.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that we need this sentence here

Copy link

codecov bot commented Mar 21, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 85.04%. Comparing base (5044a27) to head (2e73538).

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #332   +/-   ##
=======================================
  Coverage   85.04%   85.04%           
=======================================
  Files           9        9           
  Lines        2668     2668           
=======================================
  Hits         2269     2269           
  Misses        399      399           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@hombit hombit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Please see few more comments on testing

from light_curve.light_curve_py import PeakToPeakVar


def test_ptpvar():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't really understand what this test tests. Should it work for any statistical feature, not this one only?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need a test to make clear what "0.1-0.15" refers to

@hombit hombit enabled auto-merge (squash) April 5, 2024 16:43
@hombit hombit merged commit 670d374 into master Apr 5, 2024
79 checks passed
@hombit hombit deleted the ptp_var branch April 5, 2024 17:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants