Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change timeseries data checks to warn instead of error on read #1793

Merged
merged 2 commits into from
Nov 30, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
- For `NWBHDF5IO()`, change the default of arg `load_namespaces` from `False` to `True`. @bendichter [#1748](https://github.com/NeurodataWithoutBorders/pynwb/pull/1748)
- Add `NWBHDF5IO.can_read()`. @bendichter [#1703](https://github.com/NeurodataWithoutBorders/pynwb/pull/1703)
- Add `pynwb.get_nwbfile_version()`. @bendichter [#1703](https://github.com/NeurodataWithoutBorders/pynwb/pull/1703)
- Updated timeseries data checks to warn instead of error when reading invalid files. @stephprince [#1793](https://github.com/NeurodataWithoutBorders/pynwb/pull/1793)

## PyNWB 2.5.0 (August 18, 2023)

Expand Down
12 changes: 10 additions & 2 deletions src/pynwb/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -174,15 +174,23 @@ def __init__(self, **kwargs):
timestamps = args_to_process['timestamps']
if timestamps is not None:
if self.rate is not None:
raise ValueError('Specifying rate and timestamps is not supported.')
self._error_on_new_warn_on_construct(
error_msg='Specifying rate and timestamps is not supported.'
)
if self.starting_time is not None:
raise ValueError('Specifying starting_time and timestamps is not supported.')
self._error_on_new_warn_on_construct(
rly marked this conversation as resolved.
Show resolved Hide resolved
error_msg='Specifying starting_time and timestamps is not supported.'
)
self.fields['timestamps'] = timestamps
self.timestamps_unit = self.__time_unit
self.interval = 1
if isinstance(timestamps, TimeSeries):
timestamps.__add_link('timestamp_link', self)
elif self.rate is not None:
if self.rate <= 0:
self._error_on_new_warn_on_construct(
error_msg='Rate must be a positive value.'
)
if self.starting_time is None: # override default if rate is provided but not starting time
self.starting_time = 0.0
self.starting_time_unit = self.__time_unit
Expand Down
57 changes: 57 additions & 0 deletions tests/unit/test_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -405,6 +405,63 @@ def test_get_data_in_units(self):
ts = mock_TimeSeries(data=[1., 2., 3.])
assert_array_equal(ts.get_data_in_units(), [1., 2., 3.])

def test_non_positive_rate(self):
with self.assertRaisesWith(ValueError, 'Rate must be a positive value.'):
TimeSeries(name='test_ts', data=list(), unit='volts', rate=-1.0)
with self.assertRaisesWith(ValueError, 'Rate must be a positive value.'):
TimeSeries(name='test_ts1', data=list(), unit='volts', rate=0.0)
Comment on lines +411 to +412
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rly @stephprince Hey there

Does this mean that on next release of PyNWB it will be impossible to specify any TimeSeries with a rate of zero, even if it only has a single element on the first axis?

That's a fairly big break to back-compatibility if so, we've recommended to several people throughout the years to use that approach for statically generated microscopy images since that is the closest available data type for their setup (static images module lacks the associated metadata, and no specific static image types in the ophys module)

Not that it's a bad thing overall, but there should be a bigger changelog note about it IMO, along with some update about what we recommend to such users as an alternative

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I was not aware that folks were using the Ophys modules this way. Could you clarify why rate matters when there is only a single image. Wouldn't you just use an explicit timestamp array in that case? I think it is fine to change the check to only check for "self.rate < 0" instead of "self.rate <= 0" to avoid the breaking case you mentioned (an maybe have an warning for "rate == 0").

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the difference with that proposal is then how to treat NaN values for timestamps

For static images in this case, it might not matter when it was acquired w.r.t. any other data streams in the file, so the starting_time could be set to NaN while the rate is set to a true value of 0.0

It's true an equivalent representation could be to use timestamps but then it would be an array of one element, that being NaN, but we'd need checks elsewhere to ensure no more than one NaN is set in other instances, such as a series with timestamps=[NaN, NaN, NaN, ...] which would make less sense, right?

In all cases, this is really because there's a lack of appropriate neurodata object, so maybe I'll just kick the bucket to ndx-microscopy to resolve this in the best way

I do think we shouldn't break back-compatibility for the exact 0.0 case for the time being however, just so there's no unintended disruption

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case I would suggest the following:

  • change the new check to only check for self.rate > 0
  • optionally add a warning for self.rate == 0
  • Wait for ndx-microscopy to address the missing data type

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephprince could you create a PR to make those changes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I can do that


def test_file_with_non_positive_rate_in_construct_mode(self):
"""Test that UserWarning is raised when rate is 0 or negative
while being in construct mode (i.e,. on data read)."""
obj = TimeSeries.__new__(TimeSeries,
container_source=None,
parent=None,
object_id="test",
in_construct_mode=True)
with self.assertWarnsWith(warn_type=UserWarning, exc_msg='Rate must be a positive value.'):
obj.__init__(
name="test_ts",
data=list(),
unit="volts",
rate=-1.0
)

def test_file_with_rate_and_timestamps_in_construct_mode(self):
"""Test that UserWarning is raised when rate and timestamps are both specified
while being in construct mode (i.e,. on data read)."""
obj = TimeSeries.__new__(TimeSeries,
container_source=None,
parent=None,
object_id="test",
in_construct_mode=True)
with self.assertWarnsWith(warn_type=UserWarning, exc_msg='Specifying rate and timestamps is not supported.'):
obj.__init__(
name="test_ts",
data=[11, 12, 13, 14, 15],
unit="volts",
rate=1.0,
timestamps=[1, 2, 3, 4, 5]
)

def test_file_with_starting_time_and_timestamps_in_construct_mode(self):
"""Test that UserWarning is raised when starting_time and timestamps are both specified
while being in construct mode (i.e,. on data read)."""
obj = TimeSeries.__new__(TimeSeries,
container_source=None,
parent=None,
object_id="test",
in_construct_mode=True)
with self.assertWarnsWith(warn_type=UserWarning,
exc_msg='Specifying starting_time and timestamps is not supported.'):
obj.__init__(
name="test_ts",
data=[11, 12, 13, 14, 15],
unit="volts",
starting_time=1.0,
timestamps=[1, 2, 3, 4, 5]
)


class TestImage(TestCase):
def test_init(self):
Expand Down
Loading