Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edited installs impact data provenance #1087

Open
CBroz1 opened this issue Sep 5, 2024 · 1 comment
Open

Edited installs impact data provenance #1087

CBroz1 opened this issue Sep 5, 2024 · 1 comment
Labels
Database Issues with Frank Lab database, not Spyglass code

Comments

@CBroz1
Copy link
Member

CBroz1 commented Sep 5, 2024

A user's edited fork of Spyglass has been used to process data in common_ephys.LFPBand here. This edit impacted the following files, and perhaps others

  • eliot20221017.nwb
  • eliot20221020.nwb
  • eliot20221024.nwb

In the short term, the data should be edited to reflect the pipeline without this edit

In the long term, we need norms that dictate how one can/cannot edit spyglass to maintain confidence in data provenance

@CBroz1 CBroz1 added the Database Issues with Frank Lab database, not Spyglass code label Sep 5, 2024
@LorenFrankLab LorenFrankLab deleted a comment Sep 5, 2024
@rly
Copy link
Collaborator

rly commented Sep 6, 2024

You could check whether a user is running an editable install of spyglass

from importlib_metadata import Distribution
getattr(Distribution.from_name("spyglass-neuro").origin.dir_info, "editable", False)  # returns True for editable install
getattr(Distribution.from_name("numpy").origin.dir_info, "editable", False)  # returns False for not

and has a dirty git environment:

git status --untracked-files=no --porcelain  # returns empty string or line if no changes besides untracked files

or is not on the master branch

git rev-parse --abbrev-ref HEAD  # returns just the name of the branch

(Or is not on the list commit)

and prevent or warn about making changes to the database if so. (There might be easier ways but those are what I found.)

But as @edeno mentions in #439, this would not check whether a user has made modifications to dependent packages (e.g., spikeinterface), that would also affect the integrity and provenance of the added data. Still, it might catch the most likely, accidental changes...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Database Issues with Frank Lab database, not Spyglass code
Projects
None yet
Development

No branches or pull requests

2 participants