-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dandi export and read #956
Conversation
I know this is still a WIP, but we should also decide whether to make dandi a dependency or an optional dependency. |
src/spyglass/common/common_dandi.py
Outdated
definition = """ | ||
-> Export.File | ||
--- | ||
dandiset_id: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the dandiset_id
the same for all files in a given Export
? If so, I might avoid some redundancy by having DandiPath
be a master table with one part per Export.File
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As written, the dandiset ID is common for everything in export. However I could see a future case where an analysis file is used in multiple papers and rather than re-uploading a new copy of the file to dandi you would point to it's existing version in a different dandiset. I might lean towards leaving the flexibility for this in the table structure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does dandi handle this case? Presumably, they wouldn't want to store two copies of the same data file across dandisets. But, if the name is contextually determined by other files, the name would have to differ. I wouldn't be surprised if they had cross-dandiset soft-link tool, but I don't know for sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would have different names on Dandi, but they would share the same object_id for the file. I haven't tested if that blocks upload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just took a quick first pass to keep up to speed
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work, @samuelbray32. I had some comments, but I think the heaviest lifting would be migrating it all to common_dandi
to revise the dependency order. I'm happy to help migrating to a more object-oriented approach so the helpers can share properties
src/spyglass/common/common_usage.py
Outdated
# make a temp dir with symbolic links to the export files | ||
source_files = (self.File() & key).fetch("file_path") | ||
paper_dir = f"{export_dir}/{paper_id}" | ||
os.makedirs(paper_dir, exist_ok=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If replacing os
with pathlib
, you can pathlib_dir.mkdir(parents=True, exist_ok=True)
src/spyglass/common/common_usage.py
Outdated
translations = [ | ||
{ | ||
**( | ||
self.File() & key & f"file_path LIKE '%{t['filename']}'" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My file_like
helper would work here i think
Thanks @CBroz1. I can work on moving the functionality into dandi_common. For another common comment, |
FYI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a couple quick comments. I can do a full review later today
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requesting some minor changes
|
||
|
||
def validate_dandiset( | ||
folder, min_severity="ERROR", ignore_external_files=False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read a recommendation recently that suggested setting default min_severity
values to w/e the lowest is, like warn
. The theory being that seeing warnings makes it easier to deal with the errors when they arise. Do you anticipate warnings at this stage that should be hidden from the user?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The primary use of this function is to check for errors that would prevent upload to Dandi, so the default min_severity
is set to the level that would do so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense - we can keep as is. If we can find a way to display warnings without interrupting validation, I think that would make maintenance easier - suggestions over time might push users to address warnings before they potentially become errors in future dandi versions
filtered_results = [ | ||
i | ||
for i in filtered_results | ||
if not i.message.startswith("Path is not inside") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how I would handle warnings we expect to see given our setup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, unsure what the recommedation is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment was in ref to the above suggestion to relax min_severity
, and then silence items we expect by the same list comprehension means
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Responded to some comments re dandi warnings. Approved with or without addressing
* Give UUID to artifact interval * Add ability to set smoothing sigma in get_firing_rate (#994) * add option to set spike smoothing sigma * update changelog * Add docstrings to SortedSpikesGroup and Decoding methods (#996) * Add docstrings * update changelog * fix spelling --------- Co-authored-by: Samuel Bray <[email protected]> * Add Common Errors doc (#997) * Add Common Errors * Update changelog * Mua notebook (#998) * documented some of mua notebook * mua notebook documented * documented some of mua notebook * synced py script * Dandi export and read (#956) * compile exported files, download dandiset, and organize * add function to translate files into dandi-compatible names * add table to store dandi name translation and steps to populate * add dandiset validation * add function to fetch nwb from dandi * add function to change obj_id of nwb_file * add dandi upload call and fix circular import * debug dandi file streaming * fix circular import * resolve dandi-streamed files with fetch_nwb * implement review comments * add admin tools to fix common dandi discrepencies * implement tool to cleanup common dandi errors * add dandi export to tutorial * fix linting * update changelog * fix spelling * style changes from review * reorganize function locations * fix circular import * make dandi dependency optional in imports * store dandi instance of data in DandiPath * resolve case of pre-existing dandi entries for export * cleanup bugs from refactor * update notebook * Apply suggestions from code review Co-authored-by: Chris Broz <[email protected]> * add requested changes from review * make method check_admin_privilege in LabMember --------- Co-authored-by: Chris Broz <[email protected]> * Minor fixes (#999) * give analysis nwb new uuid when created * fix function argument * update changelog * Fix bug in change in analysis_file object_id (#1004) * fix bug in change in analysis_file_object_id * update changelog * Remove classes for usused tables (#1003) * #976 * Remove notebook reference * Non-daemon parallel populate (#1001) * initial non daemon parallel commit * resolve namespace and pickling errors * fix linting * update changelog * implement review comments * add parallel_make flag to spikesorting recording tables * fix multiprocessing spawn error on mac * move propert --------- Co-authored-by: Samuel Bray <[email protected]> * Update pipeline column for IntervalList --------- Co-authored-by: Samuel Bray <[email protected]> Co-authored-by: Samuel Bray <[email protected]> Co-authored-by: Chris Broz <[email protected]> Co-authored-by: Denisse Morales-Rodriguez <[email protected]> Co-authored-by: Samuel Bray <[email protected]>
Description
Partial steps for #861
Step (3)
DandiPath
DandiPath.compile_dandiset()
DandiPath
Export.prepare_files_for_export()
Step (4)
fetch_nwb
which streams file from dandi if availableChecklist:
CITATION.cff
alter
snippet for release notes.CHANGELOG.md
with PR number and description.