Dandi export and read #956

samuelbray32 · 2024-05-07T19:13:56Z

Description

Partial steps for #861

Step (3)
- makes new table DandiPath
  - tracks translations between analysis file name and renamed path in a dandiset
  - has method to stream data from dandi
- makes function DandiPath.compile_dandiset()
  - Uses dandi api functions to get dandi-compliant names for files
  - organizes and uploads files to dandi
  - populates DandiPath
- makes admin function Export.prepare_files_for_export()
  - Addresses common discrepancies between dandi standards and metadata in the nwb
  - ensures all analysis files in the export set have unique uuid's Uniqueness of object_ids in generated analysis nwb files #953
  - resolves resulting mismatch between database checksum and altered file
Step (4)
- adds new fallback to fetch_nwb which streams file from dandi if available

Checklist:

This PR should be accompanied by a release: unsure
If release, I have updated the CITATION.cff
This PR makes edits to table definitions: no
N/A If table edits, I have included an alter snippet for release notes.
I have updated the CHANGELOG.md with PR number and description.
I have added/edited docs/notebooks to reflect the changes
Change the dandi url's from development to production server

edeno · 2024-05-08T00:18:32Z

I know this is still a WIP, but we should also decide whether to make dandi a dependency or an optional dependency.

CBroz1 · 2024-05-08T14:40:06Z

src/spyglass/common/common_dandi.py

+    definition = """
+    -> Export.File
+    ---
+    dandiset_id: str


Is the dandiset_id the same for all files in a given Export? If so, I might avoid some redundancy by having DandiPath be a master table with one part per Export.File

As written, the dandiset ID is common for everything in export. However I could see a future case where an analysis file is used in multiple papers and rather than re-uploading a new copy of the file to dandi you would point to it's existing version in a different dandiset. I might lean towards leaving the flexibility for this in the table structure

How does dandi handle this case? Presumably, they wouldn't want to store two copies of the same data file across dandisets. But, if the name is contextually determined by other files, the name would have to differ. I wouldn't be surprised if they had cross-dandiset soft-link tool, but I don't know for sure

It would have different names on Dandi, but they would share the same object_id for the file. I haven't tested if that blocks upload.

CBroz1

Just took a quick first pass to keep up to speed

src/spyglass/common/common_usage.py

src/spyglass/common/common_dandi.py

src/spyglass/sharing/sharing_kachery.py

src/spyglass/utils/nwb_helper_fn.py

review-notebook-app · 2024-05-20T21:01:39Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

into dandi_export

samuelbray32 · 2024-05-20T21:19:57Z

~~I'm leaving the functions pointing to the Dandi dev server for now until the other parts of the code are reviewed.~~

CBroz1

Good work, @samuelbray32. I had some comments, but I think the heaviest lifting would be migrating it all to common_dandi to revise the dependency order. I'm happy to help migrating to a more object-oriented approach so the helpers can share properties

notebooks/py_scripts/05_Export.py

src/spyglass/common/common_dandi.py

CBroz1 · 2024-05-21T15:56:02Z

src/spyglass/common/common_usage.py

+        # make a temp dir with symbolic links to the export files
+        source_files = (self.File() & key).fetch("file_path")
+        paper_dir = f"{export_dir}/{paper_id}"
+        os.makedirs(paper_dir, exist_ok=True)


If replacing os with pathlib, you can pathlib_dir.mkdir(parents=True, exist_ok=True)

src/spyglass/common/common_usage.py

CBroz1 · 2024-05-21T16:02:51Z

src/spyglass/common/common_usage.py

+        translations = [
+            {
+                **(
+                    self.File() & key & f"file_path LIKE '%{t['filename']}'"


My file_like helper would work here i think

samuelbray32 · 2024-05-21T17:14:28Z

Thanks @CBroz1. I can work on moving the functionality into dandi_common. For another common comment, dandi is already a dependency in pynwb so it shouldn't cause issues for import

rly · 2024-05-22T20:44:42Z

FYI dandi is not a dependency of pynwb.

CBroz1

I had a couple quick comments. I can do a full review later today

src/spyglass/utils/dj_helper_fn.py

notebooks/py_scripts/05_Export.py

src/spyglass/common/common_dandi.py

CBroz1

Requesting some minor changes

src/spyglass/common/common_dandi.py

CBroz1 · 2024-06-03T17:31:18Z

src/spyglass/common/common_dandi.py

+
+
+def validate_dandiset(
+    folder, min_severity="ERROR", ignore_external_files=False


I read a recommendation recently that suggested setting default min_severity values to w/e the lowest is, like warn. The theory being that seeing warnings makes it easier to deal with the errors when they arise. Do you anticipate warnings at this stage that should be hidden from the user?

The primary use of this function is to check for errors that would prevent upload to Dandi, so the default min_severity is set to the level that would do so.

That makes sense - we can keep as is. If we can find a way to display warnings without interrupting validation, I think that would make maintenance easier - suggestions over time might push users to address warnings before they potentially become errors in future dandi versions

src/spyglass/common/common_dandi.py

CBroz1 · 2024-06-03T17:32:36Z

src/spyglass/common/common_dandi.py

+        filtered_results = [
+            i
+            for i in filtered_results
+            if not i.message.startswith("Path is not inside")


This is how I would handle warnings we expect to see given our setup

Sorry, unsure what the recommedation is?

This comment was in ref to the above suggestion to relax min_severity, and then silence items we expect by the same list comprehension means

src/spyglass/utils/dj_helper_fn.py

src/spyglass/utils/nwb_helper_fn.py

Co-authored-by: Chris Broz <[email protected]>

CBroz1

Responded to some comments re dandi warnings. Approved with or without addressing

* Give UUID to artifact interval * Add ability to set smoothing sigma in get_firing_rate (#994) * add option to set spike smoothing sigma * update changelog * Add docstrings to SortedSpikesGroup and Decoding methods (#996) * Add docstrings * update changelog * fix spelling --------- Co-authored-by: Samuel Bray <[email protected]> * Add Common Errors doc (#997) * Add Common Errors * Update changelog * Mua notebook (#998) * documented some of mua notebook * mua notebook documented * documented some of mua notebook * synced py script * Dandi export and read (#956) * compile exported files, download dandiset, and organize * add function to translate files into dandi-compatible names * add table to store dandi name translation and steps to populate * add dandiset validation * add function to fetch nwb from dandi * add function to change obj_id of nwb_file * add dandi upload call and fix circular import * debug dandi file streaming * fix circular import * resolve dandi-streamed files with fetch_nwb * implement review comments * add admin tools to fix common dandi discrepencies * implement tool to cleanup common dandi errors * add dandi export to tutorial * fix linting * update changelog * fix spelling * style changes from review * reorganize function locations * fix circular import * make dandi dependency optional in imports * store dandi instance of data in DandiPath * resolve case of pre-existing dandi entries for export * cleanup bugs from refactor * update notebook * Apply suggestions from code review Co-authored-by: Chris Broz <[email protected]> * add requested changes from review * make method check_admin_privilege in LabMember --------- Co-authored-by: Chris Broz <[email protected]> * Minor fixes (#999) * give analysis nwb new uuid when created * fix function argument * update changelog * Fix bug in change in analysis_file object_id (#1004) * fix bug in change in analysis_file_object_id * update changelog * Remove classes for usused tables (#1003) * #976 * Remove notebook reference * Non-daemon parallel populate (#1001) * initial non daemon parallel commit * resolve namespace and pickling errors * fix linting * update changelog * implement review comments * add parallel_make flag to spikesorting recording tables * fix multiprocessing spawn error on mac * move propert --------- Co-authored-by: Samuel Bray <[email protected]> * Update pipeline column for IntervalList --------- Co-authored-by: Samuel Bray <[email protected]> Co-authored-by: Samuel Bray <[email protected]> Co-authored-by: Chris Broz <[email protected]> Co-authored-by: Denisse Morales-Rodriguez <[email protected]> Co-authored-by: Samuel Bray <[email protected]>

samuelbray32 added 10 commits May 3, 2024 12:30

compile exported files, download dandiset, and organize

8808041

add function to translate files into dandi-compatible names

f57287a

add table to store dandi name translation and steps to populate

717e04f

add dandiset validation

675d929

add function to fetch nwb from dandi

1e7491b

add function to change obj_id of nwb_file

58d6411

add dandi upload call and fix circular import

8aa9242

debug dandi file streaming

463a1d6

fix circular import

ff7b5c6

resolve dandi-streamed files with fetch_nwb

204859d

CBroz1 reviewed May 8, 2024

View reviewed changes

samuelbray32 added 7 commits May 20, 2024 09:02

Merge remote-tracking branch 'origin/master' into dandi_export

b641642

implement review comments

94ae801

add admin tools to fix common dandi discrepencies

49c493b

implement tool to cleanup common dandi errors

9cb11a3

add dandi export to tutorial

6a7e486

fix linting

530b128

update changelog

c8f4200

samuelbray32 added 3 commits May 20, 2024 14:08

Merge branch 'master' into dandi_export

00d1364

fix spelling

73a3f97

Merge branch 'dandi_export' of https://github.com/LorenFrankLab/spyglass

aba687c

into dandi_export

samuelbray32 marked this pull request as ready for review May 20, 2024 21:17

edeno requested a review from CBroz1 May 20, 2024 21:18

CBroz1 requested changes May 21, 2024

View reviewed changes

samuelbray32 added 10 commits May 29, 2024 11:03

style changes from review

d9bf04e

reorganize function locations

21bbd3f

fix circular import

9fbb80c

make dandi dependency optional in imports

45f8d3e

store dandi instance of data in DandiPath

dba04df

resolve case of pre-existing dandi entries for export

f716ea4

cleanup bugs from refactor

5727528

update notebook

252f5f4

Merge remote-tracking branch 'origin/master' into dandi_export

4acf2e8

Merge branch 'master' into dandi_export

c693a93

samuelbray32 requested a review from CBroz1 May 30, 2024 15:30

samuelbray32 mentioned this pull request May 31, 2024

Insert session from Dandi #992

Open

CBroz1 reviewed Jun 3, 2024

View reviewed changes

src/spyglass/utils/dj_helper_fn.py Outdated Show resolved Hide resolved

notebooks/py_scripts/05_Export.py Outdated Show resolved Hide resolved

src/spyglass/common/common_dandi.py Outdated Show resolved Hide resolved

CBroz1 requested changes Jun 3, 2024

View reviewed changes

samuelbray32 and others added 3 commits June 3, 2024 12:52

Apply suggestions from code review

3d02e5f

Co-authored-by: Chris Broz <[email protected]>

add requested changes from review

5fdaba4

make method check_admin_privilege in LabMember

d802921

samuelbray32 mentioned this pull request Jun 3, 2024

Resolve case of shared file use in multiple Dandisets #995

Open

samuelbray32 requested a review from CBroz1 June 4, 2024 16:21

CBroz1 approved these changes Jun 4, 2024

View reviewed changes

edeno merged commit 3e7d35a into master Jun 6, 2024
7 checks passed

edeno deleted the dandi_export branch June 6, 2024 18:30

samuelbray32 mentioned this pull request Jun 6, 2024

Minor fixes #999

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dandi export and read #956

Dandi export and read #956

samuelbray32 commented May 7, 2024 •

edited

Loading

edeno commented May 8, 2024

CBroz1 May 8, 2024

samuelbray32 May 20, 2024

CBroz1 May 21, 2024

samuelbray32 May 21, 2024

CBroz1 left a comment

review-notebook-app bot commented May 20, 2024

samuelbray32 commented May 20, 2024 •

edited

Loading

CBroz1 left a comment

CBroz1 May 21, 2024

CBroz1 May 21, 2024

samuelbray32 commented May 21, 2024

rly commented May 22, 2024

CBroz1 left a comment

CBroz1 left a comment

CBroz1 Jun 3, 2024

samuelbray32 Jun 3, 2024

CBroz1 Jun 4, 2024

CBroz1 Jun 3, 2024

samuelbray32 Jun 3, 2024

CBroz1 Jun 4, 2024

CBroz1 left a comment



		def validate_dandiset(
		folder, min_severity="ERROR", ignore_external_files=False

Dandi export and read #956

Dandi export and read #956

Conversation

samuelbray32 commented May 7, 2024 • edited Loading

Description

Checklist:

edeno commented May 8, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CBroz1 left a comment

Choose a reason for hiding this comment

review-notebook-app bot commented May 20, 2024

samuelbray32 commented May 20, 2024 • edited Loading

CBroz1 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

samuelbray32 commented May 21, 2024

rly commented May 22, 2024

CBroz1 left a comment

Choose a reason for hiding this comment

CBroz1 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CBroz1 left a comment

Choose a reason for hiding this comment

samuelbray32 commented May 7, 2024 •

edited

Loading

samuelbray32 commented May 20, 2024 •

edited

Loading