Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

file fails validation #1031

Closed
bendichter opened this issue Jun 16, 2022 · 9 comments
Closed

file fails validation #1031

bendichter opened this issue Jun 16, 2022 · 9 comments
Assignees

Comments

@bendichter
Copy link
Member

bendichter commented Jun 16, 2022

I am trying to upload this file
ecephys_tutorial.nwb.zip

but when I run dandi upload I get

(base) MacBook-Pro-5:000121 bendichter$ dandi upload
2022-06-16 02:18:20,386 [    INFO] Found 2 files to consider
PATH                        SIZE     ERRORS UPLOAD STATUS           MESSAGE                  
dandiset.yaml               4.5 kB                 skipped          should be edited online  
sub-001/sub-001_ecephys.nwb 204.1 kB               pre-validating                            
Summary:                    208.7 kB               1 skipped        1 should be edited online
PATH                        SIZE        ERRORS     UPLOAD STATUS           MESSAGE                  
dandiset.yaml               4.5 kB                        skipped          should be edited online  ecosub-001/sub-001_ecephys.nwb 204.1 kB       1              skipped          failed validation        
Summary:                    208.7 kB 1 with errors        2 skipped        1 should be edited online
                                                                           1 failed validation      
2022-06-16 02:18:20,579 [    INFO] Logs saved in /Users/bendichter/Library/Logs/dandi-cli/20220616061815Z-87466.log
(base) MacBook-Pro-5:000121 bendichter$ 

The log does not appear to give me any useful information about why validation failed

20220616060840Z-86009.log

The standard way of validating NWB files, python -m pynwb.validate sub-001/sub-001_ecephys.nwb yields no errors.

Running nwbinspector does not help either, as all of the messages are "best practice suggestions" and should not prevent an upload:

(base) MacBook-Pro-5:000121 bendichter$ nwbinspector --config dandi sub-001/sub-001_ecephys.nwb 
Inspecting NWBFiles...:   0%|                                                                                                                                    | 0/1 [00:00<?, ?it/s]/Users/bendichter/opt/miniconda3/lib/python3.9/site-packages/pynwb/ecephys.py:93: UserWarning: The second dimension of data does not match the length of electrodes. Your data may be transposed.
  warnings.warn("The second dimension of data does not match the length of electrodes. Your data may be "
                                                                                                                                                                                       

**************************************************
NWBInspector Report Summary

Timestamp: 2022-06-16 02:14:22.338099-04:00
Platform: macOS-10.16-x86_64-i386-64bit
NWBInspector version: 0.4.0

Found 5 issues over 1 files:
       5 - BEST_PRACTICE_SUGGESTION
**************************************************


0  BEST_PRACTICE_SUGGESTION
===========================

0.0  sub-001/sub-001_ecephys.nwb: None - 'None' object with name 'None'
       Message: The second dimension of data does not match the length of electrodes. Your data may be transposed.

0.1  sub-001/sub-001_ecephys.nwb: None - 'None' object with name 'None'
       Message: Description is missing.

0.2  sub-001/sub-001_ecephys.nwb: None - 'None' object with name 'None'
       Message: Description ('no description') is a placeholder.

0.3  sub-001/sub-001_ecephys.nwb: None - 'None' object with name 'None'
       Message: Metadata /general/keywords is missing.

0.4  sub-001/sub-001_ecephys.nwb: None - 'None' object with name 'None'
       Message: Subject.sex is missing.

Two separate issues here.

  1. Why isn't this file validating?
  2. This is a bad user experience. Even if this file is not up to DANDI standards there needs to be a better communication of why.
@yarikoptic
Copy link
Member

sorry @bendichter for the late response, I might have been traveling (back then) ;-) Usually I say we recommend/rely on users running dandi validate on those files which failed validation. In this file case I see

❯ dandi validate /tmp/ecephys_tutorial.nwb
/home/yoh/proj/dandi/dandi-cli-master/venvs/dev3/lib/python3.10/site-packages/pynwb/ecephys.py:90: UserWarning: ElectricalSeries 'ElectricalSeries': The second dimension of data does not match the length of electrodes. Your data may be transposed.
  warnings.warn("%s '%s': The second dimension of data does not match the length of electrodes. "
[NWBI.check_electrical_series_dims] /tmp/ecephys_tutorial.nwb — The second dimension of data does not match the length of electrodes. Your data may be transposed.
2022-11-03 14:54:05,841 [    INFO] Logs saved in /home/yoh/.cache/dandi-cli/log/20221103185404Z-355822.log

which is I guess was the one. Note that now we gather it from nwb-inspector whenever before it was plain invocation to pynwb.validate AFAIK.

and nwbinspector indeed lists it as CRITICAL ATM
❯ nwbinspector /tmp/ecephys_tutorial.nwb
/home/yoh/proj/dandi/dandi-cli-master/venvs/dev3/lib/python3.10/site-packages/hdmf/spec/namespace.py:531: UserWarning: Ignoring cached namespace 'core' version 2.4.0 because version 2.5.0 is already loaded.
  warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."
/home/yoh/proj/dandi/dandi-cli-master/venvs/dev3/lib/python3.10/site-packages/pynwb/ecephys.py:90: UserWarning: ElectricalSeries 'ElectricalSeries': The second dimension of data does not match the length of electrodes. Your data may be transposed.
  warnings.warn("%s '%s': The second dimension of data does not match the length of electrodes. "
Inspecting NWBFiles...:   0%|                                                                                                                                                                                                                | 0/1 [00:00<?, ?it/s]/home/yoh/proj/dandi/dandi-cli-master/venvs/dev3/lib/python3.10/site-packages/pynwb/ecephys.py:90: UserWarning: ElectricalSeries 'ElectricalSeries': The second dimension of data does not match the length of electrodes. Your data may be transposed.
  warnings.warn("%s '%s': The second dimension of data does not match the length of electrodes. "
                                                                                                                                                                                                                                                                   

**************************************************
NWBInspector Report Summary

Timestamp: 2022-11-03 14:56:11.522285-04:00
Platform: Linux-5.19.0-2-amd64-x86_64-with-glibc2.35
NWBInspector version: 0.4.17

Found 7 issues over 1 files:
       1 - CRITICAL
       6 - BEST_PRACTICE_SUGGESTION
**************************************************


0  CRITICAL
===========

0.0  /tmp/ecephys_tutorial.nwb: check_electrical_series_dims - 'ElectricalSeries' object at location '/acquisition/ElectricalSeries'
       Message: The second dimension of data does not match the length of electrodes. Your data may be transposed.

1  BEST_PRACTICE_SUGGESTION
===========================

1.1  /tmp/ecephys_tutorial.nwb: check_description - 'Subject' object at location '/general/subject'
       Message: Description is missing.

1.2  /tmp/ecephys_tutorial.nwb: check_description - 'ElectricalSeries' object at location '/acquisition/ElectricalSeries'
       Message: Description ('no description') is a placeholder.

1.3  /tmp/ecephys_tutorial.nwb: check_experimenter_form - 'NWBFile' object at location '/'
       Message: The name of experimenter 'Dr. Bilbo Baggins' does not match any of the accepted DANDI forms: 'LastName, Firstname', 'LastName, FirstName MiddleInitial.' or 'LastName, FirstName, MiddleName'.

1.4  /tmp/ecephys_tutorial.nwb: check_keywords - 'NWBFile' object at location '/'
       Message: Metadata /general/keywords is missing.

1.5  /tmp/ecephys_tutorial.nwb: check_subject_sex - 'Subject' object at location '/general/subject'
       Message: Subject.sex is missing.

1.6  /tmp/ecephys_tutorial.nwb: check_col_not_nan - 'DynamicTable' object with name 'electrodes'
       Message: Column 'imp' has all NaN values. Consider removing it from the table.

@bendichter
Copy link
Member Author

Thanks @yarikoptic this helped identify an issue in the tutorial

@djarecka
Copy link
Member

djarecka commented Nov 9, 2022

@bendichter - I just had a similar issue, I thought that nwbinspector --config dandi should include all dandi-related errors, am I right?

@CodyCBakerPhD
Copy link
Contributor

CodyCBakerPhD commented Nov 9, 2022

I just had a similar issue, I thought that nwbinspector --config dandi should include all dandi-related errors, am I right?

@djarecka Can you elaborate on your issue (including, ideally, the report in your case)?

The point is that a failure in dandi validate (which should prevent upload) should be equivalent to seeing any CRITICAL or above results with nwbinspector --config dandi

@djarecka
Copy link
Member

djarecka commented Nov 9, 2022

I have an error from dandi validator, but no critical file from nwbinspector (file from the user guide)

Do you want me to open an issue in the inspector repo?

(dandi_pip) dhcp-10-31-85-60:205769 dorota$ dandi validate
2022-11-09 15:30:04,586 [   ERROR] sub-001/sub-001_ecephys.nwb: 1 error(s)
2022-11-09 15:30:04,586 [   ERROR]   Subject.sex is missing.
Summary: Validation errors in 1 out of 1 files
2022-11-09 15:30:04,586 [    INFO] Logs saved in /Users/dorota/Library/Logs/dandi-cli/20221109203003Z-76498.log
(dandi_pip) dhcp-10-31-85-60:205769 dorota$ nwbinspector . --config dandi
                                                                                                                                                   

**************************************************
NWBInspector Report Summary

Timestamp: 2022-11-09 15:30:11.718071-05:00
Platform: macOS-12.0.1-arm64-arm-64bit
NWBInspector version: 0.4.17

Found 7 issues over 1 files:
       1 - BEST_PRACTICE_VIOLATION
       6 - BEST_PRACTICE_SUGGESTION
**************************************************


0  BEST_PRACTICE_VIOLATION
==========================

0.0  sub-001/sub-001_ecephys.nwb: check_regular_timestamps - 'ElectricalSeries' object at location '/acquisition/test_ephys_data'
       Message: TimeSeries appears to have a constant sampling rate. Consider specifying starting_time=0.0 and rate=0.1 instead of timestamps.

1  BEST_PRACTICE_SUGGESTION
===========================

1.1  sub-001/sub-001_ecephys.nwb: check_description - 'Subject' object at location '/general/subject'
       Message: Description is missing.

1.2  sub-001/sub-001_ecephys.nwb: check_description - 'Device' object at location '/general/devices/trodes_rig123'
       Message: Description is missing.

1.3  sub-001/sub-001_ecephys.nwb: check_experimenter_form - 'NWBFile' object at location '/'
       Message: The name of experimenter 'Dr. Bilbo Baggins' does not match any of the accepted DANDI forms: 'LastName, Firstname', 'LastName, FirstName MiddleInitial.' or 'LastName, FirstName, MiddleName'.

1.4  sub-001/sub-001_ecephys.nwb: check_keywords - 'NWBFile' object at location '/'
       Message: Metadata /general/keywords is missing.

1.5  sub-001/sub-001_ecephys.nwb: check_subject_sex - 'Subject' object at location '/general/subject'
       Message: Subject.sex is missing.

1.6  sub-001/sub-001_ecephys.nwb: check_col_not_nan - 'DynamicTable' object with name 'electrodes'
       Message: Column 'imp' has all NaN values. Consider removing it from the table.

@CodyCBakerPhD
Copy link
Contributor

Do you want me to open an issue in the inspector repo?

Sure - it looks like it's not being properly elevated from that usage, will investigate

@yarikoptic
Copy link
Member

Hm, looking at https://github.com/dandi/dandi-schema/blob/71a8ea3f8231cc8bf4fadcc51ea76db86171925f/dandischema/models.py#L818 where it is

sex: Optional[SexType]

and we even test https://github.com/dandi/dandi-cli/blob/HEAD/dandi/tests/test_metadata.py#L237 with "sex": None to be ok.

And it seems to be not required but rather "good practice" according to nwbinspector. Having upgraded all versions of dandi-cli, nwbinspector, pynwb, and hdmf I see following

dandi validate exits with non-0 but with another error ()
❯ dandi validate ecephys_tutorial.nwb
/home/yoh/proj/dandi/dandi-cli-master/venvs/dev3/lib/python3.10/site-packages/pynwb/ecephys.py:90: UserWarning: ElectricalSeries 'ElectricalSeries': The second dimension of data does not match the length of electrodes. Your data may be transposed.
  warnings.warn("%s '%s': The second dimension of data does not match the length of electrodes. "
[NWBI.check_electrical_series_dims] ecephys_tutorial.nwb — The second dimension of data does not match the length of electrodes. Your data may be transposed.
[NWBI.check_subject_sex] ecephys_tutorial.nwb — Subject.sex is missing.
2022-11-09 17:38:48,387 [    INFO] Logs saved in /home/yoh/.cache/dandi-cli/log/20221109223846Z-1414634.log
❯ echo $?
1

colors aren't copy/pasted but NWBI.check_electrical_series_dims is the red one and sex is blue, so I guess just a warning, right @TheChymera ?

nwbinspector also has the same CRITICAL error and sex among other recommended practices issues (not sure why we have only sex warning and not the others,,, need to check)
❯ nwbinspector ecephys_tutorial.nwb
/home/yoh/proj/dandi/dandi-cli-master/venvs/dev3/lib/python3.10/site-packages/hdmf/spec/namespace.py:531: UserWarning: Ignoring cached namespace 'core' version 2.4.0 because version 2.5.0 is already loaded.
  warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."
/home/yoh/proj/dandi/dandi-cli-master/venvs/dev3/lib/python3.10/site-packages/pynwb/ecephys.py:90: UserWarning: ElectricalSeries 'ElectricalSeries': The second dimension of data does not match the length of electrodes. Your data may be transposed.
  warnings.warn("%s '%s': The second dimension of data does not match the length of electrodes. "
Inspecting NWBFiles...:   0%|                                                                                                   | 0/1 [00:00<?, ?it/s]/home/yoh/proj/dandi/dandi-cli-master/venvs/dev3/lib/python3.10/site-packages/pynwb/ecephys.py:90: UserWarning: ElectricalSeries 'ElectricalSeries': The second dimension of data does not match the length of electrodes. Your data may be transposed.
  warnings.warn("%s '%s': The second dimension of data does not match the length of electrodes. "
                                                                                                                                                      

**************************************************
NWBInspector Report Summary

Timestamp: 2022-11-09 17:46:36.829704-05:00
Platform: Linux-5.19.0-2-amd64-x86_64-with-glibc2.35
NWBInspector version: 0.4.17

Found 7 issues over 1 files:
       1 - CRITICAL
       6 - BEST_PRACTICE_SUGGESTION
**************************************************


0  CRITICAL
===========

0.0  ecephys_tutorial.nwb: check_electrical_series_dims - 'ElectricalSeries' object at location '/acquisition/ElectricalSeries'
       Message: The second dimension of data does not match the length of electrodes. Your data may be transposed.

1  BEST_PRACTICE_SUGGESTION
===========================

1.1  ecephys_tutorial.nwb: check_description - 'Subject' object at location '/general/subject'
       Message: Description is missing.

1.2  ecephys_tutorial.nwb: check_description - 'ElectricalSeries' object at location '/acquisition/ElectricalSeries'
       Message: Description ('no description') is a placeholder.

1.3  ecephys_tutorial.nwb: check_experimenter_form - 'NWBFile' object at location '/'
       Message: The name of experimenter 'Dr. Bilbo Baggins' does not match any of the accepted DANDI forms: 'LastName, Firstname', 'LastName, FirstName MiddleInitial.' or 'LastName, FirstName, MiddleName'.

1.4  ecephys_tutorial.nwb: check_keywords - 'NWBFile' object at location '/'
       Message: Metadata /general/keywords is missing.

1.5  ecephys_tutorial.nwb: check_subject_sex - 'Subject' object at location '/general/subject'
       Message: Subject.sex is missing.

1.6  ecephys_tutorial.nwb: check_col_not_nan - 'DynamicTable' object with name 'electrodes'
       Message: Column 'imp' has all NaN values. Consider removing it from the table.

so -- the question remains - how are we getting absent sex to be an error?

@CodyCBakerPhD
Copy link
Contributor

so -- the question remains - how are we getting absent sex to be an error?

That was upgraded to CRITICAL a while back at the discretion of @bendichter at @satra's request, I believe: NeurodataWithoutBorders/nwbinspector#247

We've required subject species, age, and sex ever since that was merged.

That's also the state it was left in (and checked by me last time I worked with this stuff) after #1108 (in which I also mentioned the heightened metadata requirements).

/home/yoh/proj/dandi/dandi-cli-master/venvs/dev3/lib/python3.10/site-packages/pynwb/ecephys.py:90: UserWarning: ElectricalSeries 'ElectricalSeries': The second dimension of data does not match the length of electrodes. Your data may be transposed.
warnings.warn("%s '%s': The second dimension of data does not match the length of electrodes. "
[NWBI.check_electrical_series_dims] ecephys_tutorial.nwb — The second dimension of data does not match the length of electrodes. Your data may be transposed.
[NWBI.check_subject_sex] ecephys_tutorial.nwb — Subject.sex is missing.

I'm unsure about all this newfangled display of the suggestion-level stuff or coloration, but you'd expect to see both of them there as CRITICAL level things preventing dandi validation. The first is a problem with the tutorial itself, though.

nwbinspector ecephys_tutorial.nwb
/home/yoh/proj/dandi/dandi-cli-master/venvs/dev3/lib/python3.10/site-packages/hdmf/spec/namespace.py:531: UserWarning: Ignoring cached namespace 'core' version 2.4.0 because version 2.5.0 is already loaded.

As @djarecka pointed out, you need to run it with --config dandi to see the configuration that the validation should be using. That's not working as expected which is what I'm trying to fix ATM

@CodyCBakerPhD
Copy link
Contributor

Fix is now merged and available in NWB Inspector v0.4.18

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants