Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make ls more resilient in case of metadata loading fails with pynwb for some reason #282

Closed
t-b opened this issue Nov 24, 2020 · 9 comments · Fixed by #293
Closed

make ls more resilient in case of metadata loading fails with pynwb for some reason #282

t-b opened this issue Nov 24, 2020 · 9 comments · Fixed by #293

Comments

@t-b
Copy link

t-b commented Nov 24, 2020

Follow up from NeurodataWithoutBorders/pynwb#1321.

Can someone explain why HardwareTests-V2-IP8.nwb is good with

$ dandi --log-level DEBUG ls HardwareTests-V2-IP8.nwb
2020-11-24 01:20:38,661 [   DEBUG] No newer (than 0.7.2) version of dandi/dandi-cli found available
2020-11-24 01:20:41,977 [   DEBUG] Calling memoized version of <function get_metadata at 0x0000000008A4A678> for E:\projekte\mies-igor\tools\unit-testing\HardwareTests-V2-IP8.nwb
- age: null
  date_of_birth: null
  experiment_description: null
  experimenter: null
  genotype: null
  identifier: ac24acc942a5b87538bf15d140e06b4576481565b77b114877c4d26ba23fc09e
  institution: null
  keywords: null
  lab: null
  nd_types:
  - Device (7)
  - IntracellularElectrode (6)
  - LabNotebook
  - LabNotebookDevice
  - StimulusSets
  - Subject
  - SweepTable
  - Testpulse
  - TestpulseDevice
  - TimeSeries (4)
  - UserComment
  - UserCommentDevice
  - VoltageClampSeries (2)
  - VoltageClampStimulusSeries (2)
  number_of_electrodes: 0
  number_of_units: 0
  nwb_version: 2.2.4
  path: HardwareTests-V2-IP8.nwb
  related_publications: null
  session_description: PLACEHOLDER
  session_id: null
  session_start_time: 2020-11-21 20:42:02.721000+00:00
  sex: null
  size: 9180392
  species: null
  subject_id: null
(

but bad.nwb gives

$ dandi --log-level DEBUG ls bad.nwb
2020-11-24 12:18:13,126 [   DEBUG] No newer (than 0.7.2) version of dandi/dandi-cli found available
2020-11-24 12:18:17,903 [   DEBUG] Calling memoized version of <function get_metadata at 0x0000000008A49678> for E:\projekte\mies-igor\tools\unit-testing\bad.nwb
2020-11-24 12:18:17,917 [   DEBUG] Running original <function get_metadata at 0x0000000008A49678> on 'E:\\projekte\\mies-igor\\tools\\unit-testing\\bad.nwb'
2020-11-24 12:18:20,087 [   DEBUG] Call to get_metadata on bad.nwb failed: Could not construct IntracellularElectrode object due to: IntracellularElectrode.__init__: incorrect type for 'description' (got 'bytes', expected 'str')
2020-11-24 12:18:20,087 [   DEBUG] Problem obtaining metadata for bad.nwb: 'NoneType' object has no attribute 'items'
2020-11-24 12:18:20,087 [ WARNING] Failed to operate on some paths (empty records were listed):
 AttributeError: 1 paths
- path: bad.nwb
  size: 31985390

Both have the exact same HDF5 type

E:\projekte\mies-igor\tools\unit-testing>h5dump -d "/general/intracellular_ephys/electrode_0/description" bad.nwb HardwareTests-V2-IP8.nwb
HDF5 "bad.nwb" {
DATASET "/general/intracellular_ephys/electrode_0/description" {
   DATATYPE  H5T_STRING {
      STRSIZE H5T_VARIABLE;
      STRPAD H5T_STR_NULLTERM;
      CSET H5T_CSET_UTF8;
      CTYPE H5T_C_S1;
   }
   DATASPACE  SIMPLE { ( 1 ) / ( 1 ) }
   DATA {
   (0): "Headstage 0"
   }
}
}
HDF5 "HardwareTests-V2-IP8.nwb" {
DATASET "/general/intracellular_ephys/electrode_0/description" {
   DATATYPE  H5T_STRING {
      STRSIZE H5T_VARIABLE;
      STRPAD H5T_STR_NULLTERM;
      CSET H5T_CSET_UTF8;
      CTYPE H5T_C_S1;
   }
   DATASPACE  SIMPLE { ( 1 ) / ( 1 ) }
   DATA {
   (0): "Headstage 0"
   }
}
}

files.zip

@yarikoptic
Copy link
Member

Wrong second paste? (Looks like hdf5dump and no error in any of those pasted)

@t-b
Copy link
Author

t-b commented Nov 24, 2020

@yarikoptic Thanks. Fixed.

@yarikoptic
Copy link
Member

I cannot tell what is wrong, and most likely an underlying pynwb issue, not dandi (but I could be wrong) for

Could not construct IntracellularElectrode object due to: IntracellularElectrode.__init__: incorrect type for 'description' (got 'bytes', expected 'str')

but you could try with env variable DANDI_CACHE=ignore to see if that is not somehow due to serialization during caching of the results (unlikely but possible).

Then there is a dandi ls issue with

Problem obtaining metadata for bad.nwb: 'NoneType' object has no attribute 'items'

is due to the failure above, I will probably (yet deciding on either it is worth it at this stage -- we are yet to do big RF #249 TODO which would touch on it more harmoniously) submit a PR shortly to handle it a bit better.

And that bad.nwb - it is not the one from https://github.com/NeurodataWithoutBorders/pynwb/files/5578038/HardwareTests-V2-MIES-187d9143.zip , is it available to look "deeper"?

@t-b
Copy link
Author

t-b commented Nov 24, 2020

@yarikoptic I have attached files.zip to the issue of course.

@t-b
Copy link
Author

t-b commented Nov 24, 2020

Same error with DANDI_CACHE=ignore.

@yarikoptic
Copy link
Member

eh, sorry -- missed that link.

for me both files ls fine with pynwb 1.4.0.post.dev3 "installed" straight from upstream repo at 1.4.0-3-g6d6ea6c9 (hdmf was 2.1.0 and then upgraded to 2.2.0) at NeurodataWithoutBorders/pynwb#1282

output
$> dandi  ls /tmp/bad.nwb /tmp/HardwareTests-V2-IP8.nwb      
PATH                          SIZE    IDENTIFIER                                                       NWB   ND_TYPES                                                                                                      SESSION_DESCRIPTION SESSION_START_TIME  
/tmp/bad.nwb                  32.0 MB 2ae7afd1a09f78c3d7c3311d71990095010fab706d91f9048986eef429991a70 2.2.4 CurrentClampSeries (73), CurrentClampStimulusSeries (73), Device (148), IntracellularElectrode (147), LabN... PLACEHOLDER         2019-11-08/18:46:09 
/tmp/HardwareTests-V2-IP8.nwb 9.2 MB  ac24acc942a5b87538bf15d140e06b4576481565b77b114877c4d26ba23fc09e 2.2.4 Device (7), IntracellularElectrode (6), LabNotebook, LabNotebookDevice, StimulusSets, Subject, SweepTable,... PLACEHOLDER         2020-11-21/20:42:02 
Summary:                      41.2 MB                                                                                                                                                                                                          2019-11-08/18:46:09>
                

so I then did pip install --upgrade --force-reinstall pynwb==1.4.0

which pulled in everything including a kitchensink
(git)lena:~/proj/dandi/dandi-cli[tags/0.7.2^0]
$> pip install --upgrade --force-reinstall pynwb==1.4.0
Collecting pynwb==1.4.0
  Using cached pynwb-1.4.0-py2.py3-none-any.whl (94 kB)
Collecting numpy>=1.16
  Downloading numpy-1.19.4-cp38-cp38-manylinux2010_x86_64.whl (14.5 MB)
     |████████████████████████████████| 14.5 MB 4.9 MB/s 
Collecting h5py>=2.9
  Downloading h5py-3.1.0-cp38-cp38-manylinux1_x86_64.whl (4.4 MB)
     |████████████████████████████████| 4.4 MB 5.9 MB/s 
Collecting pandas>=0.23
  Downloading pandas-1.1.4-cp38-cp38-manylinux1_x86_64.whl (9.3 MB)
     |████████████████████████████████| 9.3 MB 5.3 MB/s 
Collecting hdmf<3,>=2.1.0
  Using cached hdmf-2.2.0-py2.py3-none-any.whl (132 kB)
Collecting python-dateutil>=2.7
  Using cached python_dateutil-2.8.1-py2.py3-none-any.whl (227 kB)
Collecting pytz>=2017.2
  Downloading pytz-2020.4-py2.py3-none-any.whl (509 kB)
     |████████████████████████████████| 509 kB 5.9 MB/s 
Collecting ruamel.yaml>=0.15
  Using cached ruamel.yaml-0.16.12-py2.py3-none-any.whl (111 kB)
Collecting scipy>=1.1
  Downloading scipy-1.5.4-cp38-cp38-manylinux1_x86_64.whl (25.8 MB)
     |████████████████████████████████| 25.8 MB 5.6 MB/s 
Collecting jsonschema>=2.6.0
  Using cached jsonschema-3.2.0-py2.py3-none-any.whl (56 kB)
Collecting six>=1.5
  Using cached six-1.15.0-py2.py3-none-any.whl (10 kB)
Collecting ruamel.yaml.clib>=0.1.2; platform_python_implementation == "CPython" and python_version < "3.9"
  Using cached ruamel.yaml.clib-0.2.2-cp38-cp38-manylinux1_x86_64.whl (578 kB)
Collecting setuptools
  Using cached setuptools-50.3.2-py3-none-any.whl (785 kB)
Collecting pyrsistent>=0.14.0
  Downloading pyrsistent-0.17.3.tar.gz (106 kB)
     |████████████████████████████████| 106 kB 7.3 MB/s 
Collecting attrs>=17.4.0
  Downloading attrs-20.3.0-py2.py3-none-any.whl (49 kB)
     |████████████████████████████████| 49 kB 3.2 MB/s 
Building wheels for collected packages: pyrsistent
  Building wheel for pyrsistent (setup.py) ... done
  Created wheel for pyrsistent: filename=pyrsistent-0.17.3-cp38-cp38-linux_x86_64.whl size=108030 sha256=35d9b62b6ae1719f89cdc792f8a5a1a519fc6b56cab22e086e3fadaebe431c40
  Stored in directory: /home/yoh/.cache/pip/wheels/3d/22/08/7042eb6309c650c7b53615d5df5cc61f1ea9680e7edd3a08d2
Successfully built pyrsistent
ERROR: launchpadlib 1.10.13 requires testresources, which is not installed.
Installing collected packages: numpy, h5py, six, python-dateutil, pytz, pandas, ruamel.yaml.clib, ruamel.yaml, scipy, setuptools, pyrsistent, attrs, jsonschema, hdmf, pynwb
  Attempting uninstall: numpy
    Found existing installation: numpy 1.19.3
    Not uninstalling numpy at /usr/lib/python3/dist-packages, outside environment /home/yoh/proj/dandi/dandi-cli/venvs/dev3
    Can't uninstall 'numpy'. No files were found to uninstall.
  Attempting uninstall: h5py
    Found existing installation: h5py 2.10.0
    Not uninstalling h5py at /usr/lib/python3/dist-packages, outside environment /home/yoh/proj/dandi/dandi-cli/venvs/dev3
    Can't uninstall 'h5py'. No files were found to uninstall.
  Attempting uninstall: six
    Found existing installation: six 1.14.0
    Uninstalling six-1.14.0:
      Successfully uninstalled six-1.14.0
  Attempting uninstall: python-dateutil
    Found existing installation: python-dateutil 2.8.1
    Not uninstalling python-dateutil at /usr/lib/python3/dist-packages, outside environment /home/yoh/proj/dandi/dandi-cli/venvs/dev3
    Can't uninstall 'python-dateutil'. No files were found to uninstall.
  Attempting uninstall: pytz
    Found existing installation: pytz 2020.1
    Not uninstalling pytz at /usr/lib/python3/dist-packages, outside environment /home/yoh/proj/dandi/dandi-cli/venvs/dev3
    Can't uninstall 'pytz'. No files were found to uninstall.
  Attempting uninstall: pandas
    Found existing installation: pandas 1.1.3
    Not uninstalling pandas at /usr/lib/python3/dist-packages, outside environment /home/yoh/proj/dandi/dandi-cli/venvs/dev3
    Can't uninstall 'pandas'. No files were found to uninstall.
  Attempting uninstall: ruamel.yaml.clib
    Found existing installation: ruamel.yaml.clib 0.2.2
    Not uninstalling ruamel.yaml.clib at /usr/lib/python3/dist-packages, outside environment /home/yoh/proj/dandi/dandi-cli/venvs/dev3
    Can't uninstall 'ruamel.yaml.clib'. No files were found to uninstall.
  Attempting uninstall: ruamel.yaml
    Found existing installation: ruamel.yaml 0.16.12
    Not uninstalling ruamel.yaml at /usr/lib/python3/dist-packages, outside environment /home/yoh/proj/dandi/dandi-cli/venvs/dev3
    Can't uninstall 'ruamel.yaml'. No files were found to uninstall.
  Attempting uninstall: scipy
    Found existing installation: scipy 1.5.2
    Not uninstalling scipy at /usr/lib/python3/dist-packages, outside environment /home/yoh/proj/dandi/dandi-cli/venvs/dev3
    Can't uninstall 'scipy'. No files were found to uninstall.
  Attempting uninstall: setuptools
    Found existing installation: setuptools 44.0.0
    Uninstalling setuptools-44.0.0:
      Successfully uninstalled setuptools-44.0.0
  Attempting uninstall: pyrsistent
    Found existing installation: pyrsistent 0.15.5
    Not uninstalling pyrsistent at /usr/lib/python3/dist-packages, outside environment /home/yoh/proj/dandi/dandi-cli/venvs/dev3
    Can't uninstall 'pyrsistent'. No files were found to uninstall.
  Attempting uninstall: attrs
    Found existing installation: attrs 19.3.0
    Not uninstalling attrs at /usr/lib/python3/dist-packages, outside environment /home/yoh/proj/dandi/dandi-cli/venvs/dev3
    Can't uninstall 'attrs'. No files were found to uninstall.
  Attempting uninstall: jsonschema
    Found existing installation: jsonschema 3.2.0
    Uninstalling jsonschema-3.2.0:
      Successfully uninstalled jsonschema-3.2.0
  Attempting uninstall: hdmf
    Found existing installation: hdmf 2.2.0
    Uninstalling hdmf-2.2.0:
      Successfully uninstalled hdmf-2.2.0
  Attempting uninstall: pynwb
    Found existing installation: pynwb 1.4.0
    Uninstalling pynwb-1.4.0:
      Successfully uninstalled pynwb-1.4.0
Successfully installed attrs-20.3.0 h5py-3.1.0 hdmf-2.2.0 jsonschema-3.2.0 numpy-1.19.4 pandas-1.1.4 pynwb-1.4.0 pyrsistent-0.17.3 python-dateutil-2.8.1 pytz-2020.4 ruamel.yaml-0.16.12 ruamel.yaml.clib-0.2.2 scipy-1.5.4 setuptools-50.3.2 six-1.15.0
pip install --upgrade --force-reinstall pynwb==1.4.0  14.34s user 2.57s system 68% cpu 24.512 total
and now it doesn't work on either of those:
$> DANDI_CACHE=ignore dandi -l debug ls -f yaml /tmp/bad.nwb /tmp/HardwareTests-V2-IP8.nwb
2020-11-25 10:11:05,197 [   DEBUG] No newer (than 0.7.2) version of dandi/dandi-cli found available
2020-11-25 10:11:06,412 [   DEBUG] Calling memoized version of <function get_metadata at 0x7f2119214550> for /tmp/bad.nwb
2020-11-25 10:11:06,412 [   DEBUG] Running original <function get_metadata at 0x7f2119214550> on '/tmp/bad.nwb'
2020-11-25 10:11:07,503 [   DEBUG] Call to get_metadata on /tmp/bad.nwb failed: Could not construct IntracellularElectrode object due to: IntracellularElectrode.__init__: incorrect type for 'description' (got 'bytes', expected 'str')
2020-11-25 10:11:07,503 [   DEBUG] Problem obtaining metadata for /tmp/bad.nwb: 'NoneType' object has no attribute 'items'
2020-11-25 10:11:07,503 [   DEBUG] Calling memoized version of <function get_metadata at 0x7f2119214550> for /tmp/HardwareTests-V2-IP8.nwb
2020-11-25 10:11:07,504 [   DEBUG] Running original <function get_metadata at 0x7f2119214550> on '/tmp/HardwareTests-V2-IP8.nwb'
2020-11-25 10:11:08,175 [   DEBUG] Call to get_metadata on /tmp/HardwareTests-V2-IP8.nwb failed: Could not construct IntracellularElectrode object due to: IntracellularElectrode.__init__: incorrect type for 'description' (got 'bytes', expected 'str')
2020-11-25 10:11:08,175 [   DEBUG] Problem obtaining metadata for /tmp/HardwareTests-V2-IP8.nwb: 'NoneType' object has no attribute 'items'
2020-11-25 10:11:08,175 [ WARNING] Failed to operate on some paths (empty records were listed):
 AttributeError: 2 paths
- path: /tmp/bad.nwb
  size: 31985390
- path: /tmp/HardwareTests-V2-IP8.nwb
  size: 9180392

so at least there is some consistency ;) note that the upgrade pulled in h5py-3.1.0 which hdmf is not compatible with yet.

what h5py do you have ? (we need WTF #57)

@yarikoptic
Copy link
Member

downgrade pip install --upgrade 'h5py==2.10.0' made it work again

output
$> DANDI_CACHE=ignore dandi ls  /tmp/bad.nwb /tmp/HardwareTests-V2-IP8.nwb     
PATH                          SIZE    IDENTIFIER                                                       SESSION_START_TIME   ND_TYPES                                                                                                      SESSION_DESCRIPTION NWB  
/tmp/bad.nwb                  32.0 MB 2ae7afd1a09f78c3d7c3311d71990095010fab706d91f9048986eef429991a70 2019-11-08/18:46:09  CurrentClampSeries (73), CurrentClampStimulusSeries (73), Device (148), IntracellularElectrode (147), LabN... PLACEHOLDER         2.2.4
/tmp/HardwareTests-V2-IP8.nwb 9.2 MB  ac24acc942a5b87538bf15d140e06b4576481565b77b114877c4d26ba23fc09e 2020-11-21/20:42:02  Device (7), IntracellularElectrode (6), LabNotebook, LabNotebookDevice, StimulusSets, Subject, SweepTable,... PLACEHOLDER         2.2.4
Summary:                      41.2 MB                                                                  2019-11-08/18:46:09>                                                                                                                                        
                                                                                                       2020-11-21/20:42:02<  

yarikoptic added a commit that referenced this issue Nov 25, 2020
h5py 3.0 introduced breaking changes (see e.g. #282)
and hdmf yet to account for them.  We better cache that version too to improve
consistency of outputs etc
@yarikoptic
Copy link
Member

So, nothing really to "fix" on dandi-cli side, besides improving that ls behavior in case of pynwb etc puking. I will retitle for that

@yarikoptic yarikoptic changed the title Chokes on file with "Could not construct IntracellularElectrode object due to: IntracellularElectrode.__init__: incorrect type for 'description' (got 'bytes', expected 'str')" make ls more resilient in case of metadata loading fails with pynwb for some reason Nov 25, 2020
@t-b
Copy link
Author

t-b commented Nov 26, 2020

Thanks for digging. Once I downgrade h5py to "2.10.0" it indeed works. I thought I checked that but looks like I got lost in tracking the various issues involved in that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants