Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dcm2bids_scaffold: FileNotFoundError: dcm2bids/scaffold/dataset_description.json #110

Closed
ArnaudC opened this issue Jan 25, 2021 · 5 comments · Fixed by #111
Closed

dcm2bids_scaffold: FileNotFoundError: dcm2bids/scaffold/dataset_description.json #110

ArnaudC opened this issue Jan 25, 2021 · 5 comments · Fixed by #111

Comments

@ArnaudC
Copy link

ArnaudC commented Jan 25, 2021

Hello,

The steps to reproduce :

apk add --no-cache python3-dev
pip3 install dcm2bids
/usr/bin/dcm2bids_scaffold
File "/usr/bin/dcm2bids_scaffold", line 11, in <module>
sys.exit(scaffold())
File "/usr/lib/python3.6/site-packages/dcm2bids/scaffold/__init__.py", line 51, in scaffold
   shutil.copyfile(os.path.join(SELF_DIR, _), os.path.join(args.output_dir, _))
File "/usr/lib/python3.6/shutil.py", line 120, in copyfile
   with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: '/usr/lib/python3.6/site-packages/dcm2bids/scaffold/dataset_description.json'
bash-4.4# ls /usr/lib/python3.6/site-packages/dcm2bids/scaffold/
__init__.py  __pycache__

As you can see, there is no file called dataset_description.json.
This was working last month.

Thanks

@SamGuay
Copy link
Member

SamGuay commented Jan 25, 2021

I confirm I am getting the same error on a fresh installation of dcm2bids with python 3.8 on Ubuntu 20.04.

@marleenhaupt
Copy link

I can also confirm that I get the same error on a fresh installation of dcm2bids with python 3.9.1 on macOS Big Sur version 11.1.

@kousu
Copy link
Contributor

kousu commented Feb 1, 2021

Debugging

Indeed, this file doesn't appear anywhere in the package:

$ wget https://files.pythonhosted.org/packages/4b/75/437b9cab314cbbb380a11ef33d408779e099ac6da202c21b2f7d4ea167be/dcm2bids-2.1.5-py3-none-any.whl
--2021-02-01 11:00:21--  https://files.pythonhosted.org/packages/4b/75/437b9cab314cbbb380a11ef33d408779e099ac6da202c21b2f7d4ea167be/dcm2bids-2.1.5-py3-none-any.whl
Resolving files.pythonhosted.org (files.pythonhosted.org)... 151.101.137.63, 2a04:4e42:20::319
Connecting to files.pythonhosted.org (files.pythonhosted.org)|151.101.137.63|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 32304 (32K) [application/octet-stream]
Saving to: ‘dcm2bids-2.1.5-py3-none-any.whl’

dcm2bids-2.1.5-py3-none-any.whl       100%[=======================================================================>]  31.55K  --.-KB/s    in 0.004s  

2021-02-01 11:00:21 (7.22 MB/s) - ‘dcm2bids-2.1.5-py3-none-any.whl’ saved [32304/32304]

$ mv dcm2bids-2.1.5-py3-none-any.whl dcm2bids-2.1.5-py3-none-any.zip
$ unzip dcm2bids-2.1.5-py3-none-any.zip 
Archive:  dcm2bids-2.1.5-py3-none-any.zip
  inflating: dcm2bids/__init__.py    
  inflating: dcm2bids/dcm2bids.py    
  inflating: dcm2bids/dcm2niix.py    
  inflating: dcm2bids/helper.py      
  inflating: dcm2bids/logger.py      
  inflating: dcm2bids/sidecar.py     
  inflating: dcm2bids/structure.py   
  inflating: dcm2bids/utils.py       
  inflating: dcm2bids/version.py     
  inflating: dcm2bids/scaffold/__init__.py  
  inflating: tests/__init__.py       
  inflating: tests/test_dcm2bids.py  
  inflating: tests/test_dcm2niix.py  
  inflating: tests/test_sidecar.py   
  inflating: tests/test_structure.py  
  inflating: tests/test_version.py   
  inflating: dcm2bids-2.1.5.dist-info/LICENSE.txt  
  inflating: dcm2bids-2.1.5.dist-info/METADATA  
  inflating: dcm2bids-2.1.5.dist-info/WHEEL  
  inflating: dcm2bids-2.1.5.dist-info/entry_points.txt  
  inflating: dcm2bids-2.1.5.dist-info/top_level.txt  
  inflating: dcm2bids-2.1.5.dist-info/RECORD  

It's not even in the src package:

$ wget https://files.pythonhosted.org/packages/b9/5d/15558896b579b70cda401dc3e60f821949a45d7828234131ac3f8fe96f43/dcm2bids-2.1.5.tar.gz
--2021-02-01 11:06:29--  https://files.pythonhosted.org/packages/b9/5d/15558896b579b70cda401dc3e60f821949a45d7828234131ac3f8fe96f43/dcm2bids-2.1.5.tar.gz
Resolving files.pythonhosted.org (files.pythonhosted.org)... 151.101.137.63, 2a04:4e42:1e::319
Connecting to files.pythonhosted.org (files.pythonhosted.org)|151.101.137.63|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 29478 (29K) [application/x-tar]
Saving to: ‘dcm2bids-2.1.5.tar.gz’

dcm2bids-2.1.5.tar.gz                 100%[=======================================================================>]  28.79K  --.-KB/s    in 0.001s  

2021-02-01 11:06:29 (26.3 MB/s) - ‘dcm2bids-2.1.5.tar.gz’ saved [29478/29478]

$ tar -ztvf dcm2bids-2.1.5.tar.gz 
drwxr-xr-x runner/docker     0 2021-01-04 13:02 dcm2bids-2.1.5/
-rw-r--r-- runner/docker 35235 2021-01-04 13:01 dcm2bids-2.1.5/LICENSE.txt
-rw-r--r-- runner/docker    58 2021-01-04 13:01 dcm2bids-2.1.5/MANIFEST.in
-rw-r--r-- runner/docker  4955 2021-01-04 13:02 dcm2bids-2.1.5/PKG-INFO
-rw-r--r-- runner/docker  3169 2021-01-04 13:01 dcm2bids-2.1.5/README.md
drwxr-xr-x runner/docker     0 2021-01-04 13:02 dcm2bids-2.1.5/dcm2bids/
-rw-r--r-- runner/docker   277 2021-01-04 13:01 dcm2bids-2.1.5/dcm2bids/__init__.py
-rw-r--r-- runner/docker  9042 2021-01-04 13:01 dcm2bids-2.1.5/dcm2bids/dcm2bids.py
-rw-r--r-- runner/docker  3286 2021-01-04 13:01 dcm2bids-2.1.5/dcm2bids/dcm2niix.py
-rw-r--r-- runner/docker  1115 2021-01-04 13:01 dcm2bids-2.1.5/dcm2bids/helper.py
-rw-r--r-- runner/docker   674 2021-01-04 13:01 dcm2bids-2.1.5/dcm2bids/logger.py
drwxr-xr-x runner/docker     0 2021-01-04 13:02 dcm2bids-2.1.5/dcm2bids/scaffold/
-rw-r--r-- runner/docker  1417 2021-01-04 13:01 dcm2bids-2.1.5/dcm2bids/scaffold/__init__.py
-rw-r--r-- runner/docker  8435 2021-01-04 13:01 dcm2bids-2.1.5/dcm2bids/sidecar.py
-rw-r--r-- runner/docker  6795 2021-01-04 13:01 dcm2bids-2.1.5/dcm2bids/structure.py
-rw-r--r-- runner/docker  2668 2021-01-04 13:01 dcm2bids-2.1.5/dcm2bids/utils.py
-rw-r--r-- runner/docker  4280 2021-01-04 13:01 dcm2bids-2.1.5/dcm2bids/version.py
drwxr-xr-x runner/docker     0 2021-01-04 13:02 dcm2bids-2.1.5/dcm2bids.egg-info/
-rw-r--r-- runner/docker  4955 2021-01-04 13:02 dcm2bids-2.1.5/dcm2bids.egg-info/PKG-INFO
-rw-r--r-- runner/docker   594 2021-01-04 13:02 dcm2bids-2.1.5/dcm2bids.egg-info/SOURCES.txt
-rw-r--r-- runner/docker     1 2021-01-04 13:02 dcm2bids-2.1.5/dcm2bids.egg-info/dependency_links.txt
-rw-r--r-- runner/docker   130 2021-01-04 13:02 dcm2bids-2.1.5/dcm2bids.egg-info/entry_points.txt
-rw-r--r-- runner/docker    15 2021-01-04 13:02 dcm2bids-2.1.5/dcm2bids.egg-info/requires.txt
-rw-r--r-- runner/docker    15 2021-01-04 13:02 dcm2bids-2.1.5/dcm2bids.egg-info/top_level.txt
-rw-r--r-- runner/docker   138 2021-01-04 13:01 dcm2bids-2.1.5/pyproject.toml
-rw-r--r-- runner/docker    38 2021-01-04 13:02 dcm2bids-2.1.5/setup.cfg
-rwxr-xr-x runner/docker  2532 2021-01-04 13:01 dcm2bids-2.1.5/setup.py
drwxr-xr-x runner/docker     0 2021-01-04 13:02 dcm2bids-2.1.5/tests/
-rw-r--r-- runner/docker    24 2021-01-04 13:01 dcm2bids-2.1.5/tests/__init__.py
-rw-r--r-- runner/docker  5770 2021-01-04 13:01 dcm2bids-2.1.5/tests/test_dcm2bids.py
-rw-r--r-- runner/docker  1163 2021-01-04 13:01 dcm2bids-2.1.5/tests/test_dcm2niix.py
-rw-r--r-- runner/docker   852 2021-01-04 13:01 dcm2bids-2.1.5/tests/test_sidecar.py
-rw-r--r-- runner/docker   952 2021-01-04 13:01 dcm2bids-2.1.5/tests/test_structure.py
-rw-r--r-- runner/docker   223 2021-01-04 13:01 dcm2bids-2.1.5/tests/test_version.py

The fix

This is a common issue with python packaging. So common so I would almost call it a bug in pip/setuptools. Until there's a better default every python programmer needs to know about package_data:

setup(
    ...
    include_package_data=True
)

You should also use this stdlib's API to access it. Instead of:

os.path.join(SELF_DIR, _)

use

pkgutil.get_data('dcm2bids', 'scaffold/dataset_description.json')

Now, there's a catch here: because python's package system supports virtual packages and remote packages and zipped packages (not that anyone ever ever uses those features) you cannot get a filename from this API, because there isn't necessarily a file to give. As the docs say:

For packages located in the filesystem, which have already been imported, this is the rough equivalent of:

d = os.path.dirname(sys.modules[package].__file__)
data = open(os.path.join(d, resource), 'rb').read()

So you need to deal with that: you can't use shutil with pkgutil.

alternate solutions

btw there was also pkg_resources.resource_stream() as well, but pkgutil supersedes it AFAIK. There's also pkg_resources.resource_filename() which would allow you to use the same API you already have. But avoid these, despite being critical pieces of python infrastructure they aren't part of the stdlib!

Apparently you can also use importlib which is part of the stdlib:

importlib.resources.files('top_level_package.sub_package').joinpath('file.txt')

L'avenir

Because of another quirk of python, your tests won't catch this until you publish and have a broken package. It's important to always do python tests on the installed package (here's more of the same advice). I have the same concern in SCT: spinalcordtoolbox/spinalcordtoolbox#3161. We wrote a custom package manager instead of using pip to distribute assets which forces us to test our installed version (our tests won't work until you ./install_sct). But as we migrate off that we're going to run into this problem too.

@kousu
Copy link
Contributor

kousu commented Feb 1, 2021

Oh wait actually maybe importlib.resources.path() is the currently blessed way?

@SamGuay
Copy link
Member

SamGuay commented Feb 16, 2021

Thanks for reporting the issue @ArnaudC and @marleenhaupt. This is simply to let you know that the bug was fixed, thanks to @kousu and @arnaudbore :).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants