Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset extraction error #835

Closed
Ishihara-Masabumi opened this issue Jan 12, 2023 · 3 comments · Fixed by #842
Closed

Dataset extraction error #835

Ishihara-Masabumi opened this issue Jan 12, 2023 · 3 comments · Fixed by #842

Comments

@Ishihara-Masabumi
Copy link

When I tried to run using the following command line,

python3 tools/train.py --config anomalib/models/padim/config.yaml

the following error saying "Unrecognized file format: datasets/MVTec/420938113-1629952094" occurrd.

023-01-12 17:20:27,177 - anomalib - INFO - Training the model.
2023-01-12 17:20:27,181 - anomalib.data.utils.download - INFO - Downloading the mvtec dataset.
mvtec: 5.26GB [16:26, 5.34MB/s]                                                                                           
2023-01-12 17:36:53,638 - anomalib.data.utils.download - INFO - Checking the hash of the downloaded file.
2023-01-12 17:37:02,944 - anomalib.data.utils.download - INFO - Extracting dataset into root folder.
Traceback (most recent call last):
  File "tools/train.py", line 75, in <module>
    train()
  File "tools/train.py", line 61, in train
    trainer.fit(model=model, datamodule=datamodule)
  File "/home/dl/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 770, in fit
    self._call_and_handle_interrupt(
  File "/home/dl/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 723, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/home/dl/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 811, in _fit_impl
    results = self._run(model, ckpt_path=self.ckpt_path)
  File "/home/dl/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1164, in _run
    self._data_connector.prepare_data()
  File "/home/dl/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/data_connector.py", line 124, in prepare_data
    self.trainer.datamodule.prepare_data()
  File "/home/dl/anomalib/anomalib/data/mvtec.py", line 260, in prepare_data
    download_and_extract(self.root, DOWNLOAD_INFO)
  File "/home/dl/anomalib/anomalib/data/utils/download.py", line 248, in download_and_extract
    raise ValueError(f"Unrecognized file format: {downloaded_file_path}")
ValueError: Unrecognized file format: datasets/MVTec/420938113-1629952094

Please let me know how to fix it !

@samet-akcay
Copy link
Contributor

Hi @Ishihara-Masabumi, looks like this happens when the code tries to download the MVTec dataset. Do you have the MVTec dataset on your file system? If yes, can you try to train the model using the same CLI command? If it works, we would know that the script only fails when trying to download the dataset.

@samet-akcay
Copy link
Contributor

Also reported in #838. @djdameln, can you have a look at this to check if this is a bug?

@Ishihara-Masabumi
Copy link
Author

It's OK.
Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants