Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix broken archivers #285

Closed
1 of 2 tasks
e-belfer opened this issue Feb 20, 2024 · 3 comments · Fixed by #304
Closed
1 of 2 tasks

Fix broken archivers #285

e-belfer opened this issue Feb 20, 2024 · 3 comments · Fixed by #304
Labels
bug Something isn't working

Comments

@e-belfer
Copy link
Member

e-belfer commented Feb 20, 2024

In #276 we encounter the following errors for archivers:

All FERC archivers fail with the following error:

2024-02-20 14:16:35,954 [webCache:cacheDownloadRenamingError] [Errno 2] No such file or directory: '/home/runner/.config/arelle/cache/http/www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd.tmp' -> '/home/runner/.config/arelle/cache/http/www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd' 
Unsuccessful renaming of downloaded file to active file /home/runner/.config/arelle/cache/http/www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd 
Please remove with file manager. - 

File "/home/runner/micromamba/envs/pudl-cataloger/lib/python3.11/site-packages/aiohttp/client.py", line 449, in _request
    url = self._build_url(str_or_url)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/micromamba/envs/pudl-cataloger/lib/python3.11/site-packages/aiohttp/client.py", line 376, in _build_url
    url = URL(str_or_url)
          ^^^^^^^^^^^^^^^
  File "/home/runner/micromamba/envs/pudl-cataloger/lib/python3.11/site-packages/yarl/_url.py", line 179, in __new__
    raise TypeError("Constructor parameter should be str")
TypeError: Constructor parameter should be str

eiawater has a "record has been deleted" response that seems incorrect - have contacted Zenodo.

Tasks

@zaneselvans
Copy link
Member

@e-belfer I just noticed that the EIA cooling water archives are not currently getting turned into annual zipfiles. Does that conflict with the recent change in the PUDL repo that removed the EIA-860M not-a-zipfile special case?

@e-belfer
Copy link
Member Author

e-belfer commented Mar 12, 2024

We are not currently extracting these files at all, so we'll just have to handle it when we do. My suggestion is just to subclass the load method there to expect a zipfile, not a massive change. If it becomes a more generalized case we can either zip or handle Excel files again.

@jdangerx
Copy link
Member

We are running into problems because arelle's taxonomy loading doesn't support concurrency. We can work around this in a couple ways:

  1. we can make each FERC dataset run in its own GHA runner
  2. we can add retries for the FileExistsError

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants