Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nightly Build Failure 2024-03-07 #3449

Closed
5 tasks
zaneselvans opened this issue Mar 7, 2024 · 4 comments
Closed
5 tasks

Nightly Build Failure 2024-03-07 #3449

zaneselvans opened this issue Mar 7, 2024 · 4 comments
Assignees
Labels
bug Things that are just plain broken. nightly-builds Anything having to do with nightly builds or continuous deployment. xbrl Related to the FERC XBRL transition

Comments

@zaneselvans
Copy link
Member

zaneselvans commented Mar 7, 2024

Overview

This seems to be another iteration of the failure from 2 days ago in #3441 stemming from Arelle having trouble with some cached file that it downloads from xbrl.org

This issue seems to maybe be related to the problems with the XBRL archiver problems.

The first questionable errors that show up in the logs seems to be:

2024-03-07 06:12:39,428 [webCache:cacheDownloadRenamingError] [Errno 2] No such file or directory: '/home/mambauser/.config/arelle/cache/http/www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd.tmp' -> '/home/mambauser/.config/arelle/cache/http/www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd' 
Unsuccessful renaming of downloaded file to active file /home/mambauser/.config/arelle/cache/http/www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd 
Please remove with file manager. - 

2024-03-07 06:12:39,429 [IOerror] sched-234_2022-01-01_def.xml: file error: [Errno 2] No such file or directory - ../../schedules/ScheduleAccumulatedDeferredIncomeTaxes/sched-234_2022-01-01.xsd 4

Next steps

  • Can we narrow the problem down enough that we can open a useful issue in the Arelle repo? This seems like a new problem that we haven't run into before.
  • They seem to have a lot of "linkbase table" PRs in flight.

Verify that everything is fixed!

Once you've applied any necessary fixes, make sure that the nightly build outputs are all in their right places.

Tasks

Relevant logs

[link to build logs from internal distribution bucket]( PLEASE FIND THE ACTUAL LINK AND FILL IN HERE )

@zaneselvans zaneselvans added bug Things that are just plain broken. xbrl Related to the FERC XBRL transition nightly-builds Anything having to do with nightly builds or continuous deployment. labels Mar 7, 2024
@zaneselvans zaneselvans moved this from New to Backlog in Catalyst Megaproject Mar 7, 2024
@zaneselvans
Copy link
Member Author

@jdangerx Should we say that this seems to have fixed itself for the moment?

@jdangerx
Copy link
Member

I've been able to reproduce this fairly consistently with the following snippet:

from arelle import Cntlr, ModelManager, ModelXbrl, WebCache

from concurrent.futures import ProcessPoolExecutor
import multiprocessing as mp


def load_tax(_i):
    cntlr = Cntlr.Cntlr()
    model_manager = ModelManager.initialize(cntlr)
    taxonomy_url = "https://eCollection.ferc.gov/taxonomy/form60/2022-01-01/form/form60/form-60_2022-01-01.xsd"
    taxonomy = ModelXbrl.load(model_manager, taxonomy_url)
    return 1

if __name__ == '__main__':
    cntlr = Cntlr.Cntlr()
    cache = WebCache.WebCache(cntlr, None)
    cache.clear()
    with ProcessPoolExecutor(max_workers=10, mp_context=mp.get_context('fork')) as executor:
        taxonomies = [t for t in executor.map(load_tax, range(5))]

The issue is, I think, that we split up the ferc_to_sqlite into form-specific ops - which works fine when you aren't using subprocesses, but once you have multiple sub-processes trying to write stuff to the cache at the same time, we run into a race condition where two processes try to execute this code at the same time:

        if reload or not filepathExists:
            return filepath if self._downloadFile(url, filepath) else None

P1 and P2 both see that not filepathExists; then P1 successfully downloads the file, and P2 tries to download the file but runs into:

FileExistsError: [Errno 17] File exists: '/Users/dazhong-catalyst/Library/Caches/Arelle/https/eCollection.ferc.gov/taxonomy/form60/2022-01-01/form/form60'

I think what we need to do is warm the cache by making an op that fetches all the taxonomies ahead of time. Then all the actual extraction ops can depend on the warmed cache - which means for all processes, filepathExists will always be True and then we will avoid the race condition.

@jdangerx jdangerx moved this from Backlog to In progress in Catalyst Megaproject Mar 19, 2024
@jdangerx
Copy link
Member

This is the same problem as catalyst-cooperative/pudl-archiver#285, but we need to solve it separately because we're not operating in a Dagster environment. More thoughts there.

@zaneselvans
Copy link
Member Author

@jdangerx I think this has been fixed with the new version of the extractor, so I'm closing.

@github-project-automation github-project-automation bot moved this from In progress to Done in Catalyst Megaproject Mar 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Things that are just plain broken. nightly-builds Anything having to do with nightly builds or continuous deployment. xbrl Related to the FERC XBRL transition
Projects
Archived in project
Development

No branches or pull requests

2 participants