Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nightly Build Failure 2024-03-05 #3441

Closed
5 tasks
jdangerx opened this issue Mar 5, 2024 · 7 comments
Closed
5 tasks

Nightly Build Failure 2024-03-05 #3441

jdangerx opened this issue Mar 5, 2024 · 7 comments
Assignees
Labels
bug Things that are just plain broken. nightly-builds Anything having to do with nightly builds or continuous deployment.

Comments

@jdangerx
Copy link
Member

jdangerx commented Mar 5, 2024

Overview

Something funny seems to have happened with arelle's local cache...

Next steps

What next steps do we need to do to understand or remediate the issue?

  • did any deps change on main?
  • can we reproduce this locally by running the same commands as the pudl etl script?

Looks like we have an arelle change - we can try to see if 2.23.13 breaks locally too. And we can try to see if there are relevant code changes between those patch versions.

-  - arelle-release=2.23.10=pyhd8ed1ab_0
+  - arelle-release=2.23.13=pyhd8ed1ab_0

UPDATE 10:34 ET: 2.23.13 did not break the following ferc_to_sqlite run:

PUDL_OUTPUT=$PUDL_OUTPUT/tmp ferc_to_sqlite --loglevel DEBUG --gcs-cache-path gs://internal-zenodo-cache.catalyst.coop --workers 8 ~/work/pudl/src/

And the code changes between 2.23.10 and 2.23.13 didn't turn up anything suspicious.

Verify that everything is fixed!

Once you've applied any necessary fixes, make sure that the nightly build outputs are all in their right places.

Tasks

Relevant logs

link to build logs from internal distribution bucket

Exception: dagster._core.errors.DagsterExecutionStepExecutionError: Error occurred while executing op "ferc6_xbrl":

Stack Trace:
  File "/home/mambauser/env/lib/python3.11/site-packages/dagster/_core/execution/plan/execute_plan.py", line 286, in dagster_event_sequence_for_step
    for step_event in check.generator(step_events):
<... dagster continues to complain that there was an error ...>

The above exception was caused by the following exception:
FileExistsError: [Errno 17] File exists: '/home/mambauser/.config/arelle/cache/http/www.xbrl.org/dtr/type'

Stack Trace:
  File "/home/mambauser/env/lib/python3.11/site-packages/dagster/_core/execution/plan/utils.py", line 54, in op_execution_error_boundary
    yield
  File "/home/mambauser/env/lib/python3.11/site-packages/dagster/_utils/__init__.py", line 467, in iterate_with_context
    next_output = next(iterator)
                  ^^^^^^^^^^^^^^
  File "/home/mambauser/env/lib/python3.11/site-packages/dagster/_core/execution/plan/compute_generator.py", line 131, in _coerce_op_compute_fn_to_iterator
    result = invoke_compute_fn(
             ^^^^^^^^^^^^^^^^^^
  File "/home/mambauser/env/lib/python3.11/site-packages/dagster/_core/execution/plan/compute_generator.py", line 125, in invoke_compute_fn
    return fn(context, **args_to_pass) if context_arg_provided else fn(**args_to_pass)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mambauser/pudl/src/pudl/extract/xbrl.py", line 85, in inner_op
    convert_form(
  File "/home/mambauser/pudl/src/pudl/extract/xbrl.py", line 129, in convert_form
    run_main(
  File "/home/mambauser/env/lib/python3.11/site-packages/ferc_xbrl_extractor/cli.py", line 158, in run_main
    extracted = xbrl.extract(
                ^^^^^^^^^^^^^
  File "/home/mambauser/env/lib/python3.11/site-packages/ferc_xbrl_extractor/xbrl.py", line 59, in extract
    table_defs = get_fact_tables(
                 ^^^^^^^^^^^^^^^^
  File "/home/mambauser/env/lib/python3.11/site-packages/ferc_xbrl_extractor/xbrl.py", line 249, in get_fact_tables
    taxonomy = Taxonomy.from_source(taxonomy_source, entry_point=entry_point)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mambauser/env/lib/python3.11/site-packages/ferc_xbrl_extractor/taxonomy.py", line 255, in from_source
    taxonomy, view = load_taxonomy_from_archive(taxonomy_source, entry_point)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mambauser/env/lib/python3.11/site-packages/ferc_xbrl_extractor/arelle_interface.py", line 48, in load_taxonomy_from_archive
    return _taxonomy_view(file_source)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mambauser/env/lib/python3.11/site-packages/ferc_xbrl_extractor/arelle_interface.py", line 19, in _taxonomy_view
    taxonomy = ModelXbrl.load(model_manager, taxonomy_source)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
***** the above is the last stack frame that belongs to us vs. arelle

  File "/home/mambauser/env/lib/python3.11/site-packages/arelle/ModelXbrl.py", line 87, in load
    modelXbrl.modelDocument = ModelDocument.load(modelXbrl, url, base, isEntry=True, **kwargs)
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mambauser/env/lib/python3.11/site-packages/arelle/ModelDocument.py", line 335, in load
    modelDocument.schemaDiscover(rootNode, isIncluded, isSupplemental, namespace)
  File "/home/mambauser/env/lib/python3.11/site-packages/arelle/ModelDocument.py", line 951, in schemaDiscover
    self.schemaDiscoverChildElements(rootElement, isSupplemental)
  File "/home/mambauser/env/lib/python3.11/site-packages/arelle/ModelDocument.py", line 975, in schemaDiscoverChildElements
    self.importDiscover(modelObject)
  File "/home/mambauser/env/lib/python3.11/site-packages/arelle/ModelDocument.py", line 1075, in importDiscover
    doc = load(self.modelXbrl, schemaLocation, base=importElementBase, isDiscovered=self.inDTS,
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mambauser/env/lib/python3.11/site-packages/arelle/ModelDocument.py", line 335, in load
    modelDocument.schemaDiscover(rootNode, isIncluded, isSupplemental, namespace)
  File "/home/mambauser/env/lib/python3.11/site-packages/arelle/ModelDocument.py", line 951, in schemaDiscover
    self.schemaDiscoverChildElements(rootElement, isSupplemental)
  File "/home/mambauser/env/lib/python3.11/site-packages/arelle/ModelDocument.py", line 1007, in schemaDiscoverChildElements
    self.schemaDiscoverChildElements(modelObject, isSupplemental)
  File "/home/mambauser/env/lib/python3.11/site-packages/arelle/ModelDocument.py", line 1007, in schemaDiscoverChildElements
    self.schemaDiscoverChildElements(modelObject, isSupplemental)
  File "/home/mambauser/env/lib/python3.11/site-packages/arelle/ModelDocument.py", line 999, in schemaDiscoverChildElements
    self.schemaLinkbaseRefDiscover(modelObject)
  File "/home/mambauser/env/lib/python3.11/site-packages/arelle/ModelDocument.py", line 1121, in schemaLinkbaseRefDiscover
    return self.discoverHref(element, urlRewritePluginClass="ModelDocument.InstanceSchemaRefRewriter")
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mambauser/env/lib/python3.11/site-packages/arelle/ModelDocument.py", line 1241, in discoverHref
    doc = _newDoc(self.modelXbrl, url, isDiscovered=not nonDTS, base=self.baseForElement(element), referringElement=element)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mambauser/env/lib/python3.11/site-packages/arelle/ModelDocument.py", line 337, in load
    modelDocument.linkbaseDiscover(rootNode)
  File "/home/mambauser/env/lib/python3.11/site-packages/arelle/ModelDocument.py", line 1172, in linkbaseDiscover
    href = self.discoverHref(linkElement, nonDTS=nonDTS)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mambauser/env/lib/python3.11/site-packages/arelle/ModelDocument.py", line 1241, in discoverHref
    doc = _newDoc(self.modelXbrl, url, isDiscovered=not nonDTS, base=self.baseForElement(element), referringElement=element)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mambauser/env/lib/python3.11/site-packages/arelle/ModelDocument.py", line 335, in load
    modelDocument.schemaDiscover(rootNode, isIncluded, isSupplemental, namespace)
  File "/home/mambauser/env/lib/python3.11/site-packages/arelle/ModelDocument.py", line 951, in schemaDiscover
    self.schemaDiscoverChildElements(rootElement, isSupplemental)
  File "/home/mambauser/env/lib/python3.11/site-packages/arelle/ModelDocument.py", line 975, in schemaDiscoverChildElements
    self.importDiscover(modelObject)
  File "/home/mambauser/env/lib/python3.11/site-packages/arelle/ModelDocument.py", line 1075, in importDiscover
    doc = load(self.modelXbrl, schemaLocation, base=importElementBase, isDiscovered=self.inDTS,
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mambauser/env/lib/python3.11/site-packages/arelle/ModelDocument.py", line 109, in load
    filepath = modelXbrl.modelManager.cntlr.webCache.getfilename(mappedUri, reload=reloadCache, checkModifiedTime=kwargs.get("checkModifiedTime",False))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mambauser/env/lib/python3.11/site-packages/arelle/WebCache.py", line 426, in getfilename
    return filepath if self._downloadFile(url, filepath) else None
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mambauser/env/lib/python3.11/site-packages/arelle/WebCache.py", line 508, in _downloadFile
    os.makedirs(filedir)
  File "<frozen os>", line 225, in makedirs
@jdangerx jdangerx added bug Things that are just plain broken. nightly-builds Anything having to do with nightly builds or continuous deployment. labels Mar 5, 2024
@jdangerx jdangerx moved this from New to In progress in Catalyst Megaproject Mar 5, 2024
@jdangerx jdangerx self-assigned this Mar 5, 2024
@jdangerx
Copy link
Member Author

jdangerx commented Mar 5, 2024

I looked at the code and found that the error occurs here:

        if not os.path.exists(filedir):
            os.makedirs(filedir)

Which sort of seems like there is some sort of race condition going on where someone makes the dir between when the code checks for the dir and when it decides to make the dir. I'm confused because this happens before we split off into multiple workers. There's some concurrency stuff in the Arelle lib but it looks mostly UI-related? But maybe there's some weird code path in there that I don't understand.

@jdangerx
Copy link
Member Author

jdangerx commented Mar 5, 2024

@bendnorman you had to do some fiddling to get the nightly build re-running last time, right? What was the fiddling?

@bendnorman
Copy link
Member

Yes! I manually kicked off the GitHub action for tag of the previous nightly build.

@jdangerx
Copy link
Member Author

jdangerx commented Mar 6, 2024

Re-running fixed this. 🤷

@jdangerx jdangerx closed this as completed Mar 6, 2024
@github-project-automation github-project-automation bot moved this from In progress to Done in Catalyst Megaproject Mar 6, 2024
@zaneselvans
Copy link
Member

I'm suspicious that this error may be related to the error we're getting in the FERC XBRL archivers: catalyst-cooperative/pudl-archiver#285

It's complaining about the same file early on https://www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd

@zaneselvans
Copy link
Member

And I guess this never comes up when running the ETL locally because this file has looong since been cached in my personal local arelle cache.

@zaneselvans
Copy link
Member

I tried removing the cache directory entirely in my local setup and re-running the FERC to SQLite extraction, and it doesn't seem to care at all. It's not even repopulating the cached files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Things that are just plain broken. nightly-builds Anything having to do with nightly builds or continuous deployment.
Projects
Archived in project
Development

No branches or pull requests

3 participants