Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with dataclass in queue submission (?) #508

Closed
ligerzero-ai opened this issue Nov 23, 2024 · 5 comments
Closed

Error with dataclass in queue submission (?) #508

ligerzero-ai opened this issue Nov 23, 2024 · 5 comments
Labels
bug Something isn't working

Comments

@ligerzero-ai
Copy link

Some kind of weird error when I try to submit vasp to the queue

from pymatgen.io.vasp.inputs import Incar
from ase.build import bulk
from pymatgen.io.ase import AseAtomsAdaptor
#from vasp_nodes import get_default_POTCAR_paths, write_POTCAR, VaspInput, vasp_job, create_WorkingDirectory, stack_element_string
import numpy as np
from pyiron_workflow import for_node, Workflow

%load_ext autoreload
%autoreload 2

from pyiron_nodes.atomistic.engine.vasp import get_default_POTCAR_paths, write_POTCAR, VaspInput, vasp_job, create_WorkingDirectory, stack_element_string

incar = Incar.from_dict({
    "ENCUT": 520,
    "EDIFF": 1e-4,
    "ISMEAR": 1,
    "SIGMA": 0.1,
})

bulk_Fe = bulk("Fe", cubic=True, a=2.83)
bulk_Fe = AseAtomsAdaptor().get_structure(bulk_Fe)
vi = VaspInput(bulk_Fe, incar, potcar_paths=["/cmmc/u/hmai/vasp_potentials_54/Fe_sv/POTCAR"])

import pyiron_workflow as pwf

def submit_to_slurm(
    node,
    /,
    job_name=None,
    output_file=None,
    error_file=None,
    time_limit="00:05:00",
    account="hmai",
    partition="s.cmmg",
    nodes=1,
    ntasks=40,
    cpus_per_task=40,
    memory="1GB",
):
    """
    An example of a helper function for running nodes on slurm.

    - Saves the node
    - Writes a slurm batch script that 
        - Loads the node
        - Runs it
        - Saves it again
    - Runs the batch script
    """
    if node.graph_root is not node:
        raise ValueError(
            f"Can only submit parent-most nodes, but {node.full_label} "
            f"has root {node.graph_root.full_label}"
        )
        
    node.save(backend="pickle")
    p = node.as_path()
    
    if job_name is None:
        job_name = node.full_label 
        job_name = job_name.replace(node.semantic_delimiter, "_")
        job_name = "pwf" + job_name
        
    script_content = f"""#!/bin/bash
#SBATCH --partition=s.cmmg
#SBATCH --ntasks=1  # Adjust CPU count as needed
#SBATCH --cpus-per-task=1
#SBATCH --time=10:00:00  # Adjust wall time as needed
#SBATCH --output=time.out
#SBATCH --error=error.out
#SBATCH --job-name=resubmitter.sh  # Adjust job name as needed
#SBATCH --get-user-env=L
#SBATCH --mem-per-cpu=2000MB
#SBATCH --hint=nomultithread
##SBATCH --reservation=benchmarking

# Execute Python script inline
python - <<EOF
from pyiron_workflow import PickleStorage
node = PickleStorage().load(filename="{node.as_path().joinpath('picklestorage').resolve()}")  # Load
node.run()  # Run
node.save(backend="pickle")  # Save again
EOF
"""
    submission_script = p.joinpath("node_submission.sh")
    submission_script.write_text(script_content)
    import subprocess
    submission = subprocess.run(["sbatch", submission_script.resolve()])
    return submission

#v_job = vasp_job(workdir="/cmmc/u/hmai/github_dev_pyiron/pyiron_vasp_nodes/test3", vasp_input = vi)
wf = pwf.Workflow("vasp_queue")
wf.vasp_job_queue = vasp_job(workdir="/cmmc/u/hmai/github_dev_pyiron/pyiron_vasp_nodes/test3",
                             vasp_input = vi)
# Send it off
submit_to_slurm(wf)

# Try reloading it (need to wait for `submit_to_slurm`)
while not wf.has_saved_content():
    sleep(0.2)
reloaded = pwf.Workflow(wf.label)

# Reload it until it's done
while reloaded.vasp_job_queue.outputs is pwf.NOT_DATA:
    reloaded.load()
    print(reloaded.vasp_job_queue.outputs.to_value_dict()["vasp_output"]["generic"]["energy_pot"])
    sleep(1)
reloaded.outputs.to_value_dict()

In the error.out

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/cmmc/ptmp/pyironhb/mambaforge/envs/pyiron_mpie_cmti_2024-11-18/lib/python3.11/site-packages/pyiron_workflow/__init__.py", line 37, in <module>
    from pyiron_workflow.workflow import Workflow
  File "/cmmc/ptmp/pyironhb/mambaforge/envs/pyiron_mpie_cmti_2024-11-18/lib/python3.11/site-packages/pyiron_workflow/workflow.py", line 13, in <module>
    from pyiron_workflow.nodes.composite import Composite
  File "/cmmc/ptmp/pyironhb/mambaforge/envs/pyiron_mpie_cmti_2024-11-18/lib/python3.11/site-packages/pyiron_workflow/nodes/composite.py", line 15, in <module>
    from pyiron_workflow.create import HasCreator
  File "/cmmc/ptmp/pyironhb/mambaforge/envs/pyiron_mpie_cmti_2024-11-18/lib/python3.11/site-packages/pyiron_workflow/create.py", line 16, in <module>
    from pyiron_workflow.nodes.function import function_node, as_function_node
  File "/cmmc/ptmp/pyironhb/mambaforge/envs/pyiron_mpie_cmti_2024-11-18/lib/python3.11/site-packages/pyiron_workflow/nodes/function.py", line 10, in <module>
    from pyiron_workflow.mixin.preview import ScrapesIO
  File "/cmmc/ptmp/pyironhb/mambaforge/envs/pyiron_mpie_cmti_2024-11-18/lib/python3.11/site-packages/pyiron_workflow/mixin/preview.py", line 28, in <module>
    from pyiron_workflow.channels import NOT_DATA
  File "/cmmc/ptmp/pyironhb/mambaforge/envs/pyiron_mpie_cmti_2024-11-18/lib/python3.11/site-packages/pyiron_workflow/channels.py", line 19, in <module>
    from pyiron_workflow.type_hinting import (
  File "/cmmc/ptmp/pyironhb/mambaforge/envs/pyiron_mpie_cmti_2024-11-18/lib/python3.11/site-packages/pyiron_workflow/type_hinting.py", line 10, in <module>
    from pint import Quantity
  File "/cmmc/ptmp/pyironhb/mambaforge/envs/pyiron_mpie_cmti_2024-11-18/lib/python3.11/site-packages/pint/__init__.py", line 18, in <module>
    from .delegates.formatter._format_helpers import formatter
  File "/cmmc/ptmp/pyironhb/mambaforge/envs/pyiron_mpie_cmti_2024-11-18/lib/python3.11/site-packages/pint/delegates/__init__.py", line 12, in <module>
    from . import txt_defparser
  File "/cmmc/ptmp/pyironhb/mambaforge/envs/pyiron_mpie_cmti_2024-11-18/lib/python3.11/site-packages/pint/delegates/txt_defparser/__init__.py", line 12, in <module>
    from .defparser import DefParser
  File "/cmmc/ptmp/pyironhb/mambaforge/envs/pyiron_mpie_cmti_2024-11-18/lib/python3.11/site-packages/pint/delegates/txt_defparser/defparser.py", line 10, in <module>
    from . import block, common, context, defaults, group, plain, system
  File "/cmmc/ptmp/pyironhb/mambaforge/envs/pyiron_mpie_cmti_2024-11-18/lib/python3.11/site-packages/pint/delegates/txt_defparser/common.py", line 23, in <module>
    @dataclass(frozen=True)
     ^^^^^^^^^^^^^^^^^^^^^^
  File "/cmmc/ptmp/pyironhb/mambaforge/envs/pyiron_mpie_cmti_2024-11-18/lib/python3.11/dataclasses.py", line 1220, in wrap
    return _process_class(cls, init, repr, eq, order, unsafe_hash,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/cmmc/ptmp/pyironhb/mambaforge/envs/pyiron_mpie_cmti_2024-11-18/lib/python3.11/dataclasses.py", line 993, in _process_class
    raise TypeError('cannot inherit frozen dataclass from a '
TypeError: cannot inherit frozen dataclass from a non-frozen one
@ligerzero-ai ligerzero-ai added the bug Something isn't working label Nov 23, 2024
@jan-janssen
Copy link
Member

That seems to be a pint related issue hgrecco/pint#1969 (comment) - @niklassiemer is already working on it pyiron/docker-stacks#417

@liamhuber
Copy link
Member

I hope that's it, because otherwise I am not seeing how to fix it.

On the off chance that this isn't purely an upstream/pint issue and we need to dig deeper, try simplifying your example like this:

import pyiron_workflow as pwf
from dataclasses import dataclass

@dataclass
class SomeData:
    foo: str = "foo"
    bar: int = 42

@pwf.as_function_node
def SomeNode(data):
    return data

n = SomeNode(SomeData())

# Then the rest of the submission and load looping on this simpler node

@niklassiemer
Copy link
Member

If it is the pint stuff, it will break by just import pint and, thus, also on import pyiron_workflow. The resulting error is for sure very similar!

@liamhuber
Copy link
Member

Right, and the import line itself is indeed at the root of the stack trace, I should have noticed that. Looking super probable this is a non-workflow issue, and hopefully the new release with the latest pint will resolve the downgrade issue you're having on the docker stack

@ligerzero-ai
Copy link
Author

I fixed (?) it by running in a completely separate environment with my own pyiron_workflow.

The good news is that the first vasp calculation using liam's method of node serialisation was now successful!

Looking forward to running production

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants