Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add type annotations for io.vasp.outputs #3776

Merged
merged 24 commits into from
May 31, 2024

Conversation

DanielYang59
Copy link
Contributor

@DanielYang59 DanielYang59 commented Apr 21, 2024

Summary

  • Add type annotations for io.vasp.outputs, a follow-up PR of Add type annotations for io.vasp.inputs/optics #3740.
  • Use PathLike as type over str for all paths.
  • Add missing docstrings and clean up docstrings.
  • Lazily import Vasprun/Xdatcar in Trajectory to be consistent with other part of the class.

@janosh janosh added io Input/output functionality vasp Vienna Ab initio Simulation Package types Type all the things labels Apr 21, 2024
@DanielYang59
Copy link
Contributor Author

@janosh While you are working on improving dcostrings, can you please skip this io.vasp.outputs part to avoid overlapping? I would help fixing some docstring issues while adding types. Thanks!

@@ -766,33 +786,39 @@ def run_type(self) -> str:
if self.parameters.get("LUSE_VDW", False):
run_type += "+rVV10"
elif self.incar.get("IVDW") in IVDW_TYPES:
run_type += "+vdW-" + IVDW_TYPES[self.incar.get("IVDW")]
run_type += f"+vdW-{IVDW_TYPES[self.incar.get("IVDW", 0)]}"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No sure if we really need the default IVDW as 0 (for no correction), see VASP wiki

Copy link
Contributor

@esoteric-ephemera esoteric-ephemera May 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please be cognizant of downstream effects, especially in the materials project software stack. If there is no reason to remove/change something, then it may be best to leave as-is in case something downstream depends on it

Copy link
Contributor Author

@DanielYang59 DanielYang59 May 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure about downstream effect, but I assume default of 0 instead of None might align better with VASP behavior.

I would revert this for now, and focus on typing things in this PR.

Copy link
Contributor Author

@DanielYang59 DanielYang59 May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can revisit this change?

With current implementation, if the IVDW tag is absent from INCAR, self.incar.get('IVDW') would return None, and f"+vdW-{IVDW_TYPES[self.incar.get('IVDW')]}" would break by using None as the index.

elif self.incar.get("IVDW") in IVDW_TYPES:
run_type += "+vdW-" + IVDW_TYPES[self.incar.get("IVDW")]

IVDW_TYPES = {
1: "DFT-D2",
10: "DFT-D2",
11: "DFT-D3",
12: "DFT-D3-BJ",
2: "TS",
20: "TS",
21: "TS-H",
202: "MBD",
4: "dDsC",
}

Also I think setting the default value of IVDW is a rational default which agrees with the VASP wiki:

IVDW = 0 | 1 | 10 | 11 | 12 | 13 | 2 | 20 | 21 | 202 | 263 | 3 | 4
Default: IVDW = 0 (no correction)
Description: IVDW specifies a vdW (dispersion) correction.

@DanielYang59
Copy link
Contributor Author

DanielYang59 commented May 6, 2024

I'm now facing significant resistance from the "default value as None behavior" (most mypy errors are more or less related to possible default None values), similar to #3724.

Default values for get or similar methods

For the following example:

poscar_dict = {"lattice": Lattice, "pos": Position}  # a terrible example dict

- comment: str | None = poscar_dict.get("comment", None)
+ comment: str = poscar_dict.get("comment", "")

# More downstream operations that only works for `str`
clean_comment = comment.strip()

Using empty str "" could avoid the type being expanded to str | None.

Or perhaps:

@property
def net_magnetization(self):
"""Net magnetization from Chgcar"""
return np.sum(self.data["diff"]) if self.is_spin_polarized else None

Such default values as None would expand the type from float to float | None, meaning all downstream operations (or child objects) need to explicitly check for None.

Default values for functions/methods

For example

def __init__(
self,
filename: str | Path,
ionic_step_skip: int | None = None,
ionic_step_offset: int = 0,
parse_dos: bool = True,
parse_eigen: bool = True,
parse_projected_eigen: bool = False,
parse_potcar_file: bool = True,
occu_tol: float = 1e-8,
separate_spins: bool = False,
exception_on_bad_xml: bool = True,
) -> None:

The default ionic_step_skip is None, which would require all downstream methods to handle this case. Using a default int which will NOT normally happen during runtime could bypass this issue (for number of steps, -1 might be a good choice).

Some preliminary thoughts

  • For sequence types, use an empty sequence might be quite straightforward ("" for str, []/()/{} for list/tuple/dict). Certain not as a function default value with mutables.
  • For numbers, maybe use a number that would NOT happen during runtime (for example, -1 for number of CPU cores).

Summary

Such cases are happening everywhere across the code base (for all kinds of types) and need to insert huge amount of checks if var is None. I'm not sure how to resolve this gracefully (I cannot even describe this issue properly because it's happening everywhere and in various forms). Any suggestion would be appreciated @janosh.

@janosh
Copy link
Member

janosh commented May 6, 2024

Any suggestion would be appreciated @janosh.

definitely feel free to homogenize types where it make sense and simplifies the code. there are cases where it's important to be explicit about accepting None and handling that case separately. but if an empty string/tuple/... works for the default case, by all means use that over None

@DanielYang59 DanielYang59 marked this pull request as ready for review May 30, 2024 05:28
@DanielYang59
Copy link
Contributor Author

DanielYang59 commented May 30, 2024

@janosh Can you please revise this?

For most xml.etree or re related searches, I chose to suppress mypy error instead of check for None (for example 0346b92), because I think it's fairly safe to assume the source file is properly formatted and match should normally not return None.

Only change and need discussion is the change of default IVDW to agree with VASP in c990b01 and discussions in #3776 (comment).

I notice CI runs are cancelled randomly from time to time (Update: very frequently for example in https://github.com/materialsproject/pymatgen/actions/runs/9303003521/job/25604368181)?

@shyuep shyuep merged commit c70e5b8 into materialsproject:master May 31, 2024
23 checks passed
@DanielYang59 DanielYang59 deleted the type-vasp-outputs branch June 1, 2024 01:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
io Input/output functionality types Type all the things vasp Vienna Ab initio Simulation Package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants