Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor[next]: workflowify step3 #1516

Merged
merged 32 commits into from
Apr 29, 2024
Merged

Conversation

DropD
Copy link
Contributor

@DropD DropD commented Apr 2, 2024

Description

New:

  • ffront.stages.FieldOperatorDefinition

    • all the data to start the toolchain from a field operator dsl definition
  • ffront.stages.FoastOperatorDefinition

    • data after lowering from field operator dsl code
  • ffront.stages.FoastWithTypes

    • program argument types in addition to the foast definition for creating a program AST
  • ffront.stages.FoastClosure

    • program arguments in addition to the foast definition, ready to run the whole toolchain

Changed:

  • decorator.Program.__post_init__

    • implementation moved to past_passes.linters workflow steps
    • linting stage added to program transforms
  • decorator.FieldOperator.from_function

    • implementation moved to workflow step in ffront.func_to_foast
  • decorator.FieldOperator.as_program

    • implementation moved to workflow steps in ffront.foast_to_past
  • decorator.FieldOperator data attributes

    • added: definition_stage
    • removed:
      • .foast_node: replaced with .foast_stage.foast_node
      • .definition: replaced with .definition_stage.definition
  • next.backend.Backend

    • renamed: .transformer -> .transforms_prog
    • added: .transforms_fop, toolchain for starting from field operator
  • otf.recipes.FieldOpTransformWorkflow

    • now has all the steps from DSL field operator to ProgramCall via foast_to_past, with additional steps to go to the field operator IteratorIR expression directly instead (not run by default). The latter foast_to_itir step is required during lowering of programs that call a field operator.

Requirements

  • All fixes and/or new features come with corresponding tests.
  • Important design decisions have been documented in the approriate ADR inside the docs/development/ADRs/ folder.

@DropD DropD marked this pull request as ready for review April 5, 2024 09:20
@DropD DropD requested a review from havogt April 8, 2024 13:03
Copy link
Contributor

@havogt havogt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel you need to walk me through these changes, specifically the 2 different DEFAULT_X_TRANSFORMS and backend specifying 2 kinds of transforms.

src/gt4py/next/otf/recipes.py Outdated Show resolved Hide resolved
Copy link
Contributor

@havogt havogt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's talk with Enrique about the hashing part.

Comment on lines +103 to +105
if self.backend is not None and self.backend.transforms_prog is not None:
return self.backend.transforms_prog.func_to_past(self.definition_stage)
return next_backend.DEFAULT_PROG_TRANSFORMS.func_to_past(self.definition_stage)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have asked this question, in the previous refactoring: what's happening here? Can you document it? What's the case when the DEFAULT_PROG_TRANSFORMS should be used?

src/gt4py/next/ffront/decorator.py Show resolved Hide resolved
src/gt4py/next/ffront/decorator.py Show resolved Hide resolved
src/gt4py/next/ffront/decorator.py Show resolved Hide resolved
"""
Allows deriving dataclasses to modify what goes into the content hash per-field.

Warning: Using this will modify how the class gets pickled. If unpickling is desired,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@egparedes Can you have a look at this part? I am not sure I understand its implication and it seems like a second pair of eyes make sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, the Mixin classes and the functions have been added here to support hashing the Stage classes with eve.content_hash but I don't really see why this is needed. I mean, if these classes need to be hashable and there is an unique way to implement the hash that makes sense in all use cases, then we could just write a custom __hash__ instead of hacking around with __setstate__ and __getstate__ methods (which in addition of breaking the pickleablity of the instances, it'd only work for the content_hash function).

An even better and more general solution, in my opinion, would be to define custom hash functions as required for different purposes as singledispatch free functions, and then register specific implementations for the classes that need to be hashable in that specific way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the singledispatch function is probably the solution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My solution is up for review.

src/gt4py/next/backend.py Outdated Show resolved Hide resolved
@DropD DropD requested review from egparedes and havogt April 19, 2024 08:02
Copy link
Contributor

@havogt havogt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Let's wait for @egparedes feedback on the hashing.

src/gt4py/next/backend.py Show resolved Hide resolved
src/gt4py/next/backend.py Show resolved Hide resolved
Copy link
Contributor

@egparedes egparedes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only reviewed the changes in stages.py related to hashing. The code itself looks good, but I strongly suggest to rename the functions.

src/gt4py/next/ffront/stages.py Outdated Show resolved Hide resolved
src/gt4py/next/ffront/stages.py Outdated Show resolved Hide resolved
src/gt4py/next/ffront/stages.py Outdated Show resolved Hide resolved
src/gt4py/next/ffront/stages.py Outdated Show resolved Hide resolved
Comment on lines 107 to 137
@functools.singledispatch
def add_content_to_fingerprint(obj: Any, hasher: xtyping.HashlibAlgorithm) -> None:
hasher.update(str(obj).encode())


@add_content_to_fingerprint.register(FieldOperatorDefinition)
@add_content_to_fingerprint.register(FoastOperatorDefinition)
@add_content_to_fingerprint.register(FoastWithTypes)
@add_content_to_fingerprint.register(FoastClosure)
@add_content_to_fingerprint.register(ProgramDefinition)
@add_content_to_fingerprint.register(PastProgramDefinition)
@add_content_to_fingerprint.register(PastClosure)
def add_content_to_fingerprint_stages(obj: Any, hasher: xtyping.HashlibAlgorithm) -> None:
add_content_to_fingerprint(obj.__class__, hasher)
for field in dataclasses.fields(obj):
add_content_to_fingerprint(getattr(obj, field.name), hasher)


@add_content_to_fingerprint.register
def add_str_to_fingerprint(obj: str, hasher: xtyping.HashlibAlgorithm) -> None:
hasher.update(str(obj).encode())


@add_content_to_fingerprint.register(int)
@add_content_to_fingerprint.register(bool)
@add_content_to_fingerprint.register(float)
def add_builtin_to_fingerprint(
obj: None,
hasher: xtyping.HashlibAlgorithm,
) -> None:
hasher.update(str(obj).encode())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A suggestion to reduce boilerplate:

Suggested change
@functools.singledispatch
def add_content_to_fingerprint(obj: Any, hasher: xtyping.HashlibAlgorithm) -> None:
hasher.update(str(obj).encode())
@add_content_to_fingerprint.register(FieldOperatorDefinition)
@add_content_to_fingerprint.register(FoastOperatorDefinition)
@add_content_to_fingerprint.register(FoastWithTypes)
@add_content_to_fingerprint.register(FoastClosure)
@add_content_to_fingerprint.register(ProgramDefinition)
@add_content_to_fingerprint.register(PastProgramDefinition)
@add_content_to_fingerprint.register(PastClosure)
def add_content_to_fingerprint_stages(obj: Any, hasher: xtyping.HashlibAlgorithm) -> None:
add_content_to_fingerprint(obj.__class__, hasher)
for field in dataclasses.fields(obj):
add_content_to_fingerprint(getattr(obj, field.name), hasher)
@add_content_to_fingerprint.register
def add_str_to_fingerprint(obj: str, hasher: xtyping.HashlibAlgorithm) -> None:
hasher.update(str(obj).encode())
@add_content_to_fingerprint.register(int)
@add_content_to_fingerprint.register(bool)
@add_content_to_fingerprint.register(float)
def add_builtin_to_fingerprint(
obj: None,
hasher: xtyping.HashlibAlgorithm,
) -> None:
hasher.update(str(obj).encode())
@functools.singledispatch
def add_content_to_fingerprint(obj: Any, hasher: xtyping.HashlibAlgorithm) -> None:
hasher.update(str(obj).encode())
_default_impl = add_content_to_fingerprint[object]
for t in (str, int, bool, float):
add_content_to_fingerprint.register(t, _default_impl)
@add_content_to_fingerprint.register(FieldOperatorDefinition)
@add_content_to_fingerprint.register(FoastOperatorDefinition)
@add_content_to_fingerprint.register(FoastWithTypes)
@add_content_to_fingerprint.register(FoastClosure)
@add_content_to_fingerprint.register(ProgramDefinition)
@add_content_to_fingerprint.register(PastProgramDefinition)
@add_content_to_fingerprint.register(PastClosure)
def add_content_to_fingerprint_stages(obj: Any, hasher: xtyping.HashlibAlgorithm) -> None:
add_content_to_fingerprint(obj.__class__, hasher)
for field in dataclasses.fields(obj):
add_content_to_fingerprint(getattr(obj, field.name), hasher)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think registering a dispatch for the builtins is not even necessary at this point, now that the default is exactly the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was necessary for str and int, but only to avoid a max recursion depth error. I am not exactly sure why this helps.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For str is needed because otherwise it will call the implementation for Iterable, not the default implementation, since str is Iterable (and that's why the infinite recursion appears). For int, I don't see why it would be needed. Are you sure that just str is not enough? Maybe I'm misunderstanding something...

Copy link
Contributor

@egparedes egparedes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still wonder why it is needed to register a specific implementation for int in the add_content_to_fingerprint singledispatch function, but other than that, the fingerprinting functions look good to me.

@DropD
Copy link
Contributor Author

DropD commented Apr 29, 2024

I still wonder why it is needed to register a specific implementation for int in the add_content_to_fingerprint singledispatch function, but other than that, the fingerprinting functions look good to me.

I also wonder why that should impact the recursion depth.

@DropD DropD merged commit 9cc7548 into GridTools:main Apr 29, 2024
51 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants