Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collection class should not inherit from Catalog class #296

Closed
philvarner opened this issue Mar 29, 2021 · 14 comments
Closed

Collection class should not inherit from Catalog class #296

philvarner opened this issue Mar 29, 2021 · 14 comments
Labels

Comments

@philvarner
Copy link
Collaborator

This is a confusing part of the spec (which has changed recently) -- there is no longer a subtying relationship between the types Collection and Catalog. They're two independent types that just happen to share some fields of the same name.

pystac shouldn't conflate them in the implementation.

@philvarner
Copy link
Collaborator Author

Related: These tests #292

@m-mohr
Copy link
Contributor

m-mohr commented Apr 23, 2021

We said in calls, tooling could still implement an inheritance relationship though.

@lossyrob
Copy link
Member

I'm not sure it's worth it to remove this inheritance from the implementation. Considering they can both be roots, they operate in mostly the same way, the inheritance relationship is super useful. I'm not sure it would be worth it to create some CatalogCollectionCommonBase type and have that scattered around the library to have an empty Catalog implementation. I forget what the key difference was that made a collection not able to be a catalog - it was something about the number of item/child links? If it's a minimal distinction, I'd rather handle it as a special case rather than change the natural inheritance structure of PySTAC to accommodate the breakage of inheritance in the spec.

@m-mohr
Copy link
Contributor

m-mohr commented Apr 27, 2021

The primary reason was the type field, which has different values.

There are other minor reasons like the requirement for item/child links or different values for stac_extensions.

@philvarner
Copy link
Collaborator Author

One underlying issue is a recently-developed irrational hatred for subtyping inheritance.

But, the conceptual problem is that it violates the Liskov Substitution Principle for subtyping inheritance. A Collection can't be arbitrarily used anywhere that a Catalog can be used -- they're conceptually unrelated objects that just happen to have overlap in their field names.

I'd be glad to make the change, but since I haven't contributed to this component before, I wanted to discuss before doing the work (which should be minimal).

@m-mohr
Copy link
Contributor

m-mohr commented Apr 27, 2021

One underlying issue is a recently-developed irrational hatred for subtyping inheritance.

Who developed that?

But, the conceptual problem is that it violates the Liskov Substitution Principle for subtyping inheritance. A Collection can't be arbitrarily used anywhere that a Catalog can be used -- they're conceptually unrelated objects that just happen to have overlap in their field names.

I think I agree, but which are the places where a Catalog can't be substituted with a Collection?

@philvarner
Copy link
Collaborator Author

One underlying issue is a recently-developed irrational hatred for subtyping inheritance.

Who developed that?

Me! Sorry, I should have been more clear that I was talking about my own irrational hatred for it.

@lossyrob
Copy link
Member

I'm not sure there's enough to motivate a change like this here, but if you want to take a crack at it to prove it's a minimal change and show that there's a way to modify it in a non-disruptive way (which for someone who knows how much functionality Catalogs and Collections share, I am doubtful), I'd suggest proposing something against the #309 branch, as it has full type annotations and mypy. The tests aren't yet passing on it yet, that should be solved over this week, but it would be better to tinker around with the typed codebase with something like Pylance to check (which does check for breakage of the Liskov Substitution Principle) than the current main.

@lossyrob
Copy link
Member

Also, just to note:

they're conceptually unrelated objects that just happen to have overlap in their field names.

I can see where this opinion comes from, but I also do not agree with the strength of the stance that's implied here. Conceptually, they are related - that's why a Collection was a specific implementation of a Catalog with additional functionality to begin with. I know his has been argued ad nauseum in something that resembles the battle for functional folk to eradicate OO concepts from the world or whatever (I regret the days I was even resembling being on the side of that bike shedding war), but there's not many concepts that fit the description "things that can be the root of a STAC, and can hold child Collections or Catalogs as well as Items", and the two things that do fit that description are clearly related. From a practical standpoint, it is useful to have the inheritance. If there's some practical reason to separate the inheritance, that isn't rooted in ideology or coding tastes, I'm happy to weigh the pros vs cons of removing what is a useful programmatic concept to introduce more useful programmatic concepts.

@philvarner
Copy link
Collaborator Author

philvarner commented Apr 27, 2021 via email

@m-mohr
Copy link
Contributor

m-mohr commented Apr 27, 2021

I know his has been argued ad nauseum in something that resembles the battle for functional folk to eradicate OO concepts from the world or whatever

Hehe, I'm not sure that goes towards me being radical about inheritance in the discussion (and I hate that there's no inheritance any longer), but I'm actually strictly on the OO side and not on the functional side. ;-) Maybe to give some outside of Python point of view, all the JS implementations will also still implement inheritance for practical reasons.

I'm wondering, does Pylance fail in case the Collection class returns "Collection" instead of "Catalog" although it inherits from Catalog and Catalogs are always expected to return "Catalog" as type?

@philvarner
Copy link
Collaborator Author

Also, just to note:

they're conceptually unrelated objects that just happen to have overlap in their field names.

I can see where this opinion comes from, but I also do not agree with the strength of the stance that's implied here. Conceptually, they are related - that's why a Collection was a specific implementation of a Catalog with additional functionality to begin with. I know his has been argued ad nauseum in something that resembles the battle for functional folk to eradicate OO concepts from the world or whatever (I regret the days I was even resembling being on the side of that bike shedding war), but there's not many concepts that fit the description "things that can be the root of a STAC, and can hold child Collections or Catalogs as well as Items", and the two things that do fit that description are clearly related. From a practical standpoint, it is useful to have the inheritance. If there's some practical reason to separate the inheritance, that isn't rooted in ideology or coding tastes, I'm happy to weigh the pros vs cons of removing what is a useful programmatic concept to introduce more useful programmatic concepts.

Mine mostly comes from having been involved with several codebases that became overly-complicated and difficult to maintain because the inheritance hierarchy, for what amounted to basically just saving a few lines of duplicate code. Maybe this is the best way to structure it in a language like Python that doesn't have a typeclass mechanism? I'm still trying to work out a lot of these design principles myself, having done Java for a long time and then more recently trying to incorporate the Python & Rust & Scala & Haskell concepts in.

@lossyrob
Copy link
Member

Makes sense. Agree that inheritance isn't always (or often) the best solution, but does have it's place, especially in Python development. For sure it's always good to be trying things out and transferring concepts around where they're useful. Since leaning on the Pylance type checking functionality with full type annotations I'm enjoying some of the type-driven development I miss from Scala, though trying to keep Python code still looking like Python and not leaning too Haskell-y :-)

@lossyrob lossyrob removed the 1.0.0-RCx label May 4, 2021
gadomski added a commit that referenced this issue Nov 9, 2022
I'm generally uneasy with the way that the base catalog `extra_fields` holds
values that might be mutated on the collection. I think this is an argument
_for_ #296, namely that while the inheritance makes some things easier, it also
causes problems.
@gadomski gadomski added this to the 2.0 milestone Jan 31, 2023
@gadomski gadomski removed this from the 2.0 milestone Mar 24, 2023
@gadomski
Copy link
Member

Closing as wontfix, as I haven't seen the benefit of separating Catalog and Collection in a code sense (as correct as it is that they're different in a logical sense).

@gadomski gadomski closed this as not planned Won't fix, can't repro, duplicate, stale Sep 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants