Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Improve datastore type hints #14678

Open
wants to merge 10 commits into
base: develop
Choose a base branch
from
Open

Improve datastore type hints #14678

wants to merge 10 commits into from

Conversation

clokep
Copy link
Member

@clokep clokep commented Dec 14, 2022

By being explicit about methods we should be able to get better type checking (although this isn't entirely proven in this PR). The idea is to mostly treat datastores the way we handle config classes, each as an individual unit that has to reach back to a "root" object to query other datastores.

This means that the main DataStore object would become a bundle of SQLBaseStore objects instead of a single entity which has super-classes of SQLBaseStore.

As a proof-of-concept I converted the RelationsStore (it is simple and I know it fairly well). Other stores could be done in follow-ups.

Related to #11165

Downsides of this approach is it might make #11733 worse as you now have a DataStore which includes a RelationsStore? It adds another layer to the puzzle.

@clokep
Copy link
Member Author

clokep commented Dec 14, 2022

(I should probably mention that this would need to be expanded to the other datastore classes, e.g. for workers, etc.)

@clokep clokep force-pushed the clokep/datastore-rejigger branch from 73f2fdb to 7e4ef0f Compare April 14, 2023 16:20
changelog.d/14678.misc Outdated Show resolved Hide resolved
@clokep clokep marked this pull request as ready for review April 14, 2023 16:24
@clokep clokep requested a review from a team as a code owner April 14, 2023 16:24
@clokep clokep removed the request for review from a team April 14, 2023 16:29
@clokep
Copy link
Member Author

clokep commented Apr 14, 2023

This needs a bit more to get it working with all the various datastores we create...

@clokep clokep changed the title Proof of concept for better typed datastores Improved datastore type hints Apr 14, 2023
@clokep clokep changed the title Improved datastore type hints Improve datastore type hints Apr 14, 2023
@clokep clokep force-pushed the clokep/datastore-rejigger branch from 7b54466 to 9796706 Compare May 17, 2023 18:32
@clokep clokep requested a review from a team May 17, 2023 18:36
Copy link
Contributor

@squahtx squahtx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks okay so far.

I think we're missing a few things before we can apply this pattern to the rest of the stores.

  • CacheInvalidationWorkerStore and SQLBaseStore assume there is only one instance of themselves. Is there a plan for how we're going to handle them?
  • The bit of SQLBaseStore that involves external_cached_functions won't work for split out stores. I'm not sure what it does.
  • process_replication_rows and process_replication_position won't get called on split out stores right now, so the massive chain of overrides and super() calls will break. We'd probably want to refactor things to avoid the massive chain anyway, and have stores register their interest in particular row types.

Comment on lines +145 to +148
try:
store = getattr(self, store_name)
except AttributeError:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be ignoring store_names that refer to missing fields here?

@squahtx squahtx requested a review from a team May 19, 2023 12:27
@squahtx
Copy link
Contributor

squahtx commented May 19, 2023

I'm going to put this back in the queue in case anyone has more thoughts.

@clokep
Copy link
Member Author

clokep commented May 23, 2023

  • CacheInvalidationWorkerStore and SQLBaseStore assume there is only one instance of themselves. Is there a plan for how we're going to handle them?

I'm failing to see how we would end up with more than one (per process)? Or are you just suggesting it might be possible to do that somehow?

  • The bit of SQLBaseStore that involves external_cached_functions won't work for split out stores. I'm not sure what it does.

Good-catch, I think we need to do something similar to _attempt_to_invalidate_cache. 👍

  • process_replication_rows and process_replication_position won't get called on split out stores right now, so the massive chain of overrides and super() calls will break. We'd probably want to refactor things to avoid the massive chain anyway, and have stores register their interest in particular row types.

This sounds like a good change to make regardless, maybe I'll just do this as a separate PR.

@squahtx
Copy link
Contributor

squahtx commented May 23, 2023

  • CacheInvalidationWorkerStore and SQLBaseStore assume there is only one instance of themselves. Is there a plan for how we're going to handle them?

I'm failing to see how we would end up with more than one (per process)? Or are you just suggesting it might be possible to do that somehow?

It's more that every store inherits from one of the two. Even now with this PR, I think there are two SQLBaseStores? One of them is the big data store and the other is the RelationsStore.

(I'm not suggesting that there should be exactly one or we should support multiple CacheInvalidationWorkerStore/SQLBaseStores. It's just something to think about.)

Copy link
Contributor

@reivilibre reivilibre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing to add to Sean's review — looks sensible overall and given this has stuck around for a while in the queue I don't think anyone else has any objections.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants