Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

♻️ [#2060] Replace get_paginated_results with pagination_helper #995

Merged
merged 3 commits into from
Feb 12, 2024

Conversation

stevenbal
Copy link
Contributor

@stevenbal stevenbal commented Jan 29, 2024

task: https://taiga.maykinmedia.nl/project/open-inwoner/task/2060

This PR replaces get_paginated_results (which was broken for zgw-consumers 0.28.0) with the new pagination_helper and refactors the client usage such that we have an APIClient for each API (zaken, catalogi, klanten, etc.)

@stevenbal stevenbal marked this pull request as draft January 29, 2024 14:29
@stevenbal stevenbal force-pushed the fix/2060-replace-get-paginated-results branch 5 times, most recently from 6063a72 to 0819fdf Compare February 1, 2024 13:46

return klanten
return client.retrieve_objectcontactmoment(contactmoment, object_type)


def fetch_klant(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure if we still need these functions after the changes, I suppose they do save us the hassle of having to build a client when we use it in views

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could remove them and just use the client instances in the views.

These are a level of indirection and we should really be re-using clients (for connection and expensive cert/schema setups etc).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove them 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed these functions and now use the clients directly. I wanted to switch to using context managers (with build_client("zaak"), but in some cases we can't do this currently, because we do not raise errors if we can't build a client but return None instead. This is related to the issue that we currently simply show no results if we can't build a client

"cases:{user_bsn}:{max_requests}:{identificatie}",
timeout=settings.CACHE_ZGW_ZAKEN_TIMEOUT,
)
def fetch_cases(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to rename this to be a bit more standardized (because now I see things like case / zaak mixed). Perhaps something like create/read/update/delete + name of the resource in the API: retrieve_zaken

@stevenbal
Copy link
Contributor Author

@alextreme @Bartvaderkin once this PR is approved, I'll ask Sergei if he can release a new version of zgw-consumers and I'll upgrade it in the requirements

@codecov-commenter
Copy link

codecov-commenter commented Feb 1, 2024

Codecov Report

Attention: 110 lines in your changes are missing coverage. Please review.

Comparison is base (4ae1a96) 94.74% compared to head (1dce7e0) 94.78%.

Files Patch % Lines
src/open_inwoner/openzaak/clients.py 85.45% 48 Missing ⚠️
src/open_inwoner/openklant/clients.py 81.53% 24 Missing ⚠️
src/open_inwoner/cms/cases/views/status.py 78.84% 11 Missing ⚠️
src/open_inwoner/openzaak/notifications.py 69.69% 10 Missing ⚠️
src/open_inwoner/openzaak/zgw_imports.py 67.74% 10 Missing ⚠️
src/open_inwoner/openklant/wrap.py 77.77% 2 Missing ⚠️
src/open_inwoner/cms/cases/views/mixins.py 92.30% 1 Missing ⚠️
src/open_inwoner/openklant/services.py 80.00% 1 Missing ⚠️
src/open_inwoner/openklant/views/contactform.py 95.83% 1 Missing ⚠️
...ak/migrations/0035_populate_zaaktypeconfig_urls.py 87.50% 1 Missing ⚠️
... and 1 more
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #995      +/-   ##
===========================================
+ Coverage    94.74%   94.78%   +0.03%     
===========================================
  Files          872      871       -1     
  Lines        30573    30560      -13     
===========================================
- Hits         28966    28965       -1     
+ Misses        1607     1595      -12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@alextreme alextreme requested review from pi-sigma and removed request for alextreme February 1, 2024 16:29
@alextreme
Copy link
Member

Nice, thanks for this Steven! I'll wait for Bart and/or Paul to do a review first

Copy link
Contributor

@pi-sigma pi-sigma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! Most of the comments are suggestions, so feel free to adopt what seems right.

src/open_inwoner/openklant/clients.py Show resolved Hide resolved
src/open_inwoner/openklant/clients.py Outdated Show resolved Hide resolved
src/open_inwoner/openklant/clients.py Outdated Show resolved Hide resolved
# eSuite doesn't implement a `object_type` query parameter
ret = [moment for moment in moments if moment.object_type == object_type]

return ret
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Since we're not using the list (?), you could:
  • yield from (moment for moment in moments if...)
  • (in the following function): return next(ocms, None)
  1. This seems to be the only use of the function which potentially returns multiple objects, but then we're only retrieving and returning the first (in retrieve_objectcontactmoment). Is this to allow for possible extensions where we actually use multiple objectcontactmomenten from this? Otherwise this and the following function could be collapsed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a API so best not to mix lists and generator return values (because you don't know how it is used).

(also: the example would be return (moment for moment in moments if...) (adding yield from just adds another generator around a generator))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could collapse it into one function retrieve_objectcontactmoment, since retrieve_objectcontactmomenten_for_object_type is indeed only used by that function currently. Though we might need this in the future, so that's why I added it initially

service = getattr(config, f"{type_}_service")
if service:
client = _build_client(service, client_factory=JSONParserClient)
client = _build_client(service, client_factory=client_class)
return client

logger.warning(f"no service defined for {type_}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logger.warning("no service defined for {type}", type=type_) (better to avoid f-string interpolation with logging)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(better to avoid f-string interpolation with logging)

@pi-sigma Nooooo. Where does this information/advice come from?

.format() is much more dangerous because it'll actually raise runtime errors if you miss a replacement placeholder, while f-strings would catch it at parse time.

Copy link

@Viicos Viicos Feb 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logger.warning("no service defined for %s", type_) is actually the correct form (see https://docs.python.org/3/howto/logging.html#optimization)

("no service defined for {type}", type=type_ is afaik not supported by the logging module)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That links to the optimization section. Here is the bit about variables: https://docs.python.org/3/howto/logging.html#logging-variable-data, which doesn't mention a correct form but says something about support.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought logger.warning("no service defined for {type}", type=type_) was equivalent to logger.warning("no service defined for %s", type_); perhaps that's not the case.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought logger.warning("no service defined for {type}", type=type_) was equivalent to logger.warning("no service defined for %s", type_); perhaps that's not the case.

log methods only allow pos. args (except for exc_info and similar). I also think that apart from performance (which can be meaningless is most cases), it allows easier grouping of log messages in tools like Sentry

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Bartvaderkin This not not (merely) about performance. The main motivation is to facilitate log aggregation in Sentry. If you do logger.warning("no service defined for %s", type_), the logger can group together logs with different instances under the same label. Both f-strings and .format should be avoided. Like I said, I didn't think of logger.warning("no service defined for {type}", type=type_) as equivalent to .format() logging.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we doing log aggregation in tools like Sentry in this project? I don't think the existing logs are setup for that so that would be introducing something new.

Copy link

@Viicos Viicos Feb 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it's true that with deferred string interpolation, it would only error at runtime if there's a mismatch in the number of arguments. I think linting rules exist to warn on this issue though (logging-too-many-args from Pylint)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we doing log aggregation in tools like Sentry in this project? I don't think the existing logs are setup for that so that would be introducing something new.

We do have Sentry set up for test/acceptance/prod and if logger.warning(f"no service defined for {type_}") gets triggered for two different type_s, it will create two separate issues in Sentry, whereas logger.warning(f"no service defined for %s", type_) will group them together

src/open_inwoner/openzaak/api_models.py Show resolved Hide resolved
# eSuite doesn't implement a `object_type` query parameter
ret = [moment for moment in moments if moment.object_type == object_type]

return ret
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a API so best not to mix lists and generator return values (because you don't know how it is used).

(also: the example would be return (moment for moment in moments if...) (adding yield from just adds another generator around a generator))

service = getattr(config, f"{type_}_service")
if service:
client = _build_client(service, client_factory=JSONParserClient)
client = _build_client(service, client_factory=client_class)
return client

logger.warning(f"no service defined for {type_}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(better to avoid f-string interpolation with logging)

@pi-sigma Nooooo. Where does this information/advice come from?

.format() is much more dangerous because it'll actually raise runtime errors if you miss a replacement placeholder, while f-strings would catch it at parse time.


return klanten
return client.retrieve_objectcontactmoment(contactmoment, object_type)


def fetch_klant(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could remove them and just use the client instances in the views.

These are a level of indirection and we should really be re-using clients (for connection and expensive cert/schema setups etc).

src/open_inwoner/openklant/wrap.py Show resolved Hide resolved
Comment on lines +15 to +28
def get_json_response(response: requests.Response) -> Optional[dict]:
try:
response_json = response.json()
except Exception:
response_json = None

results = []
try:
response.raise_for_status()
except requests.HTTPError as exc:
if response.status_code >= 500:
raise
raise ClientError(response_json) from exc

response = client.get(resource, *args, **kwargs)

def _get_results(response):
_results = response["results"]
if test_func:
_results = [result for result in _results if test_func(result)]
return _results

response = response
results += _get_results(response)

if minimum and len(results) >= minimum:
return results

while response.get("next"):
next_url = urlparse(response["next"])
query = parse_qs(next_url.query)
new_page = int(query["page"][0])

request_params["page"] = [new_page]
request_kwargs["params"] = request_params
kwargs["request_kwargs"] = request_kwargs

response = client.get(resource, *args, **kwargs)
results += _get_results(response)

if minimum and len(results) >= minimum:
return results

return results


class JSONParserClient(APIClient):
"""
Simple layer on top of `ape_pie.APIClient` to attempt to convert the response to
JSON and check that the request is successful (and raise the correct exceptions if not)
"""

def request(
self,
*args,
**kwargs,
) -> Union[List[Object], Object]:
response = super().request(*args, **kwargs)
try:
response_json = response.json()
except Exception:
response_json = None

try:
response.raise_for_status()
except requests.HTTPError as exc:
if response.status_code >= 500:
raise
raise ClientError(response_json) from exc

return response_json
return response_json
Copy link
Contributor

@Bartvaderkin Bartvaderkin Feb 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The get_json_response() also feels like a common thing that would be part of ZGW or the client library, like the pagination?

Copy link
Contributor Author

@stevenbal stevenbal Feb 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discussed this with Sergei and Open Formulieren doesn't use this pattern, so he said he won't put it in zgw-consumers. Additionally we should probably get rid of the ClientErrors altogether?

EDIT: I think I'll leave this as is for now, and we can look into the ClientError thing when we address error handling in general, like you mentioned here #995 (comment)

@stevenbal stevenbal force-pushed the fix/2060-replace-get-paginated-results branch 3 times, most recently from 289fc21 to adb911b Compare February 6, 2024 13:29
@stevenbal stevenbal force-pushed the fix/2060-replace-get-paginated-results branch from adb911b to 657549d Compare February 6, 2024 14:51
@stevenbal stevenbal marked this pull request as ready for review February 6, 2024 15:23
task: https://taiga.maykinmedia.nl/project/open-inwoner/task/2060

* create clients for each API with semantic methods
* replace `get_paginated_results` with `pagination_helper`
instead of using intermediate functions that each build a new client
@alextreme alextreme merged commit d7c5dba into develop Feb 12, 2024
16 checks passed
@alextreme alextreme deleted the fix/2060-replace-get-paginated-results branch February 12, 2024 10:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants