-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
♻️ [#2060] Replace get_paginated_results with pagination_helper #995
Conversation
6063a72
to
0819fdf
Compare
src/open_inwoner/openklant/wrap.py
Outdated
|
||
return klanten | ||
return client.retrieve_objectcontactmoment(contactmoment, object_type) | ||
|
||
|
||
def fetch_klant( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely sure if we still need these functions after the changes, I suppose they do save us the hassle of having to build a client when we use it in views
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could remove them and just use the client instances in the views.
These are a level of indirection and we should really be re-using clients (for connection and expensive cert/schema setups etc).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove them 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed these functions and now use the clients directly. I wanted to switch to using context managers (with build_client("zaak")
, but in some cases we can't do this currently, because we do not raise errors if we can't build a client but return None
instead. This is related to the issue that we currently simply show no results if we can't build a client
"cases:{user_bsn}:{max_requests}:{identificatie}", | ||
timeout=settings.CACHE_ZGW_ZAKEN_TIMEOUT, | ||
) | ||
def fetch_cases( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to rename this to be a bit more standardized (because now I see things like case / zaak mixed). Perhaps something like create/read/update/delete + name of the resource in the API: retrieve_zaken
@alextreme @Bartvaderkin once this PR is approved, I'll ask Sergei if he can release a new version of zgw-consumers and I'll upgrade it in the requirements |
Nice, thanks for this Steven! I'll wait for Bart and/or Paul to do a review first |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! Most of the comments are suggestions, so feel free to adopt what seems right.
# eSuite doesn't implement a `object_type` query parameter | ||
ret = [moment for moment in moments if moment.object_type == object_type] | ||
|
||
return ret |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Since we're not using the list (?), you could:
yield from (moment for moment in moments if...)
- (in the following function): return next(ocms, None)
- This seems to be the only use of the function which potentially returns multiple objects, but then we're only retrieving and returning the first (in
retrieve_objectcontactmoment
). Is this to allow for possible extensions where we actually use multipleobjectcontactmomenten
from this? Otherwise this and the following function could be collapsed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a API so best not to mix lists and generator return values (because you don't know how it is used).
(also: the example would be return (moment for moment in moments if...)
(adding yield from just adds another generator around a generator))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could collapse it into one function retrieve_objectcontactmoment
, since retrieve_objectcontactmomenten_for_object_type
is indeed only used by that function currently. Though we might need this in the future, so that's why I added it initially
service = getattr(config, f"{type_}_service") | ||
if service: | ||
client = _build_client(service, client_factory=JSONParserClient) | ||
client = _build_client(service, client_factory=client_class) | ||
return client | ||
|
||
logger.warning(f"no service defined for {type_}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logger.warning("no service defined for {type}", type=type_)
(better to avoid f-string interpolation with logging)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(better to avoid f-string interpolation with logging)
@pi-sigma Nooooo. Where does this information/advice come from?
.format()
is much more dangerous because it'll actually raise runtime errors if you miss a replacement placeholder, while f-strings would catch it at parse time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logger.warning("no service defined for %s", type_)
is actually the correct form (see https://docs.python.org/3/howto/logging.html#optimization)
("no service defined for {type}", type=type_
is afaik not supported by the logging module)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That links to the optimization section. Here is the bit about variables: https://docs.python.org/3/howto/logging.html#logging-variable-data, which doesn't mention a correct form but says something about support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought logger.warning("no service defined for {type}", type=type_)
was equivalent to logger.warning("no service defined for %s", type_)
; perhaps that's not the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought
logger.warning("no service defined for {type}", type=type_)
was equivalent tologger.warning("no service defined for %s", type_)
; perhaps that's not the case.
log methods only allow pos. args (except for exc_info
and similar). I also think that apart from performance (which can be meaningless is most cases), it allows easier grouping of log messages in tools like Sentry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Bartvaderkin This not not (merely) about performance. The main motivation is to facilitate log aggregation in Sentry. If you do logger.warning("no service defined for %s", type_)
, the logger can group together logs with different instances under the same label. Both f-strings and .format
should be avoided. Like I said, I didn't think of logger.warning("no service defined for {type}", type=type_)
as equivalent to .format()
logging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we doing log aggregation in tools like Sentry in this project? I don't think the existing logs are setup for that so that would be introducing something new.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it's true that with deferred string interpolation, it would only error at runtime if there's a mismatch in the number of arguments. I think linting rules exist to warn on this issue though (logging-too-many-args from Pylint)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we doing log aggregation in tools like Sentry in this project? I don't think the existing logs are setup for that so that would be introducing something new.
We do have Sentry set up for test/acceptance/prod and if logger.warning(f"no service defined for {type_}")
gets triggered for two different type_
s, it will create two separate issues in Sentry, whereas logger.warning(f"no service defined for %s", type_)
will group them together
# eSuite doesn't implement a `object_type` query parameter | ||
ret = [moment for moment in moments if moment.object_type == object_type] | ||
|
||
return ret |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a API so best not to mix lists and generator return values (because you don't know how it is used).
(also: the example would be return (moment for moment in moments if...)
(adding yield from just adds another generator around a generator))
service = getattr(config, f"{type_}_service") | ||
if service: | ||
client = _build_client(service, client_factory=JSONParserClient) | ||
client = _build_client(service, client_factory=client_class) | ||
return client | ||
|
||
logger.warning(f"no service defined for {type_}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(better to avoid f-string interpolation with logging)
@pi-sigma Nooooo. Where does this information/advice come from?
.format()
is much more dangerous because it'll actually raise runtime errors if you miss a replacement placeholder, while f-strings would catch it at parse time.
src/open_inwoner/openklant/wrap.py
Outdated
|
||
return klanten | ||
return client.retrieve_objectcontactmoment(contactmoment, object_type) | ||
|
||
|
||
def fetch_klant( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could remove them and just use the client instances in the views.
These are a level of indirection and we should really be re-using clients (for connection and expensive cert/schema setups etc).
def get_json_response(response: requests.Response) -> Optional[dict]: | ||
try: | ||
response_json = response.json() | ||
except Exception: | ||
response_json = None | ||
|
||
results = [] | ||
try: | ||
response.raise_for_status() | ||
except requests.HTTPError as exc: | ||
if response.status_code >= 500: | ||
raise | ||
raise ClientError(response_json) from exc | ||
|
||
response = client.get(resource, *args, **kwargs) | ||
|
||
def _get_results(response): | ||
_results = response["results"] | ||
if test_func: | ||
_results = [result for result in _results if test_func(result)] | ||
return _results | ||
|
||
response = response | ||
results += _get_results(response) | ||
|
||
if minimum and len(results) >= minimum: | ||
return results | ||
|
||
while response.get("next"): | ||
next_url = urlparse(response["next"]) | ||
query = parse_qs(next_url.query) | ||
new_page = int(query["page"][0]) | ||
|
||
request_params["page"] = [new_page] | ||
request_kwargs["params"] = request_params | ||
kwargs["request_kwargs"] = request_kwargs | ||
|
||
response = client.get(resource, *args, **kwargs) | ||
results += _get_results(response) | ||
|
||
if minimum and len(results) >= minimum: | ||
return results | ||
|
||
return results | ||
|
||
|
||
class JSONParserClient(APIClient): | ||
""" | ||
Simple layer on top of `ape_pie.APIClient` to attempt to convert the response to | ||
JSON and check that the request is successful (and raise the correct exceptions if not) | ||
""" | ||
|
||
def request( | ||
self, | ||
*args, | ||
**kwargs, | ||
) -> Union[List[Object], Object]: | ||
response = super().request(*args, **kwargs) | ||
try: | ||
response_json = response.json() | ||
except Exception: | ||
response_json = None | ||
|
||
try: | ||
response.raise_for_status() | ||
except requests.HTTPError as exc: | ||
if response.status_code >= 500: | ||
raise | ||
raise ClientError(response_json) from exc | ||
|
||
return response_json | ||
return response_json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The get_json_response()
also feels like a common thing that would be part of ZGW or the client library, like the pagination?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I discussed this with Sergei and Open Formulieren doesn't use this pattern, so he said he won't put it in zgw-consumers. Additionally we should probably get rid of the ClientError
s altogether?
EDIT: I think I'll leave this as is for now, and we can look into the ClientError
thing when we address error handling in general, like you mentioned here #995 (comment)
289fc21
to
adb911b
Compare
adb911b
to
657549d
Compare
task: https://taiga.maykinmedia.nl/project/open-inwoner/task/2060 * create clients for each API with semantic methods * replace `get_paginated_results` with `pagination_helper`
instead of using intermediate functions that each build a new client
657549d
to
1dce7e0
Compare
task: https://taiga.maykinmedia.nl/project/open-inwoner/task/2060
This PR replaces
get_paginated_results
(which was broken for zgw-consumers 0.28.0) with the newpagination_helper
and refactors the client usage such that we have anAPIClient
for each API (zaken, catalogi, klanten, etc.)