Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Add a five minute cache to get_destination_retry_timings #3933

Merged
merged 9 commits into from
Oct 1, 2018

Conversation

erikjohnston
Copy link
Member

Hopefully helps with #3931

Based on: #3932

@erikjohnston erikjohnston requested a review from a team September 21, 2018 14:05
@erikjohnston
Copy link
Member Author

Hmm, this breaks metrics due to it returning a length estimate less than 0:

2018-09-21 16:11:41,429 - twisted - 243 - CRITICAL -  - 
Traceback (most recent call last):
  File "/home/erikj/.synapse3/lib/python3.6/site-packages/twisted/web/server.py", line 198, in process
    self.render(resrc)
  File "/home/erikj/synapse/synapse/http/site.py", line 116, in render
    Request.render(self, resrc)
  File "/home/erikj/.synapse3/lib/python3.6/site-packages/twisted/web/server.py", line 258, in render
    body = resrc.render(self)
  File "/home/erikj/.synapse3/lib/python3.6/site-packages/twisted/web/resource.py", line 250, in render
    return m(request)
  File "/home/erikj/.synapse3/lib/python3.6/site-packages/prometheus_client/twisted/_exposition.py", line 18, in render_GET
    return generate_latest(self.registry)
  File "/home/erikj/.synapse3/lib/python3.6/site-packages/prometheus_client/exposition.py", line 69, in generate_latest
    for metric in registry.collect():
  File "/home/erikj/synapse/synapse/metrics/__init__.py", line 46, in collect
    for metric in REGISTRY.collect():
  File "/home/erikj/.synapse3/lib/python3.6/site-packages/prometheus_client/core.py", line 102, in collect
    for metric in collector.collect():
  File "/home/erikj/synapse/synapse/util/caches/__init__.py", line 85, in collect
    cache_size.labels(cache_name).set(len(cache))
ValueError: __len__() should return >= 0

erikjohnston and others added 2 commits September 21, 2018 16:25
It used to try and produce an estimate, which was sometimes negative.
This caused metrics to be sad, so lets always just calculate it from
scratch.
if value is SENTINEL:
return default

if self.iterable:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not call this an eviction. To me, eviction means that we have removed a valid entry to make room for a new one. I'd consider this as equivalent to an update with a magic "doesn't exist" value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem is that we're potentially conflating "eviction because of invalidation" (which this code may do) with "eviction because of some automated process" in the statistics. If something pops from the cache so it won't be cached anymore deliberately, I don't think that's useful to track in what this metric appears to do (assessing whether our cache is effective).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hawkowl: so I think you're agreeing with me? If so I'll remove this bit of code and merge it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, for some reason I thought that this was consistent with what we did elsewhere, but it isn't.

richvdh pushed a commit that referenced this pull request Sep 26, 2018
It used to try and produce an estimate, which was sometimes negative.
This caused metrics to be sad, so lets always just calculate it from
scratch.

(This appears to have been a longstanding bug, but one which has been made more
of a problem by #3932 and #3933).

(This was originally done by Erik as part of #3933. I'm cherry-picking it
because really it's a fix in its own right)
@erikjohnston erikjohnston requested a review from a team October 1, 2018 11:01
Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm otherwise

@@ -95,6 +98,13 @@ def __getitem__(self, key):

return entry.value

def pop(self, key, default=None):
value = self._cache.pop(key, SENTINEL)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the same as return self._cache.pop(key, default) now.

docstring wouldn't go amiss.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the same as return self._cache.pop(key, default) now.

It's not, actually, as self._cache.pop won't raise if default is None. Thinking about it we can probably just do return self._cache.pop(key, **kwargs)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argh, I've completely messed this up

if value is SENTINEL:
raise KeyError(key)

return value
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could have just done:

def pop(self, key, *args, **kwargs):
    "Identical functionality to `dict.pop(..)`"
    return self._cache.pop(key, *args, **kwargs)

But that feels quite opaque to me?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah sorry, I hadn't quite twigged that dict.pop(foo) was different to dict.pop(foo, None).

Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@erikjohnston erikjohnston merged commit 8f5c23d into develop Oct 1, 2018
@erikjohnston erikjohnston deleted the erikj/destination_retry_cache branch December 12, 2018 15:45
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants