Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add some level of caching for speed-up #78

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 11 additions & 5 deletions deploystream/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
from os.path import join, dirname
from flask import Flask


APP_DIR = dirname(__file__)
CONFIG_DIR = join(dirname(APP_DIR), 'config')
STATIC_DIR = join(APP_DIR, 'static')
Expand Down Expand Up @@ -44,21 +45,26 @@
}
})

from deploystream.lib import ensure_certifi_certs_installed
from .lib import ensure_certifi_certs_installed
ensure_certifi_certs_installed()

# set the secret key. Dummy secret for flask. When using in real life, have
# something that is actually a secret
app.secret_key = 'mysecret'

# Initialise the providers.
from providers import init_providers
from .providers import init_providers
classes = init_providers(app.config['PROVIDERS'])

# Configure additional routes needed for oauth
from deploystream.apps.oauth.views import configure_oauth_routes
from .apps.oauth.views import configure_oauth_routes
configure_oauth_routes(classes)

# Import any views we want to register here at the bottom of the file:
import deploystream.views # NOQA
import deploystream.apps.feature.views # NOQA
from . import views # NOQA
from .apps.feature import views as feature_views # NOQA

# As this requests caching monkey patches requests.Session, it must be done
# at the end, else the Jira provider fails
from .lib import cache
cache.activate_requests_caching()
2 changes: 2 additions & 0 deletions deploystream/apps/feature/views.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

from deploystream import app
from deploystream.apps.feature.lib import get_feature_info, get_all_features
from deploystream.lib import cache
from deploystream.lib.transforms import nativify
from deploystream.decorators import needs_providers

Expand All @@ -23,6 +24,7 @@ def _wrapped(*args, **kwargs):

@app.route('/features', methods=['GET'])
@needs_providers
@cache.cached()
@as_json
def list_features(providers):
features = get_all_features(providers)
Expand Down
39 changes: 39 additions & 0 deletions deploystream/lib/cache.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
from functools import wraps

from flask import request
from werkzeug.contrib.cache import SimpleCache


cache = SimpleCache()
EXPIRY_SECONDS = 5 * 60


def cached(timeout=EXPIRY_SECONDS, key='view/%s'):
"""
A decorator for caching views.

Source: http://flask.pocoo.org/docs/patterns/viewdecorators/

With a little additional work it could be made into a generic caching
decorator for other functions that return pickleable data.
"""
def decorator(f):
@wraps(f)
def decorated_function(*args, **kwargs):
cache_key = key % request.path
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this also wants to take into account user. Is there something in the request that is unique for the user?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will have to get session ID from somewhere?

rv = cache.get(cache_key)
if rv is not None:
return rv
rv = f(*args, **kwargs)
cache.set(cache_key, rv, timeout=timeout)
return rv
return decorated_function
return decorator


def activate_requests_caching():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want both activate_requests_caching and cached?

Doesn't the cached decorator negate the need for the other?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They serve different purposes. One caches what we serve to the user, the other caches what we get from other tools.
I believe there will be cases where what we deliver to the user will change (different views of the same data set) but what we need to fetch from the tools will be the same. Especially depending on how we get data from those tools (e.g. less requests that fetch a large dataset rather than many small requests - I think you were planning to use this kind of approach with fetching branch info from github).

User -(cached)- Ployst -(requests_caching)- 3rd party tools

Eventually we may want to review how we fetch and cache 3rd party data so that we can have some of it pre-fetched in the background, not impacting response time to the user. That lives in a longer-term future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I understood the difference in when they were caching - but in the use case of list_features the activate_requests_caching won't ever be used.

I'm a little uneasy with the way the activate_requests_caching does the caching - if it means all installed libraries using requests also have that cache then that might not be the right way about it. It feels like caching should be a per-provider thing - as source code info is more important to be live than planning info for example.

Happy to merge if you add a config param to decide whether to call this on line 70 of deploystream/__init__.py :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On retrospect, this looked like a quick win but it is a shoddy solution. I am concerned about the load time for the page and I want my team to use ployst and I didn't want page load time to be a pain point.

But I am not sure what is the best place to implement caching for provider calls. Either we do it by 1) wrapping it around calls to providers (preferrable, but at that level we may not be aware of what variables make output change - such as provider configuration) or 2) we leave that responsibility to providers' implementations (that will be more aware of what caching makes sense for the service they call, or which are the variables).

Leaning towards 1), but changes in provider configuration may require busting the cache for that provider.

"""
Call once to activate caching for all HTTP GET issued by 'requests'
"""
import requests_cache
requests_cache.install_cache(expire_after=EXPIRY_SECONDS)
1 change: 1 addition & 0 deletions requirements/runtime.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@ zope.interface # define and enforce interfaces
certifi # Module for Mozilla's CA bundle
apitopy # Required to implement the sprint.ly client
jira-python>=0.13 # Required to access JIRA api
requests-cache # A transparent caching library for 'requests'