Reduce API response times by 30% by using memcache migration flag #5239

AlanCoding · 2019-11-05T03:34:07Z

SUMMARY

When profiling times taken in requests (or just ever looking at the debug toolbar output), the migration middleware came up as a big resource consumer. It was tricky to pin down, because it wasn't much if tested with isolated requests.

I did timings with the debug settings off and just logged middleware times and did some typical UI use. I found general numbers of ~40% of the time was spent in middleware and ~30% was spent in the migration middleware alone.

This is not hard to reduce, but this PR still needs some more testing via those prior means and verification of continued functionality.

ISSUE TYPE

Feature Pull Request

COMPONENT NAME

API

AWX VERSION

9.0.1

ADDITIONAL INFORMATION

ryanpetrello · 2019-11-05T03:37:09Z

~30% was spent in the migration middleware alone.

Do you have some examples? When we talk about 30%, what is the actual amount total time saved in seconds on a fairly naive request (i.e., are we talking hundredths of a second per request, or tenths?) Are we talking about requests that don't otherwise have database access (e.g., GET /api/v2/)?

ryanpetrello · 2019-11-05T03:40:01Z

awx/main/middleware.py

-            return redirect(reverse("ui:migrations_notran"))
+        if cache.get('migration_in_progress', False):
+            executor = MigrationExecutor(connection)
+            plan = executor.migration_plan(executor.loader.graph.leaf_nodes())


So I take it this is getting to be fairly expensive per-request?

ryanpetrello · 2019-11-05T03:41:38Z

awx/main/apps.py

 from django.utils.translation import ugettext_lazy as _


+def raise_migration_flag(**kwargs):
+    from awx.main.tasks import set_migration_flag
+    set_migration_flag()


You probably actually want set_migration_flag.apply_async() so that it gets broadcast everywhere, right?

ryanpetrello

I'm interesting in chatting about numbers, but I really like this idea in general if we can show that the migration planner is contributing notable overhead to certain requests.

ryanpetrello · 2019-11-05T03:49:38Z

softwarefactory-project-zuul · 2019-11-05T03:49:54Z

Build failed.

awx-api-lint : SUCCESS in 5m 07s
awx-api : SUCCESS in 8m 35s
awx-ui : SUCCESS in 11m 47s
awx-ui-next : FAILURE in 12m 45s
awx-swagger : SUCCESS in 15m 29s
awx-detect-schema-change : SUCCESS in 15m 30s (non-voting)
awx-ansible-modules : SUCCESS in 6m 30s

AlanCoding · 2019-11-05T12:24:16Z

Oh I forgot the other thing that's still TBD on this.

Hopefully reasonable statement: we should not run the task manager while migrations are running.

It's probably benign either way, because right now the task manager will run and just rollback in the event of errors, until it finishes and it completes. But there's a reasonable user experience case to show a log "migration is running" versus ERROR ERROR ERROR.

softwarefactory-project-zuul · 2019-11-05T12:39:17Z

Build succeeded.

awx-api-lint : SUCCESS in 6m 25s
awx-api : SUCCESS in 16m 07s
awx-ui : SUCCESS in 12m 34s
awx-ui-next : SUCCESS in 10m 20s
awx-swagger : SUCCESS in 16m 33s
awx-detect-schema-change : SUCCESS in 14m 55s (non-voting)
awx-ansible-modules : SUCCESS in 7m 25s

AlanCoding · 2019-11-05T13:08:03Z

In spite of substantial hardware differences, I actually get almost the same number for that timing, as low as 6 seconds.

Here is the branch I used for the timings previously, which I just rebased:

https://github.com/ansible/awx/compare/devel...AlanCoding:middleware_timings?expand=1

I extremely consistently get higher numbers using that method. When I first did this my number was 157 ms. I re-tested just now over a slightly longer time frame and got 200 ms. This sounds suspiciously high, I know.

This is why I was doing the more complicated task of bracketing the middleware methods - because those numbers remained stubbornly higher than when tested in isolation. I cannot tell you exactly why, my speculation falls short of justifying the large magnitude of the difference. Your 6 second number averages out to 63 ms, which isn't exactly small either when considering that it applies to all requests.

ryanpetrello · 2019-11-05T13:40:18Z

Even if we're not talking about saving multiple tenths of seconds, I do think this change is simple enough, and worth it. It's always bothered me that we rebuild the migration executor on every single request just for the purpose of showing an upgrade screen.

I like the idea of setting a flag in the cache once migrations start, and having the local instances unset their flag as requests land there and they discover migration is done.

ryanpetrello · 2019-11-05T13:43:13Z

Hopefully reasonable statement: we should not run the task manager while migrations are running.

I suppose it would be reasonable to add a check to the task manager's startup. This should definitely lean on the same memcached flag check, though.

softwarefactory-project-zuul · 2019-11-12T03:44:08Z

Build succeeded.

awx-api-lint : SUCCESS in 5m 18s
awx-api : SUCCESS in 24m 24s
awx-ui : SUCCESS in 14m 25s
awx-ui-next : SUCCESS in 19m 28s
awx-swagger : SUCCESS in 23m 44s
awx-detect-schema-change : SUCCESS in 17m 23s (non-voting)
awx-ansible-modules : SUCCESS in 10m 12s

softwarefactory-project-zuul · 2019-11-12T04:26:10Z

Build succeeded.

awx-api-lint : SUCCESS in 2m 42s
awx-api : SUCCESS in 9m 40s
awx-ui : SUCCESS in 9m 47s
awx-ui-next : SUCCESS in 11m 28s
awx-swagger : SUCCESS in 13m 55s
awx-detect-schema-change : SUCCESS in 13m 52s (non-voting)
awx-ansible-modules : SUCCESS in 4m 23s

AlanCoding · 2019-11-12T16:15:25Z

did some basic manual testing. This shows the same behavior as we have now, that if you migrate back 1, the migration screen will never go away. Also observed the log messages related to the task manager.

Filed #5302 related to this.

AlanCoding · 2019-11-12T20:08:13Z

I combined this content with my timing stuff

https://github.com/ansible/awx/compare/devel...AlanCoding:migration_cache_verification?expand=1

and got some updated numbers.

{
  "total": 0.3692765372923051,
  "request": 1.916118051813937e-05,
  "MigrationRanCheckMiddleware_request": 0.019325396110271585,
  "ActivityStreamMiddleware_request": 0.006098179981626314,
  "URLModificationMiddleware_request": 4.861820703265311e-05,
  "SessionTimeoutMiddleware_request": 0.0005964273693917811,
  "URLModificationMiddleware": 1.2784168638032058e-05,
  "ActivityStreamMiddleware": 0.0014353072506257858
}

that's for 87 data points. This is not as good as I had hoped at 0.02 seconds, but much improved over 0.3 I was seeing before with same hardware & settings.

Anyway, I think I'm finished at this point. Checking these numbers was my last TODO item.

AlanCoding · 2019-12-12T20:30:56Z

@ryanpetrello are you in favor of merge?

ryanpetrello · 2019-12-12T20:31:44Z

Oh, I thought I merged this. Yep, let's merge it.

softwarefactory-project-zuul · 2019-12-12T20:55:22Z

Build failed (gate pipeline). For information on how to proceed, see
http://docs.openstack.org/infra/manual/developers.html#automated-testing

awx-api-lint : FAILURE in 13m 19s
awx-api : SUCCESS in 20m 24s
awx-ui : SUCCESS in 16m 32s
awx-ui-next : SUCCESS in 18m 13s
awx-swagger : SUCCESS in 20m 25s
awx-detect-schema-change : FAILURE in 22m 53s (non-voting)
awx-ansible-modules : SUCCESS in 14m 05s
awx-push-new-schema : SKIPPED (non-voting)

wenottingham · 2019-12-13T20:51:59Z

regate

wenottingham · 2019-12-13T21:13:44Z

(needs rebase before merging)

softwarefactory-project-zuul · 2019-12-13T21:48:53Z

Build failed (gate pipeline). For information on how to proceed, see
http://docs.openstack.org/infra/manual/developers.html#automated-testing

awx-api-lint : FAILURE in 4m 34s
awx-api : SUCCESS in 8m 49s
awx-ui : SUCCESS in 6m 15s
awx-ui-next : SUCCESS in 7m 49s
awx-swagger : SUCCESS in 11m 19s
awx-detect-schema-change : FAILURE in 10m 01s (non-voting)
awx-ansible-modules : SUCCESS in 5m 23s
awx-push-new-schema : SKIPPED (non-voting)

softwarefactory-project-zuul · 2019-12-16T04:11:57Z

Build succeeded.

awx-api-lint : SUCCESS in 6m 33s
awx-api : SUCCESS in 10m 55s
awx-ui : SUCCESS in 8m 57s
awx-ui-next : SUCCESS in 10m 14s
awx-swagger : SUCCESS in 12m 35s
awx-detect-schema-change : SUCCESS in 14m 15s (non-voting)
awx-ansible-modules : SUCCESS in 6m 56s

softwarefactory-project-zuul · 2019-12-16T04:33:28Z

Build succeeded (gate pipeline).

awx-api-lint : SUCCESS in 2m 42s
awx-api : SUCCESS in 10m 13s
awx-ui : SUCCESS in 4m 51s
awx-ui-next : SUCCESS in 6m 34s
awx-swagger : SUCCESS in 11m 20s
awx-detect-schema-change : SUCCESS in 11m 38s (non-voting)
awx-ansible-modules : SUCCESS in 3m 09s
awx-push-new-schema : SUCCESS in 9m 11s (non-voting)

ryanpetrello reviewed Nov 5, 2019

View reviewed changes

awxbot added type:enhancement component:api labels Nov 5, 2019

AlanCoding force-pushed the migration_cache branch from 26a2a88 to c551677 Compare November 5, 2019 12:22

AlanCoding force-pushed the migration_cache branch from c551677 to 09fb989 Compare November 12, 2019 03:15

AlanCoding marked this pull request as ready for review November 12, 2019 04:15

AlanCoding requested a review from chrismeyersfsu November 12, 2019 16:15

ryanpetrello added the mergeit label Dec 5, 2019

ryanpetrello approved these changes Dec 12, 2019

View reviewed changes

Reduce API response times by caching migration flag

5433af6

Apply migration flag check to task manager

a0910eb

AlanCoding force-pushed the migration_cache branch from 9ba88a0 to a0910eb Compare December 16, 2019 03:57

softwarefactory-project-zuul bot merged commit 112f896 into ansible:devel Dec 16, 2019

smuth4 mentioned this pull request Dec 26, 2019

Upgrade from 9.0.1 to 9.1.0 breaks the system #5530

Closed

AlanCoding mentioned this pull request Oct 7, 2022

[performance] Make the migration middleware faster, second attempt #13018

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce API response times by 30% by using memcache migration flag #5239

Reduce API response times by 30% by using memcache migration flag #5239

AlanCoding commented Nov 5, 2019

ryanpetrello commented Nov 5, 2019 •

edited

Loading

ryanpetrello Nov 5, 2019

ryanpetrello Nov 5, 2019

ryanpetrello left a comment •

edited

Loading

ryanpetrello commented Nov 5, 2019

softwarefactory-project-zuul bot commented Nov 5, 2019

AlanCoding commented Nov 5, 2019

softwarefactory-project-zuul bot commented Nov 5, 2019

AlanCoding commented Nov 5, 2019

ryanpetrello commented Nov 5, 2019 •

edited

Loading

ryanpetrello commented Nov 5, 2019

softwarefactory-project-zuul bot commented Nov 12, 2019

softwarefactory-project-zuul bot commented Nov 12, 2019

AlanCoding commented Nov 12, 2019

AlanCoding commented Nov 12, 2019

AlanCoding commented Dec 12, 2019

ryanpetrello commented Dec 12, 2019

softwarefactory-project-zuul bot commented Dec 12, 2019

wenottingham commented Dec 13, 2019

wenottingham commented Dec 13, 2019

softwarefactory-project-zuul bot commented Dec 13, 2019

softwarefactory-project-zuul bot commented Dec 16, 2019

softwarefactory-project-zuul bot commented Dec 16, 2019

Reduce API response times by 30% by using memcache migration flag #5239

Reduce API response times by 30% by using memcache migration flag #5239

Conversation

AlanCoding commented Nov 5, 2019

SUMMARY

ISSUE TYPE

COMPONENT NAME

AWX VERSION

ADDITIONAL INFORMATION

ryanpetrello commented Nov 5, 2019 • edited Loading

ryanpetrello Nov 5, 2019

Choose a reason for hiding this comment

ryanpetrello Nov 5, 2019

Choose a reason for hiding this comment

ryanpetrello left a comment • edited Loading

Choose a reason for hiding this comment

ryanpetrello commented Nov 5, 2019

softwarefactory-project-zuul bot commented Nov 5, 2019

AlanCoding commented Nov 5, 2019

softwarefactory-project-zuul bot commented Nov 5, 2019

AlanCoding commented Nov 5, 2019

ryanpetrello commented Nov 5, 2019 • edited Loading

ryanpetrello commented Nov 5, 2019

softwarefactory-project-zuul bot commented Nov 12, 2019

softwarefactory-project-zuul bot commented Nov 12, 2019

AlanCoding commented Nov 12, 2019

AlanCoding commented Nov 12, 2019

AlanCoding commented Dec 12, 2019

ryanpetrello commented Dec 12, 2019

softwarefactory-project-zuul bot commented Dec 12, 2019

wenottingham commented Dec 13, 2019

wenottingham commented Dec 13, 2019

softwarefactory-project-zuul bot commented Dec 13, 2019

softwarefactory-project-zuul bot commented Dec 16, 2019

softwarefactory-project-zuul bot commented Dec 16, 2019

ryanpetrello commented Nov 5, 2019 •

edited

Loading

ryanpetrello left a comment •

edited

Loading

ryanpetrello commented Nov 5, 2019 •

edited

Loading