Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UI: Stat charts everywhere #4704

Merged
merged 27 commits into from
Sep 26, 2018

Conversation

DingoEatingFuzz
Copy link
Contributor

Okay, not everywhere, only where they make sense to go. Which is

  1. Client detail
  2. Allocation detail
  3. Task detail

This has been a tricky challenge. Nomad itself doesn't store any utilization figures. It can provide them on demand, but any sort of time series graphing needs to be done outside of the API.

Typically this is built out by hooking Nomad up to another tool, and despite this PR, that is still recommended. However, we want a good out-of-the-box experience.

In order to add time series graphs to the UI, the UI itself has to keep track of stats over time in memory. If unwatched, this could lead to the UI webpage consuming all sorts of megabytes of memory.

I put a few measures in place to avoid that problem.

1. Use lightweight objects to store stats.

The obvious solution for storing stats would be to use Ember Data, since it is used for everything else. However, records in Ember Data come with a lot of overhead in order to support relationships, dirty-tracking, computed properties, deduplication, and more. Since these stats objects need none of those features, plain arrays and objects can be used to cut down on bloat.

2. Limit stats history.

If a tab is left open monitoring a service for hours, it would end up with hours worth of stats data at a ~2s resolution. This is unacceptable, so each tracker has a max length in place. Eventually older data is dropped as newer data comes in.

3. Limit how many objects are being tracked.

The two easiest paths for keeping tabs on stats by objects is a) don't keep tabs at all, and b) keep tabs on everything. The problem with option a) is it is normal to click away from a page and come back. Losing all the graph data every time you click way is unfortunate. And the issue with option b) is if there are lots of clients and lots of allocations and you are really investigating something, your memory will fill up with stats you are never going to look at again.

To achieve the best of both worlds, an LRU cache is used to hold on to some trackers, but eventually remove the stale ones in favor of fresh ones. The logic being that diagnosing an issue may involve juggling a few clients and a handful of allocations, but eventually you are going to move on to the next issue which may involve different clients and allocations.


Separate from the "how do I avoid using all the memory" problem is the "how do I avoid saturating the network" problem, and the related "even though I can't fetch historical data, can I still make a useful ux" problem.

So to avoid saturating the network, stats are only fetched for the resources immediately on the page. Even though stats are stored for past graphs that have been on the page, as soon as a graph is no longer on the page, the stats poller is paused.

Additionally, trackers are tracked in a global registry so any number of components can use the same tracker which avoid making redundant requests.

Pausing trackers introduces a new issue where returning to a page that already has data will resume a tracker, but the graph will jump from the latest datum to the newest datum unaware that there is missing data in between.

To overcome this, the line chart now handles gaps in data, and trackers will append special null frames when paused. Ember Concurrency is used here with great effect to avoid trackers becoming stateful or components using trackers having to coordinate amongst themselves.

The last ux gripe is when first visiting a page, and there is no historical data, a sliver of a line chart is not a great way to read metrics. This is why each chart also visualizes the current value as well as presenting it as a percentage and an absolute figure.


image 1

image

image

It encapsulates all the tracker, polling, and markup for this style
of metric.
In favor of the new primary-metric components
This solves two problems:

1. redundant trackers making redundant requests
2. trackers being obliterated as soon as the primary metric component
   is destroyed

It introduces a new problem where visiting more and more node and
allocation pages adds to an ever-growing list of trackers that can
assume lots of memory, but it solves the problem by using a
least-recently-used cache to limit the number of trackers tracked.
This is the best of three options

1. Users of stats trackers control polling (old method)
2. Stat tracker is stateful and has start/stop methods (like logging)
3. Stat trackers blindly throttle requests

This is the best option because it means N number of concurrent users of
a stats tracker can request polling without inundating the tracker with
redundant frames (or the network with redundant requests), but they also
don't have to coordinate amongst themselves to determine what state a
tracker should be in.
@DingoEatingFuzz DingoEatingFuzz merged commit f5eaffe into f-ui-improved-stats-charts Sep 26, 2018
@DingoEatingFuzz DingoEatingFuzz mentioned this pull request Sep 26, 2018
5 tasks
@DingoEatingFuzz DingoEatingFuzz requested a review from a team September 26, 2018 18:06
@github-actions
Copy link

github-actions bot commented Jan 8, 2023

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 8, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant