Nomad UI painfully slow when job counts goes from hundreds to thousands #14787

djenriquez · 2022-10-03T20:48:09Z

Nomad version

Output from nomad version
Nomad v1.3.3 (428b2cd8014c48ee9eae23f02712b7219da16d30)

Operating system and Environment details

Amazon Linux release 2 (Karoo)

Issue

We have a particular use case where Nomad is used to orchestrate full sandboxes for our developers in our development environment. These sandboxes represent our complete stack of services, which means ~100 jobs, including periodic batch jobs.

The higher the number of total jobs, the slower the Nomad UI becomes. Initially, we thought this might be an issue with the actual Nomad servers handling the sheer amount of work, but thats is not the case. Nomad's core is able to handle, at one point, over 10,000 jobs /w ~maybe 50,000 allocations just fine. RPC calls through its API were responsive and the metrics we track showed no struggle whatsoever.

However, the UI was a different story, as it would sit on the Nomad loader graphic for a period of time that seemed to grow linearly with the amount of jobs being run. Interestingly the API requests the UI made to the Nomad servers were responsive, according to chrome dev tools, providing supporting evidence that the backend is not the issue.

Also, when looking at the waterfall chart from chrome dev tools, we see a call to /v1/namespaces?index=1 that eventually is canceled by the browser. Not sure if this request is misleading, but the page renders once that request pops up in the network analyzer, so it seems there is some blocking call at that part of the flow.

Reproduction steps

Spin up atleast 1000 jobs /w ~3000 allocations then navigate to the UI.

Expected Result

UI load time grows proportionately with the API response time for requests made to the Nomad server.

Actual Result

UI load time degrades as more jobs and allocations are running on the Nomad cluster while the API responds performantly.

We're open to scheduling a remote session if that makes it easier to see the issue.

The text was updated successfully, but these errors were encountered:

philrenaud · 2022-10-03T21:04:53Z

Hi @djenriquez, thanks for raising this — we'll take a look and update this once we have more info.

ChaiWithJai · 2022-10-27T18:19:43Z

Hey @djenriquez! Nice to meet you. We're super grateful that you raised this issue and it looks like the Nomad Community at large is also noticing this problem.

We're noticing that the issue may be the result of JavaScript Promises on the /jobs and /jobs/:jobId views are starving the event loop. We investigated the issue along with possible solutions and we have 2 commits that you can pull down:

For the /jobs/:jobId (The Job Detail Overview page) we're very confident that this commit will resolve that problem.

But for the /jobs (The Main Jobs List page) we tried to implement our pagination logic. There will be some regressions because we're mixing server and client-side filtering and sorting now. You can try out this commit.

We're very excited to work with you to find the right solution and we welcome any and all feedback about how you're searching and filtering for jobs (along with any feedback about the Nomad UI). We're in the process of planning a lot great new features into the UI and we're eager to solve any big challenges or even small "papercuts" that you're experiencing.

I'll be heading out on vacation soon, but I'll try my best to be responsive today and tomorrow on this issue and revisit this when I return. Looking forward to hearing from you!

Life is so rich,
Jai

djenriquez · 2022-10-27T21:43:51Z

Hi @ChaiWithJai, thanks so much for providing these commits. I'll go ahead see how I might be able to plug this into our current system and verify its results. It will likely be next week when I can provide results, however.

ChaiWithJai · 2022-11-11T15:39:44Z

Hey @djenriquez! I'm back in the office and wanted to circle back up with you. Were you able to try these commits out?

djenriquez · 2023-05-03T18:12:48Z

Hi @ChaiWithJai I realize I dropped the ball on checking back on this issue. Are we able to reconvene?

jhyx2022 · 2023-05-22T22:56:16Z

Greetings! Is there any update to the fix? The UI is slowing down to a halt whenever there are more than thousand jobs(including dead jobs) in the cluster.

djenriquez · 2023-05-22T23:28:46Z

Looks like theres a PR: #14989, looking to test this out against 1.5.3, just need quick confirmation on compatibility /w #14989 (comment).

philrenaud · 2023-05-25T14:29:05Z

Dropping a note to say that this is something we intend to prioritize soon; see #14989 (comment) for a little more context.

jhyx2022 · 2024-01-24T04:01:28Z

Dropping a note to say that this is something we intend to prioritize soon; see #14989 (comment) for a little more context.

Hi there, is there an update on the fix yet or expected version for the fix? Thanks!

philrenaud · 2024-01-24T14:38:31Z

@jhyx2022 Serendipitous timing! We've been developing a new endpoint to complement /jobs that will should make things a lot snappier. You can follow along with a few of the issues:

These should have the effect of a more limited initial pull of jobs on the main index in the UI. There'll still be the ability to paginate, search, and filter your list down, but those functions will no longer be front-end dependent.

jhyx2022 · 2024-01-24T23:46:09Z

Great news, appreciate the update!

philrenaud · 2024-05-09T14:09:10Z

Thanks to everyone for your patience on this issue. Pleased to say that #20452 is now merged and will be releasing in the upcoming Nomad 1.8. Among other things, it handles pagination for the jobs index and doesn't overload itself with child jobs that eat up memory at index level. I hope that this makes the overall experience of using the web UI much smoother!

github-actions · 2024-12-28T02:16:40Z

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

djenriquez added the type/bug label Oct 3, 2022

tgross added the theme/ui label Oct 3, 2022

tgross added this to Nomad UI Oct 3, 2022

tgross moved this to Backlog in Nomad UI Oct 3, 2022

philrenaud moved this from Backlog to Todo in Nomad UI Oct 3, 2022

philrenaud self-assigned this Oct 4, 2022

ChaiWithJai moved this from Todo to In Progress in Nomad UI Oct 17, 2022

ChaiWithJai self-assigned this Oct 17, 2022

ChaiWithJai mentioned this issue Oct 20, 2022

ui: fix long loading times on /jobs/:jobId page #14989

Closed

ChaiWithJai moved this from In Progress to In Review in Nomad UI Oct 20, 2022

This was referenced Oct 21, 2022

ui: fix loading problem on jobs.index route #15007

Closed

[draft] Update Jobs index to watch store, not model #15033

Closed

ChaiWithJai added the type/enhancement label Dec 14, 2022

tgross unassigned ChaiWithJai Jun 20, 2023

philrenaud moved this from In Review to Todo in Nomad UI Jul 13, 2023

philrenaud moved this from Todo to In Progress in Nomad UI Jan 24, 2024

philrenaud closed this as completed May 9, 2024

github-project-automation bot moved this from In Progress to Done in Nomad UI May 9, 2024

tgross added this to Nomad - Community Issues Triage Jun 24, 2024

tgross moved this to Done in Nomad - Community Issues Triage Jun 24, 2024

github-actions bot locked as resolved and limited conversation to collaborators Dec 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nomad UI painfully slow when job counts goes from hundreds to thousands #14787

Nomad UI painfully slow when job counts goes from hundreds to thousands #14787

djenriquez commented Oct 3, 2022 •

edited

Loading

philrenaud commented Oct 3, 2022

ChaiWithJai commented Oct 27, 2022

djenriquez commented Oct 27, 2022

ChaiWithJai commented Nov 11, 2022

djenriquez commented May 3, 2023

jhyx2022 commented May 22, 2023

djenriquez commented May 22, 2023

philrenaud commented May 25, 2023

jhyx2022 commented Jan 24, 2024

philrenaud commented Jan 24, 2024

jhyx2022 commented Jan 24, 2024

philrenaud commented May 9, 2024

github-actions bot commented Dec 28, 2024

Nomad UI painfully slow when job counts goes from hundreds to thousands #14787

Nomad UI painfully slow when job counts goes from hundreds to thousands #14787

Comments

djenriquez commented Oct 3, 2022 • edited Loading

Nomad version

Operating system and Environment details

Issue

Reproduction steps

Expected Result

Actual Result

philrenaud commented Oct 3, 2022

ChaiWithJai commented Oct 27, 2022

djenriquez commented Oct 27, 2022

ChaiWithJai commented Nov 11, 2022

djenriquez commented May 3, 2023

jhyx2022 commented May 22, 2023

djenriquez commented May 22, 2023

philrenaud commented May 25, 2023

jhyx2022 commented Jan 24, 2024

philrenaud commented Jan 24, 2024

jhyx2022 commented Jan 24, 2024

philrenaud commented May 9, 2024

github-actions bot commented Dec 28, 2024

djenriquez commented Oct 3, 2022 •

edited

Loading