Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: release notes & updates for Kubernetes pods to jobs #9443

Merged

Conversation

carolinaecalderon
Copy link
Contributor

@carolinaecalderon carolinaecalderon commented May 29, 2024

Ticket

RM-258

Description

Include release note & updates to Kubernetes docs for the pods to jobs project.

Test Plan

N/A

Checklist

  • Changes have been manually QA'd
  • User-facing API changes need the "User-facing API Change" label.
  • Release notes should be added as a separate file under docs/release-notes/.
    See Release Note for details.
  • Licenses should be included for new code which was copied and/or modified from any external code.

@cla-bot cla-bot bot added the cla-signed label May 29, 2024
@determined-ci determined-ci added the documentation Improvements or additions to documentation label May 29, 2024
@carolinaecalderon carolinaecalderon changed the base branch from main to stoksc/feat/kubernetesjobs May 29, 2024 17:30
Copy link

netlify bot commented May 29, 2024

Deploy Preview for determined-ui ready!

Name Link
🔨 Latest commit 44c279a
🔍 Latest deploy log https://app.netlify.com/sites/determined-ui/deploys/665766360b73150007815927
😎 Deploy Preview https://deploy-preview-9443--determined-ui.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@determined-ci determined-ci requested a review from a team May 29, 2024 17:30
@carolinaecalderon carolinaecalderon marked this pull request as ready for review May 29, 2024 17:31
Copy link

codecov bot commented May 29, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 49.04%. Comparing base (66f9d1e) to head (6f888c4).

Additional details and impacted files
@@                      Coverage Diff                       @@
##           stoksc/feat/kubernetesjobs    #9443      +/-   ##
==============================================================
- Coverage                       49.04%   49.04%   -0.01%     
==============================================================
  Files                            1233     1233              
  Lines                          159205   159205              
  Branches                         2778     2777       -1     
==============================================================
- Hits                            78084    78076       -8     
- Misses                          80947    80955       +8     
  Partials                          174      174              
Flag Coverage Δ
backend 43.56% <ø> (-0.01%) ⬇️
harness 63.97% <ø> (-0.01%) ⬇️
web 44.38% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

see 3 files with indirect coverage changes

Copy link
Contributor

@tara-det-ai tara-det-ai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added suggested edit to match the release notes style, voice, tone

@determined-ci determined-ci requested a review from a team May 29, 2024 18:30
@carolinaecalderon carolinaecalderon merged commit 42ead04 into stoksc/feat/kubernetesjobs May 29, 2024
80 of 95 checks passed
@carolinaecalderon carolinaecalderon deleted the carolinac/pods2jobs-docs branch May 29, 2024 18:56
stoksc added a commit that referenced this pull request May 31, 2024
This change updates the Kubernetes resource manager to submit one Kubernetes job per Determined allocation instead of many pods. This is complicated but we think it is worth it because:
- Jobs play nice with resource quotas and other Kubernetes features out of the box.
- Eventually we can delegate restarts, TTL, pause/resume (using suspend), and more to jobs.
- They allow us to better integrate with Kueue and other tools in the ml ecosystem.
- Supporting VolcanoJobs (or similar alternatives) alongside Jobs is realistic.
- The refactor is net positive w.r.t. test coverage (20% to 80%) and code quality.

This commit is the result of several PRs, enumerated here for easier discovery.
- #9296 contains most of the code changes.
- #9443 
- #9447 
- #9450 
- #9451

Co-authored-by: Carolina Calderon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants