-
Notifications
You must be signed in to change notification settings - Fork 357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: fix some failing long-running tests related to password requirements #9421
Conversation
✅ Deploy Preview for determined-ui ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i see znode tests are still failing too. i'm running https://app.circleci.com/pipelines/github/determined-ai/determined/56052/workflows/4c820a30-5f45-407c-b11b-115cfc9aead6 to see if it fixes it. if it does i can merge into your branch.
- run: | ||
name: Set initial user password | ||
command: | | ||
echo "export INITIAL_USER_PASSWORD=${INITIAL_USER_PASSWORD}" >> "$BASH_ENV" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this do anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sets the variable in the environment of future commands run within the same job.
@@ -1932,6 +1933,11 @@ jobs: | |||
- install-devcluster | |||
- start-devcluster: | |||
target-stage: db | |||
- run: | | |||
sudo mkdir -p /etc/systemd/system/determined-master.service.d | |||
echo "[Service]" | sudo tee /etc/systemd/system/determined-master.service.d/password.override.conf >/dev/null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why tee -a
and throwing away the duped output >/dev/null
instead of than just >>
echo "[Service]" | sudo tee /etc/systemd/system/determined-master.service.d/password.override.conf >/dev/null | |
echo "[Service]" >>/etc/systemd/system/determined-master.service.d/password.override.conf |
even better might be a heredoc to write the entire conf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shell pipes and redirects (like >>
) create new processes, but there isn't a way to apply a temporary owner to them; sudo echo narf >> poit
runs echo
as a superuser, but opens the poit
file with the current session's user, which in this case wouldn't have permission to write to /etc/systemd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to fail in weird ways if sudo is configured to require a password.
But more importantly, why not do the whole file in one block so there's less potential "use -a
except on the first line" confusion? Either create it as a template file and sed
the password into the template while outputting to the override location, or perhaps printf so you don't have nested quote problems?
printf '[Service]\nEnvironment="DET_SECURITY_INITIAL_USER_PASSWORD=%s"\n' "$INITIAL_USER_PASSWORD" \
| sudo tee /etc/systemd/system/determined-master.service.d/password.override.conf > /dev/null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooh, ...drat, wish I'd thought of that. I'll give that a try when I can and maybe patch it in as a separate very-smol PR
@@ -103,7 +105,9 @@ def resource_manager( | |||
def test_cluster_down() -> None: | |||
name = "test_cluster_down" | |||
|
|||
with resource_manager(Resource.CLUSTER, name, {}, ["no-gpu"]): | |||
with resource_manager( | |||
Resource.CLUSTER, name, {"initial-user-password": conf.USER_PASSWORD}, ["no-gpu"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's probably worth just adding this to
@contextlib.contextmanager
def resource_manager(
resource: Resource,
name: str,
kwflags: Optional[Dict[str, str]] = None,
boolean_flags: Optional[List[str]] = None,
positional_arguments: Optional[List[str]] = None,
) -> Iterator[None]:
+ kwflags.update({"initial-user-password": conf.USER_PASSWORD})
"""Context manager to bring resources up and down."""
resource_up(resource, name, kwflags, boolean_flags, positional_arguments)
try:
yield
finally:
resource_down(resource, name)
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #9421 +/- ##
=======================================
Coverage 48.98% 48.99%
=======================================
Files 1238 1238
Lines 160518 160518
Branches 2783 2783
=======================================
+ Hits 78629 78638 +9
+ Misses 81714 81705 -9
Partials 175 175
Flags with carried forward coverage won't be shown. Click here to find out more. |
c9101b0
to
2f79c70
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fine to me from an infra perspective. My only note is that the variable and config refers to initial "user" password - is that setting the admin and determined user both to the same initial password? I don't remember how that works. :) As long as that's the case, I'm happy. My other in-line note is more trivial style opinion than anything functional.
Yes, it sets the admin and determined user passwords to the same value. |
Ticket
Description
Addressing CircleCI test failures on Slurm clusters, Debian package install, and e2e "local" and perf tests.
Test Plan
CI tests pass
Checklist
docs/release-notes/
.See Release Note for details.