Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: fix some failing long-running tests related to password requirements #9421

Merged
merged 12 commits into from
Jun 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 64 additions & 5 deletions .circleci/real_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -661,6 +661,7 @@ commands:
name: Run det-deploy tests
working_directory: ./e2e_tests
command: |
DET_SECURITY_INITIAL_USER_PASSWORD=$INITIAL_USER_PASSWORD \
pytest -vv -s \
-m <<parameters.mark>> \
--junitxml=/tmp/test-results/det-deploy-tests.xml \
Expand Down Expand Up @@ -1939,6 +1940,11 @@ jobs:
- install-devcluster
- start-devcluster:
target-stage: db
- run: |
sudo mkdir -p /etc/systemd/system/determined-master.service.d
echo "[Service]" | sudo tee /etc/systemd/system/determined-master.service.d/password.override.conf >/dev/null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why tee -a and throwing away the duped output >/dev/null instead of than just >>

Suggested change
echo "[Service]" | sudo tee /etc/systemd/system/determined-master.service.d/password.override.conf >/dev/null
echo "[Service]" >>/etc/systemd/system/determined-master.service.d/password.override.conf

even better might be a heredoc to write the entire conf

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shell pipes and redirects (like >>) create new processes, but there isn't a way to apply a temporary owner to them; sudo echo narf >> poit runs echo as a superuser, but opens the poit file with the current session's user, which in this case wouldn't have permission to write to /etc/systemd

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to fail in weird ways if sudo is configured to require a password.

But more importantly, why not do the whole file in one block so there's less potential "use -a except on the first line" confusion? Either create it as a template file and sed the password into the template while outputting to the override location, or perhaps printf so you don't have nested quote problems?

printf '[Service]\nEnvironment="DET_SECURITY_INITIAL_USER_PASSWORD=%s"\n' "$INITIAL_USER_PASSWORD" \
| sudo tee /etc/systemd/system/determined-master.service.d/password.override.conf > /dev/null

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh, ...drat, wish I'd thought of that. I'll give that a try when I can and maybe patch it in as a separate very-smol PR

echo "Environment=\"DET_SECURITY_INITIAL_USER_PASSWORD=${INITIAL_USER_PASSWORD}\"" | sudo tee -a /etc/systemd/system/determined-master.service.d/password.override.conf >/dev/null
sudo systemctl daemon-reload
- run: python3 .circleci/scripts/wait_for_server.py localhost 5432
- run: sudo systemctl restart determined-master
- run: python3 .circleci/scripts/wait_for_server.py localhost 8080 || { journalctl --no-pager -u determined-master; exit 1; }
Expand Down Expand Up @@ -2694,6 +2700,11 @@ jobs:
echo "export OPT_DEVBOX_PREFIX=circleci-job-$(echo -n "${CIRCLE_USERNAME}-${CIRCLE_BRANCH}-${CIRCLE_JOB}" | md5sum | awk '{print $1}')" >> "$BASH_ENV"
fi

- run:
name: Set initial user password
command: |
echo "export INITIAL_USER_PASSWORD=${INITIAL_USER_PASSWORD}" >> "$BASH_ENV"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this do anything?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sets the variable in the environment of future commands run within the same job.


- attach_workspace:
at: .

Expand Down Expand Up @@ -2812,6 +2823,8 @@ jobs:
name: Query the slot count to ensure slots are allocated
command: |
tries=20
export DET_USER=determined
export DET_PASS=${INITIAL_USER_PASSWORD}
det slot list
until [[ $(det slot list | wc -l) -gt 2 ]] ; do
if [[ $((--tries)) -eq 0 ]]; then
Expand Down Expand Up @@ -2913,6 +2926,8 @@ jobs:
auth_file: /home/launcher/.launcher.$HOSTNAME.token
path: /opt/singularity/bin:/usr/local/bin:${PATH}
ld_library_path:
security:
initial_user_password: ${INITIAL_USER_PASSWORD}
reserved_ports_znode50:
type: string
default: |
Expand All @@ -2936,9 +2951,6 @@ jobs:
determined_admin_username:
type: string
default: admin
determined_admin_password:
type: string
default: ""
database_username:
type: string
default: postgres
Expand Down Expand Up @@ -3087,7 +3099,7 @@ jobs:
--data-binary @- \<< EOF | jq -r '.token'
{
"username": "<<parameters.determined_admin_username>>",
"password": "<<parameters.determined_admin_password>>"
"password": "$INITIAL_USER_PASSWORD"
}
EOF
)
Expand Down Expand Up @@ -3147,6 +3159,7 @@ jobs:
name: Query the slot count to ensure slots are allocated
command: |
tries=20
export DET_PASS=${INITIAL_USER_PASSWORD}
det slot list
until [[ $(det slot list | wc -l) -gt 2 ]] ; do
if [[ $((--tries)) -eq 0 ]]; then
Expand Down Expand Up @@ -3431,7 +3444,7 @@ jobs:
command: |
export PERF_DOCKER_FLAGS="--network=host"
export PERF_K6_FLAGS='-e DET_ADMIN_USERNAME="admin" \
-e DET_ADMIN_PASSWORD="" \
-e DET_ADMIN_PASSWORD="${INITIAL_USER_PASSWORD}" \
-e model_name="tnjpuojqzbluqiyyqilftulsw" \
-e model_version_number="1" \
-e trial_id="8282" \
Expand Down Expand Up @@ -4296,6 +4309,8 @@ workflows:
- test-debian-packaging:
requires:
- package-and-push-system-local-ee
context:
- dev-ci-cluster-default-user-credentials
filters:
branches:
only:
Expand All @@ -4305,6 +4320,8 @@ workflows:
name: test-e2e-slurm-misconfigured
requires:
- package-and-push-system-local-ee
context:
- dev-ci-cluster-default-user-credentials
filters:
branches:
only:
Expand Down Expand Up @@ -4346,12 +4363,16 @@ workflows:
auth_file: /home/launcher/.launcher.$HOSTNAME.token
path: /opt/singularity/bin:/usr/local/bin:${PATH}
ld_library_path:
security:
initial_user_password: ${INITIAL_USER_PASSWORD}

- test-e2e-slurm:
name: test-e2e-slurm-gpu
mark: "e2e_slurm_gpu"
requires:
- package-and-push-system-local-ee
context:
- dev-ci-cluster-default-user-credentials
filters:
branches:
only:
Expand All @@ -4364,6 +4385,7 @@ workflows:
# that's required by the "gh" command for authentication.
- github-read
- gcp
- gcp-ci-cluster-default-user-credentials
matrix:
parameters:
name: [test-e2e-slurm-singularity-gcp]
Expand All @@ -4382,6 +4404,7 @@ workflows:
# that's required by the "gh" command for authentication.
- github-read
- gcp
- gcp-ci-cluster-default-user-credentials
matrix:
parameters:
name: [test-e2e-slurm-podman-gcp]
Expand All @@ -4401,6 +4424,7 @@ workflows:
# that's required by the "gh" command for authentication.
- github-read
- gcp
- gcp-ci-cluster-default-user-credentials
matrix:
parameters:
name: [test-e2e-slurm-enroot-gcp]
Expand All @@ -4420,6 +4444,7 @@ workflows:
# that's required by the "gh" command for authentication.
- github-read
- gcp
- gcp-ci-cluster-default-user-credentials
matrix:
parameters:
name: [test-e2e-pbs-singularity-gcp]
Expand All @@ -4439,6 +4464,7 @@ workflows:
# that's required by the "gh" command for authentication.
- github-read
- gcp
- gcp-ci-cluster-default-user-credentials
matrix:
parameters:
name: [test-e2e-pbs-podman-gcp]
Expand All @@ -4460,6 +4486,7 @@ workflows:
# that's required by the "gh" command for authentication.
- github-read
- gcp
- gcp-ci-cluster-default-user-credentials
matrix:
parameters:
name: [test-e2e-pbs-enroot-gcp]
Expand All @@ -4480,6 +4507,7 @@ workflows:
# that's required by the "gh" command for authentication.
- github-read
- gcp
- gcp-ci-cluster-default-user-credentials
matrix:
parameters:
name: [test-e2e-slurm-agent-podman-gcp]
Expand Down Expand Up @@ -4799,6 +4827,8 @@ workflows:

- test-det-deploy:
name: test-det-deploy-local
context:
- dev-ci-cluster-default-user-credentials
requires:
- package-and-push-system-local
- package-and-push-system-local-ee
Expand Down Expand Up @@ -5155,10 +5185,14 @@ workflows:
requires:
- package-and-push-system-local-ee
- request-packaging-tests
context:
- dev-ci-cluster-default-user-credentials

# Local deployment
- test-det-deploy:
name: test-det-deploy-local
context:
- dev-ci-cluster-default-user-credentials
requires:
- package-and-push-system-local
- package-and-push-system-local-ee
Expand Down Expand Up @@ -5207,6 +5241,8 @@ workflows:

- test-e2e-slurm:
name: test-e2e-slurm-misconfigured
context:
- dev-ci-cluster-default-user-credentials
filters: *upstream-feature-branch
requires:
- package-and-push-system-local-ee
Expand Down Expand Up @@ -5248,9 +5284,13 @@ workflows:
auth_file: /home/launcher/.launcher.$HOSTNAME.token
path: /opt/singularity/bin:/usr/local/bin:${PATH}
ld_library_path:
security:
initial_user_password: ${INITIAL_USER_PASSWORD}

- test-e2e-slurm:
name: test-e2e-slurm-gpu
context:
- dev-ci-cluster-default-user-credentials
filters: *upstream-feature-branch
mark: "e2e_slurm_gpu"
requires:
Expand All @@ -5265,6 +5305,7 @@ workflows:
# that's required by the "gh" command for authentication.
- github-read
- gcp
- gcp-ci-cluster-default-user-credentials
matrix:
parameters:
name: [test-e2e-slurm-singularity-gcp]
Expand All @@ -5281,6 +5322,7 @@ workflows:
# that's required by the "gh" command for authentication.
- github-read
- gcp
- gcp-ci-cluster-default-user-credentials
matrix:
parameters:
name: [test-e2e-slurm-podman-gcp]
Expand All @@ -5298,6 +5340,7 @@ workflows:
# that's required by the "gh" command for authentication.
- github-read
- gcp
- gcp-ci-cluster-default-user-credentials
matrix:
parameters:
name: [test-e2e-slurm-enroot-gcp]
Expand All @@ -5315,6 +5358,7 @@ workflows:
# that's required by the "gh" command for authentication.
- github-read
- gcp
- gcp-ci-cluster-default-user-credentials
matrix:
parameters:
name: [test-e2e-pbs-singularity-gcp]
Expand All @@ -5332,6 +5376,7 @@ workflows:
# that's required by the "gh" command for authentication.
- github-read
- gcp
- gcp-ci-cluster-default-user-credentials
matrix:
parameters:
name: [test-e2e-pbs-podman-gcp]
Expand All @@ -5351,6 +5396,7 @@ workflows:
# that's required by the "gh" command for authentication.
- github-read
- gcp
- gcp-ci-cluster-default-user-credentials
matrix:
parameters:
name: [test-e2e-pbs-enroot-gcp]
Expand All @@ -5369,6 +5415,7 @@ workflows:
# that's required by the "gh" command for authentication.
- github-read
- gcp
- gcp-ci-cluster-default-user-credentials
matrix:
parameters:
name: [test-e2e-slurm-agent-podman-gcp]
Expand Down Expand Up @@ -5486,23 +5533,31 @@ workflows:
context: github-read
- test-e2e-slurm:
name: test-e2e-slurm-restart
context:
- dev-ci-cluster-default-user-credentials
mark: "e2e_slurm_restart"
requires:
- package-and-push-system-local-ee
extra-pytest-flags: "--no-compare-stats"
- test-e2e-slurm:
name: test-e2e-slurm-preemption
context:
- dev-ci-cluster-default-user-credentials
mark: "e2e_slurm_preemption"
requires:
- package-and-push-system-local-ee
extra-pytest-flags: "--no-compare-stats"
- test-e2e-slurm:
name: test-e2e-slurm-znode
context:
- dev-ci-cluster-default-user-credentials
requires:
- package-and-push-system-local-ee
extra-pytest-flags: "--no-compare-stats"
- test-e2e-slurm:
name: test-e2e-slurm-enroot-znode
context:
- dev-ci-cluster-default-user-credentials
matrix:
parameters:
mark: ["e2e_slurm and not deepspeed"]
Expand Down Expand Up @@ -5552,8 +5607,12 @@ workflows:
auth_file: /home/launcher/.launcher.$HOSTNAME.token
path: /opt/singularity/bin:/usr/local/bin:${PATH}
ld_library_path:
security:
initial_user_password: ${INITIAL_USER_PASSWORD}
- test-e2e-slurm:
name: test-e2e-slurm-agent-singularity-znode
context:
- dev-ci-cluster-default-user-credentials
requires:
- package-and-push-system-local-ee
agent-use: "-A"
Expand Down
Loading
Loading