Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Makes the janitor use the correct id so it doesn't delete literally e… #778

Merged
merged 2 commits into from
Nov 6, 2018

Conversation

kurtwheeler
Copy link
Contributor

…very working directory...

Issue Number

N/A Came up during staging test

Purpose/Implementation Notes

Our job models have two id fields. One of them is id, which is an auto-incrementing primary key. The other is nomad_job_id which is a very long alphanumeric string identifying a nomad job. We make working directories for each of our jobs using their PK fields: id. We have to query the nomad API using the id field that Nomad knows about: nomad_job_id.

We were using the wrong PK to query nomad jobs, so we thought no jobs were running so we indiscriminately deleted every single working directory while jobs were using them.

Types of changes

  • Bugfix (non-breaking change which fixes an issue)

Functional tests

This is still a WIP so I haven't done any testing yet.

Checklist

  • Lint and unit tests pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)
  • Any dependent changes have been merged and published in downstream modules

@ghost ghost assigned kurtwheeler Nov 5, 2018
@kurtwheeler kurtwheeler mentioned this pull request Nov 5, 2018
Copy link
Contributor

@Miserlou Miserlou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/giphy this is why we have a staging server

@kurtwheeler
Copy link
Contributor Author

Honestly I think we have a staging server to catch bugs that are less obvious and not as easy to test for. There's no reason this should have made it to the staging server.

I've improved the unit test so it makes sure the janitor does what it's supposed to and doesn't do stuff it's not supposed to. I've also fixed the other bug that this unit test exposed. Wanna take a look at those changes since they're much more significant than the original bug fix?

@Miserlou
Copy link
Contributor

Miserlou commented Nov 6, 2018

Still looks good to me. Was the other bug about not being able to reach Nomad?

@kurtwheeler
Copy link
Contributor Author

The other bug was about if we ended up with an id that a job didn't exist for. Because the try catch block around that just had a pass in it, it would continue without reassigning the job variable so it would just try to delete the same directory as the last iteration of the loop did.

It probably won't come up in practice, at least not unless we start hitting RDS volume issues and decide to implement something that cleans up old jobs or something.

@kurtwheeler kurtwheeler merged commit 7a1e964 into dev Nov 6, 2018
kurtwheeler added a commit that referenced this pull request Jan 10, 2019
Makes the janitor use the correct id so it doesn't delete literally e…
@wvauclain wvauclain deleted the kurtwheeler/fix-janitor branch July 11, 2019 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants