Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DATAUP-729 job ts implementation #2960

Open
wants to merge 10 commits into
base: develop
Choose a base branch
from
Open

Conversation

n1mus
Copy link
Contributor

@n1mus n1mus commented May 9, 2022

Description of PR purpose/changes

  • Please include a summary of the change and which issue is fixed.
  • Please also include relevant motivation and context.
  • List any dependencies that are required for this change.

Jira Ticket / Issue

Related Jira ticket: https://kbase-jira.atlassian.net/browse/DATAUP-X

  • Added the Jira Ticket to the title of the PR (e.g. DATAUP-69 Adds a PR template)

Testing Instructions

  • Details for how to test the PR:
  • Tests pass locally and in GitHub Actions
  • Changes available by spinning up a local narrative and navigating to X to see Y

Dev Checklist:

  • My code follows the guidelines at https://sites.google.com/lbl.gov/trussresources/home?authuser=0
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • (JavaScript) I have run Prettier and ESLint on changed code manually or with a git precommit hook
  • (Python) I have run Black and Flake8 on changed Python code manually or with a git precommit hook
  • Any dependent changes have been merged and published in downstream modules

Updating Version and Release Notes (if applicable)

@ialarmedalien
Copy link
Collaborator

This is looking good!

Can you add in functionality such that if there are no updated jobs in response to a request, the backend returns an error?

Thanks!

@n1mus n1mus force-pushed the DATAUP-729-just-job-ts branch from bf55bdb to cd0194c Compare May 12, 2022 11:01
@n1mus n1mus marked this pull request as ready for review May 12, 2022 11:01
@lgtm-com
Copy link

lgtm-com bot commented May 12, 2022

This pull request introduces 1 alert when merging cd0194c into c57f696 - view on LGTM.com

new alerts:

  • 1 for Unused import

@codecov
Copy link

codecov bot commented May 12, 2022

Codecov Report

Merging #2960 (6bbd36d) into develop (86c5af6) will increase coverage by 0.19%.
The diff coverage is 93.75%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #2960      +/-   ##
===========================================
+ Coverage    73.25%   73.45%   +0.19%     
===========================================
  Files           36       36              
  Lines         3903     3906       +3     
===========================================
+ Hits          2859     2869      +10     
+ Misses        1044     1037       -7     
Impacted Files Coverage Δ
src/biokbase/narrative/jobs/job.py 93.11% <86.95%> (+2.60%) ⬆️
src/biokbase/narrative/jobs/jobcomm.py 98.96% <100.00%> (+<0.01%) ⬆️
src/biokbase/narrative/jobs/jobmanager.py 95.56% <100.00%> (+0.20%) ⬆️
src/biokbase/narrative/jobs/util.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 014f72a...6bbd36d. Read the comment docs.

@n1mus n1mus changed the title job ts implementation DATAUP-729 job ts implementation May 13, 2022
Comment on lines +521 to +524
if msg_type == MESSAGE_TYPE["STATUS"]:
now = time_ns()
for output_state in content.values():
output_state["last_checked"] = now
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not add this timestamp when the job manager is putting together the list of jobs, instead of adding an extra iteration through the job state data here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't sure since the CANCEL_JOBS request also responds with a STATUS message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided not to filter the STATUS response for CANCEL_JOBS though because I figured in theory they should all get updated, whether successfully or just coming back with an error

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because everything is asynchronous, the FE doesn't have any way of knowing what triggered a job status message -- whether it was a cancel request, a status request, or the BE job loop. That's why I say it's better to put the timestamp on in the job manager, so that all job state objects that the FE receives have a timestamp on them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one allure of putting everything into JobComm is less tests surgery ... But putting it deep into the stack, at the origin of the STATUS response ds, seems less googly-eyed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I tried putting all the filtering/last_checked logic at the source _construct_job_state_set but the tests were complaining so I'm abandoning that effort for the sake of time. Is the current placement of the filtering/last_checked good enough?

@n1mus n1mus force-pushed the DATAUP-729-just-job-ts branch from 2d673b4 to 6bbd36d Compare May 19, 2022 20:51
@sonarqubecloud
Copy link

SonarCloud Quality Gate failed.    Quality Gate failed

Bug C 1 Bug
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 4 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@@ -32,6 +32,13 @@ def generate_error(job_id, err_type):
return error_strings[err_type]


def trim_ee2_state(ee2_state, exclude_fields):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we have this code somewhere else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, Job._trim_ee2_state. I just got tired of using that in tests when usually we use independent testing functions

Comment on lines +248 to +253
ee2_states = self.job_state_data
if params.get("exclude_fields"):
for ee2_state in ee2_states.values():
trim_ee2_state(ee2_state, params["exclude_fields"])
if params.get("return_list"):
ee2_states = list(ee2_states.values())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do any of those params ever change? There's only one place where check_workspace_jobs gets called, and the params are always the same, so...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but I thought it might be a good idea to implement the "exclude_fields" param since here I'm paying closer attention to when state updates are triggered

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has adding the exclude_fields param changed the output of the function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well .... now that you mention it ... probably not

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Today's "good idea to implement" is tomorrow's "why on earth did someone write this?". YAGNI. 😄

job.update_state({})
self.assertEqual(last_updated, job.last_updated)

# job has init ee2 state
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test looks suspiciously spaghetti code-like. Does it need to be done as this long series of transitions or can it be split into separate tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it followed a very similar pattern throughout and so could flow in a single function. The punchline is last_updated defined at the top never changes throughout these tests. Is there a benefit to making tests methods small?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's much easier to read, understand, and update/edit a couple of stanzas of code than it is a long series of stanzas. Unless there is a specific need to test a sequence of modifications (e.g. there's something going on elsewhere that changes state as a result of these mods), it's best to make tests as simple as possible to assist future codebase editors and maintainers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. But what if it's two long stanzas of highly repetitive code? With a common punchline that is accentuated by more repetition?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's highly repetitive, it suggests that the repetition could be abstracted out into a function... or that it could be replaced by individual tests that validate the atomic operations involved.

@@ -757,7 +821,7 @@ def test_in_cells__batch__same_cell(self):
batch_job, child_jobs = batch_fam[0], batch_fam[1:]

for job in child_jobs:
job.cell_id = "hello"
job._acc_state["job_input"]["narrative_cell_info"]["cell_id"] = "hello"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this the only place you could find where an attribute was changed (other than via the update_state method)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked every field in job.__setattr__ that was from the "job_input". I didn't check anything in the outer level of the ee2 state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But update_state is the only place _acc_state is mutated

Copy link
Collaborator

@ialarmedalien ialarmedalien May 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You left a TODO comment about whether the attribute setter was ever used in job.py -- seems as though you've answered it here, so can delete the comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants