Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

completed flag check may erroneously stop on regex matches #470

Closed
nsheff opened this issue Feb 17, 2024 · 6 comments
Closed

completed flag check may erroneously stop on regex matches #470

nsheff opened this issue Feb 17, 2024 · 6 comments

Comments

@nsheff
Copy link
Contributor

nsheff commented Feb 17, 2024

I'm trying to submit 6 jobs with looper. I've never submitted any before, it's a brand new project. I noticed one of them says:

Found existing status: completed. Skipping sample.

This is bizarre because it's a brand new project! It has never been submitted before.

I realize this sample shares a prefix with another sample: one is named pairs_swap_maintain_coords, which the pipeline runs, and then the next sample is named pair_swap -- which the pipeline incorrectly says is already completed.

I'm guessing there's a regex that's looking for {sample_name}*_completed.flag -- if that's the case, it would actually register the first one as completed for the second one, and then never submit that job.

@nsheff
Copy link
Contributor Author

nsheff commented Feb 17, 2024

This may actually be a bug with pipestat, rather than looper. I'm guessing this coincides with a switch to using pipestat for status checks.

@donaldcampbelljr
Copy link
Contributor

I can't reproduce this using a modified hello_looper example for both the basic and the pipestat approaches (which look for their flags in slightly different ways).

@donaldcampbelljr
Copy link
Contributor

However, looking at the basic, non-pipestat example, I can see where the function fetch_sample_flags might have issues if you had a flag from a different sample in the results folder, because of this logic:

looper/looper/utils.py

Lines 93 to 98 in 1468956

folder_contents = [os.path.join(sfolder, f) for f in os.listdir(sfolder)]
return [
x
for x in folder_contents
if os.path.splitext(x)[1] == ".flag" and os.path.basename(x).startswith(pl_name)
]

Appears it is only concerned with .flag and the pipeline_name. The sample name doesn't matter.

@nsheff
Copy link
Contributor Author

nsheff commented Mar 12, 2024

was this fixed by the pipestat update referenced above?

@nsheff nsheff added this to the v1.8.0 milestone Mar 12, 2024
@donaldcampbelljr
Copy link
Contributor

I don't believe so. The pipestat code above was broken for filebackend and is not used for getting sample statuses.

donaldcampbelljr added a commit that referenced this issue Mar 13, 2024
@donaldcampbelljr
Copy link
Contributor

Should be solved with the above commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: Done
Development

No branches or pull requests

2 participants