Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix file processor running on files that are on unfinished upload runs. #1506

Merged
merged 6 commits into from
Dec 14, 2023

Conversation

JeffreyThiessen
Copy link
Member

@JeffreyThiessen JeffreyThiessen commented Dec 13, 2023

Description of changes

Added a filtering step to SequencingObjectProcessingService such that files that are on a sequencing run that is not in a COMPLETE state are not picked up for processing.

Related issue

Link to the GitHub issue this pull request addresses using the #issuenum format. If it completes an issue, use Fixes #issuenum to automatically close the issue.
Fixes #1505

Fixes the race condition issue of files having FastQC run on them before they are fully uploaded.

How to test changes

  1. run irida
  2. In IRIDA, create a new project and a new sample in that project. Make note of the Project_ID and Sample Name
  3. checkout, build and import the iridauploader codebase into a python3 interpreter
cd irida-uploader
git pull origin development
make
source .virtualenv/bin/activate
python3
  1. Use the libraries to make a sequencing run, and upload a file
import iridauploader
from iridauploader import api
# make an api instance of IRIDA
# if you built irida with dev db seed, the following creds should work
a = api.ApiCalls("sequencer", "N9Ywc6GKWWZotzsJGutj3BZXJDRn65fXJqjrk29yTjI", "http://localhost:8080/api/", "admin","Password1!")
# test connection
a.get_irida_version()

# make a sequencing file
# a valid file which will pass FastQC can be found in the irida-uploader source. irida-uploader/examples/directory_run/file_1.fastq.gz
sf = iridauploader.model.SequenceFile(['/path/to/a/fastq.gz/file/mysample.fastq.gz'])

# create a new sequencing run in IRIDA
run_id = a.create_seq_run(metadata={'layoutType': 'SINGLE_END'}, sequencing_run_type='miseq')
# Go to the IRIDA ADMIN panel to view your sequencing run http://localhost:8080/admin/sequencing-runs

# Use your project id and sample name from before
p_id = 1 # Note: this should be an int
s_name = 'my_sample' # Note: this should be a string

# upload the data
a.send_sequence_files(sf, s_name, p_id, run_id)
# response should look something like this
{'resource': {'file': '/tmp/irida/sequence-files/45/1/valid.fastq.gz', 'createdDate': 1702507892000, 'modifiedDate': 1702507892000, 'uploadSha256': None, 'fileName': 'valid.fastq.gz', 'label': 'valid.fastq.gz', 'fileSizeBytes': 864, 'links': [{'rel': 'sample/sequenceFiles', 'href': 'http://localhost:8080/api/samples/140/sequenceFiles'}, {'rel': 'self', 'href': 'http://localhost:8080/api/samples/140/unpaired/24/files/45'}, {'rel': 'sample', 'href': 'http://localhost:8080/api/samples/140'}, {'rel': 'sequenceFile/sequencingObject', 'href': 'http://localhost:8080/api/samples/140/unpaired/24'}], 'identifier': '45'}}
  1. On IRIDA, see that the sequencing run is still in UPLOADING state
  2. On IRIDA, wait a few minutes and see that the file has not been processed by FastQC
  3. Via the python interpreter, set the sequencing run to COMPLETE
a.set_seq_run_complete(run_id)
  1. On IRIDA, see that the sequencing run is in COMPLETE state
  2. On IRIDA, assuming your dev environment is set up correctly, see that FastQC has run on the sample. (Can be seen in the dev output logs too)
  3. Upload a file to a sample via the Web GUI. See that FastQC runs on it, as there is no associated Sequencing Run

Checklist

Things for the developer to confirm they've done before the PR should be accepted:

  • CHANGELOG.md (and UPGRADING.md if necessary) updated with information for new change.
  • Tests added (or description of how to test) for any new features.
  • User documentation updated for UI or technical changes.

@JeffreyThiessen JeffreyThiessen marked this pull request as ready for review December 13, 2023 23:44
Copy link
Contributor

@deepsidhu85 deepsidhu85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me and working as expected! Just a few things to update:

  1. CHANGELOG
  2. Target branch to main (will need to pull in the changes currently in main)
  3. Verify the version number in the gradle build file is set to 23.10.1

@JeffreyThiessen JeffreyThiessen changed the base branch from development to main December 14, 2023 16:33
@JeffreyThiessen JeffreyThiessen force-pushed the fix_early_file_processor branch from 2e36c4b to 173053f Compare December 14, 2023 16:44
Copy link
Contributor

@deepsidhu85 deepsidhu85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@deepsidhu85 deepsidhu85 merged commit 3486466 into main Dec 14, 2023
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

file processing race condition
2 participants