Move to src/
layout?
#813
Labels
core
Changes to the cpg-workflows api
enhancement
New feature or request
shared
The change impacts both the LC and RD pipeline.
Inspired by the python best practices @dancoates mentioned earlier in the week
https://www.pyopensci.org/python-package-guide/package-structure-code/python-package-structure.html
We have a fundamental issue with our usage of cpg_workflows in Cloud infrastructure:
cpg_workflows
is at the root of the project folder)cpg_workflows
at the front of the Python PATH (i.e. whenmain.py
imports stagesfrom cpg_workflows.X
it uses the git cloned version of the code, not the installed version. The packages and versions available in the environment at runtime are sourced from the installation in the image). This is an inherent conflictcpg_utils
, orseqr-loading-pipelines
are sourced from the installation inside the image, while cpg_workflows is sourced from the cloned folder.query_command
we pickle a bit of source code, pass it into a new VM, and run it there. When this code is run, we import the packages inside that image. i.e. for every one of ourquery_modules
scripts which is run usingquery_command
, we potentially import different versions of the cpg_workflows library (here) compared with whenmain.py
was run to create the batch jobs.Something as basic as moving to a
src
layout could be a partial solution to this. We would never be in a position to accidentally importfrom cpg_workflows import X
and pick up the cloned code instead of the installed code. That would create stability between the code running when creating a batch, and the code running inside a batch.The issue with the flat layout we have is that you can accidentally run code you don't intend to. Normally running in a container with pre-installed versions would overcome this, but analysis-runner clones the codebase into the running container before doing anything else, so we're right back at this vulnerability.
This change would make the pipeline more cumbersome to run while undergoing development - to test new changes, you need to push a new image and use it as a driver image - but it would add zero burden to code in
main
, and stability there should be the most important factor when we are trying to nail down exact versions of our tools and code.The text was updated successfully, but these errors were encountered: