-
Notifications
You must be signed in to change notification settings - Fork 10
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Chargestate - dev container - dev guide (#59)
- Chargestate prediction data, models, and other resources - VSCode dev container for containerized development - Guide for development
- Loading branch information
Showing
23 changed files
with
6,687 additions
and
71 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,241 @@ | ||
# Development Guides for dlomix PyTorch Implementation | ||
|
||
This file provides guidelines for contributing PyTorch implementations to the dlomix project, a deep learning framework for proteomics. | ||
|
||
Based on your environment, please follow the respective setup guide: | ||
|
||
- [Dev Containers in VSCode](#dev-containers-in-vscode): Recommended if you would like to isolate everything in a Docker container. If you have Apple Silicon, local development would be a better option. | ||
- [Local Development Guide](#local-development-guide): Recommended if you have good command of your Python virtual environments, dependencies, etc.. | ||
- [Google Colab Development Guide](#google-colab-development-guide): More explorative and would not provide full control on the development environment (temrinal, etc..) | ||
|
||
Other options: GitHub Codespaces or similar, please follow the local development guide. | ||
|
||
For contributing, please follow our [implementation guidelines](#implementation-guidelines) | ||
|
||
|
||
## Dev Containers in VSCode | ||
|
||
### Steps | ||
|
||
1. Ensure you have Docker installed on your system and the docker daemon is running. To validate, please run the following command and ensure you do not get a listing of CONTAINER ID and other details: | ||
```bash | ||
docker ps | ||
``` | ||
2. If you don't yet have docker installed, follow these instructions: https://docs.docker.com/engine/install/ubuntu/#install-using-the-repository | ||
|
||
3. To run docker without sudo (VSCode requirement), follow these post-installation steps: https://docs.docker.com/engine/install/linux-postinstall/ | ||
|
||
4. Open VSCode and install the Devcontainers extensions from the extensions tab | ||
|
||
5. Clone the forked GitHub repository of DLOmix https://github.com/omsh/dlomix | ||
|
||
6. Open the repository in a DevContainer by clicking on the arrows in the botton left corner, and choosing "Reopen in Container". | ||
|
||
![alt text](vscode-screenshot.png) | ||
|
||
7. During the first time, the container build will take some time and then VSCode will connect to the running container. Once it is done, please run the following command in the VSCode Terminal to install DLOmix with development packages in the editable mode: | ||
|
||
```bash | ||
make install-dev | ||
``` | ||
8. You are now ready to make changes to the source code and see the impact directly. Once you make the changes, they should be reflected in the editable install inside your dev container. | ||
|
||
VSCode Official Tutorial: https://code.visualstudio.com/docs/devcontainers/tutorial | ||
VSCode documentaion for DevContainers: https://code.visualstudio.com/docs/devcontainers/containers | ||
|
||
|
||
## Local Development Guide | ||
|
||
### Environment Setup | ||
|
||
#### Option 1: Using venv (Recommended) | ||
|
||
1a. Create and activate a virtual environment: | ||
```bash | ||
python -m venv venv | ||
# On Windows | ||
.\venv\Scripts\activate | ||
# On Unix or MacOS | ||
source venv/bin/activate | ||
``` | ||
|
||
#### Option 2: Using conda | ||
|
||
1b. Create and activate a conda environment: | ||
```bash | ||
conda create -n dlomix python=3.9 | ||
conda activate dlomix | ||
``` | ||
|
||
2. Clone the repository and `cd` into the directory of the cloned repo: | ||
```bash | ||
git clone https://github.com/omsh/dlomix.git | ||
cd dlomix | ||
``` | ||
|
||
3. Install development dependencies and ensure torch-related packages are in this file, otherwise extend it: | ||
```bash | ||
pip install -r ./.devcontainer/dev-requirements.txt | ||
``` | ||
|
||
### DLOmix Editable installation | ||
|
||
Install the package with the dev option and in editable mode: | ||
```bash | ||
pip install -e .[dev] | ||
``` | ||
|
||
|
||
## Google Colab Development Guide | ||
|
||
### Initial Setup | ||
|
||
1. Create a new Colab notebook and mount your Google Drive: | ||
```python | ||
from google.colab import drive | ||
drive.mount('/content/drive') | ||
``` | ||
|
||
2. Clone the forked dlomix repository: | ||
```bash | ||
!git clone https://github.com/omsh/dlomix.git | ||
``` | ||
|
||
3. Install development dependencies and ensure torch-related packages are in this file, otherwise extend it: | ||
```bash | ||
pip install -r ./dlomix/.devcontainer/dev-requirements.txt | ||
``` | ||
|
||
4. Install the package in development mode: | ||
```bash | ||
!pip install -e "./dlomix[dev]" | ||
``` | ||
|
||
|
||
## Implementation Guidelines | ||
|
||
1. Add PyTorch implementations following the current project structure: | ||
``` | ||
dlomix/ | ||
├── models/ | ||
│ ├── pytorch/ | ||
│ │ ├── __init__.py | ||
│ │ └── model.py | ||
│ └── existing_models/ | ||
``` | ||
|
||
2. Ensure compatibility with existing APIs: | ||
```python | ||
# dlomix/models/pytorch/model.py | ||
import torch | ||
import torch.nn as nn | ||
|
||
# Example of maintaining consistent API | ||
class PrositRTPyTorch(nn.Module): | ||
"""PyTorch implementation of Prosit retention time model""" | ||
|
||
def __init__(self, *args, **kwargs): | ||
super().__init__() | ||
# PyTorch implementation here | ||
|
||
def forward(self, sequences): | ||
# Maintain same input/output structure as TensorFlow version | ||
|
||
return retention_times | ||
``` | ||
|
||
3. Add corresponding tests: | ||
```python | ||
# tests/test_pytorch_models.py | ||
import torch | ||
import pytest | ||
from dlomix.models.pytorch import PrositRTPyTorch | ||
|
||
def test_model_compatibility(): | ||
tf_model = PrositRT() # Existing TF implementation | ||
pt_model = PrositRTPyTorch() | ||
|
||
# Test with same input | ||
sequence_input = "PEPTIDE" | ||
tf_output = tf_model.predict(sequence_input) | ||
pt_output = pt_model(torch.tensor(encoded_sequence)) | ||
|
||
assert tf_output.shape == pt_output.detach().numpy().shape | ||
|
||
def test_model_forward_pass(): | ||
model = PrositRTPyTorch() | ||
expected_shape = (128, 1) | ||
input_size = 30 | ||
|
||
x = torch.randn(128, input_size) # Match existing input dimensions | ||
output = model(x) | ||
assert output.shape == expected_shape | ||
``` | ||
|
||
4. Add a usage example of the new PyTorch implementation, preferably in a notebook under `./notebooks` | ||
|
||
|
||
### Development Workflow | ||
|
||
#### (Optional, but recommended) Pre-commit hooks | ||
We use some simple pre-commit hooks to ensure consistency in file and code formatting. To use pre-commit hooks: | ||
- install pre-commit with `pip install pre-commit` | ||
- add the hooks by running in the root directory of the repo `pre-commit install` | ||
- If you like, you can manually run the checks after staging but before commiting using `pre-commit run` to run the hooks against youur changes. | ||
|
||
1. Create a new branch: | ||
```bash | ||
git checkout -b feature/FEATURE_NAME | ||
``` | ||
|
||
2. Add your implementation | ||
|
||
3. Write tests under `./tests` to ensure your code runs as expected. | ||
|
||
4. Run the test suite using make: | ||
```bash | ||
make test-local | ||
``` | ||
|
||
For google Colab you can run: | ||
```bash | ||
!python -m pytest tests/ | ||
``` | ||
|
||
5. Format your code using the project's style guidelines: | ||
```bash | ||
make format | ||
``` | ||
|
||
6. Create a pull request with: | ||
- Clear description of changes | ||
- Any new dependencies added | ||
- Mention the usage example under `./notebooks` | ||
|
||
|
||
### General Considerations | ||
|
||
1. Sequence Data : | ||
- Assume the same sequence encoding schemes | ||
|
||
2. Model Architecture: | ||
- Closely mimic the existing Keras implementations or the original implementations of papers | ||
- Maintain similar model inputs and outputs (datatype, shape, etc..) | ||
|
||
### Resources | ||
|
||
#### PyTorch | ||
- PyTorch Installation https://pytorch.org/get-started/locally/ | ||
- PyTorch Documentation, please always ensure you have the right version on the top left corner https://pytorch.org/docs/stable/index.html | ||
|
||
### Keras and TensorFlow | ||
- TensorFlow API Documentation 2.15 (Version 2.16 introduced some breaking changes with respect to Keras) https://www.tensorflow.org/versions/r2.15/api_docs/python/tf | ||
- TensorFlow Keras Guide https://www.tensorflow.org/guide/keras | ||
|
||
### HuggingFace Datasets | ||
|
||
- PROSPECT PTMs is available on HuggingFace for Retention time, Fragment ion intensity, and Charge state prediction https://huggingface.co/collections/Wilhelmlab/prospect-ptms-665db48431a7e844634660ba | ||
|
||
|
||
### Python Environments | ||
- If you like to use conda, try out miniforge https://github.com/conda-forge/miniforge |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
FROM python:3.9-slim | ||
|
||
ENV PYTHONUNBUFFERED=1 | ||
ENV PYTHONDONTWRITEBYTECODE=1 | ||
|
||
# Install extra development dependencies | ||
COPY dev-requirements.txt /tmp/dev-requirements.txt | ||
RUN pip install --upgrade --no-cache-dir pip && \ | ||
pip install --no-cache-dir -r /tmp/dev-requirements.txt | ||
|
||
# Install system dependencies | ||
RUN set -ex && \ | ||
apt-get update && \ | ||
apt-get install -y --no-install-recommends \ | ||
build-essential \ | ||
bash-completion \ | ||
git \ | ||
openssh-client \ | ||
ca-certificates \ | ||
rsync \ | ||
vim \ | ||
nano \ | ||
wget \ | ||
curl \ | ||
&& update-ca-certificates \ | ||
&& rm -rf /var/lib/apt/lists/* | ||
|
||
# Set the working directory | ||
WORKDIR /workspaces/dlomix | ||
|
||
USER root |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
# comment the following line to install cuda with pytorch, otherwise if left uncommented it will install cpu version | ||
--index-url https://download.pytorch.org/whl/cpu | ||
|
||
torch | ||
#wandb >= 0.15 # enable this line to install wandb |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
{ | ||
"name": "DLOmix Dev Container", | ||
"build": { | ||
"dockerfile": "Dockerfile" | ||
}, | ||
"runArgs": [ | ||
"--platform=linux/amd64" | ||
], | ||
"remoteUser": "root", | ||
"containerUser": "root", | ||
"workspaceFolder": "/workspaces/dlomix", | ||
"workspaceMount": "source=${localWorkspaceFolder},target=/workspaces/dlomix,type=bind,consistency=cached", | ||
"customizations": { | ||
"vscode": { | ||
"extensions": [ | ||
"ms-azuretools.vscode-docker", | ||
"eamodio.gitlens", | ||
"ms-python.python", | ||
"ms-python.black-formatter", | ||
"ms-python.vscode-pylance", | ||
"tamasfe.even-better-toml", | ||
"ms-toolsai.jupyter" | ||
], | ||
"settings": { | ||
"git.path": "/usr/bin/git", | ||
"[python]": { | ||
"python.pythonPath": "/usr/local/bin/python", | ||
"editor.defaultFormatter": "ms-python.black-formatter" | ||
} | ||
}, | ||
"postCreateCommand": "pip install -e .[dev]" | ||
} | ||
} | ||
} |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.