Skip to content

Commit

Permalink
Merge pull request #360 from MannLabs/development
Browse files Browse the repository at this point in the history
Release 1.8.2
  • Loading branch information
GeorgWa authored Oct 28, 2024
2 parents 2a05b38 + 4f5afa3 commit 06439d2
Show file tree
Hide file tree
Showing 37 changed files with 520 additions and 37 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@

## Features
- Empirical library and fully predicted library search
- End-to-end transfer learning for custom RT, mobility, and MS2 models
- Label free quantification
- DIA multiplexing

Expand Down
2 changes: 1 addition & 1 deletion alphadia/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
#!python

__version__ = "1.8.1"
__version__ = "1.8.2"
25 changes: 22 additions & 3 deletions alphadia/data/bruker.py
Original file line number Diff line number Diff line change
Expand Up @@ -567,7 +567,7 @@ def assemble_push(
n_precursor_indices = len(unique_precursor_index)
n_tof_slices = len(tof_limits)

# scan valuesa
# scan values
mobility_start = int(scan_limits[0, 0])
mobility_stop = int(scan_limits[0, 1])
mobility_len = mobility_stop - mobility_start
Expand All @@ -586,6 +586,16 @@ def assemble_push(
dtype=np.float32,
)

# intensities below HIGH_EPSILON will be set to zero
HIGH_EPSILON = 1e-26

# LOW_EPSILON will be used to avoid division errors
# as LOW_EPSILON will be added to the numerator and denominator
# intensity values approaching LOW_EPSILON would result in updated dim1 values with 1
# therefore, LOW_EPSILON should be orderes of magnitude smaller than HIGH_EPSILON
# TODO: refactor the calculation of dim1 for performance and numerical stability
LOW_EPSILON = 1e-36

if absolute_masses:
pass
else:
Expand Down Expand Up @@ -636,12 +646,18 @@ def assemble_push(
]

new_intensity = self.intensity_values[idx]
new_intensity = new_intensity * (
new_intensity > HIGH_EPSILON
)

if absolute_masses:
new_dim1 = (
accumulated_dim1 * accumulated_intensity
+ new_intensity * measured_mz_value
) / (accumulated_intensity + new_intensity)
+ LOW_EPSILON
) / (
accumulated_intensity + new_intensity + LOW_EPSILON
)

else:
new_error = (
Expand All @@ -652,7 +668,10 @@ def assemble_push(
new_dim1 = (
accumulated_dim1 * accumulated_intensity
+ new_intensity * new_error
) / (accumulated_intensity + new_intensity)
+ LOW_EPSILON
) / (
accumulated_intensity + new_intensity + LOW_EPSILON
)

dense_output[
0,
Expand Down
6 changes: 5 additions & 1 deletion docs/_static/css/custom.css
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

.content {
font-size: 14px;
width: 60em;
width: 70em;
}

.autosummary.longtable{
Expand All @@ -14,3 +14,7 @@
.sidebar-brand{
width: 50%;
}

.sidebar-drawer{
width: calc(50% - 34em);
}
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/images/libfree-gui/first_input.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/images/libfree-gui/first_result.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/images/libfree-gui/first_start.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/images/libfree-gui/second_input.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
34 changes: 34 additions & 0 deletions docs/guides.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,36 @@
# Guides

```{toctree}
:maxdepth: 1
:hidden:
Library-free DIA <guides/libfree-gui>
DIA Transfer Learning <guides/transfer-dimethyl>
Evaluate Transfer Learning <guides/evaluate_model>
```


Below you will find a collection of guides how to use and configure alphaDIA for specific searches.

```{card} Library free DIA search using the GUI
:link: guides/libfree-gui.html
Perform a DIA search of label free bulk data usinmg a fully predicted spectral library from a FASTA digest.
{bdg-success}`GUI`
```

```{card} DIA Transfer Learning for Dimethyl labeled samples
:link: guides/transfer-dimethyl.html
Use the AlphaDIA GUI to train a custom peptdeep model for dimethyl labeled DIA data.
The custom model is then used to search non-multiplexed dimethyl labeled data with match between runs.
{bdg-success}`GUI` {bdg-primary}`Transfer Learning`
```

```{card} Evaluate Transfer Learning Metrics
:link: guides/evaluate_model.html
Evaluate the transfer learning by visualizing the metrics for the RT, Charge, Mobility and MS2 prediction tasks.
{bdg-warning}`Jupyter Notebook` {bdg-primary}`Transfer Learning`
```
210 changes: 210 additions & 0 deletions docs/guides/evaluate_model.ipynb

Large diffs are not rendered by default.

59 changes: 59 additions & 0 deletions docs/guides/libfree-gui.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# Library-free DIA search using the GUI
**This tutorial was created using alphaDIA 1.8.1 - please be aware that there might be changes in your version**

## 1. Prerequisites
Make sure that you have a machine with at least 64 gigabytes of memory.
Please download the test data for this tutorial [here](https://datashare.biochem.mpg.de/s/Nsp8CaHMBf7FHq1). We will be using replicates of label-free bulk DIA data of HeLa digests acquired on the Orbitrap Astral.
Also make sure you have a valid alphaDIA installation including the GUI. The easiest option is the one-click installer, and a summary of all installation options can be found [here](<project:../installation.md>).
Also ensure the right execution engine has been selected and your version is up to date.
<img src="../_static/images/libfree-gui/initial_engine.png" width="100%" height="auto">

## 2. Project Structure
We will be performing two DIA searches: a first search for library generation and a second search for joined quantification. To accommodate this, we prepare the project directory to have a `first_pass` and a `second_pass` folder. You can change the project path in the `Output Files` tab
<img src="../_static/images/libfree-gui/folder_structure.png" width="100%" height="auto">

## 3. First search
To set up the first search:

1. Select all raw files by clicking the `.raw` button and add them to the file list.
2. Add the FASTA file which will be used for library prediction by clicking the `SELECT FILE` button.

<img src="../_static/images/libfree-gui/first_input.png" width="100%" height="auto">

For this search, most parameters can be left at their default values. To speed up processing, set `thread_count` to the number of logical cores you have available in your system. Also enable library prediction from FASTA and set the `precursor_mz` range to the range of the dataset `380`-`980` to predict only the relevant subset of precursors. By default, this search will have `Carbamidomethyl@C` as a fixed modification and up to two variable modifications of `Oxidation@M` and `Acetyl@Protein_N-term`.

For the search, we will use known `target_ms1_tolerance` of 4ppm and `target_ms2_tolerance` of 7ppm. These values are optimal for Orbitrap Astral data and can be reused. For lower resolution instruments, 10ppm or 15ppm might be optimal. If the optimal mass tolerance is not known, it can be set to `0` to activate automatic optimization. The `target_rt_tolerance` will also be set to `0` for automatic optimization. Lastly, we increase the number of peak groups `target_num_candidates` to use for deep-learning based scoring to `3`.

:::{tip}
Keeping track of optimized mass tolerance values for different instrument setups can save time in future analyses and ensure consistent results across projects.
:::

<img src="../_static/images/libfree-gui/first_settings.png" width="100%" height="auto">

Start the first search by clicking the "Run Workflow" button. This will take between one and two hours depending on your system.

## 4. Second search
For the second search, we will use the library generated in the first search to quantify precursors across samples. Load all raw files as previously but remove the FASTA file. Instead, select the `speclib.mbr.hdf` as the spectral library.
<img src="../_static/images/libfree-gui/second_input.png" width="100%" height="auto">

For the second search, configure the `thread_count`, `target_ms1_tolerance`, and `target_ms2_tolerance` as before. Do not activate library prediction and instead set the `inference_strategy` to `library` to reuse the protein grouping from the first search. In the second search, it can be beneficial to increase the number of peak groups `target_num_candidates` to `5`. Values larger than `5` will most likely not have an effect, and we expect that future versions of alphaDIA will have an improved peak group selection making this step unnecessary.

<img src="../_static/images/libfree-gui/second_settings.png" width="100%" height="auto">

Finally, start the search as before. This search will take only around 2 minutes per file.

## 5. Results
In the end, both folders should contain a full search output.
Use the precursor-level file `precursors.tsv` or the protein matrix in `pg.matrix.tsv` for any downstream analysis.
<img src="../_static/images/libfree-gui/final_folders.png" width="100%" height="auto">

You can get a quick overview from the contents of the `stat.tsv` file. This two-step search strategy resulted in more than 115,000 precursors and 9,300 protein groups across the six files.

| run | channel | precursors | proteins | ms1_accuracy | fwhm_rt | fwhm_mobility | ms2_error | ms1_error | rt_error | mobility_error |
|----------------------------------------------------------------------|------------|---------------|-------------|-----------------|-------------|------------------|--------------|--------------|--------------|-------------------|
| 20231024_OA3_TiHe_ADIAMA_HeLa_200ng_Evo01_21min_F-40_iO_before_03 | 0 | 118286 | 9356 | 0.597696 | 2.772127 | 0.000000 | 7.000000 | 4.000000 | 17.771221 | 0.100000 |
| 20231024_OA3_TiHe_ADIAMA_HeLa_200ng_Evo01_21min_F-40_iO_before_02 | 0 | 120276 | 9362 | 0.594870 | 2.748140 | 0.000000 | 7.000000 | 4.000000 | 25.645317 | 0.100000 |
| 20231024_OA3_TiHe_ADIAMA_HeLa_200ng_Evo01_21min_F-40_iO_before_01 | 0 | 119902 | 9355 | 0.596190 | 2.735716 | 0.000000 | 7.000000 | 4.000000 | 17.838232 | 0.100000 |
| 20231023_OA3_TiHe_ADIAMA_HeLa_200ng_Evo01_21min_F-40_iO_after_03 | 0 | 118977 | 9352 | 0.589555 | 2.733382 | 0.000000 | 7.000000 | 4.000000 | 25.810515 | 0.100000 |
| 20231023_OA3_TiHe_ADIAMA_HeLa_200ng_Evo01_21min_F-40_iO_after_02 | 0 | 116552 | 9359 | 0.590638 | 2.760179 | 0.000000 | 7.000000 | 4.000000 | 27.136864 | 0.100000 |
| 20231023_OA3_TiHe_ADIAMA_HeLa_200ng_Evo01_21min_F-40_iO_after_01 | 0 | 120054 | 9355 | 0.583652 | 2.725050 | 0.000000 | 7.000000 | 4.000000 | 18.924332 | 0.100000 |
Loading

0 comments on commit 06439d2

Please sign in to comment.