Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for fractionated data / normalization #14

Open
timosachsenberg opened this issue Nov 20, 2020 · 2 comments
Open

Support for fractionated data / normalization #14

timosachsenberg opened this issue Nov 20, 2020 · 2 comments

Comments

@timosachsenberg
Copy link

Hi,
We added some basic Triqler export in OpenMS but were not sure if we should
normalize the data at all. Also we were not sure how to export fractionated data.
Best,
Timo

@MatthewThe
Copy link
Contributor

Hi Timo,

Cool that you added a Triqler export to OpenMS!

Normalization:

  • If you export directly to the Triqler input format no normalization will be applied by Triqler. It might be wise to do some basic normalization, such as log intensity median centering across samples to reduce the influence of batch effects.
  • We have a retention time based normalization scheme as part of the different Triqler converters (MQ, Quandenser, Dinosaur). However, since retention time is not included in the normal Triqler input, we cannot run this scheme as part of Triqler itself.

Fractionated data:

  • Support for fractionated data is very rudimentary by simply giving all fractions of the same sample the same name in the run column. Triqler will then only use the best scoring PSM across all fractions. If you have any suggestions on how to do this in a better way, please let me know.
  • As part of the Triqler converters, the fractions can be specified as part of the --file_list_file file input (see figure 2 in the Triqler manual), which will do the retention time based normalization per fraction and automatically does the grouping by sample as mentioned above.

An alternative would be to provide an intermediate output format by OpenMS (e.g. similar to the MaxQuant evidence.txt output) for which we can write a Triqler converter. This would give "native" support for retention time based normalization and fractionated samples.

Hope that answers your questions!

@timosachsenberg
Copy link
Author

Hi Matthew! Thanks for the quick answer.
Ok I think from our side it would probably be easy to export a (augmented) triqler file with additional columns containing: rt, fraction and sample and have a converter condense that to the actual triqler input.
Regarding your question how to deal with the same PSMs across fractions. I hope I understood correctly so here are my thought. As triqler has no fraction column for input I think you discard potentially valuable information by selecting the best scoring PSM across all fractions. It might be better to determine the fraction number F that has the best score and discard the same PSMs in other fractions (!=F) across all samples (or except neighboring fraction?). But I guess this is something one might need to test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants