Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to dekonize peptide into AA list #22

Merged
merged 7 commits into from
Nov 15, 2022
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions depthcharge/components/transformers.py
Original file line number Diff line number Diff line change
Expand Up @@ -184,12 +184,15 @@ def tokenize(self, sequence, partial=False):
tokens = torch.tensor(tokens, device=self.device)
return tokens

def detokenize(self, tokens):
def detokenize(self, tokens, pep_as_str=True):
"""Transform tokens back into a peptide sequence

Parameters
----------
tokens : torch.Tensor of shape (n_amino_acids)
pep_as_str: bool, optional
Return peptide sequence in str format by default.
If "False", returns a list of amino acids.
"""
sequence = [self._idx2aa.get(i.item(), "") for i in tokens]
if "$" in sequence:
Expand All @@ -199,7 +202,7 @@ def detokenize(self, tokens):
if self.reverse:
sequence = list(reversed(sequence))

return "".join(sequence)
return "".join(sequence) if pep_as_str else sequence

@property
def vocab_size(self):
Expand Down