-
Notifications
You must be signed in to change notification settings - Fork 27.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does this project have this function ? #3162
Comments
@frankniujc it is helpful |
The probability of a sentence P(s0s1s2s3s4...sn) = P(s1|s0) * P(s2|s0s1) * P(s3|s0s1s2) * ... * P(sn|s0s1s2...sn-1) So you can do something like this def sentence_probability(sent):
bos = tokenizer.encode('<|endoftext|>')
tokens = tokenizer.encode(sent)
tokens = bos + tokens
input_ids = torch.tensor(tokens).unsqueeze(0).to('cuda')
sent_probs = []
for i, next_word in enumerate(tokens[1:]):
next_word_logits = model(input_ids[:,:i+1])[0][0, -1].detach()
next_word_prob = F.log_softmax(next_word_logits, dim=0)[next_word].item()
sent_probs.append(next_word_prob)
return sum(sent_probs) |
@lovejasmine Have a look at It is a tiny wrapper around |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
🚀 Feature request
can we use this project to calculate the probability that a input text as a real/resonable sentence base on the corpus we trained
The text was updated successfully, but these errors were encountered: