Docs fix: Multinomial sampling decoding needs "num_beams=1", since by default it is usually not 1. #22473
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix error in docs: multinomial sampling decoding strategy
As indicated in the library source code:
https://github.com/huggingface/transformers/blob/228792a9dc0c36f1e82ab441e1b1991d116ee0a0/src/transformers/generation/utils.py#LL1364-L1367
Multinomial sampling needs
num_beams=1
. However, this is not indicated in the docs, potentially leading to execute beam-search multinomial sampling instead of the intended multinomial sampling.This deviation from the expected behaviour happens quite often, since a lot of models have in their
generation_config.json
the parameternum_beams
set to something higher than 1. This happens, for example, in the majority of top translation models from the Hub.Also, I have included "ancestral sampling" as another name for multinomial sampling, since it is the most common name in the decoding algorithms literature.
Before submitting
Who can review?
Original authors of this piece of documentation: @gante, @sgugger, @stevhliu and @MKhalusova