Why there is no padding or truncation based on max length #9

PrAsAnNaRePo · 2024-10-09T12:07:18Z

No description provided.

zhangfaen · 2024-10-10T07:55:24Z

In the finetune.py, there is:
processor = AutoProcessor.from_pretrained("Qwen/Qwen2-VL-2B-Instruct", min_pixels=2562828, max_pixels=5122828, padding_side="right")

It means we set padding strategy, it's behavior is add padding tokens to the shorter data sample in the batch to match the longest of data sample in the batch.

See https://github.com/huggingface/transformers/blob/main/src/transformers/tokenization_utils_fast.py#L82 for more details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why there is no padding or truncation based on max length #9

Why there is no padding or truncation based on max length #9

PrAsAnNaRePo commented Oct 9, 2024

zhangfaen commented Oct 10, 2024

Why there is no padding or truncation based on max length #9

Why there is no padding or truncation based on max length #9

Comments

PrAsAnNaRePo commented Oct 9, 2024

zhangfaen commented Oct 10, 2024