Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support --remove_input_padding for BERT models? #1755

Closed
Altair-Alpha opened this issue Jun 8, 2024 · 4 comments
Closed

Support --remove_input_padding for BERT models? #1755

Altair-Alpha opened this issue Jun 8, 2024 · 4 comments
Assignees
Labels
question Further information is requested triaged Issue has been triaged by maintainers

Comments

@Altair-Alpha
Copy link
Contributor

Hi, we're trying out build and inference for BERT models (specifically, BertForSequenceClassification). We would have request batch in which samples can vary largely on seq_len, and as the document says, the --remove_input_padding option which packs samples into 1D tensor without padding should be beneficial on performance.

However, we didn't found this parameter in examples/bert/build.py, and the engine built with the script seems to have "remove_input_padding": False in config.json. Also, we didn't see any implementation details about this in tensorrt_llm/models/bert/model.py, while the enc-dec model has. Is there a plan on supporting this feature for BERT models? Or are we missing the possible way?

@nv-guomingz
Copy link
Collaborator

Hi @QiJune would u please take a look this question?

@nv-guomingz nv-guomingz added question Further information is requested triaged Issue has been triaged by maintainers labels Jun 11, 2024
@Altair-Alpha
Copy link
Contributor Author

We made our implementation referring to enc_dec models. For anyone who may concern, basically things you need to modify are: (1) inputs, refer to enc_dec implementation; (2) set remove_input_padding to true in plugin_config; (3) the forward of the final pooler layer in BertForSequenceClassification to make it select first token in each sample, according to input_lengths. Closing this.

@QiJune
Copy link
Collaborator

QiJune commented Jun 24, 2024

@Altair-Alpha Do you mind contributing you changes back to Github? Thanks

@Altair-Alpha
Copy link
Contributor Author

@QiJune hi, I made a PR here: #1834. Remind that this is only implemented and tested for BertForSequenceClassification models, and maybe the official team can work further on this :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

3 participants