Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TGI additional options #402

Merged
merged 1 commit into from
Sep 6, 2024
Merged

Conversation

yongfengdu
Copy link
Collaborator

Add user configurable shm_size support.
Add interface for additional TGI cli parameters.

Description

The summary of the proposed changes as long as the relevant motivation and context.

Issues

List the issue or RFC link this PR is working on. If there is no such link, please mark it as n/a.

Type of change

List the type of change like below. Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds new functionality)
  • Breaking change (fix or feature that would break existing design and interface)

Dependencies

List the newly introduced 3rd party dependency if exists.

Tests

Describe the tests that you ran to verify your changes.

Add user configurable shm_size support.
Add interface for additional TGI cli parameters.

Signed-off-by: Dolpher Du <[email protected]>
Copy link
Collaborator

@daisy-ycguo daisy-ycguo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@daisy-ycguo daisy-ycguo merged commit bf10bdd into opea-project:main Sep 6, 2024
12 checks passed
@eero-t
Copy link
Contributor

eero-t commented Sep 6, 2024

It seems that this is needed only when model requires more memory than fits to given device, i.e. it needs to be sharded over multiple devices using something like deepspeed?

This means that it would need to be in the relevant top-level DEVICE-values.yaml files, where the model and device allocations are specified. I.e. model, allocation for number of devices (>1) needed for it, matching SHM size, and TGI sharding options all need to go hand-in-hand.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants