-
Notifications
You must be signed in to change notification settings - Fork 267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Guidance on how to run HELM benchmarks against LLM models deployed on SageMaker Endpoints #1713
Comments
AWS SageMaker support is something we definitely want to have in HELM eventually. (a) and (b) are both reasonable approaches. I would suggest doing a fork and doing (b) if you have an urgent use case. For (a), you're welcome to open a PR to contribute a more general way of doing this. I imagine that it would work similarly to how the HuggingFace Hub integration works with the I also think it would be good to move these things to a config file instead of flags eventually, but we don't have plans for that yet either. |
I decided to follow option (b) as it is a more scalable approach. I deployed an AI21 Jurassic-J2-Jumbo model on a SageMaker endpoint and fronted it with an API Gateway + Lambda. API_URL and API_KEY can be used to invoke the model, eg:
At least at the 1st pass of integration, we aren't going to make all models in SageMaker Jumpstart available with public APIs (it would be too costly to host). |
Amazon API Gateway URL is be different for every new deployed API. Specifically, the alphanumeric string following https:// (eg. https://mkyfzfz3si) as well as geo region (eg us-west-2). Each user would deploy a new API Gateway. It would not be possible to infer the URL from a model name. |
The conf file option sounds reasonable to me:
|
Can the authors provide a prescriptive guidance on how to run HELM benchmarks against LLMs deployed on SageMaker? There seem to be two obvious approaches:
a\front SageMaker Endpoint with an AWS API Gateway and provide and API_Key. What's not clear is where/how to provide API_URL. This approach is more scalable as it allows a SageMaker Endpoint to be invoked externally outside of AWS Account where the endpoint was deployed.
b\clone git repo into a machine with AWS Account credentials and hack the code to directly invoke SageMaker Endpoint instead of going the API_Key route.
Any suggestions/experience with either one of the approaches?
The text was updated successfully, but these errors were encountered: