Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sagemaker: Support asynchronous endpoints #23149

Open
1 of 2 tasks
petermeansrock opened this issue Nov 29, 2022 · 1 comment
Open
1 of 2 tasks

sagemaker: Support asynchronous endpoints #23149

petermeansrock opened this issue Nov 29, 2022 · 1 comment
Labels
@aws-cdk/aws-sagemaker Related to AWS SageMaker effort/medium Medium work item – several days of effort feature-request A feature should be added or improved. p3

Comments

@petermeansrock
Copy link
Contributor

petermeansrock commented Nov 29, 2022

Describe the feature

As described in the SageMaker Endpoint L2 construct RFC:

Asynchronous Inference: By default, a deployed endpoint is synchronous: a customer issues an InvokeEndpoint operation to SageMaker with an attached input payload and the resulting response contains the output payload from the endpoint. To instead support asynchronous invocation, the AsyncInferenceClientConfig CloudFormation attribute was added to the endpoint config resource. To interact with an asynchronous endpoint, a customer issues an InvokeEndpointAsync operation to SageMaker with an attached input location in S3; SageMaker asynchronously reads the input from S3, invokes the endpoint, and writes the output to an S3 location specified within the AsyncInferenceClientConfig attribute.

Please 👍 this issue to help with the prioritization of this feature.

Use Case

"This option is ideal for requests with large payload sizes (up to 1GB), long processing times (up to 15 minutes), and near real-time latency requirements. Asynchronous Inference enables you to save on costs by autoscaling the instance count to zero when there are no requests to process, so you only pay when your endpoint is processing requests." (link)

Proposed Solution

As described in the SageMaker Endpoint L2 construct RFC:

As discussed with the RFC bar raiser here, there are a few ways to tackle the addition of this functionlity. One option is to add attribute(s) to the L2 endpoint config construct to support asynchronous inference along with synthesis-time error handling to catch configuration conflicts (e.g., asynchronous endpoints are only capable of supporting a single instance-based production variant today). Alternatively, an AsyncEndpointConfig subclass of EndpointConfig could be introduced to provide a better compile-time contract to customers (while still implementing the generic functionality within EndpointConfig). Either way, the proposed contracts would only undergo backward-compatible changes.

Other Information

No response

Acknowledgements

  • I may be able to implement this feature request
  • This feature might incur a breaking change

CDK version used

2.54.0-alpha.0

Environment details (OS name and version, etc.)

macOS Ventura

@petermeansrock petermeansrock added feature-request A feature should be added or improved. needs-triage This issue or PR still needs to be triaged. labels Nov 29, 2022
@github-actions github-actions bot added the @aws-cdk/aws-sagemaker Related to AWS SageMaker label Nov 29, 2022
@peterwoodworth peterwoodworth added p2 effort/medium Medium work item – several days of effort and removed needs-triage This issue or PR still needs to be triaged. labels Nov 29, 2022
@perrozzi
Copy link

@madeline-k madeline-k removed their assignment Oct 30, 2023
@pahud pahud added p3 and removed p2 labels Jun 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-sagemaker Related to AWS SageMaker effort/medium Medium work item – several days of effort feature-request A feature should be added or improved. p3
Projects
None yet
Development

No branches or pull requests

5 participants