sagemaker: Support asynchronous endpoints #23149
Labels
@aws-cdk/aws-sagemaker
Related to AWS SageMaker
effort/medium
Medium work item – several days of effort
feature-request
A feature should be added or improved.
p3
Describe the feature
As described in the SageMaker
Endpoint
L2 construct RFC:Please 👍 this issue to help with the prioritization of this feature.
Use Case
"This option is ideal for requests with large payload sizes (up to 1GB), long processing times (up to 15 minutes), and near real-time latency requirements. Asynchronous Inference enables you to save on costs by autoscaling the instance count to zero when there are no requests to process, so you only pay when your endpoint is processing requests." (link)
Proposed Solution
As described in the SageMaker
Endpoint
L2 construct RFC:Other Information
No response
Acknowledgements
CDK version used
2.54.0-alpha.0
Environment details (OS name and version, etc.)
macOS Ventura
The text was updated successfully, but these errors were encountered: