-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC 431: SageMaker Model Hosting L2 Constructs #433
Conversation
Co-authored-by: Matt McClean <[email protected]> Co-authored-by: Long Yao <[email protected]> Co-authored-by: Drew Jetter <[email protected]> Co-authored-by: Murali Ganesh <[email protected]> Co-authored-by: Abilash Rangoju <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First quick pass, will take another look later.
text/0431-sagemaker-l2-endpoint.md
Outdated
An inference pipeline is an Amazon SageMaker model that is composed of a linear sequence of two to | ||
five containers that process requests for inferences on data. You use an inference pipeline to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why two to five containers? The docs indicate a maximum of 15 containers: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-sagemaker-model.html#cfn-sagemaker-model-containers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. As described in the drawback section, the initial draft of this RFC was based on a Q3 2019 snapshot of the SageMaker feature set; SageMaker has since increased the number of supported containers. As part of authoring the second revision of this RFC, I'll audit the SageMaker CloudFormation resources to bring the RFC in line with the current supported attributes/features.
text/0431-sagemaker-l2-endpoint.md
Outdated
container: { | ||
image: image1, modelData: modelData1 | ||
}, | ||
extraContainers: [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we make a single containers
property? With one containers
property, we can decide if we want it to be an inference model or a single-container model based on the number the user specifies (eg if it's 1 container, then do a single-container model, and if it's more then do an inference pipeline model).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. An api with container
and extraContainers
seems clumsy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As at least one container is required, following Adam's previous recommendation to introduce productionVariant
+ extraProductionVariants
, my co-authors and I used the same naming convention for the containers.
Once there's consensus on this thread, I plan to apply the same naming convention to both the containers and production variants (i.e., I just want to be doubly sure that the CDK team wants me to undo Adam's recommendation).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't read everything yet, comments so far
text/0431-sagemaker-l2-endpoint.md
Outdated
|
||
## Model | ||
|
||
By creating a model, you tell Amazon SageMaker where it can find the model components. This includes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know yet what a model is, and did you meant to say "by creating a Model
" ? As in, refer to a class here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the current verbiage in the README was taken from SageMaker's public documentation (link relevant to this section), which uses "model" to describe both the SageMaker resource and its associated abstract ML concept.
Which of the following are you looking for?
- Use/paraphrase a one-liner from the AWS glossary, as in the following
In machine learning (ML), a mathematical model that generates predictions by finding patterns in data.
- A more substantial write-up where I describe ML concepts to non-ML customers
- Simply use
Model
in place of the first use of "model"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[This applies to all documentation in this readme section]
The SageMaker public documentation is a good starting point, but I caution actually copying it over verbatim. We can always link to the documentation with something like "For more information on Amazon SageMaker, see SageMaker docs"
To answer this specific question, I'd like to see something like this:
## Model
To create a machine learning model with Amazon Sagemaker, use the `Model` construct.
This construct includes properties that can be configured to define...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: As I had already expanded the wording a bit more (to provide a bit more ML related context), the version in the next revision will be slightly longer.
text/0431-sagemaker-l2-endpoint.md
Outdated
the S3 path where the model artifacts are stored and the Docker registry path for the image that | ||
contains the inference code. The `ContainerDefinition` interface encapsulates both the specification |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference between artifacts and inference code? Why don't they both live in the Docker image? (All to say: I would appreciate one sentence of levelsetting here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the next revision, I'll emphasize that a model's code usually changes at a slower rate than a model's artifacts (which will likely change every time the model is re-trained, while the code remains static) making their separation natural from a decoupling standpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we (the customers) deserve to see an example of a Model
after the blurb about what it is and how you can configure it. It's okay if the additional configuration sections come after it; I want to see like the most basic use case front and center.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In rev2, I've moved up the Model
examples prior to diving into container image and model data assets.
text/0431-sagemaker-l2-endpoint.md
Outdated
const image = sagemaker.ContainerImage.fromEcrRepository(repository, 'tag'); | ||
``` | ||
|
||
#### `AssetImage` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
People don't actually see the class AssetImage
themselves, so the section title might be confusing.
We generally recommend people use assets. Can we move this paragraph up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll generalize the EcrImage
and AssetImage
headers and move the assets section above the ECR one.
text/0431-sagemaker-l2-endpoint.md
Outdated
|
||
### Model Artifacts | ||
|
||
Models are often associated with model artifacts, which are specified via the `modelData` property |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would I use them? One sentence of context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll also emphasize the point about decoupling trained artifacts from the inference code here in the next revision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreeing that in general, every section title should start with a sentence about what it is and why would one use it.
text/0431-sagemaker-l2-endpoint.md
Outdated
|
||
### AutoScaling | ||
|
||
The `autoScaleInstanceCount` method on the `IEndpointProductionVariant` interface can be used to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too much detail on the interface here I think :).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean you'd like me to remove "on the IEndpointProductionVariant
interface" from this sentence?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simply "To enable autoscaling on the production variant, use the autoScaleInstanceCount
method:"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In rev2, I've adjusted both references to IEndpointProductionVariant
in the README accordingly.
text/0431-sagemaker-l2-endpoint.md
Outdated
import * as sagemaker from '@aws-cdk/aws-sagemaker'; | ||
|
||
const endpointConfig = new sagemaker.EndpointConfig(this, 'EndpointConfig', { | ||
productionVariant: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be productionVariants: ProductionVariant[]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreement. In general, throughout this RFC, I'd prefer fooBars[]
over foobar
and extraFooBars[]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the earliest PR for these constructs, @skinny85 commented:
So here's an interesting pattern that you can use here.
Since at least one ProductionVariant is required, have in props:
productionVariant: ProductionVariant; // required extraProductionVariants?: ProductionVariant[]; // optionalThis way, you communicate to the clients of this class, at compile time, that at least one variant has to be provided.
Given that this new feedback contradicts Adam's original guidance, in light of this context, I just wanted to double-check that I should indeed recombine these attributes back into a single array.
CC-ing @rix0rrr as he gave the same feedback in another comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Want to hear from others as well but I feel like a synth-time check for at least one production variant would suffice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think its okay to go forward with the consensus that we need just one property productionVariants
and one property containers
. These will be required properties so it stands to reason that at least one variant and one container must be provided. If we have to go back on this decision I will eat crow :)
text/0431-sagemaker-l2-endpoint.md
Outdated
container: { | ||
image: image1, modelData: modelData1 | ||
}, | ||
extraContainers: [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. An api with container
and extraContainers
seems clumsy.
text/0431-sagemaker-l2-endpoint.md
Outdated
|
||
The `Model` construct associates container images with their optional model data. | ||
|
||
#### Single Container Model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think this should be the start of the README. The Container Images and Model Artifacts sections are important, but they represent familiar APIs to CDK users and do not demonstrate the main L2 that this RFC is proposing. The first example we should see should be the Single Container Model, without fixture:
import * as sagemaker from '@aws-cdk/aws-sagemaker';
const image = sagemaker.ContainerImage.fromAsset(this, 'Image', {
directory: path.join('path', 'to', 'Dockerfile', 'directory')
});
const modelData = sagemaker.ModelData.fromAsset(this, 'ModelData',
path.join('path', 'to', 'artifact', 'file.tar.gz'));
const model = new sagemaker.Model(this, 'PrimaryContainerModel', {
container: {
image: image,
modelData: modelData,
},
});
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, I'll move the model content up in the next revision.
text/0431-sagemaker-l2-endpoint.md
Outdated
|
||
### What are we launching today? | ||
|
||
We are launching the first set of L2 constructs for an existing module (`@aws-cdk/aws-sagemaker`), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This syntax is only valid for V1. In text like this it will not translate properly to V2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the following paraphrasing of your comment correct?
- In CDK v2, stable modules are published under the
aws-cdk-lib
package while experimental modules are published under module-specific packages like@aws-cdk/aws-sagemaker-alpha
. As a result, usage of@aws-cdk/aws-sagemaker
does not align with the stable nor experimental package naming conventions for v2.
Assuming that's correct, would the following reword be okay?
We are launching the first set of L2 constructs for the SageMaker module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above is all correct, and the reword is good. We currently do some magic to turn the import * as blah from '@aws-cdk/aws-blah'
into import * as blah from 'aws-cdk-lib/aws-blah'
, but that only happens in the examples, I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was still new and too dumb to know this.
text/0431-sagemaker-l2-endpoint.md
Outdated
###### Container Image | ||
|
||
The following interface and abstract class provide mechanisms for configuring a container image. | ||
These closely mirror [analogous entities from the `@aws-cdk/ecs` module][ecs-image] but, rather than |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue as the v1/v2 comment above. Though, actually, the module is named incorrectly regardless. Granted, this only matters if you intend these sections to go in the README when it's written. So, take this comment for what you will.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll make the adjustments in the next revision to avoid confusion. Will just "ECS module" be appropriately version agnostic?
text/0431-sagemaker-l2-endpoint.md
Outdated
ECR as an image source while ECS was capable of sourcing images from either ECR or a | ||
customer-owned private repository. Given the fact that these two products' supported images | ||
sources may yet again diverge in the future, maybe it would be best to keep their | ||
`ContainerImage` APIs separate within their respective modules. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I agree with this point. In general, we should avoid duplicate code. Being unsure of where the place is isn't a compelling reason to have basically two copies of the same thing. A base class/interface that can be extended/implemented as appropriate is better design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that reuse is better, but in this particular case reusing from ecs
doesn't sound great. We might at some point support a more generic ContainerImage
--for now this is okay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Being unsure of where the place is isn't a compelling reason to have basically two copies of the same thing.
That's fair.
I agree that reuse is better, but in this particular case reusing from
ecs
doesn't sound great. We might at some point support a more genericContainerImage
--for now this is okay.
Just to double-check: does this mean that it's alright to keep the new, duplicated-ish ContainerImage
APIs on this RFC and record potential future work to converge the APIs into a generic solution within the "Are there any open issues that need to be addressed later" section?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Combining ContainerImage
is out of scope for this RFC. You're good to go ahead with SageMaker specific ContainerImage
API. Creating a generic ContainerImage
is our problem to solve, you don't really need to worry about it :).
(Unless if you have ideas in which case feel free to remember to open a github issue after this L2 is merged!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In rev2, I've reworded this bit and moved it into the "Are there any open issues that need to be addressed later?" section.
text/0431-sagemaker-l2-endpoint.md
Outdated
This RFC and its [original associated implementation PR][original-pr] were based on a Q3 2019 | ||
feature set of SageMaker real-time inference endpoints. Since that point in time, SageMaker has | ||
launched the following features which would require further additions to the L2 API contracts: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If these are part of SageMaker's core feature set, I think they need to be taken into account in this RFP. We may be creating one way doors with contracts set here that will make the user experience more complicated if these are added later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least I'd like a good story for how the API would evolve.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If these are part of SageMaker's core feature set, I think they need to be taken into account in this RFP. We may be creating one way doors with contracts set here that will make the user experience more complicated if these are added later.
That's fair. I'll begin familiarizing myself with the last few years of SageMaker features and revise the RFC accordingly.
Quick heads-up: this is going to require a fair bit of experimentation (i.e., I'm actually going to, at least in a rudimentary way, implement the construct changes and test these features to make sure I understand how they are configured and how they function). With upcoming OOTO plans, this means it may be a month or more before I'm done with the next revision. Please let me know if you all have concerns with that timing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear. We won't actually be creating any one-way doors because sagemaker will start out as an alpha module. I think we should be thinking about how these constructs can be extended to support other SageMaker resources in the future, but it's not necessary to include them in the current RFC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kaizencc I've familiarized myself with asynchronous inference and serverless inference (the two feature launches with the biggest impact on the proposed APIs in this RFC) by deploying a few sample endpoints via CloudFormation and found a number of feature disparities that I was hoping to get some design guidance on questions related to compile-time, synthesis-time, and deploy-time detection of issues. Particularly, strong compile-time contracts seem like a good idea in practice, but in reality, I would hope that some of these feature gaps disappear. To avoid that future evolution from preventing convergence of the API contracts, using synthesis-time validation seems ideal, but then again, I have no guarantee that these products will actually converge. Anyway, to go into more detail:
-
Lack of feature parity across inference products: The serverless inference documentation states that the following features are absent from SageMaker's serverless feature:
Some of the features currently available for SageMaker Real-time Inference are not supported for Serverless Inference, including GPUs, AWS marketplace model packages, private Docker registries, Multi-Model Endpoints, VPC configuration, network isolation, data capture, multiple production variants, Model Monitor, and inference pipelines.
Although I haven't found corresponding documentation, asynchronous inference has similar limitations: you cannot use multiple production variants nor can the single supported variant be serverless (it must use instance-based hosting). This led me down two separate paths:
- Asynchronous Endpoints: Whether or not an endpoint should be treated as sync vs async is controlled cross-variant at the level of the
EndpointConfig
resource. When anEndpointConfig
resource is configured as async, it is currently only capable of supporting a single production variant. Should I use a compile-time solution (e.g. two different compile-time*Props
interfaces to differentiate between the supported features of sync vs async) or instead simply use synthesis-time validation to identify a misconfigured resource (e.g., an async endpoint config with multiple variants)? - Serverless Production Variants: Similarly, even though the instance-vs-serverless specification is per variant, the presence of one serverless production variant prevents customers from using additional variants (i.e., an aspect of one variant influences the behavior of the entire EndpointConfig resource). Would a synthesis-time check be sufficient here to error out on the incompatible variants or would a compile-time contract guarding this scenario be preferred?
- Asynchronous Endpoints: Whether or not an endpoint should be treated as sync vs async is controlled cross-variant at the level of the
-
Feature-specific metrics: Asynchronous endpoints have unique metrics like
ApproximateBacklogSize
on the endpoint name dimension whereas serverless variants have unique metrics likeModelSetupTime
on the endpoint name + production variant dimension. When addingmetric*
helper APIs for these scenarios, should they be implemented in such a way to guarantee that the metric is available for the specific use-case from a compile-time perspective (i.e., each disparate use-case has an appropriate interface with only its suitable methods) or is it sufficient to use synthesis-time validation in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The clarification has given me a change of heart, and now I'm interested in the following idea: two constructs, one generic
EndpointConfig
, and one "L2.5"AsyncEndpointConfig
that provides further restrictions on the productionVariant property.
L2.5 AsyncEndpointConfig
construct
For should such an AsyncEndpointConfig
, which of the following would apply?
AsyncEndpointConfig extends Resource
and instantiates anEndpointConfig
construct: This is the classic L2 strategy, but would (I believe) introduce an intermediate async construct in the hierarchy, meaning customers couldn't replace anAsyncEndpointConfig
with anEndpointConfig
without changing their CloudFormation logical IDs.AsyncEndpointConfig extends EndpointConfig
: I assume this would retain the identical construct hierarchy but would violate the following documented CDK design guideline:As a rule of thumb, most constructs should directly extend the Construct or Resource instead of another construct. Prefer representing polymorphic behavior through interfaces and not through inheritance.
- I'm assuming there could be another scenario where
AsyncEndpointConfig
instantiatesEndpointConfig
with a customer-provided scope and ID without extendingConstruct
/Resource
to avoid impacting the hierarchy, but I'm not familiar with this strategy being employed in the CDK codebase.
Serverless Production Variants
The other limitation called out above, for synchronous endpoints, is that the presence of a single serverless variant prevents other variants from being configured. Would a synthesis-time approach be appropriate here (i.e., taking ProductionVariant[]
on the EndpointConfigProps
but erroring out if multiple are present alongside a serverless variant) or would you prefer a model more similar to your serverless-vs-instance specific properties?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reading the design guidelines! You are one of the few :).
We don't have to implement AsyncEndpointConfig
day 1 of the library. Lets add it to an "extensions" section and then turn those extensions into github issues later.
I think this may be a special case where we are okay with extending EndpointConfig
. We generally don't want to support that many L2.5s in the library, which is why we document that we want to extend the resource, but this could be a good candidate. I could be wrong, and be forced to change my mind later, but I'm looking at NodeJsFunction
and other L2.5s in the main repo and they extend another construct. At any rate, this doesn't need to be implemented along with EndpointConfig
so we don't need to finalize how to instantiate it just yet.
How about mutually exclusive properties serverlessProductionVariant: ServerlessProductionVariant
and productionVariants: ProductionVariant[]
? That seems like a good mix of property differentiation and synth time validation (that only one of the two props are set).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have to implement
AsyncEndpointConfig
day 1 of the library. Lets add it to an "extensions" section and then turn those extensions into github issues later.
Sounds good! I assume the serverless pieces can be added to the same section.
How about mutually exclusive properties
serverlessProductionVariant: ServerlessProductionVariant
andproductionVariants: ProductionVariant[]
? That seems like a good mix of property differentiation and synth time validation (that only one of the two props are set).
Two quick follow-up questions:
- As both types of variants share a variant name & weight (and ideally could be used as siblings if the SageMaker folks release support for multi-heterogenous variants per endpoint config), to adjust your proposal slightly, would it make sense to define a shared interface like
ProductionVariant
and then have two interfaces extending it:ServerlessProductionVariant
andInstanceProductionVariant
? We could simply not exportProductionVariant
yet as we don't have a need for polymorphic representation (e.g.,ProductionVariant[]
) and that way, both exported interfaces are specifically targeting their use-case (e.g., serverless vs instance-based). - Challenging one statement I made above: if the SageMaker folks later do support heterogeneous variants, would it make sense to collapse both variant specifications into a single
ProductionVariant[]
on theEndpointConfigProps
(i.e., is that polymorphic APIjsii
-compatible)? If so (and assuming we can foresee SageMaker's product evolution), would it be unusual to present two separate variant specifications now if we know they will eventually be "doomed" APIs (i.e., ideal for removal during a major version bump)?
Thanks so much for the help, Kaizen; this conversation has been immensely helpful! I'll start incorporating your recommended changes as my final pair of questions are mostly focused on minor remaining details that can be adjusted rather easily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Answer for 1 is yes.
Answer for 2 is hmmm. One the one hand, predicting what SageMaker is going to do in the future is future-proofing and we want to avoid doing that. We have no guarantee that that will ever happen. If it happens when SageMaker is in alpha, then we can make that breaking change. If it happens after SageMaker is stabilized, then we can deprecate serverlessProductionVariant
and instanceProductionVariants
and introduce a new productionVariant
property. Or, we can keep both APIs and just remove/modify the synth-time validation -- they're not necessarily doomed and it might be fine to specify different production variants in different properties. I think if that happens it's okay, we're not stuck, and any changes we have to make won't really hurt the API too much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feature-specific metrics
: I think the RFC should not worry about specificXxxMetric
APIs and rather just expose the generic metric API and allow the user to specify exactly what metrics they want to use. This item can be a fast-follow item but isn't required for an alpha release, and there's already a lot of ideas to implement in this RFC. To answer your question though, I think synth-time validation makes sense / is the best we can do.
As I just published rev2, I wanted to highlight one bit: I kept the previously proposed metric*
APIs on the newly renamed IEndpointInstanceProductionVariant
as they had already been implemented and would be easy enough to remove if you feel strongly that they shouldn't be included.
text/0431-sagemaker-l2-endpoint.md
Outdated
CDK integration test perspective, specifying `--no-clean` will allow the generation of a snapshot | ||
regardless of whether stack deletion will succeed or fail but may hinder snapshot re-generation | ||
by subsequent CDK contributors. For this reason, it may be helpful to exclude VPC specification | ||
from the endpoint integration test at this time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than excluding testing, we should add retries to the tests that take into account these delays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. This is out of scope for the current RFC, but it may give @corymhall some inspiration 😜
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than excluding testing, we should add retries to the tests that take into account these delays.
I agree. This is out of scope for the current RFC, but it may give (at)corymhall some inspiration 😜
So, hypothetically, if this RFC were bar raised & signed off in 30 days, does this mean that the implementation PR merge would be blocked on the configuration of CloudFormation stack tear down retries to the associated integ test code?
Thanks for all the feedback so far! I'll be OOTO for the remainder of the week, so I'll begin fielding comments early next week. |
text/0431-sagemaker-l2-endpoint.md
Outdated
import * as sagemaker from '@aws-cdk/aws-sagemaker'; | ||
|
||
const endpoint = new sagemaker.Endpoint(this, 'Endpoint', { endpointConfig }); | ||
const productionVariant = endpoint.findProductionVariant('variantName'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do I make a production variant? Is it created when I call this? Is it pre-created? How do I know its name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The customer currently specifies the variantName
attribute of a production variant when defining an EndpointConfig
resource.
- `IModel` -- interface for defined and imported models | ||
|
||
```ts | ||
export interface IModel extends cdk.IResource, iam.IGrantable, ec2.IConnectable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see a model is Grantable
and Connectable
. Does a Model correspond 1:1 with compute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming I'm understanding your question correctly: not exactly. A single model (to which VPC configuration is associated) can be shared by all of the following resources (each with its own compute):
- Multiple endpoints
- Multiple production variants of a single endpoint
- An asynchronous, batch transform job started by the
CreateTransformJob
SageMaker API
Does your question imply that we may be misusing IConnectable
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
text/0431-sagemaker-l2-endpoint.md
Outdated
* The VPC to deploy the endpoint to. | ||
* | ||
* @default none |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is about the endpoint, then doesn't this need to be configured on the Endpoint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll adjust the comment as the VPC is not endpoint-specific. For example, without ever creating an endpoint, a customer can create an asynchronous, batch transform job via the CreateTransformJob
referencing their model. SageMaker will instantiate temporary EC2 instances within the customer's VPC for the lifetime of the transform job.
* @param bucket The S3 bucket within which the model artifacts are stored | ||
* @param objectKey The S3 object key at which the model artifacts are stored | ||
*/ | ||
public static fromBucket(bucket: s3.IBucket, objectKey: string): ModelData { ... } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to deal with object versions here? Can we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add objectVersion?: string
as a parameter to this method (and double-check that SageMaker supports versioned objects).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell (from reading documentation & experimenting with SageMaker operations), SageMaker does not support versioned objects as a source of model data. I've also posted a question to repost.aws to confirm, and the one response I've gotten agrees (at "~90%" confidence) with my conclusion.
text/0431-sagemaker-l2-endpoint.md
Outdated
* | ||
* @default average over 5 minutes | ||
*/ | ||
metricGPUUtilization(props?: cloudwatch.MetricOptions): cloudwatch.Metric; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No uppercase abbreviations. metricGpuUtilization
, metricCpuUtilization
, etc.
text/0431-sagemaker-l2-endpoint.md
Outdated
ECR as an image source while ECS was capable of sourcing images from either ECR or a | ||
customer-owned private repository. Given the fact that these two products' supported images | ||
sources may yet again diverge in the future, maybe it would be best to keep their | ||
`ContainerImage` APIs separate within their respective modules. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that reuse is better, but in this particular case reusing from ecs
doesn't sound great. We might at some point support a more generic ContainerImage
--for now this is okay.
text/0431-sagemaker-l2-endpoint.md
Outdated
prevents customers from reusing configuration across endpoints. For this reason, an explicit | ||
L2 construct for endpoint configuration was incorporated into this RFC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for getting into this. What would be a typical use case to use the same config for multiple endpoints?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid that I can't say what is typical or atypical for SageMaker customers, but I could imagine the following scenario:
- Producer A exposes ten endpoints, each unique to a different consumer (let's label these B thru K).
- Each of these endpoints could use one of, say, three endpoint configs (let's label these 1 thru 3) based on the features needed by each consumer.
- Consumer B's endpoint is currently associated with endpoint config 1.
At some later point, consumer B wants to leverage a new feature, so in collaboration with the consumer, producer A updates B's endpoint to reference endpoint config 3.
As a result, without switching endpoints, consumer B was able to begin using the features enabled via the pre-built, shared endpoint config 3.
Again though, since this is a completely made up example, I'm not sure if it represents a normal SageMaker use-case. Is it possible for your team to reach out to an internal subject matter expert in the SageMaker product space that can speak more to endpoint config reuse?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Responded on a different comment but this makes sense, just please upstream it to the EndpointConfig readme section (just a simple "By using the EndpointConfig
construct, you can define one endpoint and reuse it on multiple Endpoint
constructs")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In rev2, I now mention reuse of EndpointConfig
across Endpoint
resources in the README.
text/0431-sagemaker-l2-endpoint.md
Outdated
This RFC and its [original associated implementation PR][original-pr] were based on a Q3 2019 | ||
feature set of SageMaker real-time inference endpoints. Since that point in time, SageMaker has | ||
launched the following features which would require further additions to the L2 API contracts: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least I'd like a good story for how the API would evolve.
text/0431-sagemaker-l2-endpoint.md
Outdated
CDK integration test perspective, specifying `--no-clean` will allow the generation of a snapshot | ||
regardless of whether stack deletion will succeed or fail but may hinder snapshot re-generation | ||
by subsequent CDK contributors. For this reason, it may be helpful to exclude VPC specification | ||
from the endpoint integration test at this time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. This is out of scope for the current RFC, but it may give @corymhall some inspiration 😜
text/0431-sagemaker-l2-endpoint.md
Outdated
import * as sagemaker from '@aws-cdk/aws-sagemaker'; | ||
|
||
const endpointConfig = new sagemaker.EndpointConfig(this, 'EndpointConfig', { | ||
productionVariant: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreement. In general, throughout this RFC, I'd prefer fooBars[]
over foobar
and extraFooBars[]
text/0431-sagemaker-l2-endpoint.md
Outdated
/** | ||
* Name of the SageMaker Model. | ||
* | ||
* @default AWS CloudFormation generates a unique physical ID and uses that ID for the model's |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: all @default
s that have a text description of the value (like this one) need to be @default -
. @default
s that do not have a text description, and only a value, should not have a -
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Responding to remaining comments on the first draft.
text/0431-sagemaker-l2-endpoint.md
Outdated
the S3 path where the model artifacts are stored and the Docker registry path for the image that | ||
contains the inference code. The `ContainerDefinition` interface encapsulates both the specification |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the next revision, I'll emphasize that a model's code usually changes at a slower rate than a model's artifacts (which will likely change every time the model is re-trained, while the code remains static) making their separation natural from a decoupling standpoint.
text/0431-sagemaker-l2-endpoint.md
Outdated
const image = sagemaker.ContainerImage.fromEcrRepository(repository, 'tag'); | ||
``` | ||
|
||
#### `AssetImage` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll generalize the EcrImage
and AssetImage
headers and move the assets section above the ECR one.
text/0431-sagemaker-l2-endpoint.md
Outdated
|
||
### Model Artifacts | ||
|
||
Models are often associated with model artifacts, which are specified via the `modelData` property |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll also emphasize the point about decoupling trained artifacts from the inference code here in the next revision.
text/0431-sagemaker-l2-endpoint.md
Outdated
|
||
The `Model` construct associates container images with their optional model data. | ||
|
||
#### Single Container Model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, I'll move the model content up in the next revision.
text/0431-sagemaker-l2-endpoint.md
Outdated
prevents customers from reusing configuration across endpoints. For this reason, an explicit | ||
L2 construct for endpoint configuration was incorporated into this RFC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid that I can't say what is typical or atypical for SageMaker customers, but I could imagine the following scenario:
- Producer A exposes ten endpoints, each unique to a different consumer (let's label these B thru K).
- Each of these endpoints could use one of, say, three endpoint configs (let's label these 1 thru 3) based on the features needed by each consumer.
- Consumer B's endpoint is currently associated with endpoint config 1.
At some later point, consumer B wants to leverage a new feature, so in collaboration with the consumer, producer A updates B's endpoint to reference endpoint config 3.
As a result, without switching endpoints, consumer B was able to begin using the features enabled via the pre-built, shared endpoint config 3.
Again though, since this is a completely made up example, I'm not sure if it represents a normal SageMaker use-case. Is it possible for your team to reach out to an internal subject matter expert in the SageMaker product space that can speak more to endpoint config reuse?
text/0431-sagemaker-l2-endpoint.md
Outdated
This RFC and its [original associated implementation PR][original-pr] were based on a Q3 2019 | ||
feature set of SageMaker real-time inference endpoints. Since that point in time, SageMaker has | ||
launched the following features which would require further additions to the L2 API contracts: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If these are part of SageMaker's core feature set, I think they need to be taken into account in this RFP. We may be creating one way doors with contracts set here that will make the user experience more complicated if these are added later.
That's fair. I'll begin familiarizing myself with the last few years of SageMaker features and revise the RFC accordingly.
Quick heads-up: this is going to require a fair bit of experimentation (i.e., I'm actually going to, at least in a rudimentary way, implement the construct changes and test these features to make sure I understand how they are configured and how they function). With upcoming OOTO plans, this means it may be a month or more before I'm done with the next revision. Please let me know if you all have concerns with that timing.
text/0431-sagemaker-l2-endpoint.md
Outdated
CDK integration test perspective, specifying `--no-clean` will allow the generation of a snapshot | ||
regardless of whether stack deletion will succeed or fail but may hinder snapshot re-generation | ||
by subsequent CDK contributors. For this reason, it may be helpful to exclude VPC specification | ||
from the endpoint integration test at this time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than excluding testing, we should add retries to the tests that take into account these delays.
I agree. This is out of scope for the current RFC, but it may give (at)corymhall some inspiration 😜
So, hypothetically, if this RFC were bar raised & signed off in 30 days, does this mean that the implementation PR merge would be blocked on the configuration of CloudFormation stack tear down retries to the associated integ test code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @petermeansrock! Just doing some housekeeping and responding to some of your inquiries. Will be waiting for the next revision before I take another look. In general, let's focus mostly on the README, and far less on the "technical solution" FAQ. That question is meant to be more high-level anyway, you're in a unique position where you have actually implemented most of what we're designing.
text/0431-sagemaker-l2-endpoint.md
Outdated
|
||
## Model | ||
|
||
By creating a model, you tell Amazon SageMaker where it can find the model components. This includes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[This applies to all documentation in this readme section]
The SageMaker public documentation is a good starting point, but I caution actually copying it over verbatim. We can always link to the documentation with something like "For more information on Amazon SageMaker, see SageMaker docs"
To answer this specific question, I'd like to see something like this:
## Model
To create a machine learning model with Amazon Sagemaker, use the `Model` construct.
This construct includes properties that can be configured to define...
text/0431-sagemaker-l2-endpoint.md
Outdated
the S3 path where the model artifacts are stored and the Docker registry path for the image that | ||
contains the inference code. The `ContainerDefinition` interface encapsulates both the specification |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we (the customers) deserve to see an example of a Model
after the blurb about what it is and how you can configure it. It's okay if the additional configuration sections come after it; I want to see like the most basic use case front and center.
text/0431-sagemaker-l2-endpoint.md
Outdated
const image = sagemaker.ContainerImage.fromAsset(this, 'Image', { | ||
directory: path.join('path', 'to', 'Dockerfile', 'directory') | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An asset isn't a resource, so I don't think it needs the scope and construct Id. Instead, the API should look like:
sagemaker.ContainerImage.fromAsset(directory: string)
OR, if you have additional options,
sagemaker.ContainerImage.fromAsset(assetOptions: {})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In rev2, I've pivoted to having the new proposed APIs take an Asset
from '@aws-cdk/aws-s3-assets' (for ModelData
) and DockerImageAsset
from '@aws-cdk/aws-ecr-assets' (for ContainerImage
) to avoid adding any assets into the hierarchy myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commented elsewhere but I think the correct API here is to create the asset for the user (and let them supply only a path to a directory, and optional options.
text/0431-sagemaker-l2-endpoint.md
Outdated
|
||
### Model Artifacts | ||
|
||
Models are often associated with model artifacts, which are specified via the `modelData` property |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreeing that in general, every section title should start with a sentence about what it is and why would one use it.
text/0431-sagemaker-l2-endpoint.md
Outdated
prevents customers from reusing configuration across endpoints. For this reason, an explicit | ||
L2 construct for endpoint configuration was incorporated into this RFC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Responded on a different comment but this makes sense, just please upstream it to the EndpointConfig readme section (just a simple "By using the EndpointConfig
construct, you can define one endpoint and reuse it on multiple Endpoint
constructs")
text/0431-sagemaker-l2-endpoint.md
Outdated
This RFC and its [original associated implementation PR][original-pr] were based on a Q3 2019 | ||
feature set of SageMaker real-time inference endpoints. Since that point in time, SageMaker has | ||
launched the following features which would require further additions to the L2 API contracts: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear. We won't actually be creating any one-way doors because sagemaker will start out as an alpha module. I think we should be thinking about how these constructs can be extended to support other SageMaker resources in the future, but it's not necessary to include them in the current RFC.
text/0431-sagemaker-l2-endpoint.md
Outdated
import * as sagemaker from '@aws-cdk/aws-sagemaker'; | ||
|
||
const endpointConfig = new sagemaker.EndpointConfig(this, 'EndpointConfig', { | ||
productionVariant: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think its okay to go forward with the consensus that we need just one property productionVariants
and one property containers
. These will be required properties so it stands to reason that at least one variant and one container must be provided. If we have to go back on this decision I will eat crow :)
text/0431-sagemaker-l2-endpoint.md
Outdated
|
||
### AutoScaling | ||
|
||
The `autoScaleInstanceCount` method on the `IEndpointProductionVariant` interface can be used to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simply "To enable autoscaling on the production variant, use the autoScaleInstanceCount
method:"
- `IModel` -- interface for defined and imported models | ||
|
||
```ts | ||
export interface IModel extends cdk.IResource, iam.IGrantable, ec2.IConnectable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Responding to a few comments about the direction that was taken in authoring revision 2.
text/0431-sagemaker-l2-endpoint.md
Outdated
|
||
## Model | ||
|
||
By creating a model, you tell Amazon SageMaker where it can find the model components. This includes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: As I had already expanded the wording a bit more (to provide a bit more ML related context), the version in the next revision will be slightly longer.
text/0431-sagemaker-l2-endpoint.md
Outdated
the S3 path where the model artifacts are stored and the Docker registry path for the image that | ||
contains the inference code. The `ContainerDefinition` interface encapsulates both the specification |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In rev2, I've moved up the Model
examples prior to diving into container image and model data assets.
text/0431-sagemaker-l2-endpoint.md
Outdated
const image = sagemaker.ContainerImage.fromAsset(this, 'Image', { | ||
directory: path.join('path', 'to', 'Dockerfile', 'directory') | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In rev2, I've pivoted to having the new proposed APIs take an Asset
from '@aws-cdk/aws-s3-assets' (for ModelData
) and DockerImageAsset
from '@aws-cdk/aws-ecr-assets' (for ContainerImage
) to avoid adding any assets into the hierarchy myself.
text/0431-sagemaker-l2-endpoint.md
Outdated
|
||
### AutoScaling | ||
|
||
The `autoScaleInstanceCount` method on the `IEndpointProductionVariant` interface can be used to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In rev2, I've adjusted both references to IEndpointProductionVariant
in the README accordingly.
text/0431-sagemaker-l2-endpoint.md
Outdated
* @param id The id to assign to the image asset | ||
* @param props The properties of a Docker image asset | ||
*/ | ||
public static fromAsset(scope: Construct, id: string, props: assets.DockerImageAssetProps): ContainerImage { ... } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In rev2, I've decided to pivot to having the new proposed APIs take an Asset
from '@aws-cdk/aws-s3-assets' (for ModelData
) and DockerImageAsset
from '@aws-cdk/aws-ecr-assets' (for ContainerImage
) to avoid adding any assets into the hierarchy myself.
text/0431-sagemaker-l2-endpoint.md
Outdated
* @param id The id to associate with the new asset | ||
* @param path The local path to a model artifact file as a gzipped tar file | ||
*/ | ||
public static fromAsset(scope: Construct, id: string, path: string): ModelData { ... } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted above but copying here as well:
In rev2, I've pivoted to having the new proposed APIs take an
Asset
from '@aws-cdk/aws-s3-assets' (forModelData
) andDockerImageAsset
from '@aws-cdk/aws-ecr-assets' (forContainerImage
) to avoid adding any assets into the hierarchy myself.
text/0431-sagemaker-l2-endpoint.md
Outdated
* | ||
* @default 1 | ||
*/ | ||
readonly initialInstanceCount?: number; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In rev2, I've called these features out explicitly in the "Are there any open issues that need to be addressed later?" section.
text/0431-sagemaker-l2-endpoint.md
Outdated
ECR as an image source while ECS was capable of sourcing images from either ECR or a | ||
customer-owned private repository. Given the fact that these two products' supported images | ||
sources may yet again diverge in the future, maybe it would be best to keep their | ||
`ContainerImage` APIs separate within their respective modules. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In rev2, I've reworded this bit and moved it into the "Are there any open issues that need to be addressed later?" section.
text/0431-sagemaker-l2-endpoint.md
Outdated
prevents customers from reusing configuration across endpoints. For this reason, an explicit | ||
L2 construct for endpoint configuration was incorporated into this RFC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In rev2, I now mention reuse of EndpointConfig
across Endpoint
resources in the README.
Pull request has been modified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @petermeansrock, I'm pretty happy with this RFC modulo some small comments. And we may get further involvement from the sagemaker team as well.
I also didn't really look at the code in this RFC. i'll review that on the actual PR when the time comes, but I'm much less worried about the actual implementation. i focused primarily on the readme and the other parts of the FAQ.
text/0431-sagemaker-l2-endpoint.md
Outdated
const image = sagemaker.ContainerImage.fromAsset(this, 'Image', { | ||
directory: path.join('path', 'to', 'Dockerfile', 'directory') | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commented elsewhere but I think the correct API here is to create the asset for the user (and let them supply only a path to a directory, and optional options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yay @petermeansrock! We're calling this approved :).
Co-authored-by: Matt McClean [email protected]
Co-authored-by: Long Yao [email protected]
Co-authored-by: Drew Jetter [email protected]
Co-authored-by: Murali Ganesh [email protected]
Co-authored-by: Abilash Rangoju [email protected]
This is a request for comments about SageMaker model hosting L2 constructs. See #431 for
additional details.
APIs are signed off by @kaizencc.
Rendered Version
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache-2.0 license