Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid Response Headers Set for Non MME Inference Scenario #149

Open
Grassycup opened this issue Nov 10, 2022 · 0 comments
Open

Invalid Response Headers Set for Non MME Inference Scenario #149

Grassycup opened this issue Nov 10, 2022 · 0 comments

Comments

@Grassycup
Copy link

Background
I have a custom inference container implementing the inference lifecycle methods described in https://docs.aws.amazon.com/sagemaker/latest/dg/adapt-inference-container.html.
Functionally, I have no issues, but when it comes to using features such as https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-online-explainability.html which requires correct response header I run into issues due to the response type always being text/html instead of one which I specify (tested via aws sagemaker-runtime invoke-endpoint)

After digging through the source and testing, I realized MME works as expected and returns proper response header.
This is because https://github.com/aws/sagemaker-inference-toolkit/blob/2ed6f3d19daffccab390d8e9142c9c39c9932271/src/sagemaker_inference/transformer.py#L154 handles the response header correctly.

So the issue is with the non MME case.
The reason for this behavior is due to the split on how different types of servers are ran

server.start(env.ServingEnv().framework_module)

And this split causes difference in the way response is treated due to their handler code coming from different repositories.

Possible Solution
A solution is fixing https://github.com/aws/sagemaker-containers/blob/master/src/sagemaker_containers/_transformer.py#L207 so instead of raw result it should be return _worker.Response(response=result, mimetype=request.accept) which should force the response to have the correct response header instead of text/html.
However, this package is archived and in read only mode.
So I patched out the problematic file in my dockerfile.
COPY inference/patches/_transformer.py /miniconda3/lib/python3.8/site-packages/sagemaker_containers/_transformer.py
And this seemed to yield the desired results.

I haven't dug deep enough to say this is the correct solution, but I don't think the current implementation is in an ideal state.
XGBoost also suffers from similar issue https://github.com/aws/sagemaker-xgboost-container
The change lines up with the existing documentation as well https://docs.aws.amazon.com/sagemaker/latest/dg/adapt-inference-container.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant