Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(docs): first draft of the securing endpoints #5991

Merged
merged 14 commits into from
Dec 9, 2024

Conversation

Rajakavitha1
Copy link
Contributor

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

@CLAassistant
Copy link

CLAassistant commented Oct 25, 2024

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
2 out of 3 committers have signed the CLA.

✅ paulb-seldon
✅ Rajakavitha1
❌ Rakavitha Kodhandapani


Rakavitha Kodhandapani seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@Rajakavitha1 Rajakavitha1 changed the title first draft of the securing endpoints fix(docs): first draft of the securing endpoints Oct 25, 2024
@Rajakavitha1 Rajakavitha1 requested a review from cherrymu October 25, 2024 12:16
@Rajakavitha1 Rajakavitha1 marked this pull request as ready for review October 25, 2024 12:16
@paulb-seldon
Copy link
Contributor

Have made an edit to add context to the beginning of the page.

Another suggestion I have is to add a bit more content about what we are doing ahead of each of the steps. Instead of saying 'to secure model endpoints' ahead of the steps, I think we need to expand a bit more (I think it's fine in the context / intro, but when we go more into the details, we should explain exactly what we are doing). I.e. what is the auth policy doing, what is the authentication policy as shown doing.

Also ahead of each step can we call out what the placeholders are maybe we just need that for $MESH_IP?

Copy link
Contributor Author

@Rajakavitha1 Rajakavitha1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @paulb-seldon I just rephrased the paragraph to adhere to the writing guidelines.

docs-gb/models/securing-endpoints.md Outdated Show resolved Hide resolved
Co-authored-by: Rajakavitha Kodhandapani <[email protected]>
@Rajakavitha1 Rajakavitha1 requested a review from lc525 November 6, 2024 04:34
@Rajakavitha1
Copy link
Contributor Author

Hi @paulb-seldon @lc525 and @cherrymu I have tested and validated the steps documented in the guide.

NAME                                        STATUS   ROLES    AGE   VERSION
gke-rajie-core-default-pool-808de050-ds6t   Ready    <none>   41h   v1.30.5-gke.1014003
gke-rajie-core-default-pool-808de050-g74t   Ready    <none>   41h   v1.30.5-gke.1014003
gke-rajie-core-default-pool-808de050-q6sh   Ready    <none>   41h   v1.30.5-gke.1014003
➜  ~ kubectl -f ingress-jwt-auth.yaml
error: unknown shorthand flag: 'f' in -f
See 'kubectl --help' for usage.
➜  ~ kubectl apply -f ingress-jwt-auth.yaml
requestauthentication.security.istio.io/ingress-jwt-auth created
➜  ~ kubectl apply -f deny-empty-jwt.yaml  
Warning: configured AuthorizationPolicy will deny all traffic to TCP ports under its scope due to the use of only HTTP attributes in a DENY rule; it is recommended to explicitly specify the port
authorizationpolicy.security.istio.io/deny-empty-jwt created
➜  ~  seldon model infer iris --inference-host 35.204.190.15:80 \
  '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'
Error: Post "http://35.204.190.15:80/v2/models/iris/infer": dial tcp 35.204.190.15:80: connect: connection refused
➜  ~ seldon model infer iris --inference-host 34.141.255.75:80 \
  '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'
{
	"model_name": "iris_1",
	"model_version": "1",
	"id": "08b1910a-ed60-48eb-b034-1720d2c3d645",
	"parameters": {},
	"outputs": [
		{
			"name": "predict",
			"shape": [
				1,
				1
			],
			"datatype": "INT64",
			"parameters": {
				"content_type": "np"
			},
			"data": [
				2
			]
		}
	]
}
➜  ~  curl -i http://34.141.255.75/v2/models/iris/infer \
 -H "Content-Type: application/json" \
 -H "seldon-model":iris \
 -d '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'
HTTP/1.1 200 OK
ce-endpoint: iris_1
ce-id: c9045202-6bad-466a-bf06-7c6eb4c85f88
ce-inferenceservicename: mlserver
ce-modelid: iris_1
ce-namespace: seldon-mesh
ce-requestid: c9045202-6bad-466a-bf06-7c6eb4c85f88
ce-source: io.seldon.serving.deployment.mlserver.seldon-mesh
ce-specversion: 0.3
ce-type: io.seldon.serving.inference.response
content-length: 213
content-type: application/json
date: Fri, 29 Nov 2024 03:56:10 GMT
server: envoy
x-request-id: ct4jmmjc2nks739innrg
x-envoy-upstream-service-time: 8
x-seldon-route: :iris_1:

{"model_name":"iris_1","model_version":"1","id":"c9045202-6bad-466a-bf06-7c6eb4c85f88","parameters":{},"outputs":[{"name":"predict","shape":[1,1],"datatype":"INT64","parameters":{"content_type":"np"},"data":[2]}]}%                                                                                               ➜  ~  curl -i http://35.204.190.15/v2/models/iris/infer \
 -H "Content-Type: application/json" \
 -H "seldon-model":iris \
 -d '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'
curl: (7) Failed to connect to 35.204.190.15 port 80 after 191 ms: Couldn't connect to server
➜  ~ seldon model infer iris --inference-host 34.141.255.75:80 \
  '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'
{
	"model_name": "iris_1",
	"model_version": "1",
	"id": "a76e4ea9-9138-4c26-bf5c-7844992e155b",
	"parameters": {},
	"outputs": [
		{
			"name": "predict",
			"shape": [
				1,
				1
			],
			"datatype": "INT64",
			"parameters": {
				"content_type": "np"
			},
			"data": [
				2
			]
		}
	]
}
➜  ~ kubectl apply -f virtual-service.yaml 
virtualservice.networking.istio.io/iris-route created
➜  ~ seldon model infer iris --inference-host 34.141.255.75:80 \
  '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'
{
	"model_name": "iris_1",
	"model_version": "1",
	"id": "d13e5568-1f43-48af-8bf3-c527334a7b6d",
	"parameters": {},
	"outputs": [
		{
			"name": "predict",
			"shape": [
				1,
				1
			],
			"datatype": "INT64",
			"parameters": {
				"content_type": "np"
			},
			"data": [
				2
			]
		}
	]
}
➜  ~ seldon model infer iris --inference-host 35.204.190.15:80 \                 
  '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'
Error: Post "http://35.204.190.15:80/v2/models/iris/infer": dial tcp 35.204.190.15:80: connect: connection refused
➜  ~ seldon model infer iris --inference-host 34.141.255.75:80 \
  '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'
{
	"model_name": "iris_1",
	"model_version": "1",
	"id": "11eae1dc-dda5-4b22-92b0-1af20bd74d6f",
	"parameters": {},
	"outputs": [
		{
			"name": "predict",
			"shape": [
				1,
				1
			],
			"datatype": "INT64",
			"parameters": {
				"content_type": "np"
			},
			"data": [
				2
			]
		}
	]
}
➜  ~ curl -i http://34.141.255.75/v2/models/iris/infer \
 -H "Content-Type: application/json" \
 -H "seldon-model":iris \
 -d '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'

HTTP/1.1 200 OK
ce-endpoint: iris_1
ce-id: 41dea464-a171-4dda-aeb7-682603e9fa59
ce-inferenceservicename: mlserver
ce-modelid: iris_1
ce-namespace: seldon-mesh
ce-requestid: 41dea464-a171-4dda-aeb7-682603e9fa59
ce-source: io.seldon.serving.deployment.mlserver.seldon-mesh
ce-specversion: 0.3
ce-type: io.seldon.serving.inference.response
content-length: 213
content-type: application/json
date: Fri, 29 Nov 2024 07:17:18 GMT
server: envoy
x-request-id: ct4mkvrc2nks739inntg
x-envoy-upstream-service-time: 9
x-seldon-route: :iris_1:

{"model_name":"iris_1","model_version":"1","id":"41dea464-a171-4dda-aeb7-682603e9fa59","parameters":{},"outputs":[{"name":"predict","shape":[1,1],"datatype":"INT64","parameters":{"content_type":"np"},"data":[2]}]}%                                                                                               ➜  ~ kubectl get svc istio-ingressgateway -n istio-system
NAME                   TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)                                      AGE
istio-ingressgateway   LoadBalancer   34.118.224.41   35.204.190.15   15021:31959/TCP,80:32567/TCP,443:32475/TCP   41h
➜  ~ kubectl apply -f istio-seldon-gateway.yaml
gateway.networking.istio.io/seldon-gateway created
➜  ~ curl -i http://34.141.255.75/v2/models/iris/infer \ 
 -H "Content-Type: application/json" \
 -H "seldon-model":iris \
 -d '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'

HTTP/1.1 200 OK
ce-endpoint: iris_1
ce-id: e380fbca-55c4-4cb7-b1cd-57576346248d
ce-inferenceservicename: mlserver
ce-modelid: iris_1
ce-namespace: seldon-mesh
ce-requestid: e380fbca-55c4-4cb7-b1cd-57576346248d
ce-source: io.seldon.serving.deployment.mlserver.seldon-mesh
ce-specversion: 0.3
ce-type: io.seldon.serving.inference.response
content-length: 213
content-type: application/json
date: Fri, 29 Nov 2024 07:21:47 GMT
server: envoy
x-request-id: ct4mn2rc2nks739innu0
x-envoy-upstream-service-time: 8
x-seldon-route: :iris_1:

{"model_name":"iris_1","model_version":"1","id":"e380fbca-55c4-4cb7-b1cd-57576346248d","parameters":{},"outputs":[{"name":"predict","shape":[1,1],"datatype":"INT64","parameters":{"content_type":"np"},"data":[2]}]}%                                                                                               ➜  ~ curl -i http://35.204.190.15/v2/models/iris/infer \
 -H "Content-Type: application/json" \
 -H "seldon-model":iris \
 -d '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'

HTTP/1.1 403 Forbidden
content-length: 19
content-type: text/plain
date: Fri, 29 Nov 2024 07:22:34 GMT
server: istio-envoy
connection: close

RBAC: access denied% ```

Copy link
Contributor

@paulb-seldon paulb-seldon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thank you for this work, and testing it out after the initial documentation!

@Rajakavitha1 Rajakavitha1 merged commit 2abeb80 into v2 Dec 9, 2024
3 of 5 checks passed
@Rajakavitha1 Rajakavitha1 deleted the securing-end-points-rajie branch December 9, 2024 13:01
@Rajakavitha1
Copy link
Contributor Author

merging after confirming with @paulb-seldon

driev added a commit to driev/seldon-core that referenced this pull request Dec 12, 2024
commit 373df43
Author: Lucian Carata <[email protected]>
Date:   Thu Dec 12 01:09:30 2024 +0000

    feat(k6): add scenario with multiple stages ramping up/down RPS (SeldonIO#6031)

    The added load test scenario allows one to configure an arbitrary number
    of stages, with each consisting of a linear ramp-up/down to the desired
    requests per second and a hold/plateau time.

    Within each stage, the duration for which the inference RPS is held constant
    is configured via one element in the `CONSTANT_RATE_DURATIONS_SECONDS`
    environment variable (a vector of comma separated values), with the ramp-up/
    down duration preceding it being 1/3rd of the hold time.

commit 34cf313
Author: paulb-seldon <[email protected]>
Date:   Wed Dec 11 16:59:20 2024 +0000

    fix(docs): Docs on upgrading from 2.7 - 2.8 (SeldonIO#6143)

    * Docs on upgrading from 2.7 - 2.8

    * Wording update

commit 1c40f62
Author: Sherif Akoush <[email protected]>
Date:   Wed Dec 11 14:32:40 2024 +0000

    fix: Add timeout to contexts in client calls (SeldonIO#6125)

    * add timeout context from infer call for modelgateway

    * add timeout context to pipeline gateway

    * set timeout context on process request

    * add a test for grpc call timeout

    * add agent k8s api call timeout

    * add context timeout for shutting down services

    * add timeout for controller k8s api calls

    * add timeout for control plane context

    * add timeout context to reconcile logic

    * pr comments

commit 74032a4
Author: paulb-seldon <[email protected]>
Date:   Tue Dec 10 17:17:14 2024 +0000

    Format spaces in install docs (SeldonIO#6140)

commit 7e6c8f1
Author: Sherif Akoush <[email protected]>
Date:   Tue Dec 10 16:32:37 2024 +0000

    fix(docs): add a table for core 2 dependencies in docs (SeldonIO#6139)

    * add table for core 2 deps in dosc

    * review comments

commit c1d320e
Author: Niall D <[email protected]>
Date:   Tue Dec 10 16:16:55 2024 +0000

    feat(scheduler): account for multiple instances of a model per server when scheduling (SeldonIO#6054)

    * just checking in whatever I have

    * testing all the code

    * remove comment

    * linting

    * document unused param

    * changing the proto around

    * use parallelWorkers instead of instanceCount for mlserver

    * comma

    * rename ModelConfig

    * use modelWithVersion as param

commit a7bfb00
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Mon Dec 9 21:35:13 2024 +0000

    Bump grafana/grafana from 11.3.1 to 11.4.0 in /scheduler (SeldonIO#6133)

    Bumps grafana/grafana from 11.3.1 to 11.4.0.

    ---
    updated-dependencies:
    - dependency-name: grafana/grafana
      dependency-type: direct:production
      update-type: version-update:semver-minor
    ...

    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit f129bd1
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Mon Dec 9 21:33:47 2024 +0000

    Bump envoyproxy/envoy from v1.32.1 to v1.32.2 in /scheduler (SeldonIO#6134)

    Bumps envoyproxy/envoy from v1.32.1 to v1.32.2.

    ---
    updated-dependencies:
    - dependency-name: envoyproxy/envoy
      dependency-type: direct:production
    ...

    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit 208791b
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Mon Dec 9 21:31:49 2024 +0000

    Bump google.golang.org/grpc from 1.68.0 to 1.68.1 in /hodometer (SeldonIO#6136)

    Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.68.0 to 1.68.1.
    - [Release notes](https://github.com/grpc/grpc-go/releases)
    - [Commits](grpc/grpc-go@v1.68.0...v1.68.1)

    ---
    updated-dependencies:
    - dependency-name: google.golang.org/grpc
      dependency-type: direct:production
      update-type: version-update:semver-patch
    ...

    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit 2abeb80
Author: Rajakavitha Kodhandapani <[email protected]>
Date:   Mon Dec 9 18:31:14 2024 +0530

    fix(docs): first draft of the securing endpoints (SeldonIO#5991)

    * first draft of the securing endpoints

    * added the output

    * updated the policy name

    * added a note

    * Added context, minor grammar edits

    * Update docs-gb/models/securing-endpoints.md

    Co-authored-by: Rajakavitha Kodhandapani <[email protected]>

    * incorporate review suggestions

    * fixing the links

    * added an example for all models

    * removed the example to create a vs for all models

    * fixed formatting

    * formatting changes

    * Update securing-endpoints.md

    * added a link to the services meshes main docs page

    ---------

    Co-authored-by: Rakavitha Kodhandapani <[email protected]>
    Co-authored-by: Paul Bridi <[email protected]>
    Co-authored-by: paulb-seldon <[email protected]>

commit 4125273
Author: Niall D <[email protected]>
Date:   Fri Dec 6 13:52:35 2024 +0000

    refactor(envoy): moving envoy/resources headers to util (SeldonIO#6129)

    * moving headers to util

    * removing a newline

    * lint

commit f284b4a
Author: Sherif Akoush <[email protected]>
Date:   Fri Dec 6 09:45:15 2024 +0000

    fix(cli): Kafka inspect output formatting (SeldonIO#6130)

    * add kafka inspect consumer timeout (-d) as parameter

    * add formatting

commit 6d89d57
Author: Lucian Carata <[email protected]>
Date:   Fri Dec 6 01:51:54 2024 +0000

    feat(docs): improve HPA documentation (SeldonIO#6091)

    * highlight constraints and limitations of a HPA-based approach
    * remove note on statefulsets being created sequentially - we are specifically configuring k8s to allow for parallel creation of statefulset pods.
    * highlight importance of the `metrics-relist-interval` setting
    * simplify config example to no longer use regex metric matches
    * clarify example using HPA label selectors
    * clarify the need to use the `AverageValue` target type
    * clarify the relation between query rate window size and prometheus scrape interval
Merge branch 'v2' into INFRA-1420/add-clusters-before-updating-routes-part-2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants