feat(ray): support containerized model serving #116

heiruwu · 2024-03-20T07:12:25Z

Because

We are going to support containerized model serving with Instill Model

This commit

add deployment handle return
add modules for building and pushing model image
expose cpu/gpu/memory resource allocation configs
update instill_deployable decorator

Because - We are going to support containerized model serving with `Instill Model` This commit - add deployment handle return - add modules for building and pushing model image - expose cpu/gpu/memory resource allocation configs - update `instill_deployable` decorator

Because - Missing `CUDA_SUFFIX` ARG will cause model container to lose GPU capability This commit - add missing cuda suffix arg

codecov · 2024-03-20T07:25:01Z

Codecov Report

Attention: Patch coverage is 0% with 107 lines in your changes are missing coverage. Please review.

Project coverage is 24.16%. Comparing base (3fd5914) to head (0182cad).

Files	Patch %	Lines
instill/helpers/build.py	0.00%	41 Missing ⚠️
instill/helpers/push.py	0.00%	40 Missing ⚠️
instill/helpers/ray_config.py	0.00%	25 Missing ⚠️
instill/helpers/const.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #116      +/-   ##
==========================================
- Coverage   24.49%   24.16%   -0.34%     
==========================================
  Files         189      191       +2     
  Lines        6609     6700      +91     
  Branches     1047     1061      +14     
==========================================
  Hits         1619     1619              
- Misses       4973     5064      +91     
  Partials       17       17

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🤖 I have created a release *beep* *boop* --- ## [0.8.0](v0.7.1...v0.8.0) (2024-04-25) ### Features * **deps:** upgrade ray version ([cc61b85](cc61b85)) * **ray:** adapt to native docker client instead of docker sdk ([#138](#138)) ([7d19ccb](7d19ccb)) * **ray:** add accelerator and custom resource support ([#118](#118)) ([f974f98](f974f98)) * **ray:** add llava 13b to predeploy list ([3fd5914](3fd5914)) * **ray:** add metadata and infer constructor for llm tasks ([#137](#137)) ([be122d1](be122d1)) * **ray:** generate sha256 as tag if not presented ([#120](#120)) ([6abb538](6abb538)) * **ray:** inject accelerator type at runtime ([#121](#121)) ([f78a2d0](f78a2d0)) * **ray:** support containerized model serving ([#116](#116)) ([ad0f250](ad0f250)) * **ray:** support custom accelerator type ([#134](#134)) ([ae6c139](ae6c139)) * **ray:** use env for resource and deprecate deploy/undeploy ([#124](#124)) ([a58bc50](a58bc50)) * **ray:** use tmp folder for image building ([#122](#122)) ([9512cec](9512cec)) ### Bug Fixes * **deps:** downgrade ray to avoid grpc servicer issue ([#128](#128)) ([9ead421](9ead421)) * **dockerfile:** avoid build hang at ARG statement ([#130](#130)) ([f02a27c](f02a27c)) * **ray:** fix etrypoint module not found ([#126](#126)) ([f1ed83d](f1ed83d)) * **ray:** fix missing default resource value ([#129](#129)) ([b2f564a](b2f564a)) * **ray:** fix multi-platform build stage ([6f358fd](6f358fd)) * **ray:** support target platform for image building ([#127](#127)) ([f4825fc](f4825fc)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).

heiruwu and others added 3 commits March 20, 2024 15:11

fix(ray): add missing cuda suffix arg (#115)

c02f5d7

Because - Missing `CUDA_SUFFIX` ARG will cause model container to lose GPU capability This commit - add missing cuda suffix arg

chore: add back variables for backward compatibility

b0efdb3

droplet-bot added the instill core label Mar 20, 2024

chore: fix format

0182cad

heiruwu force-pushed the dockerized branch from 55f887c to 0182cad Compare March 20, 2024 07:23

heiruwu merged commit ad0f250 into main Mar 20, 2024
10 checks passed

heiruwu deleted the dockerized branch March 20, 2024 07:25

droplet-bot mentioned this pull request Mar 20, 2024

chore(main): release 0.8.0 #106

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ray): support containerized model serving #116

feat(ray): support containerized model serving #116

heiruwu commented Mar 20, 2024

codecov bot commented Mar 20, 2024

feat(ray): support containerized model serving #116

feat(ray): support containerized model serving #116

Conversation

heiruwu commented Mar 20, 2024

codecov bot commented Mar 20, 2024

Codecov Report