Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ray): support containerized model serving #116

Merged
merged 4 commits into from
Mar 20, 2024
Merged

feat(ray): support containerized model serving #116

merged 4 commits into from
Mar 20, 2024

Conversation

heiruwu
Copy link
Member

@heiruwu heiruwu commented Mar 20, 2024

Because

  • We are going to support containerized model serving with Instill Model

This commit

  • add deployment handle return
  • add modules for building and pushing model image
  • expose cpu/gpu/memory resource allocation configs
  • update instill_deployable decorator

heiruwu and others added 3 commits March 20, 2024 15:11
Because

- We are going to support containerized model serving with `Instill
Model`

This commit

- add deployment handle return
- add modules for building and pushing model image
- expose cpu/gpu/memory resource allocation configs
- update `instill_deployable` decorator
Because

- Missing `CUDA_SUFFIX` ARG will cause model container to lose GPU
capability

This commit

- add missing cuda suffix arg
Copy link

codecov bot commented Mar 20, 2024

Codecov Report

Attention: Patch coverage is 0% with 107 lines in your changes are missing coverage. Please review.

Project coverage is 24.16%. Comparing base (3fd5914) to head (0182cad).

Files Patch % Lines
instill/helpers/build.py 0.00% 41 Missing ⚠️
instill/helpers/push.py 0.00% 40 Missing ⚠️
instill/helpers/ray_config.py 0.00% 25 Missing ⚠️
instill/helpers/const.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #116      +/-   ##
==========================================
- Coverage   24.49%   24.16%   -0.34%     
==========================================
  Files         189      191       +2     
  Lines        6609     6700      +91     
  Branches     1047     1061      +14     
==========================================
  Hits         1619     1619              
- Misses       4973     5064      +91     
  Partials       17       17              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@heiruwu heiruwu merged commit ad0f250 into main Mar 20, 2024
10 checks passed
@heiruwu heiruwu deleted the dockerized branch March 20, 2024 07:25
heiruwu pushed a commit that referenced this pull request Apr 25, 2024
🤖 I have created a release *beep* *boop*
---


##
[0.8.0](v0.7.1...v0.8.0)
(2024-04-25)


### Features

* **deps:** upgrade ray version
([cc61b85](cc61b85))
* **ray:** adapt to native docker client instead of docker sdk
([#138](#138))
([7d19ccb](7d19ccb))
* **ray:** add accelerator and custom resource support
([#118](#118))
([f974f98](f974f98))
* **ray:** add llava 13b to predeploy list
([3fd5914](3fd5914))
* **ray:** add metadata and infer constructor for llm tasks
([#137](#137))
([be122d1](be122d1))
* **ray:** generate sha256 as tag if not presented
([#120](#120))
([6abb538](6abb538))
* **ray:** inject accelerator type at runtime
([#121](#121))
([f78a2d0](f78a2d0))
* **ray:** support containerized model serving
([#116](#116))
([ad0f250](ad0f250))
* **ray:** support custom accelerator type
([#134](#134))
([ae6c139](ae6c139))
* **ray:** use env for resource and deprecate deploy/undeploy
([#124](#124))
([a58bc50](a58bc50))
* **ray:** use tmp folder for image building
([#122](#122))
([9512cec](9512cec))


### Bug Fixes

* **deps:** downgrade ray to avoid grpc servicer issue
([#128](#128))
([9ead421](9ead421))
* **dockerfile:** avoid build hang at ARG statement
([#130](#130))
([f02a27c](f02a27c))
* **ray:** fix etrypoint module not found
([#126](#126))
([f1ed83d](f1ed83d))
* **ray:** fix missing default resource value
([#129](#129))
([b2f564a](b2f564a))
* **ray:** fix multi-platform build stage
([6f358fd](6f358fd))
* **ray:** support target platform for image building
([#127](#127))
([f4825fc](f4825fc))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: 👋 Done
Development

Successfully merging this pull request may close these issues.

2 participants