Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix service url of flanT5 #39

Merged
merged 1 commit into from
Apr 16, 2024
Merged

Conversation

rcarrata
Copy link
Collaborator

Fix for the issue #35 that points to the older Ollama Service that is reverted in favor of FlanT5.

Checked in the Insurance Claim Jupyter Notebook, changing the branch of the fix created and works like a charm:

Success: Minio is reachable on minio.ic-shared-minio.svc.cluster.local:9000
Success: Gitea is reachable on gitea.gitea.svc.cluster.local:3000
Success: Postgres Database is reachable on claimdb.ic-shared-db.svc.cluster.local:5432
Success: LLM Service is reachable on llm.ic-shared-llm.svc.cluster.local:8000
Success: LLM Service-FlanT5 is reachable on llm-flant5.ic-shared-llm.svc.cluster.local:3000
Success: ModelMesh is reachable on modelmesh-serving.ic-shared-img-det.svc.cluster.local:8033

@rcarrata rcarrata assigned rcarrata and unassigned rcarrata Apr 16, 2024
@rcarrata
Copy link
Collaborator Author

@guimou @erwangranger @RHRolun can you take a look and merge this one to close this issue

@erwangranger erwangranger force-pushed the fix/#35-ollama-checks-failed branch from 6e5b94b to eeebe25 Compare April 16, 2024 16:42
@erwangranger erwangranger merged commit 60d41a3 into dev Apr 16, 2024
1 check passed
@erwangranger erwangranger deleted the fix/#35-ollama-checks-failed branch April 16, 2024 16:43
guimou added a commit that referenced this pull request Apr 19, 2024
* adding pre-puller for pipeline images (#139)

* adding pre-puller for pipeline images

* Predicting the same values

* requested changes for section 5

* quick improvements.

* cleaning trailing dots

* spelling

* link to contributors

* Lowered the requirement for acceptable response similarity

* Removing model_sha as it seems volatile

* Updated 03/06 to use VLLM and improved response time

Removed streaming and fixed the response quality

Updated requirements for VLLM

Updated responsetime to hit the text generation endpoint

* minor fixes in lab1

* minor fixes in lab2 and update images from 2.8

* minor fixes in lab4 and update images for serving in 2.8

* minor adjustments in lab6

* Update to vLLM and Ollama for model serving

* Better leverage antora variables (#170)

* add antora rhoai, argocd and ocp vars

* added lab vars

* minor typo fix

* update theme for Summit (#171)

* Started drafting the instructions for DIY vs AUTO on the projects (#176)

* 02-02 diy vs auto - start

* split up of 02-03

* disclaimer at top of diy versions

* wordlist addition

* Added instructions for locally downloading YAML (#167)

* prepare for main

* prepare for merge to main

* add pre-puller for pipeline images

* Added instructions for downloading

* use dev branch instead of main

---------

Co-authored-by: Guillaume Moutier <[email protected]>
Co-authored-by: Guillaume Moutier <[email protected]>

* 03/06 and 05/05 pipelines working (#172)

* 03/06 and 05/05 pipelines working

* New image with openai deps

* Feature/streamline section 4 (#174)

* micor changes for clarity

* marking a chunk of section 04 as optional

* spellcheck

* Feat/rename sanity pipeline (#177)

* Updated the file name

* Images and guide updated

* Instructions + Working GitOps (#180)

* Fixed navigation (#195)

* Update workbench image version to 2.0.1 (rh-aiservices-bu#204)

* updating prepulled images to match 2.8.1 (rh-aiservices-bu#205)

* * hide diy (rh-aiservices-bu#206)

* fix the RH1-related stuff.
* formatting and bullet points
* specify user again

* Fix/sanity (rh-aiservices-bu#208)

* removing references to sanity.
* adding confidence pipeline image to puller. speeds up pipelines.
* also copied the image into a repo with a new name in quay.io

* change repo reference

* Fix #11 -  Revert model comparison to Flan-T5 (#26)

* Revert to Flan-T5

Remove CUDA from container

update packages and switch to CPU torch

update workbenches IS

update deployment

switch back to 1.3

update wb image

* update conclusion

* quick readme update (#27)

* adding parasol insurance throughout and updating titles (#28)

* Fix/remove auto (#30)

* removing auto from code

* adjusting instructions (pre-created is better than auto-created)

* git clone the right project

* adjust paths to new project name (#32)

* 05-05 changes branch to dev

* Added a step for the user to go to admin view (#37)

* fix service url of flanT5 (#39)

* WIP: Fixes (#40)

* what type of pipeline

* openAI clarifications

* let's not force it. users should see choice between kserve and modelmesh

* refreshed screenshots

* added jupyter notebook guide and rearrange order (#42)

* bump minio buckets to 101 (#44)

* added run button (#49)

* repo name in image fix

* Change userX to variable

* add token to vllm deployment (#53)

fix namespace

* New App (#54)

New look + RAG
Workench update
Content for part 3 - RAG

* add demo app (#55)

fix bootstrap

* adapt files for main

---------

Co-authored-by: Erwan Granger <[email protected]>
Co-authored-by: RHRolun <[email protected]>
Co-authored-by: adrezni <[email protected]>
Co-authored-by: rcarrata <[email protected]>
Co-authored-by: Cedric Clyburn <[email protected]>
Co-authored-by: RHRolun <[email protected]>
ritzshah pushed a commit to redhat-gpte-devopsautomation/rad-model-deployment that referenced this pull request Aug 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants