Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert model comparison to Flan-T5 #11

Closed
guimou opened this issue Apr 12, 2024 · 0 comments
Closed

Revert model comparison to Flan-T5 #11

guimou opened this issue Apr 12, 2024 · 0 comments
Assignees

Comments

@guimou
Copy link
Collaborator

guimou commented Apr 12, 2024

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

@guimou guimou self-assigned this Apr 12, 2024
@guimou guimou transferred this issue from rh-aiservices-bu/insurance-claim-processing Apr 15, 2024
guimou added a commit that referenced this issue Apr 15, 2024
* Revert to Flan-T5

Remove CUDA from container

update packages and switch to CPU torch

update workbenches IS

update deployment

switch back to 1.3

update wb image

* update conclusion
@guimou guimou closed this as completed Apr 15, 2024
guimou added a commit that referenced this issue Apr 19, 2024
* adding pre-puller for pipeline images (#139)

* adding pre-puller for pipeline images

* Predicting the same values

* requested changes for section 5

* quick improvements.

* cleaning trailing dots

* spelling

* link to contributors

* Lowered the requirement for acceptable response similarity

* Removing model_sha as it seems volatile

* Updated 03/06 to use VLLM and improved response time

Removed streaming and fixed the response quality

Updated requirements for VLLM

Updated responsetime to hit the text generation endpoint

* minor fixes in lab1

* minor fixes in lab2 and update images from 2.8

* minor fixes in lab4 and update images for serving in 2.8

* minor adjustments in lab6

* Update to vLLM and Ollama for model serving

* Better leverage antora variables (#170)

* add antora rhoai, argocd and ocp vars

* added lab vars

* minor typo fix

* update theme for Summit (#171)

* Started drafting the instructions for DIY vs AUTO on the projects (#176)

* 02-02 diy vs auto - start

* split up of 02-03

* disclaimer at top of diy versions

* wordlist addition

* Added instructions for locally downloading YAML (#167)

* prepare for main

* prepare for merge to main

* add pre-puller for pipeline images

* Added instructions for downloading

* use dev branch instead of main

---------

Co-authored-by: Guillaume Moutier <[email protected]>
Co-authored-by: Guillaume Moutier <[email protected]>

* 03/06 and 05/05 pipelines working (#172)

* 03/06 and 05/05 pipelines working

* New image with openai deps

* Feature/streamline section 4 (#174)

* micor changes for clarity

* marking a chunk of section 04 as optional

* spellcheck

* Feat/rename sanity pipeline (#177)

* Updated the file name

* Images and guide updated

* Instructions + Working GitOps (#180)

* Fixed navigation (#195)

* Update workbench image version to 2.0.1 (rh-aiservices-bu#204)

* updating prepulled images to match 2.8.1 (rh-aiservices-bu#205)

* * hide diy (rh-aiservices-bu#206)

* fix the RH1-related stuff.
* formatting and bullet points
* specify user again

* Fix/sanity (rh-aiservices-bu#208)

* removing references to sanity.
* adding confidence pipeline image to puller. speeds up pipelines.
* also copied the image into a repo with a new name in quay.io

* change repo reference

* Fix #11 -  Revert model comparison to Flan-T5 (#26)

* Revert to Flan-T5

Remove CUDA from container

update packages and switch to CPU torch

update workbenches IS

update deployment

switch back to 1.3

update wb image

* update conclusion

* quick readme update (#27)

* adding parasol insurance throughout and updating titles (#28)

* Fix/remove auto (#30)

* removing auto from code

* adjusting instructions (pre-created is better than auto-created)

* git clone the right project

* adjust paths to new project name (#32)

* 05-05 changes branch to dev

* Added a step for the user to go to admin view (#37)

* fix service url of flanT5 (#39)

* WIP: Fixes (#40)

* what type of pipeline

* openAI clarifications

* let's not force it. users should see choice between kserve and modelmesh

* refreshed screenshots

* added jupyter notebook guide and rearrange order (#42)

* bump minio buckets to 101 (#44)

* added run button (#49)

* repo name in image fix

* Change userX to variable

* add token to vllm deployment (#53)

fix namespace

* New App (#54)

New look + RAG
Workench update
Content for part 3 - RAG

* add demo app (#55)

fix bootstrap

* adapt files for main

---------

Co-authored-by: Erwan Granger <[email protected]>
Co-authored-by: RHRolun <[email protected]>
Co-authored-by: adrezni <[email protected]>
Co-authored-by: rcarrata <[email protected]>
Co-authored-by: Cedric Clyburn <[email protected]>
Co-authored-by: RHRolun <[email protected]>
ritzshah pushed a commit to redhat-gpte-devopsautomation/rad-model-deployment that referenced this issue Aug 14, 2024
* adding code to deploy minio via gitops
*  PVC of 50 GB
* creates 50 buckets
ritzshah pushed a commit to redhat-gpte-devopsautomation/rad-model-deployment that referenced this issue Aug 14, 2024
…rvices-bu#26)

* Revert to Flan-T5

Remove CUDA from container

update packages and switch to CPU torch

update workbenches IS

update deployment

switch back to 1.3

update wb image

* update conclusion
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant