Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mount failed for \mnt\opea-models #522

Closed
ezelanza opened this issue Aug 2, 2024 · 2 comments
Closed

Mount failed for \mnt\opea-models #522

ezelanza opened this issue Aug 2, 2024 · 2 comments

Comments

@ezelanza
Copy link

ezelanza commented Aug 2, 2024

Hi, I've followed the instructions to create a Kubernetes cluster on Xeon without GMC : https://github.com/opea-project/GenAIExamples/blob/main/ChatQnA/kubernetes/manifests/README.md

My environment :
-AWS EKS 1.30
-Node : M7i instances

kubectl get pods
NAME                                       READY   STATUS              RESTARTS         AGE
chatqna-79d8c5ffff-gl44s                   1/1     Running             0                94m
chatqna-data-prep-59cbff99d7-chct8         1/1     Running             0                94m
chatqna-embedding-usvc-97f8f5f5f-zrxlf     1/1     Running             0                94m
chatqna-llm-uservice-7f744cd68b-4kwzt      0/1     Running             8 (10m ago)      94m
chatqna-redis-vector-db-5dcd98f579-bpjtc   1/1     Running             0                94m
chatqna-reranking-usvc-8666ffcf65-tmn9s    1/1     Running             0                94m
chatqna-retriever-usvc-77b545d47f-n2cqp    0/1     Running             17 (7m21s ago)   94m
chatqna-tei-74b9fd8f64-mrhjd               0/1     ContainerCreating   0                94m
chatqna-teirerank-68b7794ff6-mfw67         0/1     ContainerCreating   0                94m
chatqna-tgi-bf455497b-jfxkr                0/1     ContainerCreating   0                94m

Logs

kubectl describe pod chatqna-tgi-bf455497b-jfxkr
Name:             chatqna-tgi-bf455497b-jfxkr
Namespace:        default
Priority:         0
Service Account:  default
Node:             ip-172-31-66-47.ec2.internal/172.31.66.47
Start Time:       Fri, 02 Aug 2024 13:31:48 -0400
Labels:           app.kubernetes.io/instance=chatqna
                  app.kubernetes.io/name=tgi
                  pod-template-hash=bf455497b
Annotations:      <none>
Status:           Pending
IP:               
IPs:              <none>
Controlled By:    ReplicaSet/chatqna-tgi-bf455497b
Containers:
  tgi:
    Container ID:   
    Image:          ghcr.io/huggingface/text-generation-inference:2.1.0
    Image ID:       
    Port:           2080/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Environment Variables from:
      chatqna-tgi-config  ConfigMap  Optional: false
    Environment:          <none>
    Mounts:
      /data from model-volume (rw)
      /tmp from tmp (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-mmwgw (ro)
Conditions:
  Type                        Status
  PodReadyToStartContainers   False 
  Initialized                 True 
  Ready                       False 
  ContainersReady             False 
  PodScheduled                True 
Volumes:
  model-volume:
    Type:          HostPath (bare host directory volume)
    Path:          /mnt/opea-models
    HostPathType:  Directory
  tmp:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  kube-api-access-mmwgw:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason       Age                   From     Message
  ----     ------       ----                  ----     -------
  Warning  FailedMount  2m30s (x53 over 94m)  kubelet  MountVolume.SetUp failed for volume "model-volume" : hostPath type check failed: /mnt/opea-models is not a directory

FYI

curl http://localhost:8888/v1/chatqna \
    -H 'Content-Type: application/json' \
    -d '{"messages": "What is the revenue of Nike in 2023?"}'
Internal Server Error
@lianhao
Copy link
Collaborator

lianhao commented Aug 3, 2024

Please pay attention to the following NOTES from README:

You need to make sure you have created the directory /mnt/opea-models to save the cached model on the node where the ChatQnA workload is running. Otherwise, you need to modify the chatqna.yaml file to change the model-volume to a directory that exists on the node.

@ezelanza
Copy link
Author

ezelanza commented Aug 3, 2024

Please pay attention to the following NOTES from README:

You need to make sure you have created the directory /mnt/opea-models to save the cached model on the node where the ChatQnA workload is running. Otherwise, you need to modify the chatqna.yaml file to change the model-volume to a directory that exists on the node.

Thanks, it worked! :)

@ezelanza ezelanza closed this as completed Aug 3, 2024
wangkl2 pushed a commit to wangkl2/GenAIExamples that referenced this issue Dec 11, 2024
wangkl2 pushed a commit to wangkl2/GenAIExamples that referenced this issue Dec 11, 2024
…e for v0.9 (opea-project#538)

* clip embedding support

Signed-off-by: srinarayan-srikanthan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: srinarayan-srikanthan <[email protected]>

* test script for embedding

Signed-off-by: srinarayan-srikanthan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: srinarayan-srikanthan <[email protected]>

* fix freeze workflow (opea-project#522)

Signed-off-by: Sun, Xuehao <[email protected]>
Signed-off-by: srinarayan-srikanthan <[email protected]>

* Fix Dataprep Potential Error in get_file (opea-project#540)

* fix get file error & refine logs

Signed-off-by: letonghan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: letonghan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: srinarayan-srikanthan <[email protected]>

* Support SearchedDoc input type in LLM for No Rerank Pipeline (opea-project#541)

Signed-off-by: letonghan <[email protected]>
Signed-off-by: srinarayan-srikanthan <[email protected]>

* Add dependency for pdf2image and OCR processing (opea-project#421)

Signed-off-by: srinarayan-srikanthan <[email protected]>

* Add local_embedding return 768 length to align with chatqna example (opea-project#313)

Signed-off-by: Chendi.Xue <[email protected]>
Signed-off-by: srinarayan-srikanthan <[email protected]>

* add telemetry doc (opea-project#536)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: srinarayan-srikanthan <[email protected]>

* Add video-llama LVM microservice under lvms  (opea-project#495)

Signed-off-by: BaoHuiling <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: srinarayan-srikanthan <[email protected]>

* Fix the data load issue for structured files (opea-project#505)

Signed-off-by: XuhuiRen <[email protected]>
Signed-off-by: srinarayan-srikanthan <[email protected]>

* Add finetuning component (opea-project#502)

Signed-off-by: Xinyu Ye <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: lkk <[email protected]>
Co-authored-by: test <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: Letong Han <[email protected]>
Signed-off-by: srinarayan-srikanthan <[email protected]>

* add torchvision into requirements (opea-project#546)

Signed-off-by: chensuyue <[email protected]>
Signed-off-by: srinarayan-srikanthan <[email protected]>

* Use Gaudi base images from Dockerhub (opea-project#526)

* Use Gaudi base images from Dockerhub

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Fixing the malformed tag

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* fix another malformed tag

Signed-off-by: Abolfazl Shahbazi <[email protected]>

---------

Signed-off-by: Abolfazl Shahbazi <[email protected]>
Signed-off-by: srinarayan-srikanthan <[email protected]>

* Add toxicity detection microservice (opea-project#338)

* Add toxicity detection microservice

Signed-off-by: Qun Gao <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Modification to toxicity plugin PR  (opea-project#432)

* changed microservice to use Service.GUARDRAILS and input/output to TextDoc

Signed-off-by: Tyler Wilbers <[email protected]>

* simplify dockerfile to use langchain

Signed-off-by: Tyler Wilbers <[email protected]>

* sort requirements

Signed-off-by: Tyler Wilbers <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Tyler Wilbers <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Minor SPDX header update (opea-project#434)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Remove 'langsmith' per code review (opea-project#534)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Add toxicity detection microservices with E2E testing

Signed-off-by: Qun Gao <[email protected]>

---------

Signed-off-by: Qun Gao <[email protected]>
Signed-off-by: Tyler Wilbers <[email protected]>
Signed-off-by: Abolfazl Shahbazi <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Abolfazl Shahbazi <[email protected]>
Co-authored-by: Tyler W <[email protected]>
Signed-off-by: srinarayan-srikanthan <[email protected]>

* rename script and use 5xxx

Signed-off-by: BaoHuiling <[email protected]>
Signed-off-by: srinarayan-srikanthan <[email protected]>

* add proxy for build

Signed-off-by: BaoHuiling <[email protected]>
Signed-off-by: srinarayan-srikanthan <[email protected]>

* fixed commit issues

Signed-off-by: srinarayan-srikanthan <[email protected]>

* Fix docarray constraint

Signed-off-by: srinarayan-srikanthan <[email protected]>

* updated docarray

Signed-off-by: srinarayan-srikanthan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: srinarayan-srikanthan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm telemetry which cause error in mega

Signed-off-by: BaoHuiling <[email protected]>

* renamed dirs

Signed-off-by: srinarayan-srikanthan <[email protected]>

* renamed test

Signed-off-by: srinarayan-srikanthan <[email protected]>

---------

Signed-off-by: srinarayan-srikanthan <[email protected]>
Signed-off-by: Sun, Xuehao <[email protected]>
Signed-off-by: letonghan <[email protected]>
Signed-off-by: Chendi.Xue <[email protected]>
Signed-off-by: BaoHuiling <[email protected]>
Signed-off-by: XuhuiRen <[email protected]>
Signed-off-by: Xinyu Ye <[email protected]>
Signed-off-by: chensuyue <[email protected]>
Signed-off-by: Abolfazl Shahbazi <[email protected]>
Signed-off-by: Qun Gao <[email protected]>
Signed-off-by: Tyler Wilbers <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Sun, Xuehao <[email protected]>
Co-authored-by: Letong Han <[email protected]>
Co-authored-by: Zaili Wang <[email protected]>
Co-authored-by: Chendi.Xue <[email protected]>
Co-authored-by: Sihan Chen <[email protected]>
Co-authored-by: Huiling Bao <[email protected]>
Co-authored-by: XuhuiRen <[email protected]>
Co-authored-by: XinyuYe-Intel <[email protected]>
Co-authored-by: lkk <[email protected]>
Co-authored-by: test <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: chen, suyue <[email protected]>
Co-authored-by: Abolfazl Shahbazi <[email protected]>
Co-authored-by: qgao007 <[email protected]>
Co-authored-by: Tyler W <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants