Error on GCP Deploy #40

dominguesg · 2023-09-02T23:12:21Z

Hello everyone!

First, I would like to thank you for the amazing work you've been doing.

I need your help! I'm a newbie on GCP and I'm trying to deploy the rag-stack falcon7b to it.

I'm getting an error on deploy-gcp.sh. Below is the trace I'm getting:

guilhermedomingues@cloudshell:~/rag-stack/scripts/gcp (llama-rag-test)$ sh deploy-gcp.sh
____ _____ __ __
/ __ ____ _____ / // /_ / /
/ // / __ / __ /_ / _/ __ `/ / ///
/ , / // / // // / // // / // ,<
// ||_,/_, ///_/_,/_//||
/____/

Enter your GCP project ID: llama-rag-test
(https://cloud.google.com/iam/docs/keys-create-delete#creating) Enter the path to your GCP service account key file: llama-rag-test-f40c5f7db02f.json
Enter the GCP region (default: us-west1): us-central1-c
Enter your Huggingface API Token: MY_HUGGING_API
Model to deploy (llama2-7b or falcon7b): falcon7b

Initializing the backend...
Initializing modules...

Initializing provider plugins...

Reusing previous version of hashicorp/kubernetes from the dependency lock file
Reusing previous version of hashicorp/google from the dependency lock file
Using previously-installed hashicorp/kubernetes v2.23.0
Using previously-installed hashicorp/google v4.51.0

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
Success! The configuration is valid.

module.gke-cluster.google_container_cluster.gpu_cluster: Refreshing state... [id=projects/llama-rag-test/locations/us-central1-c/clusters/gpu-cluster]
module.gke-cluster.google_container_node_pool.primary_preemptible_nodes: Refreshing state... [id=projects/llama-rag-test/locations/us-central1-c/clusters/gpu-cluster/nodePools/gpu-node-pool]
data.google_client_config.default: Reading...
data.google_container_cluster.default: Reading...
data.google_client_config.default: Read complete after 0s [id=projects/llama-rag-test/regions/us-central1-c/zones/]
data.google_container_cluster.default: Read complete after 0s [id=projects/llama-rag-test/locations/us-central1-c/clusters/gpu-cluster]
kubernetes_service.falcon7b_service[0]: Refreshing state... [id=default/falcon7b-service]
kubernetes_deployment.falcon7b[0]: Refreshing state... [id=default/falcon7b]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:

create

Terraform will perform the following actions:

google_cloud_run_service.qdrant will be created

resource "google_cloud_run_service" "qdrant" {
- autogenerate_revision_name = false
- id = (known after apply)
- location = "us-central1-c"
- name = "qdrant"
- project = (known after apply)
- status = (known after apply)
- template {
  - spec {
    - container_concurrency = (known after apply)
    - service_account_name = (known after apply)
    - serving_state = (known after apply)
    - timeout_seconds = (known after apply)
    - containers {
      - image = "qdrant/qdrant:v1.3.0"
      - ports {
        
        container_port = 6333
        
        name = (known after apply)
        }
        }
        }
        }
- traffic {
  - latest_revision = true
  - percent = 100
  - url = (known after apply)
    }
    }

google_cloud_run_service.ragstack-server will be created

resource "google_cloud_run_service" "ragstack-server" {
- autogenerate_revision_name = false
- id = (known after apply)
- location = "us-central1-c"
- name = "ragstack-server"
- project = (known after apply)
- status = (known after apply)
- template {
  - spec {
    - container_concurrency = (known after apply)
    - service_account_name = (known after apply)
    - serving_state = (known after apply)
    - timeout_seconds = (known after apply)
    - containers {
      - image = "jfan001/ragstack-server:latest"
      - env {
        
        name = "LLM_URL"
        
        value = "http://35.193.123.142"
        }
      - env {
        
        name = "QDRANT_PORT"
        
        value = "443"
        }
      - env {
        
        name = "QDRANT_URL"
        
        value = (known after apply)
        }
      - resources {
        
        limits = {
        
        "memory" = "2Gi"
        }
        }
        }
        }
        }
- traffic {
  - latest_revision = true
  - percent = 100
  - url = (known after apply)
    }
    }

google_cloud_run_service_iam_member.public will be created

resource "google_cloud_run_service_iam_member" "public" {
- etag = (known after apply)
- id = (known after apply)
- location = "us-central1-c"
- member = "allUsers"
- project = (known after apply)
- role = "roles/run.invoker"
- service = "qdrant"
  }

Plan: 3 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.

Enter a value: yes

google_cloud_run_service.qdrant: Creating...
╷
│ Error: Error creating Service: googleapi: got HTTP response code 404 with body:
│
│
│
│ <title>Error 404 (Not Found)!!1</title>
│ <style>
│ {margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px} > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px}@media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio:2){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%}}#logo{display:inline-block;height:54px;width:150px}
│ </style>
│
│

404. That’s an error.
│

The requested URL /apis/serving.knative.dev/v1/namespaces/llama-rag-test/services was not found on this server. That’s all we know.
│
│
│ with google_cloud_run_service.qdrant,
│ on main.tf line 195, in resource "google_cloud_run_service" "qdrant":
│ 195: resource "google_cloud_run_service" "qdrant" {

Can you help me on this?

Thank you again!

Have a nice weekend! :)

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error on GCP Deploy #40

Error on GCP Deploy #40

dominguesg commented Sep 2, 2023 •

edited

Loading

Error on GCP Deploy #40

Error on GCP Deploy #40

Comments

dominguesg commented Sep 2, 2023 • edited Loading

google_cloud_run_service.qdrant will be created

google_cloud_run_service.ragstack-server will be created

google_cloud_run_service_iam_member.public will be created

dominguesg commented Sep 2, 2023 •

edited

Loading