Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error on GCP Deploy #40

Open
dominguesg opened this issue Sep 2, 2023 · 0 comments
Open

Error on GCP Deploy #40

dominguesg opened this issue Sep 2, 2023 · 0 comments

Comments

@dominguesg
Copy link

dominguesg commented Sep 2, 2023

Hello everyone!

First, I would like to thank you for the amazing work you've been doing.

I need your help! I'm a newbie on GCP and I'm trying to deploy the rag-stack falcon7b to it.

I'm getting an error on deploy-gcp.sh. Below is the trace I'm getting:

guilhermedomingues@cloudshell:~/rag-stack/scripts/gcp (llama-rag-test)$ sh deploy-gcp.sh
____ _____ __ __
/ __ ____ _____ / // /_ / /
/ /
/ / __ / __ /_
/ _/ __ `/ / ///
/ , / // / // /
/ / /
/ /
/ / // ,<
// ||_,/_, ///_/_,/_
//||
/____/


Enter your GCP project ID: llama-rag-test
(https://cloud.google.com/iam/docs/keys-create-delete#creating) Enter the path to your GCP service account key file: llama-rag-test-f40c5f7db02f.json
Enter the GCP region (default: us-west1): us-central1-c
Enter your Huggingface API Token: MY_HUGGING_API
Model to deploy (llama2-7b or falcon7b): falcon7b

Initializing the backend...
Initializing modules...

Initializing provider plugins...

  • Reusing previous version of hashicorp/kubernetes from the dependency lock file
  • Reusing previous version of hashicorp/google from the dependency lock file
  • Using previously-installed hashicorp/kubernetes v2.23.0
  • Using previously-installed hashicorp/google v4.51.0

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
Success! The configuration is valid.

module.gke-cluster.google_container_cluster.gpu_cluster: Refreshing state... [id=projects/llama-rag-test/locations/us-central1-c/clusters/gpu-cluster]
module.gke-cluster.google_container_node_pool.primary_preemptible_nodes: Refreshing state... [id=projects/llama-rag-test/locations/us-central1-c/clusters/gpu-cluster/nodePools/gpu-node-pool]
data.google_client_config.default: Reading...
data.google_container_cluster.default: Reading...
data.google_client_config.default: Read complete after 0s [id=projects/llama-rag-test/regions/us-central1-c/zones/]
data.google_container_cluster.default: Read complete after 0s [id=projects/llama-rag-test/locations/us-central1-c/clusters/gpu-cluster]
kubernetes_service.falcon7b_service[0]: Refreshing state... [id=default/falcon7b-service]
kubernetes_deployment.falcon7b[0]: Refreshing state... [id=default/falcon7b]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:

  • create

Terraform will perform the following actions:

google_cloud_run_service.qdrant will be created

  • resource "google_cloud_run_service" "qdrant" {
    • autogenerate_revision_name = false

    • id = (known after apply)

    • location = "us-central1-c"

    • name = "qdrant"

    • project = (known after apply)

    • status = (known after apply)

    • template {

      • spec {
        • container_concurrency = (known after apply)

        • service_account_name = (known after apply)

        • serving_state = (known after apply)

        • timeout_seconds = (known after apply)

        • containers {

          • image = "qdrant/qdrant:v1.3.0"

          • ports {

            • container_port = 6333
            • name = (known after apply)
              }
              }
              }
              }
    • traffic {

      • latest_revision = true
      • percent = 100
      • url = (known after apply)
        }
        }

google_cloud_run_service.ragstack-server will be created

  • resource "google_cloud_run_service" "ragstack-server" {
    • autogenerate_revision_name = false

    • id = (known after apply)

    • location = "us-central1-c"

    • name = "ragstack-server"

    • project = (known after apply)

    • status = (known after apply)

    • template {

      • spec {
        • container_concurrency = (known after apply)

        • service_account_name = (known after apply)

        • serving_state = (known after apply)

        • timeout_seconds = (known after apply)

        • containers {

          • image = "jfan001/ragstack-server:latest"

          • env {

          • env {

            • name = "QDRANT_PORT"
            • value = "443"
              }
          • env {

            • name = "QDRANT_URL"
            • value = (known after apply)
              }
          • resources {

            • limits = {
              • "memory" = "2Gi"
                }
                }
                }
                }
                }
    • traffic {

      • latest_revision = true
      • percent = 100
      • url = (known after apply)
        }
        }

google_cloud_run_service_iam_member.public will be created

  • resource "google_cloud_run_service_iam_member" "public" {
    • etag = (known after apply)
    • id = (known after apply)
    • location = "us-central1-c"
    • member = "allUsers"
    • project = (known after apply)
    • role = "roles/run.invoker"
    • service = "qdrant"
      }

Plan: 3 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.

Enter a value: yes

google_cloud_run_service.qdrant: Creating...

│ Error: Error creating Service: googleapi: got HTTP response code 404 with body:



│ <title>Error 404 (Not Found)!!1</title>
│ <style>
{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px} > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px}@media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio:2){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%}}#logo{display:inline-block;height:54px;width:150px}
│ </style>

404. That’s an error.

The requested URL /apis/serving.knative.dev/v1/namespaces/llama-rag-test/services was not found on this server. That’s all we know.


│ with google_cloud_run_service.qdrant,
│ on main.tf line 195, in resource "google_cloud_run_service" "qdrant":
│ 195: resource "google_cloud_run_service" "qdrant" {

Can you help me on this?

Thank you again!

Have a nice weekend! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant