Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement KIND based e2e test for Teraslice in Kubernetes #3427

Closed
godber opened this issue Oct 4, 2023 · 10 comments
Closed

Implement KIND based e2e test for Teraslice in Kubernetes #3427

godber opened this issue Oct 4, 2023 · 10 comments
Assignees
Labels
k8s Applies to Teraslice in kubernetes cluster mode only. pkg/teraslice tests

Comments

@godber
Copy link
Member

godber commented Oct 4, 2023

I have been performing acceptance testing manually for all of my changes to Teraslice in Kubernetes. We should implement at least one simple e2e like test in KIND (Docker) in CI.

The tests will have to:

  • Start KIND
  • Check that Kubernetes is up
  • Launch Elasticsearch in KIND
  • Build a Teraslice Image and get it into KIND
  • Launch Teraslice in KIND
  • Start a job in teraslice
  • Make sure that job completes successfully without errors and with the desired outputs

https://kind.sigs.k8s.io/

@godber godber added k8s Applies to Teraslice in kubernetes cluster mode only. pkg/teraslice tests labels Oct 4, 2023
@busma13
Copy link
Contributor

busma13 commented Oct 5, 2023

I had an issue running teraslice in minikube. Possibly related to the version of k8s I'm using. Here are the teraslice-master logs. I've removed or abbreviated want didn't look relevant to me.

kubectl -n ts-dev1 logs teraslice-master-6f65f6bcc4-5mt99 | bunyan
[2023-10-04T22:26:45.037Z]  INFO: teraslice/7 on teraslice-master-6f65f6bcc4-5mt99: Service starting (assignment=node_master)
...
(skipping setup, ES, asset deployment, etc)
...
[2023-10-04T22:34:14.911Z] DEBUG: example-data-generator-job/14 on teraslice-master-6f65f6bcc4-5mt99: enqueueing execution to be processed (queue size 0) (assignment=cluster_master, module=execution_service, worker_id=WvOOtRs_, active=true, analytics=true, performance_metrics=false, autorecover=false, lifecycle=once, max_retries=3, probation_window=300000, slicers=1, workers=2, stateful=false, labels=null, env_vars={}, targets=[], ephemeral_storage=true, pod_spec_override={}, volumes=[], job_id=5aee9a70-65ea-47ed-be53-ead76c096b4d, _context=ex, _created=2023-10-04T22:34:14.895Z, _updated=2023-10-04T22:34:14.895Z, ex_id=d5222f35-ad6a-4269-8b4e-e846636e44d4, metadata={}, _status=pending, _has_errors=false, _slicer_stats={}, _failureReason="")
    assets: [
      "65ee07b97850ce15e78068224febff5c5deb9ae7",
      "2b4f08ae993293c44af418d2f9ec3746d98039ae",
      "00183d8c533503f4acba7aa001931563a001791f"
    ]
    --
    operations: [
      {
        "_op": "data_generator",
        "_encoding": "json",
        "_dead_letter_action": "throw",
        "json_schema": null,
        "size": 5000000,
        "start": null,
        "end": null,
        "format": null,
        "stress_test": false,
        "date_key": "created",
        "set_id": null,
        "id_start_key": null
      },
      {
        "_op": "example",
        "_encoding": "json",
        "_dead_letter_action": "none",
        "type": "string"
      },
      {
        "_op": "delay",
        "_encoding": "json",
        "_dead_letter_action": "throw",
        "ms": 30000
      },
      {
        "_op": "elasticsearch_bulk",
        "_encoding": "json",
        "_dead_letter_action": "throw",
        "size": 5000,
        "connection": "default",
        "index": "terak8s-example-data",
        "type": "events",
        "delete": false,
        "update": false,
        "update_retry_on_conflict": 0,
        "update_fields": [],
        "upsert": false,
        "create": false,
        "script_file": "",
        "script": "",
        "script_params": {},
        "api_name": "elasticsearch_sender_api"
      }
    ]
    --
    apis: [
      {
        "_name": "elasticsearch_sender_api",
        "_encoding": "json",
        "_dead_letter_action": "throw",
        "size": 5000,
        "connection": "default",
        "index": "terak8s-example-data",
        "type": "events",
        "delete": false,
        "update": false,
        "update_retry_on_conflict": 0,
        "update_fields": [],
        "upsert": false,
        "create": false,
        "script_file": "",
        "script": "",
        "script_params": {},
        "_op": "elasticsearch_bulk"
      }
    ]
[2023-10-04T22:34:15.230Z]  INFO: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: Scheduling execution: d5222f35-ad6a-4269-8b4e-e846636e44d4 (assignment=cluster_master, module=execution_service, worker_id=WvOOtRs_)
[2023-10-04T22:34:15.273Z] DEBUG: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: execution allocating slicer (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_, apiVersion=batch/v1, kind=Job)
    metadata: {
        ...
    }
     --
    spec: {
        ...
    }
[2023-10-04T22:34:15.284Z] DEBUG: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: k8s slicer job submitted (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_, kind=Job, apiVersion=batch/v1, status={})
    metadata: {
        ...
    }
     --
    spec: {
        ...
    }
[2023-10-04T22:34:15.289Z] DEBUG: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: waiting for pod matching: controller-uid=undefined (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_)
(repeats 10 times)
...
[2023-10-04T22:34:25.673Z]  INFO: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: execution d5222f35-ad6a-4269-8b4e-e846636e44d4 is connected (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_)
[2023-10-04T22:34:26.375Z] DEBUG: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: waiting for pod matching: controller-uid=undefined (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_)
(repeats 48 more times)
...
[2023-10-04T22:35:16.021Z]  WARN: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: Failed to provision execution d5222f35-ad6a-4269-8b4e-e846636e44d4 (assignment=cluster_master, module=execution_service, worker_id=WvOOtRs_)
[2023-10-04T22:35:16.059Z]  WARN: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: Calling stopExecution on execution: d5222f35-ad6a-4269-8b4e-e846636e44d4 to clean up k8s resources. (assignment=cluster_master, module=execution_service, worker_id=WvOOtRs_)
[2023-10-04T22:35:16.063Z]  INFO: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: k8s._deleteObjByExId: d5222f35-ad6a-4269-8b4e-e846636e44d4 execution_controller jobs deleting: ts-exc-example-data-generator-job-5aee9a70-65ea (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_)
[2023-10-04T22:35:16.072Z]  INFO: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: k8s._deleteObjByExId: d5222f35-ad6a-4269-8b4e-e846636e44d4 worker deployments has already been deleted (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_)
[2023-10-04T22:35:16.205Z] DEBUG: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: execution d5222f35-ad6a-4269-8b4e-e846636e44d4 finished, shutting down execution (assignment=cluster_master, module=execution_service, worker_id=WvOOtRs_)
[2023-10-04T22:35:16.209Z]  INFO: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: k8s._deleteObjByExId: d5222f35-ad6a-4269-8b4e-e846636e44d4 execution_controller jobs has already been deleted (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_)
[2023-10-04T22:35:16.212Z]  INFO: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: k8s._deleteObjByExId: d5222f35-ad6a-4269-8b4e-e846636e44d4 worker deployments has already been deleted (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_)
[2023-10-04T22:35:16.297Z]  INFO: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: client d5222f35-ad6a-4269-8b4e-e846636e44d4 disconnected { reason: 'client namespace disconnect' } (assignment=cluster_master, module=messaging:server, worker_id=WvOOtRs_)

@busma13
Copy link
Contributor

busma13 commented Oct 5, 2023

Starting my Minikube cluster with version 1.23.17 resolves the issue.
minikube start --memory 4096 --cpus 4 --kubernetes-version=v1.23.17

@godber
Copy link
Member Author

godber commented Oct 5, 2023

This is the log I was hoping for, thanks Peter:

[2023-10-04T22:34:15.289Z] DEBUG: teraslice/14 on teraslice-master-6f65f6bcc4-5mt99: waiting for pod matching: controller-uid=undefined (assignment=cluster_master, module=kubernetes_cluster_service, worker_id=WvOOtRs_)
(repeats 10 times)

@godber
Copy link
Member Author

godber commented Oct 6, 2023

Maybe we should replicate what we do for e2e tests here:

https://github.com/terascope/teraslice/blob/master/packages/scripts/src/helpers/test-runner/index.ts#L62-L65

Make a new type k8se2e, implement a runk8sE2eTest function similar to the runE2ETest function, then go from there.

@godber
Copy link
Member Author

godber commented Oct 16, 2023

@jsnoble suggested parameterizing the existing e2e tests so that the existing jests tests can be reused. This is a great suggestion since there's no real need to implement separate teraslice tests. There may be subsets of tests that are "platform" (native clustering vs k8s clustering) specific, we'd have to have a way to omit the other ones in each case.

I suggested adding something like a platform option to the test options type. That can be used to start up the services in the right spot (k8s vs docker), launch teraslice the right way and omit platform specific tests.

@busma13
Copy link
Contributor

busma13 commented Oct 20, 2023

Initial pass/fail of all e2e tests using k8s:

Test Suites: 4 failed, 1 skipped, 6 passed, 10 of 11 total

 PASS   e2e  test/cases/data/elasticsearch-bulk-spec.js (10.747 s)   
 PASS   e2e  test/cases/validation/job-spec.js (8.55 s)
 PASS   e2e  test/cases/cluster/job-state-spec.js (25.801 s)
 PASS   e2e  test/cases/cluster/api-spec.js (47.587 s)
 PASS   e2e  test/cases/data/reindex-spec.js (67.138 s)
 PASS   e2e  test/cases/assets/simple-spec.js (99.074 s)

 FAIL   e2e  test/cases/cluster/worker-allocation-spec.js (17.359 s)
 FAIL   e2e  test/cases/data/recovery-spec.js
 FAIL   e2e  test/cases/cluster/state-spec.js (12.324 s)
 FAIL   e2e  test/cases/kafka/kafka-spec.js (122.693 s)

@godber
Copy link
Member Author

godber commented Oct 20, 2023

We'll want to get the k8s e2e tests running in Github Actions for node 14 and node 16, as well as the supported versions of Elastic/OpenSearch.

@busma13
Copy link
Contributor

busma13 commented Nov 2, 2023

ref: #3449

@busma13
Copy link
Contributor

busma13 commented Nov 3, 2023

ref: #3454
Second try

@godber
Copy link
Member Author

godber commented Nov 9, 2023

This is done.

@godber godber closed this as completed Nov 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
k8s Applies to Teraslice in kubernetes cluster mode only. pkg/teraslice tests
Projects
None yet
Development

No branches or pull requests

3 participants