Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EIS integration #111154

Merged
merged 32 commits into from
Aug 9, 2024
Merged

Conversation

timgrein
Copy link
Contributor

@timgrein timgrein commented Jul 22, 2024

This PR integrates EIS (Elastic Inference Service) with Elasticsearch behind a feature flag.

Useful ES commands:

  • Running Elasticsearch (inside of the ES root directory):
    • Run ES: ./gradlew run
    • Run ES in debug mode: ./gradlew run --debug-jvm (you can attach a debugger on the debug port, which will be logged)
    • Run ES and set EIS gateway URL via CLI: ./gradlew run -Dtests.es.xpack.inference.eis.gateway.url=http://localhost:8000
  • Running tests:
    • Run tests inside inference API plugin: ./gradlew ':x-pack:plugin:inference:test'
    • Run a specific test class inside inference API plugin: ./gradlew ':x-pack:plugin:inference:test' --tests "org.elasticsearch.xpack.inference.external.response.elastic.ElasticInferenceServiceSparseEmbeddingsResponseEntityTests" (specify --tests and package + class name)
    • Run a specific test method inside Inference API plugin: ./gradlew ':x-pack:plugin:inference:test' --tests "org.elasticsearch.xpack.inference.external.response.elastic.ElasticInferenceServiceSparseEmbeddingsResponseEntityTests.testSparseEmbeddingsResponse_SingleEmbeddingInData_NoMeta_NoTruncation" (specify --tests and package + class + method name)
  • Check style/formatting:
    • Check style in inference API plugin: ./gradlew ':x-pack:plugin:inference:checkstyleTest'
    • Apply spotless/formatting: ./gradlew spotlessApply

Testing locally:

  • Start eis-model-server (or eis-gateway, if ELSERv2 endpoint is integrated) on port {PORT}
  • Start ES with a configured eis-gateway: ./gradlew run -Dtests.es.xpack.inference.eis.gateway.url=http://localhost:{PORT}
  • Create an EIS inference endpoint:
PUT {ES_HOST}/_inference/sparse_embedding/eis

{
    "service": "elastic",
    "service_settings": {
        "model_id": ".elser_model_2"
    }
}
  • Perform inference (single embedding in a list):
POST {ES_HOST}/_inference/sparse_embedding/eis

{
    "input": "A blue sky"
}
  • Perform inference (multiple embeddings):
POST {ES_HOST}/_inference/sparse_embedding/eis

{
    "input": [
        "Embed this text",
        "Embed this text, too",
        "This text should also be embedded"
    ]
}

Testing in serverless:

  • Enable the feature flag for your project or a whole environment via a PR in our corresponding gitops repo via jvmOptions: "-Des.eis_feature_flag_enabled=true"
  • Repeat steps from Testing locally

TODOs:

  • Write tests for ElasticInferenceService
  • Write tests for ElasticInferenceServiceActionCreator
  • Write tests for ElasticInferenceServiceResponseHandler
  • Write tests for ElasticInferenceServiceSparseEmbeddingsRequest
  • Implement checkModelConfig in ElasticInferenceService
  • Implement doChunkedInfer in ElasticInferenceService
  • Implement truncation in ElasticInferenceServiceSparseEmbeddingsRequest
  • Add docs for ElasticInferenceServiceFeature
  • Handle error codes specified in inference Task API spec inside ElasticInferenceServiceResponseHandler
  • (There might be some additional smaller TODOs, which I forgot here, I usually mark them with //TODO:)

When ready for review:

  • Mark ready for review
  • Add labels (this will ping the ML team for reviews):
    • >non-issue
    • :ml
    • Team:ML

Out of scope for this PR:

  • "Always-on" experience (performing inference without creating an endpoint for service elastic upfront)
  • Rate limiting
  • Auth/Secret Settings

@demjened demjened force-pushed the timgrein/inference-api-integrate-eis branch from 49a755c to 9eb637f Compare August 6, 2024 19:26
@demjened demjened marked this pull request as ready for review August 6, 2024 21:10
@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label Aug 6, 2024
@demjened demjened added the :SearchOrg/Inference Label for the Search Inference team label Aug 6, 2024
@elasticsearchmachine elasticsearchmachine added Team:SearchOrg Meta label for the Search Org (Enterprise Search) Team:Search - Inference and removed needs:triage Requires assignment of a team area label labels Aug 6, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/search-inference-team (Team:Search - Inference)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ent-search-eng (Team:SearchOrg)

@demjened demjened requested a review from a team August 6, 2024 21:13
@demjened demjened changed the title [DRAFT] EIS integration EIS integration Aug 6, 2024
@elasticsearchmachine
Copy link
Collaborator

Hi @timgrein, I've created a changelog YAML for you.

Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@demjened demjened force-pushed the timgrein/inference-api-integrate-eis branch from 74ddbcb to d3ef457 Compare August 9, 2024 14:14
@demjened demjened merged commit 13cc380 into elastic:main Aug 9, 2024
15 checks passed
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Aug 9, 2024
* upstream/main: (22 commits)
  Prune changelogs after 8.15.0 release
  Bump versions after 8.15.0 release
  EIS integration (elastic#111154)
  Skip LOOKUP/INLINESTATS cases unless on snapshot (elastic#111755)
  Always enforce strict role validation (elastic#111056)
  Mute org.elasticsearch.xpack.esql.analysis.VerifierTests testUnsupportedAndMultiTypedFields elastic#111753
  [ML] Force time shift integration test (elastic#111620)
  ESQL: Add tests for sort, where with unsupported type (elastic#111737)
  [ML] Force time shift documentation (elastic#111668)
  Fix remote cluster credential secure settings reload   (elastic#111535)
  ESQL: Fix for overzealous validation in case of invalid mapped fields (elastic#111475)
  Pass allow security manager flag in gradle test policy setup plugin (elastic#111725)
  Rename streamContent/Separator to bulkContent/Separator (elastic#111716)
  Mute org.elasticsearch.tdigest.ComparisonTests testSparseGaussianDistribution elastic#111721
  Remove 8.14 from branches.json
  Only emit product origin in deprecation log if present (elastic#111683)
  Forward port release notes for v8.15.0 (elastic#111714)
  [ES|QL] Combine Disjunctive CIDRMatch (elastic#111501)
  ESQL: Remove qualifier from attrs (elastic#110581)
  Force using the last centroid during merging (elastic#111644)
  ...

# Conflicts:
#	server/src/main/java/org/elasticsearch/TransportVersions.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferenceNamedWriteablesProvider.java
cbuescher pushed a commit to cbuescher/elasticsearch that referenced this pull request Sep 4, 2024
* WIP

* Add ElasticInferenceServiceTests TODOs

* Add ElasticInferenceServiceActionCreatorTests TODOs

* Add ElasticInferenceServiceResponseHandlerTests TODOs

* Add ElasticInferenceServiceSparseEmbeddingsRequestTests TODOs

* Add ElasticInferenceServiceSparseEmbeddingsModelTests TODOs

* spotless apply

* Fix conflicts

* Add EmptySecretSettingsTests

* Add named writeables to InferenceNamedWriteablesProvider

* Remove addressed todos

* Translate model to correct endpoint

* Remove addressed TODO

* Add docs to ElasticInferenceServiceFeature

* Implement and test truncation/request

* Add some EIS tests

* Support chunked inference

* Check model config

* Add more tests

* Add response handler

* Add more tests + HTTP 413 handling

* Fix some tests

* Spotless

* Fixes

* Switch back to original response structure

* Implement pass-through chunking

* Spotless

* Fix after rebase

* Spotless

* Log error upon failing to parse error response

* Remove TODOs

* Update docs/changelog/111154.yaml

---------

Co-authored-by: Adam Demjen <[email protected]>
davidkyle pushed a commit to davidkyle/elasticsearch that referenced this pull request Sep 5, 2024
* WIP

* Add ElasticInferenceServiceTests TODOs

* Add ElasticInferenceServiceActionCreatorTests TODOs

* Add ElasticInferenceServiceResponseHandlerTests TODOs

* Add ElasticInferenceServiceSparseEmbeddingsRequestTests TODOs

* Add ElasticInferenceServiceSparseEmbeddingsModelTests TODOs

* spotless apply

* Fix conflicts

* Add EmptySecretSettingsTests

* Add named writeables to InferenceNamedWriteablesProvider

* Remove addressed todos

* Translate model to correct endpoint

* Remove addressed TODO

* Add docs to ElasticInferenceServiceFeature

* Implement and test truncation/request

* Add some EIS tests

* Support chunked inference

* Check model config

* Add more tests

* Add response handler

* Add more tests + HTTP 413 handling

* Fix some tests

* Spotless

* Fixes

* Switch back to original response structure

* Implement pass-through chunking

* Spotless

* Fix after rebase

* Spotless

* Log error upon failing to parse error response

* Remove TODOs

* Update docs/changelog/111154.yaml

---------

Co-authored-by: Adam Demjen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>feature :SearchOrg/Inference Label for the Search Inference team Team:Search - Inference Team:SearchOrg Meta label for the Search Org (Enterprise Search) v8.16.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants