Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PoC Refactor: Move relevance search functions from :core to :opensearch #2025

Conversation

Yury-Fridlyand
Copy link
Collaborator

Description

Phase 2 of UDF (User Defined Function) implementation
(See phase 1 in #2019)

Depends on #2001 (a rough fix for that is included in this PR, but it should be merged in scope of another PR in advance.
Fix proposed in #2002 could be done prior to these changes to simplify the refactor.

These changes include:

  • All opensearch specific code was moved from :core to :opensearch
  • OpenSearch specific analyzers were moved too
  • StorageEngine provides API to get DS-specific analyzers, which is used by analyzer in :core (common to all DS)

TODOs:

  • Complete TODOs left in code
  • Make Analyzers and CanPaginateVisitor abstract, every DS should provide their own implementation

Next Steps (phases):

  • In the current state, code checks for DS analyzers if common analyzer meets something unknown (DS-specific). It means that analyzers from different DS may be invoked during processing a single query.
    SQL engine should try to execute a query on all StorageEngines (Data Sources), using only one StorageEngine from the very beginning to the very end (QueryPlanFactory - ExecutionEngine::execute) and pick one which succeeds to build a Physical Plan tree.
    TBD: what if multiple StorageEngines can build a tree?
    Note: reuse common parts for better performance.
    important: this could be moved to the next phase if current one would be implemented with not more that one DS.
  • Let DS have their own parsers, free SQL and PPL parsers from DS-specific stuff.
  • Move Prometheus to another DS
  • Extend config to allow configuring DS list (and their parameters) before starting the cluster
  • Extend DS management API to plug-in/out DS with their UDF in runtime
  • Improve identifier resolution strategy per DS

Issues Resolved

P2 of #811

Check List

  • New functionality includes testing.
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented.
    • New functionality has javadoc added
    • New functionality has user manual doc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

GumpacG and others added 8 commits August 14, 2023 10:24
Signed-off-by: Guian Gumpac <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Note: response formatters were never actually used to format an error.

Signed-off-by: Yury-Fridlyand <[email protected]>
@acarbonetto acarbonetto changed the title Refactor: Move relevance search functions from :core to :opensearch PoC Refactor: Move relevance search functions from :core to :opensearch Aug 23, 2023
@Swiddis
Copy link
Collaborator

Swiddis commented Dec 27, 2024

Is this still in progress?

@Swiddis
Copy link
Collaborator

Swiddis commented Dec 27, 2024

Closing as stale -- feel free to reopen if work is resumed

@Swiddis Swiddis closed this Dec 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants