Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Search pipeline] Add script processor #7607

Merged
merged 4 commits into from
May 22, 2023

Conversation

noCharger
Copy link
Contributor

@noCharger noCharger commented May 17, 2023

Description

Because the script processor is quite flexible, the design should be cautious about the range of fields that it may read and modify. One approach to managing request/response context is utilizing Java's classloader and field dynamically at runtime. The key advantage of this method is its flexibility, providing a general solution adaptable to various contexts. However, its broad applicability can also be a liability, as it could inadvertently create security vulnerabilities unless explicitly disabled. Alternatively, an explicit field check offers more control and security by only supporting specific fields. While this approach reduces the risk of security breaches, it requires developers to add support on new fields. The recommandation is approach two and iterate with community engagement

Search source fields for request
https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/search/builder/SearchSourceBuilder.java

Table for Search Request script processor supports (Not final decision. Iterating and looking for community feedback)

No. Field Type & Name Description Readable (Y / N / Partially) Writable (Y / N / Partially) Notes
1 int from Starting document offset. Defaults to 0. Y Y  
2 int size Defines the number of hits to return. Defaults to 10. Y Y  
3 Boolean explain If true, returns detailed information about score computation as part of a hit. Defaults to false. Y Y  
4 Boolean version If true, returns document version as part of a hit. Defaults to false. Y Y  
5 Boolean seqNoAndPrimaryTerm If true, returns sequence number and primary term of the last modification of each hit. See Optimistic concurrency control. Y Y  
6 boolean trackScores If true, calculate and return document scores, even if the scores are not used for sorting. Defaults to false. Y Y  
7 Integer trackTotalHitsUpTo Number of hits matching the query to count accurately. Defaults to 10000.If true, the exact number of hits is returned at the cost of some performance. If false, the response does not include the total number of hits matching the query. Y Y  
8 Float minScore Minimum _score for matching documents. Documents with a lower _score are not included in the search results. Y Y  
9 TimeValue timeout Specifies the period of time to wait for a response. If no response is received before the timeout expires, the request fails and returns an error. Defaults to no timeout. N N  require more complex handling on read and rewrite
10 int terminateAfter The maximum number of documents to collect for each shard, upon reaching which the query execution will terminate early.Defaults to 0, which does not terminate query execution early. Y Y  
11 List<SortBuilder<?>> sorts A comma-separated list of : pairs. N N   require more complex handling on read and rewrite
12 List docValueFields A comma-separated list of fields to return as the docvalue representation of a field for each hit. N N  require more complex handling on read and rewrite
13 List scriptFields List of script fields to compute and include in the response. N N  require more complex handling on read and rewrite
14 List fetchFields List of fields to fetch and return as part of the search response. N N   require more complex handling on read and rewrite
15 StoredFieldsContext storedFieldsContext This field contains information about which stored fields to include in the response. N N  require more complex handling on read and rewrite
16 FetchSourceContext fetchSourceContext This field specifies whether the source of each hit should be returned and how the source should be filtered. N N Valid values for _sourcetrue(Boolean) The entire document source is returned.false(Boolean) The document source is not returned.(string) Comma-separated list of source fields to return. Wildcard (*) patterns are supported.
17 List indexBoosts Boosts the _score of documents from specified indices. N N   require more complex handling on read and rewrite
18 List stats Stats groups to associate with the search. Each group maintains a statistics aggregation for its associated searches. You can retrieve these stats using the indices stats API. N N  require more complex handling on read and rewrite
19 boolean profile If true, the response will include detailed profiling information. Y Y  
20 QueryBuilder queryBuilder This field is used to build the main search query. P P Only some fields of certain type of queries for 2.8/2.9 release
21 QueryBuilder postQueryBuilder This field is used to build post-processing queries. P P Only some fields of certain type of queries for 2.8/2.9 release
22 SearchAfterBuilder searchAfterBuilder This field is used to construct searches that fetch the next page of results. N N Not support all scroll releated features as of now. Open to community feedback.
23 PointInTimeBuilder pointInTimeBuilder This field is used to build point-in-time specifications for consistent search results. N N Not support PIT for now. Open to community feedback.
24 SliceBuilder sliceBuilder This field is used to build slice specifications for parallel scrolling. N N Not support all scroll releated features as of now. Open to community feedback.
25 CollapseBuilder collapse This field is used to build field collapsing specifications. N N Open to community feedback.
26 Map<String, Object> searchPipelineSource This field contains the source of the ad hoc search pipeline. N N Open to community feedback.
27 AggregatorFactories.Builder aggregations This field is used to build aggregation specifications for the search. N N Open to community feedback.
28 HighlightBuilder highlightBuilder This field is used to build highlighting specifications. N N Open to community feedback.
29 SuggestBuilderSuggestBuilder suggestBuilder This field is used to build suggestions for the search. N N Open to community feedback.
30 List rescoreBuilders This field is used to build rescorers that adjust the relevance score of search hits. N N Open to community feedback.
31 List extBuilders List of builders for search extensions. N N Open to community feedback.

Related Issues

#6712

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@noCharger noCharger force-pushed the feature-script-processor branch from 01485ca to 2e8b828 Compare May 17, 2023 16:57
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@noCharger noCharger self-assigned this May 17, 2023
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@noCharger noCharger force-pushed the feature-script-processor branch from 7b71bd6 to 499f0d5 Compare May 18, 2023 01:27
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@noCharger noCharger force-pushed the feature-script-processor branch from 9849595 to 1e561d8 Compare May 18, 2023 17:17
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@noCharger noCharger marked this pull request as ready for review May 18, 2023 17:38
@noCharger noCharger force-pushed the feature-script-processor branch from 77c276c to a75fe42 Compare May 22, 2023 22:34
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT:
  • URL:
  • CommitID: a75fe42
    Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green.
    Is the failure a flaky test unrelated to your change?

@noCharger noCharger force-pushed the feature-script-processor branch from a75fe42 to c04f202 Compare May 22, 2023 22:36
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT:
  • URL:
  • CommitID: c04f202
    Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green.
    Is the failure a flaky test unrelated to your change?

@noCharger
Copy link
Contributor Author

I'm concerned about all of the getters that return non-primitive types.

Without a yaml test showing how they might be used in a script, I don't know how it would look to interact with those source properties.

It's a good point - we shouldn't include opensearch-specific classes as dependencies of painless syntax. So there are at least two routes to go on the read path: 1. return their primitive representations 2. does not support reading neither. It's worth for community discussion via an RFC and I removed all non-primitive types support in this PR.

@noCharger noCharger force-pushed the feature-script-processor branch from c04f202 to 4ca4f48 Compare May 22, 2023 22:49
Signed-off-by: Louis Chu <[email protected]>
@noCharger noCharger force-pushed the feature-script-processor branch from 4ca4f48 to 1dfe800 Compare May 22, 2023 22:57
@noCharger noCharger requested a review from msfroh May 22, 2023 22:59
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@noCharger noCharger added the backport 2.x Backport to 2.x branch label May 22, 2023
@tlfeng tlfeng merged commit 959910b into opensearch-project:main May 22, 2023
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-7607-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 959910b977eb9d005e11b00721a233ed90548e5c
# Push it to GitHub
git push --set-upstream origin backport/backport-7607-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-7607-to-2.x.

@noCharger noCharger added backport 2.x Backport to 2.x branch and removed backport 2.x Backport to 2.x branch labels May 22, 2023
opensearch-trigger-bot bot pushed a commit that referenced this pull request May 23, 2023
Be able to insert a painless or mustache script to manipulate search requests.

Signed-off-by: Louis Chu <[email protected]>
(cherry picked from commit 959910b)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
tlfeng pushed a commit that referenced this pull request May 23, 2023
Be able to insert a painless or mustache script to manipulate search requests.


(cherry picked from commit 959910b)

Signed-off-by: Louis Chu <[email protected]>
bharath-techie pushed a commit to bharath-techie/OpenSearch that referenced this pull request May 23, 2023
Be able to insert a painless or mustache script to manipulate search requests.

Signed-off-by: Louis Chu <[email protected]>
Signed-off-by: Bharathwaj G <[email protected]>
suranjay pushed a commit to suranjay/OpenSearch that referenced this pull request May 29, 2023
Be able to insert a painless or mustache script to manipulate search requests.

Signed-off-by: Louis Chu <[email protected]>
stephen-crawford pushed a commit to stephen-crawford/OpenSearch that referenced this pull request May 31, 2023
Be able to insert a painless or mustache script to manipulate search requests.

Signed-off-by: Louis Chu <[email protected]>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
Be able to insert a painless or mustache script to manipulate search requests.

Signed-off-by: Louis Chu <[email protected]>
Signed-off-by: Shivansh Arora <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch enhancement Enhancement or improvement to existing feature or request Search Search query, autocomplete ...etc v2.8.0 'Issues and PRs related to version v2.8.0'
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants