Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set target routing shard by partition key #316

Merged

Conversation

acarbonetto
Copy link

@acarbonetto acarbonetto commented Jul 26, 2023

Description

Allows user to set routing shard via the relation partition keys. For example:

select _id, _index, _routing, int0 from calcs_routing PARTITION(KEY1, KEY2) where _routing = 'KEY1' or _routing = 'KEY2'

Note that multiple keys can be send to OpenSearch storage by using comma-delimited list.

IT tests

New IT tests running against a multi-node cluster have been added to the build. These can be run using ./gradlew :integ-test:multiClusterSearch and will be automatically run against CI.

TODO

  • Remove quotes/backticks from around the routingId before sending
  • Rebase
  • Propose PPL syntax

Issues Resolved

Note (out of scope): PPL syntax is still TBD

Check List

  • New functionality includes testing.
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented.
    • New functionality has javadoc added
    • New functionality has user manual doc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@codecov
Copy link

codecov bot commented Jul 26, 2023

Codecov Report

Merging #316 (087a5b3) into integ-metafields-set-routing-shard (31b2b28) will decrease coverage by 0.01%.
The diff coverage is 97.43%.

@@                           Coverage Diff                            @@
##             integ-metafields-set-routing-shard     #316      +/-   ##
========================================================================
- Coverage                                 97.42%   97.41%   -0.01%     
- Complexity                                 4647     4651       +4     
========================================================================
  Files                                       408      408              
  Lines                                     11526    11549      +23     
  Branches                                    839      845       +6     
========================================================================
+ Hits                                      11229    11251      +22     
  Misses                                      290      290              
- Partials                                      7        8       +1     
Flag Coverage Δ
sql-engine 97.41% <97.43%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
...java/org/opensearch/sql/storage/StorageEngine.java 100.00% <ø> (ø)
...ql/prometheus/storage/PrometheusStorageEngine.java 100.00% <ø> (ø)
...ensearch/sql/spark/storage/SparkStorageEngine.java 100.00% <ø> (ø)
...org/opensearch/sql/sql/domain/SQLQueryRequest.java 100.00% <ø> (ø)
...ql/opensearch/request/OpenSearchScrollRequest.java 98.24% <88.88%> (-1.76%) ⬇️
...ain/java/org/opensearch/sql/analysis/Analyzer.java 100.00% <100.00%> (ø)
...sql/opensearch/request/OpenSearchQueryRequest.java 100.00% <100.00%> (ø)
...l/opensearch/request/OpenSearchRequestBuilder.java 100.00% <100.00%> (ø)
...ch/sql/opensearch/response/OpenSearchResponse.java 100.00% <100.00%> (ø)
...search/sql/opensearch/storage/OpenSearchIndex.java 100.00% <100.00%> (ø)
... and 2 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@acarbonetto acarbonetto changed the title Dev metafields set routing shard Set target routing shard by partition key Jul 27, 2023
Copy link

@MaxKsyunz MaxKsyunz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@acarbonetto do you think we can pass routingId from query to OpenSearchRequest without having to modify

@@ -98,6 +99,7 @@ private static SQLQueryRequest createSqlQueryRequest(String query, Optional<Stri
builder.endObject();
JSONObject jsonContent = new JSONObject(Strings.toString(builder));

// TODO pass through

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO?

@@ -363,6 +363,17 @@ SQL query::
{
"query" : "SELECT account_number FROM accounts/account"
}
Example 4: Selecting From Index using Partition Shard

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when PARTITION is provided for another data source, like Prometheus or Spark?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They will need to implement a sharing or partition option separately. Most datasources would have such an option.


POST /_plugins/_sql
{
"query" : "SELECT account_number FROM account PARTITION(shard1, shard2)"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be confusing to use PARTITION an existing operation in SQL for this.

What about introducing a general notion of data source options, keeping them as key-value pairs in Relation that get analyzed by each StorageEngine.

This would allow other data sources to benefit from this and avoid question about features that only apply to OpenSearch.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly what the PARTITION in SQL is designed for. Also, the proposal in the issue suggested using this syntax.

I DO like your idea, but I'd rather stick with well-understood SQL syntax than introduce something new.

* Fix core refactor: StreamIO from common to core.common

Signed-off-by: acarbonetto <[email protected]>

* Fix core refactor: StreamIO from common to core.common

Signed-off-by: acarbonetto <[email protected]>

---------

Signed-off-by: acarbonetto <[email protected]>
Signed-off-by: acarbonetto <[email protected]>
Signed-off-by: acarbonetto <[email protected]>
Signed-off-by: acarbonetto <[email protected]>
@acarbonetto acarbonetto force-pushed the dev-metafields-set-routing-shard branch 2 times, most recently from b1928ae to 430d7a9 Compare July 28, 2023 16:06
@acarbonetto acarbonetto merged commit f6643f7 into integ-metafields-set-routing-shard Aug 23, 2023
@acarbonetto acarbonetto deleted the dev-metafields-set-routing-shard branch August 23, 2023 20:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants