-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding protos for search classes #13178
Adding protos for search classes #13178
Conversation
Compatibility status:Checks if related components are compatible with change b6f1d25 Incompatible componentsSkipped componentsCompatible componentsCompatible components: [https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/flow-framework.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/sql.git] |
❌ Gradle check result for 855f7ce: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for b6f1d25: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works to quickly add protobuf, but I think we must refactor the classes to separate transport-specific parts in this one.
/** | ||
* Gates the functionality of integrating protobuf within search API and node-to-node communication. | ||
*/ | ||
public static final String PROTOBUF = "opensearch.experimental.feature.search_with_protobuf.enabled"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we call this something more generic since we may be adding more than search and gate all of protobuf transport behind it? What do you think about "opensearch.experimental.feature.transport.protobuf.enabled"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, only search related transport will be gated behind this since when we add the transport specific node to node protobuf structure (in the next PR), it will consider only search related classes for now. We can keep adding more classes once we expand the use case. That message would be https://github.com/opensearch-project/OpenSearch/pull/11910/files#diff-3b6cef54e769860b6ff96d7052208f6fe2ef8a86ea04aec232044984ba52e5db.
@@ -93,14 +98,17 @@ public class FeatureFlags { | |||
|
|||
public static final Setting<Boolean> PLUGGABLE_CACHE_SETTING = Setting.boolSetting(PLUGGABLE_CACHE, false, Property.NodeScope); | |||
|
|||
public static final Setting<Boolean> PROTOBUF_SETTING = Setting.boolSetting(PROTOBUF, false, Property.NodeScope, Property.Dynamic); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PROTOBUF_TRANSPORT_SETTING
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The feature/effort is not just transport specific though, protobuf is at the transport level and also at the API request response level. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this variable and the name of the setting should match our intent. Won't we be rolling out the search API with protobuf support, then another API, etc.? What's the setting that will live the longest?
Either way this variable and the setting string should match I think.
The most generic one:
PROTOBUF = opensearch.experimental.feature.protobuf.enabled
PROTOBUF_SETTING
Search specific.
PROTOBUF_SEARCH = opensearch.experimental.feature.protobuf.search.enabled
PROTOBUF_SEEARCH_SETTING
I'm good with either, but I prefer (1) because I don't think we'll ever want to have 2 APIs with protobuf support enabled/disabled separately. I could totally be wrong, too.
@@ -64,9 +72,24 @@ public FetchSearchResult(StreamInput in) throws IOException { | |||
hits = new SearchHits(in); | |||
} | |||
|
|||
public FetchSearchResult(InputStream in) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a code smell. We have org.opensearch.core.common.io.stream.StreamInput
and java.io.InputStream
, and two constructors that do completely different things depending on whether we pass StreamInput
or InputStream
. We're lucky they are called different class names :)
This needs to be disambiguated, likely by implementing a base class and separate implementations for one vs. the other stream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both the constructors come from its base class implementing Writeable and BytesWriteable (this is not part of this PR). Let me see if I can make this a little more disambiguated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to do the same things as with DocumentField
- the class should not deal with all possible (de)serealizations
public FetchSearchResult(ShardSearchContextId id, SearchShardTarget shardTarget) { | ||
this.contextId = id; | ||
setSearchShardTarget(shardTarget); | ||
this.fetchSearchResultProto = FetchSearchResultProto.FetchSearchResult.newBuilder() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Transport-specific implementation needs to live in its own namespaces/classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a transport specific implementation though. This is the response/request specific serialization and deserialization. Transport specific protobuf implementation will be separate which is sent over the wire of which this message is a part. Transport specific protobuf will be a part of the next PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way it's written the request/response can serialize its way two ways, however the serialization primitives are all over the class. I think there are a few options (and possibly more).
- Multiple
serialize
methods which take different stream types, and no locals anywhere else in the class. - Different
serialize
methods in the style oftoString()
, soserializeToNative(..)
orserializeToProtobuf(..)
. - A base class
FetchResult
and child classesNativeFetchResult
andProtobufFetchResult
that both have aserialize
method that comes fromNativeSerializable
andProtobufSerializable
. - A helper serializer class
NativeFetchResultSerializer
andProtobufFetchResultSerializer
.
Maybe @reta has better ideas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dblock definitively +1 (https://github.com/opensearch-project/OpenSearch/pull/13178/files#r1566328274)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing something similar to DocumentField
might not work since the response classes implement Writeable for native protocol (and will do BytesWriteable as well). But I think option 3 specified by @dblock might be a way to go where the specific protocol class can implement the related constructor for Writeable or BytesWriteable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the hint, we probably need to align the that between transport and protocol, I will look into it shortly, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@VachaShah OK, I think we won't be able to move on with this change before we solve the protocol handling by outbound communication (we sort of addressed the inbound only, but not outbound, TransportResponseHandler
& are heavily relying on "native" serialization). In any case, I think the serde should be moved away from the response classes, and handled by the TransportResponseHandler
, depending on the protocol of the inbound message).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm with this. I think the "right" way is to first PR splitting out the native serialization and deserialization out of FetchSearchResult
. It's a hefty piece of work, but who dares wins.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense! I will look into this. For abstracting the outbound side of transport and adding support for multiple protocols on that side of transport, I have created #13293. Next, I am looking into decoupling Writeable
for native serialization from TransportResponseHandler
.
I think I missed adding some explanation here. Sorry for the confusion! This is not adding any transport specific implementations. In this PR, we are adding proto structures and serialization logic for API level response classes and model classes for search. The transport specific implementation for node to node message in protobuf structure will come in the next PR where these response classes will be used as part of it. |
Understood. I think the fact that
If you create a fetch result with a specific constructor that knows nothing about protobuf, and that will never be serialized to/from protobuf, you still have something related to protobuf. Can we do better? |
...src/main/java/org/opensearch/common/document/serializer/DocumentFieldProtobufSerializer.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/common/document/serializer/DocumentFieldSerializer.java
Outdated
Show resolved
Hide resolved
b6f1d25
to
288f99c
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #13178 +/- ##
============================================
+ Coverage 71.42% 71.48% +0.06%
- Complexity 59978 61155 +1177
============================================
Files 4985 5064 +79
Lines 282275 287916 +5641
Branches 40946 41731 +785
============================================
+ Hits 201603 205815 +4212
- Misses 63999 65102 +1103
- Partials 16673 16999 +326 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @reta and I are saying the same thing. Refactor serialization/deserialization out of FetchSearchResult
in a separate PR, then the protobuf one will be easy to add.
@@ -93,14 +98,17 @@ public class FeatureFlags { | |||
|
|||
public static final Setting<Boolean> PLUGGABLE_CACHE_SETTING = Setting.boolSetting(PLUGGABLE_CACHE, false, Property.NodeScope); | |||
|
|||
public static final Setting<Boolean> PROTOBUF_SETTING = Setting.boolSetting(PROTOBUF, false, Property.NodeScope, Property.Dynamic); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this variable and the name of the setting should match our intent. Won't we be rolling out the search API with protobuf support, then another API, etc.? What's the setting that will live the longest?
Either way this variable and the setting string should match I think.
The most generic one:
PROTOBUF = opensearch.experimental.feature.protobuf.enabled
PROTOBUF_SETTING
Search specific.
PROTOBUF_SEARCH = opensearch.experimental.feature.protobuf.search.enabled
PROTOBUF_SEEARCH_SETTING
I'm good with either, but I prefer (1) because I don't think we'll ever want to have 2 APIs with protobuf support enabled/disabled separately. I could totally be wrong, too.
public FetchSearchResult(ShardSearchContextId id, SearchShardTarget shardTarget) { | ||
this.contextId = id; | ||
setSearchShardTarget(shardTarget); | ||
this.fetchSearchResultProto = FetchSearchResultProto.FetchSearchResult.newBuilder() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm with this. I think the "right" way is to first PR splitting out the native serialization and deserialization out of FetchSearchResult
. It's a hefty piece of work, but who dares wins.
I think it depends how we would implement it, you might be very right and we may need to keep response classes implementing the default (native) serialization, but protobuf should be 100% out , I am sure |
Signed-off-by: Vacha Shah <[email protected]>
Signed-off-by: Vacha Shah <[email protected]>
Signed-off-by: Vacha Shah <[email protected]>
288f99c
to
9c07269
Compare
Update: Merged the latest changes from #13293. Working on architecture for keeping serde separate from classes for protobuf. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I commented on the next refactor in #13697 (comment).
This PR can become mergeable when FetchSearchResult
doesn't have both native stream and protobuf mixed together.
This PR is stalled because it has been open for 30 days with no activity. |
This PR is stalled because it has been open for 30 days with no activity. |
Closing this PR since won't be working on this actively. If this change needs to be picked up, please feel free to use my branch https://github.com/VachaShah/OpenSearch/tree/add-protos-for-search. |
Description
Coming from #11910, extracting some classes for introducing proto structures in this PR for easier reviews. This PR mainly contains:
Note:
Related Issues
Resolves #[Issue number to be closed when this PR is merged]
Part of #10684
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.