-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Drop user agent matches and a bugfix for over-committing KafkaRecords too early #468
Changes from all commits
48e9dfc
7c84f94
394cd5d
2779947
b2d34ba
927db73
9fa9ff0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -13,6 +13,7 @@ | |
import org.opensearch.migrations.trafficcapture.protos.EndOfSegmentsIndication; | ||
import org.opensearch.migrations.trafficcapture.protos.ReadObservation; | ||
import org.opensearch.migrations.trafficcapture.protos.ReadSegmentObservation; | ||
import org.opensearch.migrations.trafficcapture.protos.RequestIntentionallyDropped; | ||
import org.opensearch.migrations.trafficcapture.protos.TrafficObservation; | ||
import org.opensearch.migrations.trafficcapture.protos.TrafficStream; | ||
import org.opensearch.migrations.trafficcapture.protos.WriteObservation; | ||
|
@@ -204,6 +205,16 @@ public CompletableFuture<T> flushCommitAndResetStream(boolean isFinal) throws IO | |
return future; | ||
} | ||
|
||
@Override | ||
public void cancelCaptureForCurrentRequest(Instant timestamp) throws IOException { | ||
beginSubstreamObservation(timestamp, TrafficObservation.REQUESTDROPPED_FIELD_NUMBER, 1); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we have any logic on the Replayer to handle this new observation? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good question - as of when you reviewed this, No. I've pushed a new commit. |
||
getOrCreateCodedOutputStream().writeMessage(TrafficObservation.REQUESTDROPPED_FIELD_NUMBER, | ||
RequestIntentionallyDropped.getDefaultInstance()); | ||
this.readObservationsAreWaitingForEom = false; | ||
this.firstLineByteLength = -1; | ||
this.headersByteLength = -1; | ||
} | ||
|
||
@Override | ||
public void addBindEvent(Instant timestamp, SocketAddress addr) throws IOException { | ||
// not implemented for this serializer. The v1.0 version of the replayer will ignore this type of observation | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
package org.opensearch.migrations.trafficcapture.netty; | ||
|
||
|
||
import io.netty.handler.codec.http.HttpRequest; | ||
|
||
import java.util.Map; | ||
import java.util.Optional; | ||
import java.util.regex.Pattern; | ||
import java.util.stream.Collectors; | ||
|
||
public class HeaderValueFilteringCapturePredicate extends RequestCapturePredicate { | ||
private final Map<String, Pattern> headerToPredicateRegexMap; | ||
|
||
public HeaderValueFilteringCapturePredicate(Map<String, String> suppressCaptureHeaderPairs) { | ||
super(new PassThruHttpHeaders.HttpHeadersToPreserve(suppressCaptureHeaderPairs.keySet() | ||
.toArray(String[]::new))); | ||
headerToPredicateRegexMap = suppressCaptureHeaderPairs.entrySet().stream() | ||
.collect(Collectors.toMap(Map.Entry::getKey, kvp->Pattern.compile(kvp.getValue()))); | ||
} | ||
|
||
@Override | ||
public CaptureDirective apply(HttpRequest request) { | ||
return headerToPredicateRegexMap.entrySet().stream().anyMatch(kvp-> | ||
Optional.ofNullable(request.headers().get(kvp.getKey())) | ||
.map(v->kvp.getValue().matcher(v).matches()) | ||
.orElse(false) | ||
) ? CaptureDirective.DROP : CaptureDirective.CAPTURE; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly we will still write any packets up to the point that we can decipher the headers. At which case we will add this new observation to signify that the preceding observations can be ignored. It seems like ideally we wouldn't even write them to Kafka but do agree that is not necessarily an easy thing to do with our current logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is correct. If a caller wants to make sure that there's no trace, it's reasonable that they'll be able to send in a request with the filtering logic early enough (first packet). That said, we don't parse the headers as they come in, but rather once they've all arrived. Doing them as they come in would be a really nice to have.
However, there may be cases that are spread across time. We have two choices. Buffer and manage, or offload ASAP (keeping in mind that there's a lot of buffering through the stacks, just not super easy to retract). Not capturing is an optimization on the wire protocol, but at the expense of compute & memory for the proxy. Not replaying is the visible high-level requirement, which we'll meet by adding the tombstone. Trying to do more for what could be very rare cases, considering that we're going to be setup to log nearly all of the traffic anyway doesn't seem like an investment that would ever pay off. (as much as it pains me to NOT do that wire optimization)