feat: Add stream pull queries to http2 #8124

vvcephei · 2021-09-13T22:38:33Z

Adapting the stream pull query execution to the HTTP/2 endpoint, which wound up
being a significant refactor.

Reviewer checklist

Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
Ensure relevant issues are linked (description should include text like "Fixes #")

vpapavas · 2021-09-14T12:57:36Z

ksqldb-engine/src/main/java/io/confluent/ksql/query/QueryExecutor.java

  ) {
+    // If we are supposed to stop at the endOffsets, but we didn't get any, then we should


In what scenario are we not getting any endOffsests even though we are supposed to?

It's a bit weird, but if a partition is empty, then we just won't get an "end offset" for it at all. So, if all the partitions are empty, then the whole map will be empty. In this case, it just means we're supposed to stop at the "end" of an empty topic. Aka, we should return zero results.

vpapavas · 2021-09-14T13:09:28Z

ksqldb-engine/src/main/java/io/confluent/ksql/query/QueryExecutor.java

+        queue.acceptRow(null, record.value());
+      }
+
+      checkCompletion();


For every record added to the queue, we check the endOffsets across all partitions in checkCompletion. I am wondering if we can make it cheaper (might not be needed if this is already very cheap) by just checking for completion in the partition we are currently reading from, where the current record is read from since the other partitions cannot have reached endOffset

That's a good observation, but it should actually be pretty cheap already, since this is only checking local variables. I'd be surprised if just checking the current record's partition is noticeably faster, but I'll go ahead and do it.

vpapavas · 2021-09-14T13:15:16Z

ksqldb-engine/src/main/java/io/confluent/ksql/query/QueryExecutor.java

+        context.schedule(
+            Duration.ofMillis(100),
+            PunctuationType.WALL_CLOCK_TIME,
+            timestamp -> checkCompletion()


We check for completion both periodically and when reading records. For my education, why is the periodic one needed as well?

You're right; I should add a comment :)

The process call is only triggered when processing a record. If an upstream processor filters records out, then this processor (which is right at the end of the topology) might never get triggered and hence might never complete. I figured that out the hard way :/

vvcephei

Hey @vpapavas and @AlanConfluent ,

I'm ready for a real review pass now. I've left some breadcrumbs to help your review.

vvcephei · 2021-09-14T18:11:02Z

...api-client/src/test/java/io/confluent/ksql/api/client/integration/ClientIntegrationTest.java

@@ -279,9 +281,9 @@ public static void setUpClass() throws Exception {
  }

  private static void writeConnectConfigs(final String path, final Map<String, String> configs) throws Exception {
-    try (PrintWriter out = new PrintWriter(new OutputStreamWriter(
+    try (final PrintWriter out = new PrintWriter(new OutputStreamWriter(


While trying to resolve all the checkstyle errors, I wound up accepting some IntelliJ suggestions. I went ahead and kept the super minor ones.

vvcephei · 2021-09-14T18:15:08Z

...api-client/src/test/java/io/confluent/ksql/api/client/integration/ClientIntegrationTest.java

@@ -371,6 +373,74 @@ public void shouldStreamPushQuerySync() throws Exception {
    assertThat(streamedQueryResult.isComplete(), is(false));
  }

+  @Test
+  public void shouldStreamPullQueryOnStreamAsync() throws Exception {


In lieu of running all the functional test scenarios from RQTT, I'm just adding a couple of smoke tests as hand-written ITs for the HTTP/2 endpoint.

vvcephei · 2021-09-14T18:16:42Z

...api-client/src/test/java/io/confluent/ksql/api/client/integration/ClientIntegrationTest.java

@@ -675,119 +745,6 @@ public void shouldHandleErrorResponseFromInsertInto() {
    assertThat(e.getCause().getMessage(), containsString("Cannot insert into a table"));
  }

-  @Test
-  public void shouldStreamQueryWithProperties() throws Exception {


These tests actually mutate the shared state and therefore affect other tests. I've moved them to a new IT file called ClientMutationIntegrationTest.

vvcephei · 2021-09-14T18:19:08Z

...api-client/src/test/java/io/confluent/ksql/api/client/integration/ClientIntegrationTest.java

  private static void verifyStreamRows(final List<Row> rows, final int numRows) {
-    assertThat(rows, hasSize(numRows));
-    for (int i = 0; i < numRows; i++) {
+    for (int i = 0; i < Math.min(numRows, rows.size()); i++) {
      verifyStreamRowWithIndex(rows.get(i), i);
    }
+    if (rows.size() < numRows) {
+      fail("Expected " + numRows + " but only got " + rows.size());
+    } else if (rows.size() > numRows) {
+      final List<Row> extra = rows.subList(numRows, rows.size());
+      fail("Expected " + numRows + " but got " + rows.size() + ". The extra rows were: " + extra);
+    }
+
+    // not strictly necessary after the other checks, but just to specify the invariant
+    assertThat(rows, hasSize(numRows));
+
  }


This change was part of my debugging that led to discovering that some of the tests were inserting new data into the shared stream (and ejecting them into a separate test file).

I figured I'd go ahead and keep this test improvement, since it is nice to get some extra information about why we didn't get the expected row count and whether or not the rows we did get were expected or not.

vvcephei · 2021-09-14T18:22:06Z

...nt/src/test/java/io/confluent/ksql/api/client/integration/ClientMutationIntegrationTest.java

+import org.junit.rules.RuleChain;
+
+@Category({IntegrationTest.class})
+public class ClientMutationIntegrationTest {


Basically just copied from ClientIntegrationTest to isolate the tests that want to mutate the shared test data (like inserting new records). It seems like we didn't want to take the time to completely clean and re-create the test data in between each test, so the tests we write need to know whether we expect the data to change or not.

Separating the "mutation" tests from the "static" tests is advantageous because it makes it clear whether a particular test needs to worry about the underlying data changing or not.

vvcephei · 2021-09-14T21:50:48Z

ksqldb-engine/src/main/java/io/confluent/ksql/query/TransientQueryQueue.java

-    callback = new LimitQueueCallback() {
-      @Override
-      public boolean shouldQueue() {
-        return parent.shouldQueue();
-      }
-
-      @Override
-      public void onQueued() {
-        parent.onQueued();
-        queuedCallback.run();
-      }


This complexity is a big chunk of the reason why I inlined this callback. We were already piggy-backing the "queued" callback on the limit handler, and it was really hard to see what was actually going on in the code. By flattening it into this class, we can actually follow this logic now.

I agree. I always found this parent, chaining to be a bit hard to understand.

vvcephei · 2021-09-14T21:53:20Z

ksqldb-engine/src/main/java/io/confluent/ksql/query/TransientQueryQueue.java

+    if (remaining != null && remaining.decrementAndGet() <= 0) {
+      limitHandler.limitReached();
+    }
+    if (queuedCallback != null) {
+      queuedCallback.run();
+    }


if you drilled down into that composite limit + queued handler callback, this is all it was actually doing.

vvcephei · 2021-09-14T21:58:00Z

ksqldb-engine/src/test/java/io/confluent/ksql/util/StructuredTypesDataProvider.java

+    this("STRUCTURED_TYPES");
+  }
+
+  public StructuredTypesDataProvider(final String namePrefix) {


Pulling out this constructor lets us give the topics and tables/streams a different name (which we use for the new ClientMutableIntegrationTest)

vvcephei · 2021-09-14T21:59:11Z

...ional-tests/src/test/resources/rest-query-validation-tests/pull-queries-against-streams.json

@@ -93,17 +93,13 @@
      }
    },
    {
-      "name": "correct results",
+      "name": "simple",


I know it's a bummer to change the tests and logic at the same time, but having all these cases in the same test was making it hard to debug.

vvcephei · 2021-09-14T22:02:31Z

ksqldb-rest-app/src/main/java/io/confluent/ksql/api/server/LoggingHandler.java

      final String message = String.format(
          "%s - %s [%s] \"%s %s %s\" %d %d \"-\" \"%s\" %d",
-          routingContext.request().remoteAddress().host(),
+          socketAddress == null ? "null" : socketAddress.host(),


Not sure why, but I got an NPE on this line a couple of times. This should prevent it.

AlanConfluent · 2021-09-15T22:13:00Z

ksqldb-engine/src/main/java/io/confluent/ksql/engine/KsqlEngine.java

@@ -471,7 +422,7 @@ private TopicDescription getTopicDescription(final Admin admin, final String sou
    final ListOffsetsResult listOffsetsResult = admin.listOffsets(
        topicPartitions,
        new ListOffsetsOptions(
-            isolationLevel
+            IsolationLevel.READ_UNCOMMITTED


Why read uncommitted rather than the code from before?

I hard-coded the stream pull query app to be "at least once" (because it is pure side-effect, there is no point in EOS), which means that we will only need to get the end offsets in "read uncommitted" mode.

Ok, I added a comment to clarify this.

AlanConfluent · 2021-09-15T22:18:36Z

ksqldb-engine/src/main/java/io/confluent/ksql/query/PullQueryQueue.java

+    // not currently used in pull queries, although future refactoring might be able to
+    // take advantage of this mechanism.


I poked through the code and saw how we do this for pull queries at the moment, which is that we use the LimitHandler and just fire it, even if there's nothing in the queue. It's effectively the same thing as this completion handler. Can we make a followup task to consolidate these two?

AlanConfluent · 2021-09-15T22:21:30Z

ksqldb-engine/src/main/java/io/confluent/ksql/query/QueryBuilder.java

+
+      // We ignore null values in streams as invalid data and drop the record.
+      // If we're already done with the current partition, we just drop the record.
+      if (record.value() != null && !alreadyDone) {


Do push queries not handle null data values? For some reason I thought they did.

Yeah, this code was here before, so I guess not.

That is consistent with KS semantics, that null-values are considered invalid and always filtered out. In tables they are considered as tombstones of course, and we sort of letting the users be responsible of the difference between the two when converting in between.

AlanConfluent · 2021-09-15T22:27:42Z

ksqldb-engine/src/main/java/io/confluent/ksql/query/QueryBuilder.java

+      }
+    }
+
+    private void checkCompletion(final TopicPartition topicPartition) {


I found it just a bit confusing having 3 different versions of checkCompletion. Maybe it makes sense to call this something like checkCompletionPartition the next checkCompletionAllPartitions?

If you had checkCompletion() call this method, you could inline the body of the third method here. Is the idea that you want a consistent snapshot of currentPositions? There is only one thread acting on everything, right?

Roger that. I kind of phoned it in on naming these methods. I'll fix it.

It's handy to keep the third method so that we don't have to separately call currentPositions before invoking the single-partition version. Maybe you can take another look after I push my current batch of updates and see if you think it's ok.

AlanConfluent · 2021-09-15T22:29:39Z

ksqldb-engine/src/main/java/io/confluent/ksql/query/QueryBuilder.java

+          streamTask = ProcessorContextImpl.class.getDeclaredField("streamTask");
+          streamTask.setAccessible(true);
+          final StreamTask task = (StreamTask) streamTask.get(context);
+          final Method committableOffsetsAndMetadata =


We should make an issue to expose this in streams so that this isn't required.

I wanted to sanity-check the overall algorithm before starting that process. Since it seems like you agree with the approach, I'll go ahead and propose it for AK now.

Here's a POC I put together for an AK feature, but I haven't taken the time to formally propose it: apache/kafka#11336

AlanConfluent · 2021-09-15T22:31:46Z

ksqldb-engine/src/main/java/io/confluent/ksql/query/TransientQueryQueue.java

-    callback = new LimitQueueCallback() {
-      @Override
-      public boolean shouldQueue() {
-        return parent.shouldQueue();
-      }
-
-      @Override
-      public void onQueued() {
-        parent.onQueued();
-        queuedCallback.run();
-      }


I agree. I always found this parent, chaining to be a bit hard to understand.

AlanConfluent · 2021-09-15T22:34:45Z

ksqldb-engine/src/main/java/io/confluent/ksql/query/QueryBuilder.java

@@ -650,4 +671,131 @@ private static void updateListProperty(
    valueList.add(value);
    properties.put(key, valueList);
  }
+
+  private static class TransientQuerySinkProcessor implements


This is complex enough, we might want to pull it out into its own class.

Great idea.

AlanConfluent · 2021-09-15T22:36:48Z

ksqldb-engine/src/main/java/io/confluent/ksql/engine/KsqlEngine.java

@@ -364,15 +354,16 @@ public StreamPullQueryMetadata createStreamPullQuery(
            .putAll(statementOrig.getSessionConfig().getOverrides())
            .put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest")
            .put(StreamsConfig.NUM_STREAM_THREADS_CONFIG, 1)


Is this important for threadsafety in your processor? Might make sense to comment on that here if so.

No, I just think it'd be wasteful to spend more than one thread on this operation.

Turns out, I did have a comment explaining the full set of configs a few lines above, but I'd already forgotten to update it when I removed the commit interval, which isn't a good sign.

I moved the comments down to explain each config option independently, so this is hopefully clearer now.

guozhangwang

Made a pass as well, I really like the cleanup you did alongside this PR!

One meta question I had is around why we need to split the phases of stream pull queries into create first, and then wait, instead as execute as in others. I think I just lack some critical context here :P Will sync with @vvcephei separately offline.

guozhangwang · 2021-09-16T18:42:23Z

ksqldb-engine/src/main/java/io/confluent/ksql/query/TransientQuerySinkProcessor.java

+        streamTask = ProcessorContextImpl.class.getDeclaredField("streamTask");
+        streamTask.setAccessible(true);
+        final StreamTask task = (StreamTask) streamTask.get(context);
+        final Method committableOffsetsAndMetadata =


If the current position is referring to the source topic's position, could we get them via ProcessorContext#recordMetadata()#offset()? cc @AlanConfluent

I tried that first, only to sadly be reminded that the endOffsets returned by the admin client are not the "offsets of the last records in the partitions", but rather "the offset after the last record in the partition" aka "the offset that the next produced record will get".

The reason I cracked the safe to get at this method here is because it specifically computes that position for the purpose of committing offsets. I.e., we don't commit the offsets of the records we have processed, we commit the offsets of the next records we need to process. It might be tempting to just use recordMetadata#offset() + 1 for this, but the reality is that it's not so trivial, thanks for skips in the offset sequence due to compaction, "invisible" offsets for transaction markers, and stuff like that.

The method I'm exposing here specifically accounts for all of those effects:
https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamTask.java#L435-L455

In addition to the above, we have another problem: upstream processors might filter out arbitrary records before they even reach this processor. Specifically, if you write a stream pull query that should only match one record, and there are more records after it in the topic, the query would never terminate, since the process() method would never see any offsets passed the one we're filtering for. That's why I also registered the punctuator, and the punctuator needs to be able to check "the current position" of the task as a whole, since it has no recordMetadata available to get the offset of.

Thanks John, that makes sense!

vvcephei · 2021-09-16T19:33:36Z

Thanks for taking a look, @guozhangwang ! I responded above to your question about the Streams state I'm peeking into. Please let me know if it still seems wrong to you.

About the create/wait separation: that separation is actually gone now. Specifically, this PR removes the "wait" method in favor of using the same mechanism to complete the query that we're currently using to implement the "limit" clause for push queries. That would have been a better way to do it from the start, but I didn't have the inspiration until now.

I considered renaming the "create" method to "execute" like the others, but I'm actually more inclined to rename them all to "create" for the simple reason that none of those methods actually execute anything. They all only create the query metadata and then some other component later calls start() to actually execute the query. But I decided to just leave it alone, since I'd like to continue iteratively refactoring this part of the code to simplify things.

guozhangwang

Looking at the function names in KsqlEngine, and I'm wondering if we could consider renaming some of them, since:

executeTransientQuery only create the query, and then trigger start() in the publisher
executeTablePullQuery may trigger start() if startImmediately is true (and in new APIs it is set to true indeed); if not it would trigger in the publisher
executeScalablePushQuery is the same as executeTransientQuery

So maybe it’s better to consider renaming all of them as createXXX similar to createStreamPullQuery except createAndMaybeExecuteTablePullQuery? WDYT @vvcephei @AlanConfluent @vpapavas

AlanConfluent

Looks good

vvcephei · 2021-09-16T21:01:57Z

Thanks for looking at that, @guozhangwang ! I agree with your proposal, but I'd like to tackle it in a later PR.

vvcephei · 2021-09-16T22:54:41Z

Thanks for the review @AlanConfluent !

vvcephei requested a review from a team as a code owner September 13, 2021 22:38

vpapavas reviewed Sep 14, 2021

View reviewed changes

vvcephei changed the title ~~DRAFT: Add stream pull queries to http2~~ feat: Add stream pull queries to http2 Sep 14, 2021

vvcephei commented Sep 14, 2021

View reviewed changes

vvcephei force-pushed the add-stream-pull-queries-to-http2 branch 2 times, most recently from 6ac5f60 to dfe4e0a Compare September 15, 2021 21:53

feat: add stream pull queries to HTTP/2

dfe4e0a

AlanConfluent reviewed Sep 15, 2021

View reviewed changes

John Roesler added 2 commits September 16, 2021 12:14

fix zero-limit edge case for push queries

74b3ba5

code review

b590fa3

guozhangwang reviewed Sep 16, 2021

View reviewed changes

AlanConfluent approved these changes Sep 16, 2021

View reviewed changes

vvcephei merged commit a722184 into master Sep 16, 2021

vvcephei deleted the add-stream-pull-queries-to-http2 branch September 16, 2021 22:55

agavra mentioned this pull request Oct 7, 2021

Support ad-hoc queries on streams #7209

Closed

		) {
		// If we are supposed to stop at the endOffsets, but we didn't get any, then we should

		// not currently used in pull queries, although future refactoring might be able to
		// take advantage of this mechanism.

feat: Add stream pull queries to http2 #8124

feat: Add stream pull queries to http2 #8124

Conversation

vvcephei commented Sep 13, 2021 • edited Loading

Reviewer checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vpapavas Sep 14, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vvcephei left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

guozhangwang left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vvcephei commented Sep 16, 2021

guozhangwang left a comment

Choose a reason for hiding this comment

AlanConfluent left a comment

Choose a reason for hiding this comment

vvcephei commented Sep 16, 2021

vvcephei commented Sep 16, 2021

vvcephei commented Sep 13, 2021 •

edited

Loading

vpapavas Sep 14, 2021 •

edited

Loading