Dekaf materialization endpoint support #1840

jshearer · 2024-12-18T17:30:20Z

Description:

This adds support for the server side of Dekaf's support for materialization endpoints. At a high level, Dekaf is just another way to get data out of Flow. We already have a well fleshed out concept for these things: materializations. So back in #1665 we introduced support for a new materialization "endpoint type": dekaf. This lives alongside local and connector as the third kind of materialization, and is configured like so:

    endpoint:
      dekaf:
        variant: some-dekaf-variant
        config:
          token: "foo"
          ...other configuration for the behavior of your Dekaf task...

The second part of this work is for Dekaf the server to support this mode of operation. Briefly, it needs to:

Support authentication and authorization using the control plane. This means that conceptually, Dekaf is authenticating you based on the task name and token specified in the endpoint config, and authorizing you based on the access granted to that task
- Specifically, this means that you now use the task name for your username, and the token specified in the endpoint config when connecting to Dekaf.
Act more like a regular materialization. This means...
- Support for field selection. Materializations let you specify which field(s) you want included in your destination using field selection. This fundamentally looks like a document transformation, where some fields may be removed and some fields may be projected from their original home to a new location in the output.
- Support for task logs. Just like captures, materialization log output is captured and presented to users for status monitoring and debugging. As Dekaf is multi-tenant at its core, presenting these logs requires identifying which log messages are associated with which task, and then capturing and writing them to the corresponding logs journal.
- [Fast-Follow] Support for task stats. In order to monitor the status of your running tasks, as well as aggregate usage information for billing, all tasks need to periodically emit information about how much work they've been doing. While Dekaf was in beta, all usage was free so this was less of a priority, but now that there's a task to associate stats with, implementing stats will likely be one of the last things to do before going GA.

I still have a couple of things on my list before this is fully wrapped up:

Implement emitting stats
Implement routing SessionAuthentication::Task sessions to a new/different backing store for migration purposes
Support journal suspension
Implement CI task for Dekaf integration tests
Later: out how to make shard statuses show green in the UI for Dekaf tasks

This change is

crates/dekaf/src/log_journal.rs

…s using `/authorize/dekaf` and `/authorize/task` Also add a hint for passing a collection name as a topic name, when the binding has renamed that topic

* Connector projections: emit recommended constraints for all fields -- you get everything by default, and you can modify the selection however you like * Schema: Build a schema from the materialization's built spec's `field_selection` and the collection's projections that will match the extracted documents * Extraction: Implement field extraction using the `extractors` crate to emit documents that match the "value schema"

… a Session and write them to the correct ops logs journal Also support filtering logs by the requested shard log level

Then implement some tests to validate field selection logic

Part of dekaf: Improvements to handle higher scale #1876, we want to implement broker fallback so Dekaf can connect to any of the brokers in the cluster if one doesn't respond. An improvement here would be to periodically fetch the metadata from at least one of the responding brokers and update this list of addresses so that future sessions can know about/use any newly created members of the cluster. I don't anticipate changing the topology of our cluster that frequently, and if we do then updating Dekaf's deployment configs isn't that big of a deal. I may eat my hat on this, we'll see. In addition, we want to move people over to the new MSK cluster, so this implements routing new-style connections to a separate cluster with separate credentials.

A couple things to note: * I originally tried to create a single `journal::Client` responsible for appending both logs and stats, but I ended up realizing that `/authorize/task` only allows authoring a token for a single task/prefix at a time. So I took the simpler route of creating two clients, rather than teaching `/authorize/task` how to handle multiple tasks, which has some fairly delicate requirements. * As it turns out, the stats rollups assume the presence of a `shard` field on both logs and stats. So I ended up needing to craft a `ShardRef` that just contains the Dekaf materialization name, and attach it to both the logs and stats documents that get emitted.

I noticed that after roughly 1-2 hours, Dekaf would stop writing logs and stats. I tracked that down to an error appending logs, specifically: ``` Grpc( Status { code: DeadlineExceeded, message: "context deadline exceeded" } ) ``` It turns out that this is the error Gazette returns when the auth token you pass it is expired, and the appending machinery in Dekaf wasn't taking into account token expiry. So this commit refactors `GazetteWriter` to be composed of two `GazetteAppender`s, one for logs and one for stats. Each `GazetteAppender` is capable of refreshing its internal client when neccesary

We can still fetch suspended journals with a regular `ListRequest`. This will return journal specs which contain a `suspend` field. If `journal.spec.suspend.level` is `FULL`, it's not possible to read from that journal. So we need to: * Report both low and high-watermarks as `journal.spec.suspend.offset` * Serve empty resultsets for any read against this partition

jshearer · 2025-02-10T16:39:01Z

crates/dekaf/src/read.rs


                    record_bytes += tmp.len();
                    buf.extend_from_slice(&tmp);
                    tmp.clear();
                    Some(buf.split().freeze())
                };

+            input_bytes += next_offset - self.offset;


This will count acks... should we skip those?

Yep, we skip acks in the runtime -- they're neither counted as a "doc" nor do we accumulate their bytes.

jgraettinger

LGTM! 🚢

I know you're still working on amortizing multiple log appends, but all of the big picture stuff looks solid, and I think this can land as-is.

Going forward (not this PR), a refactor I'd urge you to look into is separating the pre-authorization session loop from the post-authorization loop. A refactored pre-auth session loop would only expect authorization-related messages. It would build up all of the bits of session state & context, and then the moment the session authorization is complete, it would tail-call into the post-authorization handler.

The post-auth handler then gets the benefit of assuming the session context is fully established, that the tokio::task_local! is set and Some, and it can also use tracing::instrument for contextual fields of the session (instead of using tracing-record-hierarchical to mutate the parent Span).

jgraettinger · 2025-02-10T21:18:49Z

crates/dekaf/src/lib.rs

-            tracing::debug!(client_id=?header.client_id, "Got client ID!");
-            session.client_id = header.client_id.clone().map(|id| id.to_string());
+            if let Some(client_id) = &header.client_id {
+                tracing::Span::current()


nit, and no changes in this PR, but noting for the future:

The fact that this is reaching for record_hierarchical tells me that the session handling ought to be refactored into a pre-authorization handler, that then tail-calls down into a post-authorization handler which uses tracing::instrument.

jgraettinger · 2025-02-10T21:54:35Z

crates/dekaf/src/log_appender.rs

+                        "Got recoverable error trying to write logs, retrying"
+                    );
+
+                    tokio::time::sleep(Duration::from_millis(wait_ms)).await;


gazette handles backoff sleeps for you already, so this will be on top of what it's doing.

jgraettinger · 2025-02-10T21:55:26Z

crates/dekaf/src/log_appender.rs

+        loop {
+            match resp.try_next().await {
+                Ok(_) => return Ok(()),
+                Err(RetryError { inner: err, .. })


nit: This case seems like it could be removed

jgraettinger · 2025-02-10T22:06:03Z

crates/dekaf/src/log_appender.rs

+                    break name;
+                }
+                Some(TaskWriterMessage::Log(log)) => {
+                    pending_logs.push_front(log);


What's logged prior to authorization that's valuable to preserve in task logs? We can still direct stuff to application-level logs if needed. I'm just wondering if the juice is worth the squeeze.

jgraettinger · 2025-02-11T02:51:47Z

crates/dekaf/src/logging.rs

+    let registry = tracing_subscriber::registry()
+        .with(tracing_record_hierarchical::HierarchicalRecord::default())
+        .with(
+            ops::tracing::Layer::new(


glad this worked out! Way better

jgraettinger · 2025-02-11T02:53:01Z

crates/dekaf/src/read.rs


                    record_bytes += tmp.len();
                    buf.extend_from_slice(&tmp);
                    tmp.clear();
                    Some(buf.split().freeze())
                };

+            input_bytes += next_offset - self.offset;


Yep, we skip acks in the runtime -- they're neither counted as a "doc" nor do we accumulate their bytes.

jgraettinger · 2025-02-11T02:55:09Z

crates/dekaf/src/read.rs

+    /// Note that since avro encoding can happen piecewise, there's never a need to
+    /// put together the whole extracted document, and instead we can build up the
+    /// encoded output iteratively
+    fn extract_and_encode<'a>(


jgraettinger · 2025-02-11T03:00:54Z

crates/dekaf/src/utils.rs

+
+// This lets us add our own "virtual" fields to Dekaf without having to add them to
+// doc::Extractor and all of the other platform machinery.
+impl CustomizableExtractor {


IMO it's very reasonable to add this in extractors.rs, but 🤷‍♂️

`Client::append` is fine if you only have a single buffer to append, but if you want to append an ongoing stream of messages in order, you fundamentally need somewhere to buffer your messages until they can be included in an append request.

jshearer mentioned this pull request Dec 30, 2024

dekaf: Generate predictable shard IDs for dekaf materialization endpoints #1848

Merged

jshearer force-pushed the jshearer/dekaf_materialization_endpoint_support branch 4 times, most recently from 173b9a2 to 2528553 Compare January 6, 2025 19:14

This was referenced Dec 3, 2024

dekaf: Switch to using new Dekaf materialization type + self-signed /authorize/task invocations + task logs and stats #1669

Open

gazette: Implement journal::append() #1856

Merged

jshearer force-pushed the jshearer/dekaf_materialization_endpoint_support branch 7 times, most recently from 41b05c2 to 457cb62 Compare January 13, 2025 14:50

jshearer mentioned this pull request Jan 13, 2025

agent: Switch /authorize/dekaf to fetch+return the task's built spec #1865

Merged

jshearer force-pushed the jshearer/dekaf_materialization_endpoint_support branch 14 times, most recently from cd55a23 to a16d4c5 Compare January 15, 2025 21:53

jshearer commented Jan 15, 2025

View reviewed changes

crates/dekaf/src/log_journal.rs Outdated Show resolved Hide resolved

jshearer added 3 commits February 6, 2025 12:34

dekaf: Better docstring

831a467

dekaf: Add support for authenticating with dekaf-type materialization…

bc6e722

…s using `/authorize/dekaf` and `/authorize/task` Also add a hint for passing a collection name as a topic name, when the binding has renamed that topic

jshearer force-pushed the jshearer/dekaf_materialization_endpoint_support branch 3 times, most recently from d1c933d to 88e456f Compare February 7, 2025 17:33

jshearer mentioned this pull request Feb 7, 2025

dekaf: Upstream kafka fallback and per-connection-type routing #1911

Closed

jshearer force-pushed the jshearer/dekaf_materialization_endpoint_support branch from 88e456f to cf35f2b Compare February 7, 2025 17:43

jshearer added 7 commits February 7, 2025 12:44

dekaf: Implement tracing layer to capture messages logged inside of…

008738a

… a Session and write them to the correct ops logs journal Also support filtering logs by the requested shard log level

dekaf: Build as part of Makefile / package into Flow image

b6fe635

dekaf: Update catalog schema

1b311a8

dekaf: Implement integration tests for round-tripping data through Dekaf

6d3bd93

Then implement some tests to validate field selection logic

avro: Bump apache-avro to 0.17.0

e0f1a23

dekaf: Fix DEKAF_IMAGE_TAG typo -- should be :v1, not v1

2c9bbda

dekaf: Small style cleanups

7033651

jshearer force-pushed the jshearer/dekaf_materialization_endpoint_support branch 3 times, most recently from f86c86a to 786adee Compare February 7, 2025 20:28

jshearer added 3 commits February 7, 2025 16:28

dekaf: Add more useful debug logging when avro encoding fails

e8b5b26

jshearer force-pushed the jshearer/dekaf_materialization_endpoint_support branch from 786adee to 966303f Compare February 7, 2025 21:28

jshearer added 2 commits February 7, 2025 17:11

jshearer force-pushed the jshearer/dekaf_materialization_endpoint_support branch from 966303f to d038c06 Compare February 7, 2025 22:12

jshearer commented Feb 10, 2025

View reviewed changes

jgraettinger approved these changes Feb 11, 2025

View reviewed changes

jshearer added 2 commits February 11, 2025 16:06

gazette: Add journal::Client::append_stream

96a2577

`Client::append` is fine if you only have a single buffer to append, but if you want to append an ongoing stream of messages in order, you fundamentally need somewhere to buffer your messages until they can be included in an append request.

dekaf: Update logging to use journal::Client::append_stream

794d951

jshearer force-pushed the jshearer/dekaf_materialization_endpoint_support branch from a118c5e to 794d951 Compare February 11, 2025 21:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dekaf materialization endpoint support #1840

Dekaf materialization endpoint support #1840

jshearer commented Dec 18, 2024 •

edited

Loading

jshearer Feb 10, 2025

jgraettinger Feb 11, 2025

jgraettinger left a comment

jgraettinger Feb 10, 2025

jgraettinger Feb 10, 2025

jgraettinger Feb 10, 2025

jgraettinger Feb 10, 2025

jgraettinger Feb 11, 2025

jgraettinger Feb 11, 2025

jgraettinger Feb 11, 2025

jgraettinger Feb 11, 2025

Dekaf materialization endpoint support #1840

Are you sure you want to change the base?

Dekaf materialization endpoint support #1840

Conversation

jshearer commented Dec 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jgraettinger left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jshearer commented Dec 18, 2024 •

edited

Loading