Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] Implement Kafka output form UI #143324

Closed
6 of 62 tasks
Tracked by #16
nimarezainia opened this issue Oct 13, 2022 · 59 comments · Fixed by #159110 or #160112
Closed
6 of 62 tasks
Tracked by #16

[Fleet] Implement Kafka output form UI #143324

nimarezainia opened this issue Oct 13, 2022 · 59 comments · Fixed by #159110 or #160112
Assignees
Labels
8.10 candidate epic Feature:Fleet Fleet team's agent central management project OLM Sprint QA:Validated Issue has been validated by QA Team:Defend Workflows “EDR Workflows” sub-team of Security Solution Team:Fleet Team label for Observability Data Collection Fleet team

Comments

@nimarezainia
Copy link
Contributor

nimarezainia commented Oct 13, 2022

Kafka output UI

Similar to Logstash output, we need to add the option for users to specify Kafka as an output option for their data. In 8.8, this UI will be hidden behind an experimental flag as the shipper portion is not ready until 8.9.

Tasks

API

The output API should support a new output type: kafka.

See Kafka Output type

This output type should have the following properties:

hosts[]: uri
version?: string // defaults to 1.0.0 by beats/agent if not set
key?: string
compression?: 'snappy' | 'lz4' | 'gzip' // defaults to gzip
compression_level?: integer // only for gzip compression, defaults to 4
client_id: string

// authentication can done using:
//   username/password, ssl, or kerberos
auth_type: 'user_pass' | 'ssl' | 'kerberos'

// auth: username/password
username?: string
password?: string
sasl.mechanism?: 'PLAIN' | 'SCRAM-SHA-256' | 'SCRAM-SHA-512'

// auth: ssl
ssl.certificate_authorities?: string
ssl.certificate?: string
ssl.key?: string

// auth: kerberos - should be marked as beta
// TBD: to check if we should do this as part of phase 1 if it is in beta

// partitioning settings
partition: 'random' | 'round_robin' | 'hash' // defaults to 'hash'
random.group_events?: integer
round_robin.group_events?: integer
hash.hash?: string
hash.random?: boolean // TBD: check the type of this field

// topics array
topics:
  topic: string
  when.type?: 'equals' | 'contains' | 'regexp' | 'range' | 'network' | 'has_fields' | 'or' | 'and' | 'not'
  when.condition?: string

// headers array
headers?:
  key: string
  value: string

// broker

UI tasks

  • Implement Kafka output form based on designs
    • Add "Kafka" option to output type dropdown if experimental flag is enabled
    • Top section
      • Kafka Version dropdown
        • Valid range 0.8.2.0 through 2.6.0 (verify all the available options)
        • Defaults to 1.0.0
        • Description: Specify the URLs that your agents will use to connect to Kafka. For more information, see the Fleet User Guide
      • Hosts field - List of brokers on port 9092
        • Required field
        • Needs to have Add row button below
    • Authentication section - selection by radio buttons and different fields for:
      • Option 1 - Username/Password
        • SASL Mechanism (radio buttons list)
          • PLAIN
          • SCRAM-SHA-256
          • SCRAM-SHA-512
      • Option 2 - SSL (Similar design pattern as Logstash)
        • Server SSL certificate authorities (text input)
        • Add multiple certificate authorities (button)
        • Client SSL certificate (text input)
        • Client SLL certificate Key (text input)
      • Option 3 - Kerberos (TBD, may not be needed in this phase. It might be marked as beta)
    • Partitioning section
      • Selection by radio buttons and different fields for:
        • Option 1 - Random
          • Number of events (input box)
        • Option 2 - Round robin
          • Number of events (input box)
        • Option 3 - Hash
          • List of comma separated fields used to compute the partition hash value from
    • Topics section
      • Default topic - this should be saved to the last item of the topics[] array (text input box)
      • Processors - should be able to add arbitrary amount of processors and be able to remove and reorder them
      • All the possible conditions are described here
    • Headers section
      • Key and value text inputs
      • Should be able to add arbitrary amount of key/value pairs (more definition needed on this)
      • Client ID field that defaults to Elastic Agent
    • Compression section
      • Toggle to enable/disable
      • If enabled, provide dropdown for different compression types,
      • Defaults to gzip and compression level 4
      • Dropdown can be one of:
        • none
        • snappy
        • lz4
        • gzip
      • If gzip, also show field for compression level
    • Broker settings section
      • Broker Timeout dropdown - default 30s
        • Description: Define how long a Kafka server waits for data in the same cluster
      • Broker Reachability Timeout dropdown - default 30s
        • Description: Define how long an Agent would wait for a response from Kafka Broker
      • Channel Buffer Size - default 256
        • Description: Define the number of messages buffered in output pipeline
      • ACK reliability dropdown - default Wait for local commit
        • Description: Reliability level required from the broker
        • Options:
          • Wait for local commit
          • Wait for all replicas to commit
          • Do not wait
    • Key field (text input)
      • Optional formatted string specifying the Kafka event key
      • Syntax described here and examples are here
      • Description: If configured, the event key can be extracted from the event using a format string
Designs

Kafka_output_1 Kafka_output_2 Kafka_output_4 Kafka_output_5

Open questions

  • Should we add an "experimental" or "beta" badge to the UI (similar to what was done for the shipper section?)
@nimarezainia nimarezainia added the Feature:Fleet Fleet team's agent central management project label Oct 13, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Feature:Fleet)

@botelastic botelastic bot added the needs-team Issues missing a team label label Oct 13, 2022
@dmlemeshko dmlemeshko added the Team:Fleet Team label for Observability Data Collection Fleet team label Oct 14, 2022
@botelastic botelastic bot removed the needs-team Issues missing a team label label Oct 14, 2022
@jlind23
Copy link
Contributor

jlind23 commented Jan 3, 2023

@jen-huang @nimarezainia Do we already have the design for this work?

@jen-huang
Copy link
Contributor

@jlind23 Yes we do, link to the designs can be found in the product definition doc in parent issue of this one.

@jlind23
Copy link
Contributor

jlind23 commented Mar 28, 2023

@jen-huang is the tech definition ready to be worked on in our next sprint?

@jen-huang
Copy link
Contributor

@jlind23 I'm still going to work on it this week.

@jen-huang jen-huang changed the title [Elastic Agent] Implementing UI elements of Kafka Output for fleet managed agents [Fleet] Kafka output UI Mar 30, 2023
@jen-huang jen-huang changed the title [Fleet] Kafka output UI [Fleet] Implement Kafka output form UI Mar 30, 2023
@jlind23
Copy link
Contributor

jlind23 commented Apr 3, 2023

@jen-huang As you change this issue title to implement I believe the status should be changed to ready accordingly? Shall I also remove your assignment?

@kpollich kpollich assigned criamico and unassigned jen-huang Apr 4, 2023
@kpollich kpollich added QA:Needs Validation Issue needs to be validated by QA v8.8.0 labels Apr 5, 2023
@criamico
Copy link
Contributor

I'm currently looking at the schema validation for the new kafka type and it would be much cleaner if we moved from

/api/fleet/outputs
{
  type: 'kafka'
  ... 
}

to

/api/fleet/outputs/kafka
{
  ...
}

Of course we would keep the old endpoint for a few releases and mark it as deprecated. @kpollich suggested to redirect the code to the right path based on the type property in the request body.

@jen-huang
Copy link
Contributor

As discussed, will move this to Sprint 12 and continue the work there.

@joshdover
Copy link
Contributor

We will likely need to feature flag this as the Agent work will not be ready in the same release. However, we still want to enable customers to test SNAPSHOT builds of the agent once it is ready. I think we should use the "Advanced Settings" Kibana infra to do the feature flagging instead of kibana.yml settings to easily enable a customer to turn on this feature without having to reconfigure and restart Kibana.

@jlind23
Copy link
Contributor

jlind23 commented May 24, 2023

@criamico this will be included in our next sprint. @joshdover had a great idea about first delivering the API experience to unblock users and in a second PR work on the UI part. Both should land in separate releases if needed. What do you think?

cc @juliaElastic

@nimarezainia
Copy link
Contributor Author

the API approach is fine as a first step. However what the users will need is the full UI capabilities.
Also we would need to get some sample API calls that show the user how to configure Kafka in this case and think about how that would show up in the fleet UI with other outputs present.

In otherwords, if the user uses the API to create the output and configure it, what would the other users see in the Fleet UI?

@jlind23
Copy link
Contributor

jlind23 commented May 31, 2023

@nimarezainia The API first approach has the benefit of unblocking Elastic Agent E2E tests with Kafka output, it does not necessarily imply that we should ship it to our users without any UI.

@joshdover
Copy link
Contributor

joshdover commented May 31, 2023

In otherwords, if the user uses the API to create the output and configure it, what would the other users see in the Fleet UI?

This is a good question. We'll still have to make a few UI adjustments to make sure this new output type doesn't break our existing UIs.

I would suggest just showing the output row for the Kafka outputs in the Settings tab, but disabling the edit button with a tooltip: "Use the Fleet API to edit this output"

@andrewkroh
Copy link
Member

Thanks @juliaElastic. So is the Fleet API request format exactly the same as what is written into the Agent policy outputs section? For example, if an API request comes in with ca_trusted_fingerprint: 79f956a0175 then the Agent policy would contain this or is there some translation?

outputs:
  default:
    type: kafka
    ca_trusted_fingerprint: 79f956a0175

@juliaElastic
Copy link
Contributor

There is some translation, e.g. ca_trusted_fingerprint gets a prefix of ssl.: https://github.com/szwarckonrad/kibana/blob/main/x-pack/plugins/fleet/server/services/agent_policies/full_agent_policy.ts#L214

@mjmbischoff
Copy link
Contributor

Perhaps a silly question, but is there any integration with the other end of the kafka? we have https://docs.elastic.co/integrations/kafka_log and if I gathered the tickets correctly we also have an opinionated default name for topics so it would be great if some option was given to deploy an agent that consumes the topics for which integrations are sending and uses the default datastreams / pipelines, to ease rollout

@brian-mckinney
Copy link

I have a question about topics and processors in general. It is unlikely that Endpoint will be able to support the full gamut of available processors, considering we do not have access to libbeat and would have to write all the parsing code from scratch in C++. Is there a minimum set of required processors that could maybe help us tone down the scope of what we will need to provide? cc: @nfritts @ferullo

ref: https://www.elastic.co/guide/en/beats/filebeat/current/defining-processors.html

@mjmbischoff
Copy link
Contributor

@brian-mckinney I'm unsure how this ties into this issue specifically, but processors are generally there to edge processing typically it's to drop traffic / reduce payload, or collect additional (local context) information (add_*_metadata processors + dns). With kafka in the middle:

Source -> shipper(agent/endpoint?) -> Kafka -> forwarder(agent/filebeat) -> Elasticsearch

you still have the option to use all processors at the forwarder, though add_*_metadata processors aren't useful there as it would record things on the shipper, regardless it should improve things over the the current available options. And of source there's ingest pipelines for everything that doesn't require edge processing.

I think it's a separate issue, but things like decode_* can be skipped and parse_aws_vpc_flow_log which seems like a poorly named decode_ variant.

@joshdover
Copy link
Contributor

joshdover commented Jul 19, 2023

I have a question about topics and processors in general. It is unlikely that Endpoint will be able to support the full gamut of available processors, considering we do not have access to libbeat and would have to write all the parsing code from scratch in C++.

I fully agree that Endpoint should avoid re-implementing Beats processors. From my understanding, the Kafka output only makes use of the conditional processors for topic selection, which does pare down the list somewhat. But I wonder if we could ship the first version of this without support for dynamic topic selection at all and only support for the static topic field? @nimarezainia

If/when we do want to support dynamic topic selection, I think we could omit some conditions, like network or contains (covered by regexp).

@joshdover
Copy link
Contributor

Perhaps a silly question, but is there any integration with the other end of the kafka? we have docs.elastic.co/integrations/kafka_log and if I gathered the tickets correctly we also have an opinionated default name for topics so it would be great if some option was given to deploy an agent that consumes the topics for which integrations are sending and uses the default datastreams / pipelines, to ease rollout

This is a good suggestion - but not considered at this point. It would make sense for our default and examples in docs to match up with what the Kafka input package expects.

@brian-mckinney
Copy link

I fully agree that Endpoint should avoid re-implementing Beats processors. From my understanding, the Kafka output only makes use of the conditional processors for topic selection, which does pare down the list somewhat. But I wonder if we could ship the first version of this without support for topic selection at all and only support for the static topic field? @nimarezainia

If/when we do want to support dynamic topic selection, I think we could omit some conditions, like network or contains (covered by regexp).

Thanks @joshdover. I'm very interested in the outcome of this discussion. Scoping out dynamic topic selection in the first version would definitely reduce the amount of effort and testing complexity (on Endpoint at least) for the first version.

kevinlog pushed a commit that referenced this issue Jul 19, 2023
Ruhshan pushed a commit to Ruhshan/kibana that referenced this issue Jul 19, 2023
@nimarezainia
Copy link
Contributor Author

I fully agree that Endpoint should avoid re-implementing Beats processors. From my understanding, the Kafka output only makes use of the conditional processors for topic selection, which does pare down the list somewhat. But I wonder if we could ship the first version of this without support for topic selection at all and only support for the static topic field? @nimarezainia

If/when we do want to support dynamic topic selection, I think we could omit some conditions, like network or contains (covered by regexp).

Thanks @joshdover. I'm very interested in the outcome of this discussion. Scoping out dynamic topic selection in the first version would definitely reduce the amount of effort and testing complexity (on Endpoint at least) for the first version.

@joshdover & @brian-mckinney dynamic topic selection is an attractive aspect of this solution. I have had a few customers engaging on that. However given where we are and the fact that this will be a beta to begin with, I think it's fair to address this as a followup. I will communicate this to our Beta candidates when the time comes.

@nimarezainia
Copy link
Contributor Author

Could someone please clarify what Authentication methods will be available in the first phase? (appreciate it thx)

@joshdover
Copy link
Contributor

@szwarckonrad Could you clarify which options we ended up implementing the UI for this first phase?

@nimarezainia I think we're also limited by what Endpoint ends up supporting, which is still in progress. @brian-mckinney should be able to help clarify this.

@szwarckonrad
Copy link
Contributor

@joshdover Following the mockups I went with UI for username/password and SSL

@nimarezainia
Copy link
Contributor Author

@szwarckonrad Could you clarify which options we ended up implementing the UI for this first phase?

@nimarezainia I think we're also limited by what Endpoint ends up supporting, which is still in progress. @brian-mckinney should be able to help clarify this.

thank you. I just need to know these limitations as we engage the beta customers.

@kevinlog
Copy link
Contributor

Reopening for further testing.

See: elastic/elastic-agent-shipper#116 (comment)

@kevinlog kevinlog reopened this Jul 28, 2023
@amolnater-qasource
Copy link

Hi @kevinlog

Could you please share a guide with valid values to be filled in the various fields for Kafka output configuration and if possible can share a demo recording for testing this feature?

It will be very helpful for us to understand the working and test this feature.

Thanks!

@kevinlog
Copy link
Contributor

@amolnater-qasource - yes, we can work towards that.

@nimarezainia @faec - can I work with you to provide a good test config for the Kafka output? In addition, we could record a demo during the integration testing on the shared server.

ThomThomson pushed a commit to ThomThomson/kibana that referenced this issue Aug 1, 2023
@juliaElastic
Copy link
Contributor

@kevinlog Can we close this issue or carry over to Sprint 16?

@jlind23
Copy link
Contributor

jlind23 commented Aug 14, 2023

@juliaElastic Closing this as the latest PR has now been merged.

@jlind23 jlind23 closed this as completed Aug 14, 2023
@amolnater-qasource
Copy link

amolnater-qasource commented Sep 8, 2023

Hi @cmacknz @nimarezainia

We have executed 08 testcases under the Feature test run for the 8.10.0 release at the link:

Status:

  • PASS: 08

Further 01 testcase is pending, and we would require some help in executing this:

Could anyone please help us setting correct details in the processor?

  • Any specific value we can add under processor:
    image

Logs from the main topic- quickstart-events topic:
quickstart-events.txt

Here, we aim to get data under quickstart-2.

Build details:
VERSION: 8.10.0 BC6
BUILD: 66340
COMMIT: 1b2de6d
Artifact Link: https://staging.elastic.co/8.10.0-e882546e/summary-8.10.0.html

Please let us know if anything else is required from our end.

cc: @jlind23
Thanks

@amolnater-qasource
Copy link

Thank you for resolving @pierrehilbert

  • We are now able to get the data for Dynamic topic processor.

The pending testcase execution is also now done under Feature test plan at : Kafka Output

Status:
PASS: 9/9

As the testing is completed on this feature, we are marking this as QA:Validated.

Please let us know if any other scenario is required to be tested from our end.

Thanks!

@amolnater-qasource amolnater-qasource added QA:Validated Issue has been validated by QA and removed QA:Needs Validation Issue needs to be validated by QA labels Sep 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
8.10 candidate epic Feature:Fleet Fleet team's agent central management project OLM Sprint QA:Validated Issue has been validated by QA Team:Defend Workflows “EDR Workflows” sub-team of Security Solution Team:Fleet Team label for Observability Data Collection Fleet team
Projects
None yet