Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCP/TLS Output for Beats #33107

Closed
JAndritsch opened this issue Sep 16, 2022 · 12 comments
Closed

TCP/TLS Output for Beats #33107

JAndritsch opened this issue Sep 16, 2022 · 12 comments
Labels
Stalled Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team

Comments

@JAndritsch
Copy link

Describe the enhancement:

I would like to see a generic TCP output (with optional TLS) added to Beats. This was originally proposed in #11942 but closed due to inactivity.

Describe a specific use case for the enhancement or feature:

There were some good examples in the original issue. Personally, I would like a TCP output so that I can use Beat-to-Beat communication. This would enable me to deploy minimally-configured instances of Winlogbeat and have them all converge to a central Beat. The Winlogbeat instances would be on a separate section of the network from Elasticsearch and wouldn't require any credentials or authentication management. They would simply gather data and send events to a central "Collector" Beat installed on a section of the network that can talk to Elasticsearch. The "Collector" Beat would manage configuration for the Elasticsearch output, including proper authentication/credentials.

This example would be very useful for shipping logs in heavily segmented networks and also simplifies the management of configuration and credentials needed to send the data to Elasticsearch.

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Sep 16, 2022
@legoguy1000
Copy link
Contributor

A lumberjack input was added to main branch which will do this.

@JAndritsch
Copy link
Author

JAndritsch commented Sep 17, 2022

@legoguy1000 Sweet! Do you have any samples or docs for using this, or could you point me at the source for this input?

Edit: Think I found it: #32175

This looks like an input that can receive events via the lumberjack protocol. Is there an output in Filebeat that works with it? I found #27951 which would add a generic HTTP output but it was closed because it sounds like Elastic isn't supporting new output types for Beats.

@legoguy1000
Copy link
Contributor

The logstash output is the lumberjack protocol

@JAndritsch
Copy link
Author

The logstash output is the lumberjack protocol

I think I follow. You're saying I can configure the Logstash output in Filebeat and point it at another Filebeat that uses the new lumberjack input?

@legoguy1000
Copy link
Contributor

Correct. That should work.

@JAndritsch
Copy link
Author

I'll try testing that out and close this issue once I've confirmed. Thanks for your help!

@endorama endorama added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Sep 19, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Sep 19, 2022
@JAndritsch
Copy link
Author

So I haven't gotten around to testing this yet, but the following description in the PR (#32175) makes it sound like the Lumberjack input is only a temporary addition to Filebeat and will be relocating into the Elastic Agent:

Similar to the winlog input I am not adding documentation to Filebeat for lumberjack. The goal is to make this available to Elastic Agent. Once Elastic Agent fully supports the "input v2" architecture where standalone input binaries send data via the shipper then we will want to be able to migrate and remove it from Filebeat.

Is this the intent? Will this input remain undocumented and eventually get pulled from Filebeat?

@legoguy1000
Copy link
Contributor

The agent is just a management process. Filebeat and metricbeat still runs underneath it. So for the agent to have it, filebeat has to have it

@botelastic
Copy link

botelastic bot commented Sep 30, 2023

Hi!
We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1.
Thank you for your contribution!

@JAndritsch
Copy link
Author

So I finally got around to testing this and can confirm that using a logstash output on one Filebeat and a lumberjack input on another works. It appears that the original event sent to the beat using a lumberjack input ends up living under lumberjack.* field.

I have a use case where I have Winlogbeat deployed to endpoints, and I want those logs to flow to another Filebeat acting as a relay. That relay then sends to another Filebeat in a different part of the network. So: Filebeat A --> Filebeat B --> Filebeat C

In my test, it seems each hop through the lumberjack input results in another level of nesting. After going through two routes of lumberjack inputs, I end up with an event that looks like this:

{
    data from Filebeat C ...

    "lumberjack": {
        data from Filebeat B ...

        "lumberjack": {
            data from Filebeat A, aka the original event ...
        }
    }
}

At some point this event is going to make it into Elasticsearch, and I'll have to unravel this nesting before doing any processing of the event via ingest pipelines.

Is there a way to promote the original event to the root of the document so that it doesn't end up nested under a lumberjack field (something similar to fields_under_root)? I could probably do this with a script processor but I'm hoping to avoid that if possible.

@JAndritsch
Copy link
Author

JAndritsch commented Jan 10, 2024

Just posting for anyone else who may be interested, but this seems doable via the script processor of an ingest pipeline:

def outer = ctx;

// Handle multiple nested levels of lumberjack
while (outer["lumberjack"] != null) {
  outer = outer["lumberjack"];
  // Promote the fields from lowest level event under the lumberjack namespace (aka the original event) to the top level of the document
  outer.keySet().each (key -> ctx[key] = outer.get(key));
}

Then you can just remove the top level ctx.lumberjack field.

This is only if you don't care about losing any of the beat context between agents as the event passes through. This script will make it appear as if the event went through a single Filebeat.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Stalled Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team
Projects
None yet
Development

No branches or pull requests

4 participants