-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Introduce connection prefix, move source / destination #51
Conversation
There have been recently several discussions around source, destination and connection recently, especially in elastic#9. The conclusion from my side is that source and destination normally belongs to a connection and we actually miss a connection prefix. Also some information from network like `forward_ip` more belong to a connection then network. An additional change I made to source and destination is that they both contain now a host prefix. All the fields in source and destination also exist in `host`. The host prefix can be reused here too. This makes ECS very predictable that every time `host.*` shows up it will contain the same fields. Also source and destination could contain additional data like the location, see elastic#50 for more details. The connection fields now look as following: | Field | Description | Type | |---|---|---|---|---| | <a name="connection.destination.host.ip"></a>`connection.destination.host.ip` | IP address of the destination.<br/>Can be one or multiple IPv4 or IPv6 addresses. | ip | | <a name="connection.destination.host.name"></a>`connection.destination.host.name` | Hostname of the destination. | keyword | | <a name="connection.destination.host.port"></a>`connection.destination.host.port` | Port of the destination. | long | | <a name="connection.destination.host.mac"></a>`connection.destination.host.mac` | MAC address of the destination. | keyword | | <a name="connection.destination.host.domain"></a>`connection.destination.host.domain` | Destination domain. | keyword | | <a name="connection.destination.host.subdomain"></a>`connection.destination.host.subdomain` | Destination subdomain. | keyword | | <a name="connection.source.host.ip"></a>`connection.source.host.ip` | IP address of the source.<br/>Can be one or multiple IPv4 or IPv6 addresses. | ip | | <a name="connection.source.host.name"></a>`connection.source.host.name` | Hostname of the source. | keyword | | <a name="connection.source.host.port"></a>`connection.source.host.port` | Port of the source. | long | | <a name="connection.source.host.mac"></a>`connection.source.host.mac` | MAC address of the source. | keyword | | <a name="connection.source.host.domain"></a>`connection.source.host.domain` | Source domain. | keyword | | <a name="connection.source.host.subdomain"></a>`connection.source.host.subdomain` | Source subdomain. | keyword | | <a name="connection.direction"></a>`connection.direction` | Direction of the network traffic.<br/>Recommended values are:<br/> * inbound<br/> * outbound<br/> * unknown | keyword | | <a name="connection.forwarded_ip"></a>`connection.forwarded_ip` | Host IP address when the source IP address is the proxy. | ip | I opened a PR to discuss this instead of an issue as it will allow us to discuss the high level parts as comment but also details directly in the code.
As discussed in #9 there are also cases where the host from your and destination should end up in one field. For these cases the copy_to feature could be used. Here a small example:
The |
As |
This is a pretty fundamental change to the schema -- should we be planning that this will be included if we are hoping to conform to this schema? Do we know when this might be approved or merged? |
I quite dislike how this pull request introduces vastly longer field names for mostly superfluous categorisation. In terms of daily usability I much prefer |
@urso Interesting idea. So you are basically saying a proxy is also a host with additional info? Is @strawgate There is no conclusion yet on this topic and the reason it's here for discuss. Outcome not clear yet. @praseodym Can you share a bit more background on the problem of long field names? If it is the typing, I wonder how much the auto complete in newer Kibana versions solves this issue? |
No idea where In the schema the Checking the proposal again, I wonder if |
I think that there should be a consideration for tools that do not log Source and Destination. I propose there should be 2 different fields. One for tools that use source and destination, and another for tools that are sessionized. For example, the tool |
I like the concept for connection oriented tools, but can't we just put it under
As for @spartan782's comment, the way Bro actually does this today is for the connection log, the I propose that those cases are actually annotated in the protocol-specific prefixes (i.e. ^1 Migration to ECS is underway for RockNSM, so this is a timely topic. |
Proxy IPs come from reverse proxies. The reason it's an array is that there can be more than one reverse proxy in front of the application logging the event. For example: Cloudflare => NGINX => application. Your list of forwarded IPs would include Cloudflare's edge node, then your NGINX load balancer's. You can have more than two, of course, just add in Varnish and Apache running PHP-fpm as the "application". I love the idea of building a plain array of all seen IPs for a given event, for ease of pivoting. This would not only help catch situations where a proxy is compromised, and that's the hostile entity, but would also simplify pivoting for situations where we have potentially hostile IPs in "source" as well as in "destination" IPs, like DNS (see this discussion for more context). |
++ on having a place for all
|
host.ip containing all ip addresses from the event seems confusing. The norm is to have things like src_ip and dst_ip, the current ecs makes that source.ip and destination.ip, this now makes it connection.destination.host.ip and connection.source.host.ip and I'm not entirely sure there is a benefit to this. Fields which relate to a host are vast (architecture, OS, timezone, etc) whereas fields that relate to a host that is part of a connection that a router, switch, or firewall witnessed are minimal so I'm not exactly sure why prefixing each field with the object type here is useful and stuff like this:
Don't really help the confusion. |
I'm not sure I see the benefit of adding In a connection scenario, the process only knows host details about its own side of the connection, and not the other side. This means the bulk of the host details will actually shift around, between source and destination. Two examples Application handling inbound requests:
Application calling out to an external system:
This is what I mean by the host details shifting around. I'm ok with having
|
@ruflin thanks for this PR. Clearly a great topic and a needed discussion, as it's generated a lot of sub-topics! Here's my $0.02
@strawgate #51 (comment) I agree, this would be a big change, and I prefer to work through any shortcoming with the current set of namespaces/objects/prefixes. @praseodym #51 (comment) I agree that vastly longer names without significant value, will detract from two key ECS benefits, Ease of Recall, and Ease of Deduction, and therefore should be avoided. @urso #51 (comment) the network.forwarded_ip field definition may need some improvement. The original intent was to populate this field with the IP address(es) of network entity(ies) (e.g., proxies) forwarding network traffic associated with an event, when the @spartan782 #51 (comment) The @dcode #51 (comment) +1 to keeping the connection-related fields in the @strawgate #51 (comment). Agreed with your point that details (fields) relating to source and destinations in a network event will be fewer than those relating to a host in a host event. This was a key factor in originally choosing @webmat #51 (comment) Agreed, thanks. |
@webmat Having |
@ruflin Are you ok if we close this? I think it's clear we're not going to move in this direction after all :-) |
I think it's not something we do for 1.0 of ECS but it's still something I think we should do in the long term to support more complex connection data. Based on the recent changes the initial PR will need updating but my proposal to have a connection object is still standing. I suggest to keep this open but currently put it on hold. |
Now that we are introducing also server / client, let's close this for now. I still like the idea of a connection though ;-) |
There have been recently several discussions around source, destination and connection recently, especially in #9. The conclusion from my side is that source and destination normally belongs to a connection and we actually miss a connection prefix. Also some information from network like
forward_ip
more belong to a connection then network.An additional change I made to source and destination is that they both contain now a host prefix. All the fields in source and destination also exist in
host
. The host prefix can be reused here too. This makes ECS very predictable that every timehost.*
shows up it will contain the same fields. Also source and destination could contain additional data like the location, see #50 for more details.The connection fields now look as following:
connection.destination.host.ip
Can be one or multiple IPv4 or IPv6 addresses.
connection.destination.host.name
connection.destination.host.port
connection.destination.host.mac
connection.destination.host.domain
connection.destination.host.subdomain
connection.source.host.ip
Can be one or multiple IPv4 or IPv6 addresses.
connection.source.host.name
connection.source.host.port
connection.source.host.mac
connection.source.host.domain
connection.source.host.subdomain
connection.direction
Recommended values are:
* inbound
* outbound
* unknown
connection.forwarded_ip
I opened a PR to discuss this instead of an issue as it will allow us to discuss the high level parts as comment but also details directly in the code.