Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add network locality fields #288

Closed

Conversation

andrewkroh
Copy link
Member

@andrewkroh andrewkroh commented Dec 18, 2018

This adds network.locality that simply has a value of either private or public. If either source.ip or destination.ip are public IP addresses then network.locality is elevated to "public". Otherwise if both source.ip and destination.ip are non-public then network.locality is private.

The IPv4 and IPv6 ranges that are considered private are strictly specified in the definition of network.locality.

This is a useful means of filtering on flows. Some common queries I use are

  • network.locality:public and source.locality:public
  • network.locality:public and destination.locality:public

Copy link
Contributor

@MikePaquette MikePaquette left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewkroh this seems useful - I have a few questions please:

  1. If a user has available a list of internally managed IP addresses, even if routable, (e.g., via Infoblox) should those addresses be considered private as well? or just the ones specified by RFC's listed.? (I think this could be the distinction between "internal" and "private")

  2. If private == internal, then could this information be represented in the network.direction field? It seems that:
    network.direction: external == source.locality: public and destination.locality: public
    network.direction: internal == source.locality: private and destination.locality: private
    (but I think "inbound" and "outbound" will get us back into the troubling discussion about how to determine/define this.)

  3. In your example, isn't network.locality:public and source.locality:public == source.locality:public ?

  4. If we go forward with these fields, should we also have client.locality and server.locality for performing similar filtering on bidirectional flows? If so, then we'd have to decide which one populates the network.locality field in the case where they conflict. (i.e. when source.* <> client.*)

@andrewkroh
Copy link
Member Author

andrewkroh commented Feb 23, 2019

  1. If a user has available a list of internally managed IP addresses, even if routable, (e.g., via Infoblox) should those addresses be considered private as well? or just the ones specified by RFC's listed.? (I think this could be the distinction between "internal" and "private")

I'd like for this to strictly be based on the RFCs which makes it very easy to implement everywhere since it doesn't require much configuration.

  1. If private == internal, then could this information be represented in the network.direction field?

What constitutes "internal" and "external" has some flexibility which is great. If someone treats internal as the RFC private addresses then yes, they are equivalent. You get a difference when if you configure your "internal" ranges to be a subset of the private address space like only 10.20.0.0/16 and you have outbound traffic to some other private address. You also get a difference if you have a mix of public and private addresses that you treat as "internal".

  1. In your example, isn't network.locality:public and source.locality:public == source.locality:public ?

Indeed it is, my bad. I was trying to show that you could infer some directionality based on the query and meant to use a private in there. Like network.locality:public and source.locality:private is probably a flow initiated from the inside to some public facing internet service.

  1. ... If so, then we'd have to decide which one populates the network.locality field in the case where they conflict. (i.e. when source.* <> client.*)

I agree that we should decide on a precedence. I'm just have trouble imagining cases where the two sets of client/server and source/destination addresses are different (maybe DHCP). As long as client/server and source/destination are the same sets of addresses the computation for network.locality always works out the same.

Let's say that source/destination take precedence?

This adds `network.locality` that simply has a value of either private or public. If either `source.ip` or `destination.ip` are public IP addresses then network.locality is elevated to "public". Otherwise if both `source.ip` and `destination.ip` are non-public then `network.locality` is private.

The IPv4 and IPv6 ranges that are considered private are strictly specified in the definition of `network.locality`.

This is a useful means of filtering on flows. Some common queries I use are

- network.locality:public and source.locality:public
- network.locality:public and destination.locality:public
@andrewkroh andrewkroh force-pushed the feature-network-locality branch from 098d373 to b7895d1 Compare February 23, 2019 03:07
@andrewkroh
Copy link
Member Author

andrewkroh commented Mar 8, 2019

I've changed my thinking on this after seeing a user request to make the network ranges configurable. I'm thinking it's better to keep the source/destination.locality fields in sync with the network.direction field.

source.locality destination.locality network.locality
internal internal internal
internal external outbound
external internal inbound
external external external
    unknown

You can see how this would be used in elastic/beats#11147.

@andrewkroh
Copy link
Member Author

The general idea behind having both a network.direction and a network.locality is to allow monitoring tools to report the direction that they observed (like Zeek running on a desktop) and to also be able to classify that traffic at a higher level (like classifying the traffic against all company owned networks).

For example, a server connects to another server on the company's network. A network monitoring tool running on that server sets network.direction: outbound for that connection. The network.locality field would be set to internal since both the source and destination are internal to the company's network.

Copy link
Contributor

@webmat webmat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewkroh @MikePaquette I like this proposal, and I'd like to get this in soon.

Andrew, is the PR still in line with your latest thinking, as described in comments from March and your proposal in Beats?

I can help rebase/merge the PR if you'd like. Much has changed in the last little while.

If you attempt the merge yourself, you'll have to add a short: description as well

@@ -20,6 +20,7 @@ All notable changes to this project will be documented in this file based on the

* Added pointer in description of `http` field set to `url` field set. #330
* Added an optional short field description. #330
* Add `network.locality`, `source.locality`, and `desination.locality`. #288
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a typo in destination, and please add the new fields as well, under client & server

@andrewkroh
Copy link
Member Author

Andrew, is the PR still in line with your latest thinking, as described in comments from March and your proposal in Beats?

Yes, it is. And if you want to take over that would be much appreciated.

@webmat
Copy link
Contributor

webmat commented Apr 25, 2019

You got it! 👍

@webmat
Copy link
Contributor

webmat commented Apr 29, 2019

@andrewkroh I have a branch where the code of this PR has been ported on top of master.

However looking back at the later comments here, I realize that the PR code uses values "private" and "public", whereas this table and issue elastic/beats#11147 both talk about values "internal" and "external" for network.locality, source.locality and destination.locality...

Another inconsistency between the PR and the discussion is that the PR states that the values for locality should be based on the RFCs, but the thinking seems to be that they should be configurable, by users based on their network address ranges.

So I have a few things I'd like to discuss:

  1. Which pairs of values should we use for the 5 locality fields?
    • private/public
    • internal/external
  2. Should we separate the concept of RCF private/public from the concept of whether an IP range (public or not) is under the user's control?
  3. If we introduce the concept of configurability wrt one's network for these fields, people will want levels of trust of these "trusted" ranges, as well. I've had this discussion with a few people already.

I do see the value in having the configurability. But my take is that starting with RFC & no configurability keeps it straightforward and useful, while avoiding the opening of pandora's box of "how much is the network trusted".

In other words, my answer to Q 2 would be we separate the concepts for now, and when tackling the configurability, we also tackle the trust levels.

I'm curious what you think about this.

@dainperkins
Copy link
Contributor

some thoughts:

  1. public/private are specific concepts and should be used as such. Internal / external are significantly more relevant in terms of security, apm, etc. but will, at the very least require the definition and lookup of internal public addresses, and external private addresses (assume all undefined are private/internal or public/external)

  2. absolutely, they are not mutually inclusive or exclusive (at any level, plenty of vpn B2B use private addresses on each side)

  3. network.risk.score, network.risk.tag :)

additionally, looking at e.g. NAPM having some concepts of asset type categorization of network zones would be extremely useful (network.tag or similar would be useful for describing network "subnets" in relation to physical locations in the organization, but network.location (maybe internal geoip additions? or pipelines to track ips to a network name/cidr index)

an array for network.tag could hold any of the 4 (pub/prv/int/ext), but a network.location could also hold organizational data identifying a specific location (Houston-DMZ, Data-Center, WAN-Hub, GCloud-Project, etc)

Add one for source & destination and suddenly theres the possibility of n/apm (adding network info to APM information for troubleshooting - e.g. User -> (fw to WAF) -> (WAF to array of web Servers).. track the # of resets/retransmits, or QOS info across the various WEB servers to troubleshoot issues at the network level...)

@andrewkroh andrewkroh closed this Oct 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants